Skip to content

Commit 4adf2fa

Browse files
authored
patch(errors): fix all warnings and errors in build log. fixes #35 (#34)
Signed-off-by: Laura Santamaria <[email protected]>
1 parent a45c143 commit 4adf2fa

File tree

10 files changed

+134
-110
lines changed

10 files changed

+134
-110
lines changed

.gitignore

+8
Original file line numberDiff line numberDiff line change
@@ -1 +1,9 @@
11
venv*
2+
3+
# uv
4+
uv.lock
5+
pyproject.toml
6+
.python-version
7+
8+
# pycharm
9+
.idea

docs/adding-data-to-model/creating_new_knowledge_or_skills.md

+8-8
Original file line numberDiff line numberDiff line change
@@ -162,7 +162,7 @@ ilab model train
162162
163163
⏳ This step can potentially take **several hours** to complete depending on your computing resources. Please stop `ilab model chat` and `ilab model serve` first to free resources.
164164
165-
When running multi phase training evaluation is run on each phase, we will tell you which checkpoint in this folder performs the best.
165+
When running multiphase training evaluation is run on each phase, we will tell you which checkpoint in this folder performs the best.
166166
167167
#### Train the model locally on an M-series Mac or on Linux using the full pipeline
168168
@@ -237,7 +237,7 @@ On a Mac `ilab model train` outputs a brand-new model that is saved in the `<mod
237237
238238
#### Train the model locally with GPU acceleration
239239
240-
Training has support for GPU acceleration with Nvidia CUDA or AMD ROCm. Please see [the GPU acceleration documentation](./docs/gpu-acceleration.md) for more details. At present, hardware acceleration requires a data center GPU or high-end consumer GPU with at least 18 GB free memory.
240+
Training has support for GPU acceleration with Nvidia CUDA or AMD ROCm. Please see [the GPU acceleration documentation](https://github.com/instructlab/instructlab/blob/main/docs/gpu-acceleration.md) for more details. At present, hardware acceleration requires a data center GPU or high-end consumer GPU with at least 18 GB free memory.
241241
242242
```shell
243243
ilab model train --pipeline accelerated --device cuda --data-path <path-to-sdg-data>
@@ -251,9 +251,9 @@ ilab model train --pipeline full --device cpu --data-path ~/.local/share/instruc
251251
252252
This version of `ilab model train` outputs brand-new models that can be served in the `~/.local/share/instructlab/checkpoints` directory. These models can be run through `ilab model evaluate` to choose the best one.
253253
254-
#### Train the model locally with multi-phase training and GPU acceleration
254+
#### Train the model locally with multiphase training and GPU acceleration
255255
256-
`ilab model train` supports multi-phase training. This results in the following workflow:
256+
`ilab model train` supports multiphase training. This results in the following workflow:
257257
258258
1. We train the model on knowledge
259259
2. Evaluate the trained model to find the best checkpoint
@@ -264,13 +264,13 @@ This version of `ilab model train` outputs brand-new models that can be served i
264264
ilab model train --strategy lab-multiphase --phased-phase1-data <knowledge train messages jsonl> --phased-phase2-data <skills train messages jsonl> -y
265265
```
266266
267-
This command takes in two `.jsonl` files from your `datasets` directory, one is the knowledge jsonl and the other is a skills jsonl. The `-y` flag skips an interactive prompt asking the user if they are sure they want to run multi-phase training.
267+
This command takes in two `.jsonl` files from your `datasets` directory, one is the knowledge jsonl and the other is a skills jsonl. The `-y` flag skips an interactive prompt asking the user if they are sure they want to run multiphase training.
268268
269269
⏳ This command may take 3 or more hours depending on the size of the data and number of training epochs you run.
270270
271271
#### Train the model in the cloud
272272
273-
Follow the instructions in [Training](./notebooks/README.md).
273+
Follow the instructions in [Training](https://github.com/instructlab/instructlab/blob/main/notebooks/README.md).
274274
275275
⏳ Approximate amount of time taken on each platform:
276276
@@ -452,12 +452,12 @@ argument to specify your new model:
452452
ilab model chat -m <New model path>
453453
```
454454
455-
If you are interested in optimizing the quality of the model's responses, please see [`TROUBLESHOOTING.md`](./TROUBLESHOOTING.md#model-fine-tuning-and-response-optimization)
455+
If you are interested in optimizing the quality of the model's responses, please see [`TROUBLESHOOTING.md`](https://github.com/instructlab/instructlab/blob/main/TROUBLESHOOTING.md#model-fine-tuning-and-response-optimization)
456456

457457
## 🎁 Submit your new knowledge or skills
458458

459459
Of course, the final step is, if you've improved the model, to open a pull-request in the [taxonomy repository](https://github.com/instructlab/taxonomy) that includes the files (e.g. `qna.yaml`) with your improved data.
460460
461461
## 📬 Contributing
462462
463-
Check out our [contributing](CONTRIBUTING/CONTRIBUTING.md) guide to learn how to contribute.
463+
Check out our [contributing](../community/CONTRIBUTING.md) guide to learn how to contribute.

docs/community/CONTRIBUTING.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ Once you've created a pull request (PR), maintainers will review your code and m
6666
* Write detailed commit messages
6767
* Break large changes into a logical series of smaller patches, which are easy to understand individually and combine to solve a broader issue
6868

69-
For a list of the maintainers and triagers, see the [MAINTAINERS.md](MAINTAINERS.md) page.
69+
For a list of the maintainers and triagers, see the [MAINTAINERS.md](https://github.com/instructlab/community/blob/main/MAINTAINERS.md) page.
7070

7171
### Proposing new features
7272

@@ -93,7 +93,7 @@ Distributed under the [Apache License, Version 2.0](http://www.apache.org/licens
9393

9494
SPDX-License-Identifier: [Apache-2.0](https://spdx.org/licenses/Apache-2.0)
9595

96-
If you would like to see the detailed LICENSE click [here](LICENSE).
96+
If you would like to see the detailed LICENSE click [here](../LICENSE).
9797

9898
### Developer Certificate of Origin (DCO)
9999

@@ -118,7 +118,7 @@ We automatically verify that all commit messages contain a `Signed-off-by:` line
118118

119119
There are a number of tools that make it easier for developers to manage DCO signoffs.
120120

121-
* DCO command line tool, which let's you do a single signoff for an entire repo ( <https://github.com/coderanger/dco> )
121+
* DCO command line tool, which lets you do a single signoff for an entire repo ( <https://github.com/coderanger/dco> )
122122
* GitHub UI integrations for adding the signoff automatically ( <https://github.com/scottrigby/dco-gh-ui> )
123123
* Chrome - <https://chrome.google.com/webstore/detail/dco-github-ui/onhgmjhnaeipfgacbglaphlmllkpoijo>
124124
* Firefox - <https://addons.mozilla.org/en-US/firefox/addon/scott-rigby/?src=search>
@@ -133,9 +133,9 @@ The following resources include additional information about each repository, su
133133

134134
### ilab CLI tool additional resources
135135

136-
* [`ilab` CLI tool README.md](https://github.com/instructlab/instructlab/blob/main/README.md#). This resources provides information about the `ilab` CLI tool, including an overview, getting started, training the model, submitting a pull request, etc.
136+
* [`ilab` CLI tool README.md](https://github.com/instructlab/instructlab/blob/main/README.md#). This resource provides information about the `ilab` CLI tool, including an overview, getting started, training the model, submitting a pull request, etc.
137137

138-
* [`ilab` CLI tool CONTRIBUTING.md](https://github.com/instructlab/instructlab/blob/main/CONTRIBUTING/CONTRIBUTING.md). This resources provides information about contributing to the `ilab` CLI tool repository, reporting bugs, testing, coding styles, etc.
138+
* [`ilab` CLI tool CONTRIBUTING.md](https://github.com/instructlab/instructlab/blob/main/CONTRIBUTING/CONTRIBUTING.md). This resource provides information about contributing to the `ilab` CLI tool repository, reporting bugs, testing, coding styles, etc.
139139

140140
### Taxonomy additional resources
141141

docs/community/FAQ.md

+7-1
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ InstructLab is driven by taxonomies and works by empowering users to add new [_s
8686

8787
### What are the goals of the InstructLab project?
8888

89-
The goal on the InstructLab project is to emocratize contributions to AI and LLMs. There are two approaches to achieving this goal in our community:
89+
The goal on the InstructLab project is to democratize contributions to AI and LLMs. There are two approaches to achieving this goal in our community:
9090

9191
* Enabling collaborative contribution to a large language model (LLM) through [the project's _taxonomy_ repository](https://github.com/instructlab/taxonomy). When users contribute to this repository, the project resynthesizes its open source training data. Our community Granite-based model is then retrained, ensuring that community contributions are integrated while enriching the model’s capabilities over time.
9292

@@ -118,6 +118,12 @@ When contributors write an addition to the existing taxonomy, make a pull reques
118118

119119
Contributions to the InstructLab project include fine-tuning Granite-7b, an open-source licensed LLM. Contributors have direct access to the model they are improving through [Hugging Face](https://huggingface.co/instructlab).
120120

121+
### What is Merlinite-7b?
122+
123+
Merlinite-7b is a Mistral-7b derivative model fine-tuned with the LAB (**L**arge-scale **A**lignment for chat**B**ots) method using Mixtral-8x7b-Instruct as a teacher model.
124+
125+
More information about the Merlinite-7b can be found on the [Hugging Face project page](https://huggingface.co/instructlab/merlinite-7b-lab).
126+
121127
### What is Granite-7-lab?
122128

123129
Granite-7b-lab is a model that was built from scratch by IBM and fine tuned with the LAB (**L**arge-scale **A**lignment for chat**B**ots) method.

docs/taxonomy/index.md

+7-7
Original file line numberDiff line numberDiff line change
@@ -66,11 +66,11 @@ domain. Maintainers can decide to change the names of the existing branches or t
6666
knowledge --> knowledge/miscellaneous_unknown
6767
knowledge --> knowledge/science
6868
knowledge --> knowledge/technology
69-
knowledge/science --> animals --> birds --> black_capped_chickadee --> black_capped_chikadee-a & black_capped_chikadee-q
69+
knowledge/science --> animals --> birds --> black_capped_chickadee --> black_capped_chickadee-a & black_capped_chickadee-q
7070
knowledge/science --> astronomy --> constellations --> phoenix --> phoenix-a & phoenix-q
7171
72-
black_capped_chikadee-a{attribution.txt}
73-
black_capped_chikadee-q{qna.yaml}
72+
black_capped_chickadee-a{attribution.txt}
73+
black_capped_chickadee-q{qna.yaml}
7474
phoenix-a{attribution.txt}
7575
phoenix-q{qna.yaml}
7676
classDef na fill:#EEE
@@ -108,7 +108,7 @@ This taxonomy repository will be used as the seed to synthesize the training dat
108108

109109
By contributing your skills and knowledge to this repository, you will see your changes built into an LLM within days of your contribution rather than months or years! If you are working with a model and notice its knowledge or ability lacking, you can correct it by contributing knowledge or skills and check if it's improved after your changes are built.
110110

111-
While public contributions are welcome to help drive community progress, you can also fork this repository under [the Apache License, Version 2.0](LICENSE), add your own internal skills, and train your own models internally. However, you might need your own access to significant compute infrastructure to perform sufficient retraining.
111+
While public contributions are welcome to help drive community progress, you can also fork this repository under [the Apache License, Version 2.0](../LICENSE), add your own internal skills, and train your own models internally. However, you might need your own access to significant compute infrastructure to perform sufficient retraining.
112112

113113
## Ways to Contribute
114114

@@ -121,10 +121,10 @@ For more information, see the [Ways of contributing to the taxonomy repository](
121121

122122
## How to contribute skills and knowledge
123123

124-
To contribute to this repo, you'll use the *Fork and Pull* model common in many open source repositories. You can add your skills and knowledge to the taxonomy in multiple ways; for additional information on how to make a contribution, see the [Documentation on contributing](CONTRIBUTING.md). You can also use the following guides to help with contributing:
124+
To contribute to this repo, you'll use the *Fork and Pull* model common in many open source repositories. You can add your skills and knowledge to the taxonomy in multiple ways; for additional information on how to make a contribution, see the [Documentation on contributing](../community/CONTRIBUTING.md). You can also use the following guides to help with contributing:
125125

126-
- Contributing using the [GitHub webpage UI](docs/contributing_via_GH_UI.md).
127-
- Contributing knowledge to the taxonomy in the [Knowledge contribution guidelines](docs/knowledge-contribution-guide.md).
126+
- Contributing using the [GitHub webpage UI](https://github.com/instructlab/taxonomy/blob/main/docs/contributing_via_GH_UI.md).
127+
- Contributing knowledge to the taxonomy in the [Knowledge contribution guidelines](../taxonomy/knowledge/guide.md).
128128

129129
### Why should I contribute?
130130

docs/user-interface/knowledge_contributions.md

+7-7
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ The UI Simplifies the process for Skills & Knowledge contributions by:
88

99
* Minimising risk of human error when writing YAML by using the web form.
1010

11-
* Directly submit a github pull request with a press of a button.
11+
* Directly submit a GitHub pull request with a press of a button.
1212

1313
When the form is filled out, you also are given the option to download the YAML and attribution files to your local machine, and to view the form in its original YAML structure before submission.
1414

@@ -17,11 +17,11 @@ You can view all your submissions on the dashboard page.
1717
!!! warning
1818
Even when running the UI locally, you must be logged in via github to successfully submit your Knowledge and Skills contributions. You can still fill out the form, and download the YAML and attribution files.
1919

20-
For tips on writing Skills & Knowledge contributions, please visit the documentation under the [Taxonomy](/taxonomy/) heading.
20+
For tips on writing Skills & Knowledge contributions, please visit the documentation under the [Taxonomy](../taxonomy/index.md) heading.
2121

2222
## Knowledge Contributions
2323

24-
Firstly you will need to find a source document for your knowledge. Accepted sources can be found [here](/taxonomy/knowledge/guide).
24+
Firstly you will need to find a source document for your knowledge. Accepted sources can be found [here](../taxonomy/knowledge/guide.md).
2525

2626
Navigate to the Contribute section of the sidebar and click Knowledge. Here you will see the form to contribute Knowledge to the open-source taxonomy tree.
2727

@@ -51,13 +51,13 @@ Here you will begin filling out your QNA examples that represent the knowledge y
5151

5252
### Document Information
5353

54-
You must prepare a markdown file version of the document you wish to use for the knowledge submission. By dragging and dropping the markdown file into the box, and clicking the submit files button, a forked version of the taxonomy repository will be automatically created on your github profile.
54+
You must prepare a markdown file version of the document you wish to use for the knowledge submission. By dragging and dropping the markdown file into the box, and clicking the submit files button, a forked version of the taxonomy repository will be automatically created on your GitHub profile.
5555

5656
![UI Knowledge Document Information](../images/user-interface/ui_knowledge_document_info.png)
5757

58-
![Forked Repository Showcase](../images//user-interface/ui_knowledge_repo_created.png)
58+
![Forked Repository Showcase](../images/user-interface/ui_knowledge_repo_created.png)
5959

60-
If you've already uploaded the markdown file to your github, you can switch to manually adding the document, and entering the `commit sha`.
60+
If you've already uploaded the markdown file to your GitHub, you can switch to manually adding the document, and entering the `commit sha`.
6161

6262
![UI Knowledge Document Manual Information](../images/user-interface/ui_knowledge_document_manual_info.png)
6363

@@ -77,4 +77,4 @@ Once you have submitted a Skills or Knowledge Contribution, you can view it on y
7777

7878
![UI Dashboard With Contribution](../images/user-interface/ui_dashboard_with_submission.png)
7979

80-
[Next Steps](/user-interface/skills_contributions/){: .md-button .md-button--primary }
80+
[Next Steps](skills_contributions.md){: .md-button .md-button--primary }

docs/user-interface/playground_chat.md

+6-6
Original file line numberDiff line numberDiff line change
@@ -4,28 +4,28 @@ description: Steps to set up the playground to chat with a model
44
logo: images/ilab_dog.png
55
---
66

7-
To run with a locally run model, make sure that iLab model serve is running in a seperate terminal. If you are unsure on how to do this, please visit the [Intro to serve and chat](/getting-started/serve_and_chat/) section of this document.
7+
To run with a locally run model, make sure that iLab model serve is running in a separate terminal. If you are unsure on how to do this, please visit the [Intro to serve and chat](../getting-started/serve_and_chat.md) section of this document.
88

9-
If you go to `Playground > Chat` by using the side navigation bar, you can interact with the merlinite and granite models.
9+
If you go to `Playground > Chat` by using the side navigation bar, you can interact with the Merlinite and Granite models.
1010

1111
![UI No Model Response](../images/user-interface/ui_no_model_response.png)
1212

13-
If you are running the ui within a dev environment, the model won't reply because a granite/merinite model endpoint hasn't been given. In this case, we will create a new custom model endpoint, using our locally hosted quantised model.
13+
If you are running the ui within a dev environment, the model won't reply because a Granite/Merinite model endpoint hasn't been given. In this case, we will create a new custom model endpoint, using our locally hosted quantised model.
1414

1515
To add a custom model endpoint, go to `Playground > Custom Model Endpoints` and press the `Add Endpoint` button on the right side.
1616

17-
You will have 3 fields to fill out
17+
You will have 3 fields to fill out:
1818

1919
* The URL, where your customised model is hosted, if hosting locally, the URL would be `http://127.0.0.1:8000/`
2020

2121
* The Model Name, `merlinite-7b-lab-Q4_K_M.gguf`
2222

23-
* API Key, you may put any text in here; in this case I've used`randomCharacters`. If you are setting up an API key, please provide the key in this section.
23+
* API Key, you may put any text in here; in this case I've used `randomCharacters`. If you are setting up an API key, please provide the key in this section.
2424

2525
![UI Custom Model Endpoint](../images/user-interface/ui_custom_model_endpoint.png)
2626

2727
Go back to the playground chat, select newly added model and chat.
2828

2929
![UI Model Response](../images/user-interface/ui_model_response.png)
3030

31-
[Next Steps](/user-interface/knowledge_contributions/){: .md-button .md-button--primary }
31+
[Next Steps](knowledge_contributions.md){: .md-button .md-button--primary }

0 commit comments

Comments
 (0)