You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/adding-data-to-model/creating_new_knowledge_or_skills.md
+8-8
Original file line number
Diff line number
Diff line change
@@ -162,7 +162,7 @@ ilab model train
162
162
163
163
⏳ This step can potentially take **several hours** to complete depending on your computing resources. Please stop `ilab model chat` and `ilab model serve` first to free resources.
164
164
165
-
When running multi phase training evaluation is run on each phase, we will tell you which checkpoint in this folder performs the best.
165
+
When running multiphase training evaluation is run on each phase, we will tell you which checkpoint in this folder performs the best.
166
166
167
167
#### Train the model locally on an M-series Mac or on Linux using the full pipeline
168
168
@@ -237,7 +237,7 @@ On a Mac `ilab model train` outputs a brand-new model that is saved in the `<mod
237
237
238
238
#### Train the model locally with GPU acceleration
239
239
240
-
Training has support for GPU acceleration with Nvidia CUDA or AMD ROCm. Please see [the GPU acceleration documentation](./docs/gpu-acceleration.md) for more details. At present, hardware acceleration requires a data center GPU or high-end consumer GPU with at least 18 GB free memory.
240
+
Training has support for GPU acceleration with Nvidia CUDA or AMD ROCm. Please see [the GPU acceleration documentation](https://github.com/instructlab/instructlab/blob/main/docs/gpu-acceleration.md) for more details. At present, hardware acceleration requires a data center GPU or high-end consumer GPU with at least 18 GB free memory.
241
241
242
242
```shell
243
243
ilab model train --pipeline accelerated --device cuda --data-path <path-to-sdg-data>
@@ -251,9 +251,9 @@ ilab model train --pipeline full --device cpu --data-path ~/.local/share/instruc
251
251
252
252
This version of `ilab model train` outputs brand-new models that can be served in the `~/.local/share/instructlab/checkpoints` directory. These models can be run through `ilab model evaluate` to choose the best one.
253
253
254
-
#### Train the model locally with multi-phase training and GPU acceleration
254
+
#### Train the model locally with multiphase training and GPU acceleration
255
255
256
-
`ilab model train` supports multi-phase training. This results in the following workflow:
256
+
`ilab model train` supports multiphase training. This results in the following workflow:
257
257
258
258
1. We train the model on knowledge
259
259
2. Evaluate the trained model to find the best checkpoint
@@ -264,13 +264,13 @@ This version of `ilab model train` outputs brand-new models that can be served i
This command takes in two `.jsonl` files from your `datasets` directory, one is the knowledge jsonl and the other is a skills jsonl. The `-y` flag skips an interactive prompt asking the user if they are sure they want to run multi-phase training.
267
+
This command takes in two `.jsonl` files from your `datasets` directory, one is the knowledge jsonl and the other is a skills jsonl. The `-y` flag skips an interactive prompt asking the user if they are sure they want to run multiphase training.
268
268
269
269
⏳ This command may take 3 or more hours depending on the size of the data and number of training epochs you run.
270
270
271
271
#### Train the model in the cloud
272
272
273
-
Follow the instructions in [Training](./notebooks/README.md).
273
+
Follow the instructions in [Training](https://github.com/instructlab/instructlab/blob/main/notebooks/README.md).
274
274
275
275
⏳ Approximate amount of time taken on each platform:
276
276
@@ -452,12 +452,12 @@ argument to specify your new model:
452
452
ilab model chat -m <New model path>
453
453
```
454
454
455
-
If you are interested in optimizing the quality of the model's responses, please see [`TROUBLESHOOTING.md`](./TROUBLESHOOTING.md#model-fine-tuning-and-response-optimization)
455
+
If you are interested in optimizing the quality of the model's responses, please see [`TROUBLESHOOTING.md`](https://github.com/instructlab/instructlab/blob/main/TROUBLESHOOTING.md#model-fine-tuning-and-response-optimization)
456
456
457
457
## 🎁 Submit your new knowledge or skills
458
458
459
459
Of course, the final step is, if you've improved the model, to open a pull-request in the [taxonomy repository](https://github.com/instructlab/taxonomy) that includes the files (e.g. `qna.yaml`) with your improved data.
460
460
461
461
## 📬 Contributing
462
462
463
-
Check out our [contributing](CONTRIBUTING/CONTRIBUTING.md) guide to learn how to contribute.
463
+
Check out our [contributing](../community/CONTRIBUTING.md) guide to learn how to contribute.
@@ -133,9 +133,9 @@ The following resources include additional information about each repository, su
133
133
134
134
### ilab CLI tool additional resources
135
135
136
-
*[`ilab` CLI tool README.md](https://github.com/instructlab/instructlab/blob/main/README.md#). This resources provides information about the `ilab` CLI tool, including an overview, getting started, training the model, submitting a pull request, etc.
136
+
*[`ilab` CLI tool README.md](https://github.com/instructlab/instructlab/blob/main/README.md#). This resource provides information about the `ilab` CLI tool, including an overview, getting started, training the model, submitting a pull request, etc.
137
137
138
-
*[`ilab` CLI tool CONTRIBUTING.md](https://github.com/instructlab/instructlab/blob/main/CONTRIBUTING/CONTRIBUTING.md). This resources provides information about contributing to the `ilab` CLI tool repository, reporting bugs, testing, coding styles, etc.
138
+
*[`ilab` CLI tool CONTRIBUTING.md](https://github.com/instructlab/instructlab/blob/main/CONTRIBUTING/CONTRIBUTING.md). This resource provides information about contributing to the `ilab` CLI tool repository, reporting bugs, testing, coding styles, etc.
Copy file name to clipboardExpand all lines: docs/community/FAQ.md
+7-1
Original file line number
Diff line number
Diff line change
@@ -86,7 +86,7 @@ InstructLab is driven by taxonomies and works by empowering users to add new [_s
86
86
87
87
### What are the goals of the InstructLab project?
88
88
89
-
The goal on the InstructLab project is to emocratize contributions to AI and LLMs. There are two approaches to achieving this goal in our community:
89
+
The goal on the InstructLab project is to democratize contributions to AI and LLMs. There are two approaches to achieving this goal in our community:
90
90
91
91
* Enabling collaborative contribution to a large language model (LLM) through [the project's _taxonomy_ repository](https://github.com/instructlab/taxonomy). When users contribute to this repository, the project resynthesizes its open source training data. Our community Granite-based model is then retrained, ensuring that community contributions are integrated while enriching the model’s capabilities over time.
92
92
@@ -118,6 +118,12 @@ When contributors write an addition to the existing taxonomy, make a pull reques
118
118
119
119
Contributions to the InstructLab project include fine-tuning Granite-7b, an open-source licensed LLM. Contributors have direct access to the model they are improving through [Hugging Face](https://huggingface.co/instructlab).
120
120
121
+
### What is Merlinite-7b?
122
+
123
+
Merlinite-7b is a Mistral-7b derivative model fine-tuned with the LAB (**L**arge-scale **A**lignment for chat**B**ots) method using Mixtral-8x7b-Instruct as a teacher model.
124
+
125
+
More information about the Merlinite-7b can be found on the [Hugging Face project page](https://huggingface.co/instructlab/merlinite-7b-lab).
126
+
121
127
### What is Granite-7-lab?
122
128
123
129
Granite-7b-lab is a model that was built from scratch by IBM and fine tuned with the LAB (**L**arge-scale **A**lignment for chat**B**ots) method.
@@ -108,7 +108,7 @@ This taxonomy repository will be used as the seed to synthesize the training dat
108
108
109
109
By contributing your skills and knowledge to this repository, you will see your changes built into an LLM within days of your contribution rather than months or years! If you are working with a model and notice its knowledge or ability lacking, you can correct it by contributing knowledge or skills and check if it's improved after your changes are built.
110
110
111
-
While public contributions are welcome to help drive community progress, you can also fork this repository under [the Apache License, Version 2.0](LICENSE), add your own internal skills, and train your own models internally. However, you might need your own access to significant compute infrastructure to perform sufficient retraining.
111
+
While public contributions are welcome to help drive community progress, you can also fork this repository under [the Apache License, Version 2.0](../LICENSE), add your own internal skills, and train your own models internally. However, you might need your own access to significant compute infrastructure to perform sufficient retraining.
112
112
113
113
## Ways to Contribute
114
114
@@ -121,10 +121,10 @@ For more information, see the [Ways of contributing to the taxonomy repository](
121
121
122
122
## How to contribute skills and knowledge
123
123
124
-
To contribute to this repo, you'll use the *Fork and Pull* model common in many open source repositories. You can add your skills and knowledge to the taxonomy in multiple ways; for additional information on how to make a contribution, see the [Documentation on contributing](CONTRIBUTING.md). You can also use the following guides to help with contributing:
124
+
To contribute to this repo, you'll use the *Fork and Pull* model common in many open source repositories. You can add your skills and knowledge to the taxonomy in multiple ways; for additional information on how to make a contribution, see the [Documentation on contributing](../community/CONTRIBUTING.md). You can also use the following guides to help with contributing:
125
125
126
-
- Contributing using the [GitHub webpage UI](docs/contributing_via_GH_UI.md).
127
-
- Contributing knowledge to the taxonomy in the [Knowledge contribution guidelines](docs/knowledge-contribution-guide.md).
126
+
- Contributing using the [GitHub webpage UI](https://github.com/instructlab/taxonomy/blob/main/docs/contributing_via_GH_UI.md).
127
+
- Contributing knowledge to the taxonomy in the [Knowledge contribution guidelines](../taxonomy/knowledge/guide.md).
Copy file name to clipboardExpand all lines: docs/user-interface/knowledge_contributions.md
+7-7
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ The UI Simplifies the process for Skills & Knowledge contributions by:
8
8
9
9
* Minimising risk of human error when writing YAML by using the web form.
10
10
11
-
* Directly submit a github pull request with a press of a button.
11
+
* Directly submit a GitHub pull request with a press of a button.
12
12
13
13
When the form is filled out, you also are given the option to download the YAML and attribution files to your local machine, and to view the form in its original YAML structure before submission.
14
14
@@ -17,11 +17,11 @@ You can view all your submissions on the dashboard page.
17
17
!!! warning
18
18
Even when running the UI locally, you must be logged in via github to successfully submit your Knowledge and Skills contributions. You can still fill out the form, and download the YAML and attribution files.
19
19
20
-
For tips on writing Skills & Knowledge contributions, please visit the documentation under the [Taxonomy](/taxonomy/) heading.
20
+
For tips on writing Skills & Knowledge contributions, please visit the documentation under the [Taxonomy](../taxonomy/index.md) heading.
21
21
22
22
## Knowledge Contributions
23
23
24
-
Firstly you will need to find a source document for your knowledge. Accepted sources can be found [here](/taxonomy/knowledge/guide).
24
+
Firstly you will need to find a source document for your knowledge. Accepted sources can be found [here](../taxonomy/knowledge/guide.md).
25
25
26
26
Navigate to the Contribute section of the sidebar and click Knowledge. Here you will see the form to contribute Knowledge to the open-source taxonomy tree.
27
27
@@ -51,13 +51,13 @@ Here you will begin filling out your QNA examples that represent the knowledge y
51
51
52
52
### Document Information
53
53
54
-
You must prepare a markdown file version of the document you wish to use for the knowledge submission. By dragging and dropping the markdown file into the box, and clicking the submit files button, a forked version of the taxonomy repository will be automatically created on your github profile.
54
+
You must prepare a markdown file version of the document you wish to use for the knowledge submission. By dragging and dropping the markdown file into the box, and clicking the submit files button, a forked version of the taxonomy repository will be automatically created on your GitHub profile.
Copy file name to clipboardExpand all lines: docs/user-interface/playground_chat.md
+6-6
Original file line number
Diff line number
Diff line change
@@ -4,28 +4,28 @@ description: Steps to set up the playground to chat with a model
4
4
logo: images/ilab_dog.png
5
5
---
6
6
7
-
To run with a locally run model, make sure that iLab model serve is running in a seperate terminal. If you are unsure on how to do this, please visit the [Intro to serve and chat](/getting-started/serve_and_chat/) section of this document.
7
+
To run with a locally run model, make sure that iLab model serve is running in a separate terminal. If you are unsure on how to do this, please visit the [Intro to serve and chat](../getting-started/serve_and_chat.md) section of this document.
8
8
9
-
If you go to `Playground > Chat` by using the side navigation bar, you can interact with the merlinite and granite models.
9
+
If you go to `Playground > Chat` by using the side navigation bar, you can interact with the Merlinite and Granite models.
10
10
11
11

12
12
13
-
If you are running the ui within a dev environment, the model won't reply because a granite/merinite model endpoint hasn't been given. In this case, we will create a new custom model endpoint, using our locally hosted quantised model.
13
+
If you are running the ui within a dev environment, the model won't reply because a Granite/Merinite model endpoint hasn't been given. In this case, we will create a new custom model endpoint, using our locally hosted quantised model.
14
14
15
15
To add a custom model endpoint, go to `Playground > Custom Model Endpoints` and press the `Add Endpoint` button on the right side.
16
16
17
-
You will have 3 fields to fill out
17
+
You will have 3 fields to fill out:
18
18
19
19
* The URL, where your customised model is hosted, if hosting locally, the URL would be `http://127.0.0.1:8000/`
20
20
21
21
* The Model Name, `merlinite-7b-lab-Q4_K_M.gguf`
22
22
23
-
* API Key, you may put any text in here; in this case I've used`randomCharacters`. If you are setting up an API key, please provide the key in this section.
23
+
* API Key, you may put any text in here; in this case I've used`randomCharacters`. If you are setting up an API key, please provide the key in this section.
24
24
25
25

26
26
27
27
Go back to the playground chat, select newly added model and chat.
28
28
29
29

0 commit comments