Skip to content

Commit f54680e

Browse files
byjlwmikekgfb
andauthored
Refactor and Fix the Readme (#563)
* refactoring the readme * continued refining * more cleanup * more cleanup * more cleanup * more cleanup * more cleanup * more refining * Update README.md Update README.md * move the discaimer down * remove torchtune from main readme Fix pathing issues for runner commands * don't use pybindings for et setup --------- Co-authored-by: Michael Gschwind <[email protected]>
1 parent 2030aab commit f54680e

File tree

3 files changed

+54
-2
lines changed

3 files changed

+54
-2
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ __pycache__/
88

99
.model-artifacts/
1010
.venv
11+
.torchchat
1112

1213
# Build directories
1314
build/android/*

README.md

Lines changed: 23 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,6 @@ You can also remove downloaded models with the remove command:
9393
`python3 torchchat.py remove llama3`
9494

9595

96-
9796
## Running via PyTorch / Python
9897
[Follow the installation steps if you haven't](#installation)
9998

@@ -199,7 +198,7 @@ export TORCHCHAT_ROOT=${PWD}
199198
### Export for mobile
200199
The following example uses the Llama3 8B Instruct model.
201200

202-
[#shell default]: echo '{"embedding": {"bitwidth": 4, "groupsize" : 32}, "linear:a8w4dq": {"groupsize" : 32}}' >./config/data/mobile.json
201+
[comment default]: echo '{"embedding": {"bitwidth": 4, "groupsize" : 32}, "linear:a8w4dq": {"groupsize" : 32}}' >./config/data/mobile.json
203202

204203
```
205204
# Export
@@ -250,8 +249,11 @@ Now, follow the app's UI guidelines to pick the model and tokenizer files from t
250249
<img src="https://pytorch.org/executorch/main/_static/img/llama_ios_app.png" width="600" alt="iOS app running a LlaMA model">
251250
</a>
252251

252+
253253
### Deploy and run on Android
254254

255+
256+
255257
MISSING. TBD.
256258

257259

@@ -262,6 +264,8 @@ Uses the lm_eval library to evaluate model accuracy on a variety of
262264
tasks. Defaults to wikitext and can be manually controlled using the
263265
tasks and limit args.
264266

267+
See [Evaluation](docs/evaluation.md)
268+
265269
For more information run `python3 torchchat.py eval --help`
266270

267271
**Examples**
@@ -317,6 +321,7 @@ you can perform the example commands with any of these models.
317321
**CERTIFICATE_VERIFY_FAILED**
318322
Run `pip install --upgrade certifi`.
319323

324+
320325
**Access to model is restricted and you are not in the authorized
321326
list** Some models require an additional step to access. Follow the
322327
link provided in the error to get access.
@@ -338,6 +343,22 @@ third-party models, weights, data, or other technologies, and you are
338343
solely responsible for complying with all such obligations.
339344

340345

346+
### Disclaimer
347+
The torchchat Repository Content is provided without any guarantees about
348+
performance or compatibility. In particular, torchchat makes available
349+
model architectures written in Python for PyTorch that may not perform
350+
in the same manner or meet the same standards as the original versions
351+
of those models. When using the torchchat Repository Content, including
352+
any model architectures, you are solely responsible for determining the
353+
appropriateness of using or redistributing the torchchat Repository Content
354+
and assume any risks associated with your use of the torchchat Repository Content
355+
or any models, outputs, or results, both alone and in combination with
356+
any other technologies. Additionally, you may have other legal obligations
357+
that govern your use of other content, such as the terms of service for
358+
third-party models, weights, data, or other technologies, and you are
359+
solely responsible for complying with all such obligations.
360+
361+
341362
## Acknowledgements
342363
Thank you to the [community](docs/ACKNOWLEDGEMENTS.md) for all the
343364
awesome libraries and tools you've built around local LLM inference.

docs/torchtune.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# Fine-tuned models from torchtune
2+
3+
torchchat supports running inference with models fine-tuned using [torchtune](https://github.com/pytorch/torchtune). To do so, we first need to convert the checkpoints into a format supported by torchchat.
4+
5+
Below is a simple workflow to run inference on a fine-tuned Llama3 model. For more details on how to fine-tune Llama3, see the instructions [here](https://github.com/pytorch/torchtune?tab=readme-ov-file#llama3)
6+
7+
```bash
8+
# install torchtune
9+
pip install torchtune
10+
11+
# download the llama3 model
12+
tune download meta-llama/Meta-Llama-3-8B \
13+
--output-dir ./Meta-Llama-3-8B \
14+
--hf-token <ACCESS TOKEN>
15+
16+
# Run LoRA fine-tuning on a single device. This assumes the config points to <checkpoint_dir> above
17+
tune run lora_finetune_single_device --config llama3/8B_lora_single_device
18+
19+
# convert the fine-tuned checkpoint to a format compatible with torchchat
20+
python3 build/convert_torchtune_checkpoint.py \
21+
--checkpoint-dir ./Meta-Llama-3-8B \
22+
--checkpoint-files meta_model_0.pt \
23+
--model-name llama3_8B \
24+
--checkpoint-format meta
25+
26+
# run inference on a single GPU
27+
python3 torchchat.py generate \
28+
--checkpoint-path ./Meta-Llama-3-8B/model.pth \
29+
--device cuda
30+
```

0 commit comments

Comments
 (0)