|
| 1 | +# LLaMA 3 |
| 2 | + |
| 3 | +A single-file implementation of [LLaMA 3](https://arxiv.org/abs/2407.21783), with support for jitting, KV caching and prompting. |
| 4 | + |
| 5 | +The original implementation can be found at https://github.com/meta-llama/llama3. |
| 6 | + |
| 7 | +--------------------------------------------------------------------------------------------------------- |
| 8 | + |
| 9 | +## 🛠️️ Installation |
| 10 | + |
| 11 | +### Using Pip |
| 12 | + |
| 13 | +First of all, install [Python 3.8 or later](https://www.python.org). Open a terminal and run: |
| 14 | + |
| 15 | +```bash |
| 16 | +pip install git+https://github.com/lucadellalib/llama3@main#egg=llama3[all] |
| 17 | +``` |
| 18 | + |
| 19 | +### From Source |
| 20 | + |
| 21 | +First of all, install [Python 3.8 or later](https://www.python.org). |
| 22 | +Clone or download and extract the repository, navigate to `<path-to-repository>`, open a terminal and run: |
| 23 | + |
| 24 | +```bash |
| 25 | +# Install the package locally in editable mode |
| 26 | +pip install -e .[all] |
| 27 | +``` |
| 28 | + |
| 29 | +--------------------------------------------------------------------------------------------------------- |
| 30 | + |
| 31 | +## ▶️ Quickstart |
| 32 | + |
| 33 | +### Importing the Model in Your Own Script |
| 34 | + |
| 35 | +```python |
| 36 | +import torch |
| 37 | +from llama3 import LlamaDecoder |
| 38 | + |
| 39 | +B, H, K = 3, 512, 30 |
| 40 | +model = LlamaDecoder(K) |
| 41 | +print(model) |
| 42 | + |
| 43 | +# Process 50 timesteps |
| 44 | +input = torch.randn(B, 50, H) |
| 45 | +output, state = model(input) |
| 46 | +print(output.shape) |
| 47 | + |
| 48 | +# Process 2 additional timesteps |
| 49 | +input = torch.randn(B, 2, H) |
| 50 | +output, state = model(input, state=state) |
| 51 | +print(output.shape) |
| 52 | + |
| 53 | +# JIT the model |
| 54 | +model_jit = model.jit() |
| 55 | +output_jit, state_jit = model_jit(input) |
| 56 | +print(output.shape) |
| 57 | +``` |
| 58 | + |
| 59 | +### Inference Example With Pretrained Checkpoint |
| 60 | + |
| 61 | +First of all, download the model weights and tokenizer (pretrained variant, e.g. Llama3.2-1B). Check the official |
| 62 | +website for instructions on how to [download the models](https://github.com/meta-llama/llama3#download). |
| 63 | + |
| 64 | +Navigate to `<path-to-repository>`, open a terminal and run: |
| 65 | + |
| 66 | +```bash |
| 67 | +python main.py --checkpoint_path <path-to-checkpoint> |
| 68 | +``` |
| 69 | + |
| 70 | +It is recommended to run this script on a machine with at least 1 GPU. |
| 71 | + |
| 72 | +--------------------------------------------------------------------------------------------------------- |
| 73 | + |
| 74 | +## 📧 Contact |
| 75 | + |
| 76 | + |
| 77 | + |
| 78 | +--------------------------------------------------------------------------------------------------------- |
0 commit comments