Skip to content

Commit de621eb

Browse files
Merge pull request #1 from MayankChaturvedi/gemma3
Add Gemma3
2 parents 3294a89 + 0dd75b0 commit de621eb

22 files changed

+1899
-256
lines changed

Diff for: README.md

+40-25
Original file line numberDiff line numberDiff line change
@@ -1,37 +1,52 @@
11
# Gemma in PyTorch
22

3-
**Gemma** is a family of lightweight, state-of-the art open models built from research and technology used to create Google Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants. For more details, please check out the following links:
3+
**Gemma** is a family of lightweight, state-of-the art open models built from research and technology used to create Google Gemini models. They include both text-only and multimodal decoder-only large language models, with open weights, pre-trained variants, and instruction-tuned variants. For more details, please check out the following links:
44

55
* [Gemma on Google AI](https://ai.google.dev/gemma)
6-
* [Gemma on Kaggle](https://www.kaggle.com/models/google/gemma)
7-
* [Gemma on Vertex AI Model Garden](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/335)
6+
* [Gemma on Kaggle](https://www.kaggle.com/models/google/gemma-3)
7+
* [Gemma on Vertex AI Model Garden](https://pantheon.corp.google.com/vertex-ai/publishers/google/model-garden/gemma3)
88

9-
This is the official PyTorch implementation of Gemma models. We provide model and inference implementations using both PyTorch and PyTorch/XLA, and support running inference on CPU, GPU and TPU.
9+
This is the official PyTorch implementation of Gemma models. We provide model and inference implementations using both PyTorch and PyTorch/XLA, and support running inference on CPU, GPU and TPU.
1010

1111
## Updates
1212

13-
* [June 26th 🔥] Support Gemma v2. You can find the checkpoints [on Kaggle](https://www.kaggle.com/models/google/gemma-2/pytorch) and Hugging Face
13+
* [March 12th, 2025 🔥] Support Gemma v3. You can find the checkpoints [on Kaggle](https://www.kaggle.com/models/google/gemma-3/pytorch) and [Hugging Face](https://huggingface.co/models?other=gemma_torch)
1414

15-
* [April 9th] Support CodeGemma. You can find the checkpoints [on Kaggle](https://www.kaggle.com/models/google/codegemma/pytorch) and [Hugging Face](https://huggingface.co/collections/google/codegemma-release-66152ac7b683e2667abdee11)
15+
* [June 26th, 2024] Support Gemma v2. You can find the checkpoints [on Kaggle](https://www.kaggle.com/models/google/gemma-2/pytorch) and Hugging Face
1616

17-
* [April 5] Support Gemma v1.1. You can find the v1.1 checkpoints [on Kaggle](https://www.kaggle.com/models/google/gemma/frameworks/pyTorch) and [Hugging Face](https://huggingface.co/collections/google/gemma-release-65d5efbccdbb8c4202ec078b).
17+
* [April 9th, 2024] Support CodeGemma. You can find the checkpoints [on Kaggle](https://www.kaggle.com/models/google/codegemma/pytorch) and [Hugging Face](https://huggingface.co/collections/google/codegemma-release-66152ac7b683e2667abdee11)
18+
19+
* [April 5, 2024] Support Gemma v1.1. You can find the v1.1 checkpoints [on Kaggle](https://www.kaggle.com/models/google/gemma/frameworks/pyTorch) and [Hugging Face](https://huggingface.co/collections/google/gemma-release-65d5efbccdbb8c4202ec078b).
1820

1921
## Download Gemma model checkpoint
2022

21-
You can find the model checkpoints on Kaggle
22-
[here](https://www.kaggle.com/models/google/gemma/frameworks/pyTorch).
23+
You can find the model checkpoints on Kaggle:
24+
25+
- [Gemma 3](https://www.kaggle.com/models/google/gemma-3/pyTorch)
26+
- [Gemma 2](https://www.kaggle.com/models/google/gemma-2/pyTorch)
27+
- [Gemma](https://www.kaggle.com/models/google/gemma/pyTorch)
2328

24-
Alternatively, you can find the model checkpoints on the Hugging Face Hub [here](https://huggingface.co/models?other=gemma_torch). To download the models, go the the model repository of the model of interest and click the `Files and versions` tab, and download the model and tokenizer files. For programmatic downloading, if you have `huggingface_hub`
25-
installed, you can also run:
29+
Alternatively, you can find the model checkpoints on the Hugging Face Hub [here](https://huggingface.co/models?other=gemma_torch). To download the models, go the the model repository of the model of interest and click the `Files and versions` tab, and download the model and tokenizer files. For programmatic downloading, if you have `huggingface_hub` installed, you can also run:
2630

2731
```
28-
huggingface-cli download google/gemma-7b-it-pytorch
32+
huggingface-cli download google/gemma-3-4b-it-pytorch
2933
```
3034

31-
Note that you can choose between the 2B, 2B V2, 7B, 7B int8 quantized, 9B, and 27B variants.
35+
The following model sizes are available:
36+
37+
- **Gemma 3**:
38+
- **Text only**: 1b
39+
- **Multimodal**: 4b, 12b, 27b_v3
40+
- **Gemma 2**:
41+
- **Text only**: 2b-v2, 9b, 27b
42+
- **Gemma**:
43+
- **Text only**: 2b, 7b
44+
45+
46+
Note that you can choose between the 1B, 4B, 12B, and 27B variants.
3247

3348
```
34-
VARIANT=<2b or 7b or 9b or 27b>
49+
VARIANT=<1b, 2b, 2b-v2, 4b, 7b, 9b, 12b, 27b, 27b_v3>
3550
CKPT_PATH=<Insert ckpt path here>
3651
```
3752

@@ -59,33 +74,31 @@ docker build -f docker/Dockerfile ./ -t ${DOCKER_URI}
5974

6075
### Run Gemma inference on CPU.
6176

62-
```bash
63-
PROMPT="The meaning of life is"
77+
> NOTE: This is a multimodal example. Use a multimodal variant.
6478
79+
```bash
6580
docker run -t --rm \
6681
-v ${CKPT_PATH}:/tmp/ckpt \
6782
${DOCKER_URI} \
68-
python scripts/run.py \
83+
python scripts/run_multimodal.py \
6984
--ckpt=/tmp/ckpt \
7085
--variant="${VARIANT}" \
71-
--prompt="${PROMPT}"
7286
# add `--quant` for the int8 quantized model.
7387
```
7488

7589
### Run Gemma inference on GPU.
7690

77-
```bash
78-
PROMPT="The meaning of life is"
91+
> NOTE: This is a multimodal example. Use a multimodal variant.
7992
93+
```bash
8094
docker run -t --rm \
8195
--gpus all \
8296
-v ${CKPT_PATH}:/tmp/ckpt \
8397
${DOCKER_URI} \
84-
python scripts/run.py \
98+
python scripts/run_multimodal.py \
8599
--device=cuda \
86100
--ckpt=/tmp/ckpt \
87-
--variant="${VARIANT}" \
88-
--prompt="${PROMPT}"
101+
--variant="${VARIANT}"
89102
# add `--quant` for the int8 quantized model.
90103
```
91104

@@ -109,6 +122,8 @@ docker build -f docker/xla_gpu.Dockerfile ./ -t ${DOCKER_URI}
109122

110123
### Run Gemma inference on CPU.
111124

125+
> NOTE: This is a multimodal example. Use a multimodal variant.
126+
112127
```bash
113128
docker run -t --rm \
114129
--shm-size 4gb \
@@ -156,14 +171,14 @@ docker run -t --rm --privileged \
156171

157172
### Tokenizer Notes
158173

159-
99 unused tokens are reserved in the pretrained tokenizer model to assist with more efficient training/fine-tuning. Unused tokens are in the string format of `<unused[0-98]>` with token id range of `[7-105]`.
174+
99 unused tokens are reserved in the pretrained tokenizer model to assist with more efficient training/fine-tuning. Unused tokens are in the string format of `<unused[0-97]>` with token id range of `[7-104]`.
160175

161176
```
162177
"<unused0>": 7,
163178
"<unused1>": 8,
164179
"<unused2>": 9,
165180
...
166-
"<unused98>": 105,
181+
"<unused98>": 104,
167182
```
168183

169184
## Disclaimer

0 commit comments

Comments
 (0)