Skip to content

Commit d2b2eeb

Browse files
author
pytorchbot
committed
2025-02-11 nightly release (a0f74c3)
1 parent 7754294 commit d2b2eeb

14 files changed

+69
-49
lines changed

Diff for: README.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,8 @@ torchtune provides the following finetuning recipes for training on one or more
7272
| DoRA/QDoRA Finetuning | ✅ | ✅ | ❌ | [lora_finetune_single_device](recipes/lora_finetune_single_device.py) <br> [lora_finetune_distributed](recipes/lora_finetune_distributed.py)| [Llama3 8B QDoRA single-device](recipes/configs/llama3/8B_qdora_single_device.yaml) <br> [Llama3 8B DoRA distributed](recipes/configs/llama3/8B_dora.yaml)
7373
| Quantization-Aware Training | ❌ | ✅ | ❌ | [qat_distributed](recipes/qat_distributed.py)| [Llama3 8B QAT](recipes/configs/llama3/8B_qat_full.yaml)
7474
| Quantization-Aware Training and LoRA Finetuning | ❌ | ✅ | ❌ | [qat_lora_finetune_distributed](recipes/qat_lora_finetune_distributed.py)| [Llama3 8B QAT](recipes/configs/llama3/8B_qat_lora.yaml)
75-
| Direct Preference Optimization | ✅ | ✅ | ❌ | [lora_dpo_single_device](recipes/lora_dpo_single_device.py) <br> [lora_dpo_distributed](recipes/lora_dpo_distributed.py) | [Llama2 7B single-device](recipes/configs/llama2/7B_lora_dpo_single_device.yaml) <br> [Llama2 7B distributed](recipes/configs/llama2/7B_lora_dpo.yaml)
75+
| Direct Preference Optimization: Full Finetuning | ❌ | ✅ | ❌ | [full_dpo_distributed](recipes/full_dpo_distributed.py) | [Llama3.1 8B DPO](recipes/configs/llama3_1/8B_full_dpo.yaml)
76+
| LoRA Direct Preference Optimization | ✅ | ✅ | ❌ | [lora_dpo_single_device](recipes/lora_dpo_single_device.py) <br> [lora_dpo_distributed](recipes/lora_dpo_distributed.py) | [Llama3.1 8B single-device](recipes/configs/llama3_1/8B_lora_dpo_single_device.yaml) <br> [Llama3.1 8B distributed](recipes/configs/llama3_1/8B_lora_dpo.yaml)
7677
| Proximal Policy Optimization | ✅ | ❌ | ❌ | [ppo_full_finetune_single_device](recipes/ppo_full_finetune_single_device.py) | [Mistral 7B](recipes/configs/mistral/7B_full_ppo_low_memory.yaml)
7778
| LoRA Knowledge Distillation | ✅ | ✅ | ❌ | [knowledge_distillation_single_device](recipes/knowledge_distillation_single_device.py) <br> [knowledge_distillation_distributed](recipes/knowledge_distillation_distributed.py) | [Qwen2 1.5B -> 0.5B single-device](recipes/configs/qwen2/1.5B_to_0.5B_KD_lora_single_device.yaml) <br> [Qwen2 1.5B -> 0.5B distributed](recipes/configs/qwen2/1.5B_to_0.5B_KD_lora_distributed.yaml)
7879

Diff for: recipes/configs/qwen2_5/14B_lora_single_device.yaml

+4-4
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
#
44
# This config assumes that you've run the following command before launching
55
# this run:
6-
# tune download Qwen/Qwen2.5-14B-Instruct --output-dir /tmp/Qwen2_5-14B-Instruct
6+
# tune download Qwen/Qwen2.5-14B-Instruct --output-dir /tmp/Qwen2.5-14B-Instruct
77
#
88
# To launch on a single device, run the following command from root:
99
# tune run lora_finetune_single_device --config qwen2_5/14B_lora_single_device
@@ -30,13 +30,13 @@ model:
3030

3131
tokenizer:
3232
_component_: torchtune.models.qwen2_5.qwen2_5_tokenizer
33-
path: /tmp/Qwen2_5-14B-Instruct/vocab.json
34-
merges_file: /tmp/Qwen2_5-14B-Instruct/merges.txt
33+
path: /tmp/Qwen2.5-14B-Instruct/vocab.json
34+
merges_file: /tmp/Qwen2.5-14B-Instruct/merges.txt
3535
max_seq_len: null
3636

3737
checkpointer:
3838
_component_: torchtune.training.FullModelHFCheckpointer
39-
checkpoint_dir: /tmp/Qwen2_5-14B-Instruct
39+
checkpoint_dir: /tmp/Qwen2.5-14B-Instruct
4040
checkpoint_files:
4141
filename_format: model-{}-of-{}.safetensors
4242
max_filename: "00008"

Diff for: recipes/configs/qwen2_5/32B_lora.yaml

+4-4
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
#
44
# This config assumes that you've run the following command before launching
55
# this run:
6-
# tune download Qwen/Qwen2.5-32B-Instruct --output-dir /tmp/Qwen2_5-32B-Instruct
6+
# tune download Qwen/Qwen2.5-32B-Instruct --output-dir /tmp/Qwen2.5-32B-Instruct
77
#
88
# To launch on 8 devices, run the following command from root:
99
# tune run --nnodes 1 --nproc_per_node 8 lora_finetune_distributed --config qwen2_5/32B_lora
@@ -28,13 +28,13 @@ model:
2828

2929
tokenizer:
3030
_component_: torchtune.models.qwen2_5.qwen2_5_tokenizer
31-
path: /tmp/Qwen2_5-32B-Instruct/vocab.json
32-
merges_file: /tmp/Qwen2_5-32B-Instruct/merges.txt
31+
path: /tmp/Qwen2.5-32B-Instruct/vocab.json
32+
merges_file: /tmp/Qwen2.5-32B-Instruct/merges.txt
3333
max_seq_len: null
3434

3535
checkpointer:
3636
_component_: torchtune.training.FullModelHFCheckpointer
37-
checkpoint_dir: /tmp/Qwen2_5-32B-Instruct
37+
checkpoint_dir: /tmp/Qwen2.5-32B-Instruct
3838
checkpoint_files:
3939
filename_format: model-{}-of-{}.safetensors
4040
max_filename: "00017"

Diff for: recipes/configs/qwen2_5/3B_full.yaml

+4-4
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
#
44
# This config assumes that you've run the following command before launching
55
# this run:
6-
# tune download Qwen/Qwen2.5-3B-Instruct --output-dir /tmp/Qwen2_5-3B-Instruct
6+
# tune download Qwen/Qwen2.5-3B-Instruct --output-dir /tmp/Qwen2.5-3B-Instruct
77
#
88
# To launch on 2 devices, run the following command from root:
99
# tune run --nnodes 1 --nproc_per_node 2 full_finetune_distributed --config qwen2_5/3B_full
@@ -22,8 +22,8 @@ output_dir: /tmp/torchtune/qwen2_5_3B/full # /tmp may be deleted by your system.
2222
# Tokenizer
2323
tokenizer:
2424
_component_: torchtune.models.qwen2_5.qwen2_5_tokenizer
25-
path: /tmp/Qwen2_5-3B-Instruct/vocab.json
26-
merges_file: /tmp/Qwen2_5-3B-Instruct/merges.txt
25+
path: /tmp/Qwen2.5-3B-Instruct/vocab.json
26+
merges_file: /tmp/Qwen2.5-3B-Instruct/merges.txt
2727
max_seq_len: null
2828

2929
# Dataset
@@ -39,7 +39,7 @@ model:
3939

4040
checkpointer:
4141
_component_: torchtune.training.FullModelHFCheckpointer
42-
checkpoint_dir: /tmp/Qwen2_5-3B-Instruct
42+
checkpoint_dir: /tmp/Qwen2.5-3B-Instruct
4343
checkpoint_files: [
4444
model-00001-of-00002.safetensors,
4545
model-00002-of-00002.safetensors,

Diff for: recipes/configs/qwen2_5/3B_full_single_device.yaml

+4-4
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
#
44
# This config assumes that you've run the following command before launching
55
# this run:
6-
# tune download Qwen/Qwen2.5-3B-Instruct --output-dir /tmp/Qwen2_5-3B-Instruct
6+
# tune download Qwen/Qwen2.5-3B-Instruct --output-dir /tmp/Qwen2.5-3B-Instruct
77
#
88
# The default config uses an optimizer from bitsandbytes. If you do not have it installed,
99
# you can install it with
@@ -24,8 +24,8 @@ output_dir: /tmp/torchtune/qwen2_5_3B/full_single_device # /tmp may be deleted b
2424
# Tokenizer
2525
tokenizer:
2626
_component_: torchtune.models.qwen2_5.qwen2_5_tokenizer
27-
path: /tmp/Qwen2_5-3B-Instruct/vocab.json
28-
merges_file: /tmp/Qwen2_5-3B-Instruct/merges.txt
27+
path: /tmp/Qwen2.5-3B-Instruct/vocab.json
28+
merges_file: /tmp/Qwen2.5-3B-Instruct/merges.txt
2929
max_seq_len: null
3030

3131
# Dataset
@@ -41,7 +41,7 @@ model:
4141

4242
checkpointer:
4343
_component_: torchtune.training.FullModelHFCheckpointer
44-
checkpoint_dir: /tmp/Qwen2_5-3B-Instruct
44+
checkpoint_dir: /tmp/Qwen2.5-3B-Instruct
4545
checkpoint_files: [
4646
model-00001-of-00002.safetensors,
4747
model-00002-of-00002.safetensors,

Diff for: recipes/configs/qwen2_5/3B_lora.yaml

+4-4
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
#
44
# This config assumes that you've run the following command before launching
55
# this run:
6-
# tune download Qwen/Qwen2.5-3B-Instruct --output-dir /tmp/Qwen2_5-3B-Instruct
6+
# tune download Qwen/Qwen2.5-3B-Instruct --output-dir /tmp/Qwen2.5-3B-Instruct
77
#
88
# To launch on 2 devices, run the following command from root:
99
# tune run --nnodes 1 --nproc_per_node 2 lora_finetune_distributed --config qwen2_5/3B_lora
@@ -30,13 +30,13 @@ model:
3030

3131
tokenizer:
3232
_component_: torchtune.models.qwen2_5.qwen2_5_tokenizer
33-
path: /tmp/Qwen2_5-3B-Instruct/vocab.json
34-
merges_file: /tmp/Qwen2_5-3B-Instruct/merges.txt
33+
path: /tmp/Qwen2.5-3B-Instruct/vocab.json
34+
merges_file: /tmp/Qwen2.5-3B-Instruct/merges.txt
3535
max_seq_len: null
3636

3737
checkpointer:
3838
_component_: torchtune.training.FullModelHFCheckpointer
39-
checkpoint_dir: /tmp/Qwen2_5-3B-Instruct
39+
checkpoint_dir: /tmp/Qwen2.5-3B-Instruct
4040
checkpoint_files: [
4141
model-00001-of-00002.safetensors,
4242
model-00002-of-00002.safetensors,

Diff for: recipes/configs/qwen2_5/3B_lora_single_device.yaml

+4-4
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
#
44
# This config assumes that you've run the following command before launching
55
# this run:
6-
# tune download Qwen/Qwen2.5-3B-Instruct --output-dir /tmp/Qwen2_5-3B-Instruct
6+
# tune download Qwen/Qwen2.5-3B-Instruct --output-dir /tmp/Qwen2.5-3B-Instruct
77
#
88
# To launch on a single device, run the following command from root:
99
# tune run lora_finetune_single_device --config qwen2_5/3B_lora_single_device
@@ -29,13 +29,13 @@ model:
2929

3030
tokenizer:
3131
_component_: torchtune.models.qwen2_5.qwen2_5_tokenizer
32-
path: /tmp/Qwen2_5-3B-Instruct/vocab.json
33-
merges_file: /tmp/Qwen2_5-3B-Instruct/merges.txt
32+
path: /tmp/Qwen2.5-3B-Instruct/vocab.json
33+
merges_file: /tmp/Qwen2.5-3B-Instruct/merges.txt
3434
max_seq_len: null
3535

3636
checkpointer:
3737
_component_: torchtune.training.FullModelHFCheckpointer
38-
checkpoint_dir: /tmp/Qwen2_5-3B-Instruct
38+
checkpoint_dir: /tmp/Qwen2.5-3B-Instruct
3939
checkpoint_files: [
4040
model-00001-of-00002.safetensors,
4141
model-00002-of-00002.safetensors,

Diff for: recipes/configs/qwen2_5/72B_lora.yaml

+4-4
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
#
44
# This config assumes that you've run the following command before launching
55
# this run:
6-
# tune download Qwen/Qwen2.5-72B-Instruct --output-dir /tmp/Qwen2_5-72B-Instruct
6+
# tune download Qwen/Qwen2.5-72B-Instruct --output-dir /tmp/Qwen2.5-72B-Instruct
77
#
88
# To launch on 8 devices, run the following command from root:
99
# tune run --nnodes 1 --nproc_per_node 8 lora_finetune_distributed --config qwen2_5/72B_lora
@@ -28,13 +28,13 @@ model:
2828

2929
tokenizer:
3030
_component_: torchtune.models.qwen2_5.qwen2_5_tokenizer
31-
path: /tmp/Qwen2_5-72B-Instruct/vocab.json
32-
merges_file: /tmp/Qwen2_5-72B-Instruct/merges.txt
31+
path: /tmp/Qwen2.5-72B-Instruct/vocab.json
32+
merges_file: /tmp/Qwen2.5-72B-Instruct/merges.txt
3333
max_seq_len: null
3434

3535
checkpointer:
3636
_component_: torchtune.training.FullModelHFCheckpointer
37-
checkpoint_dir: /tmp/Qwen2_5-72B-Instruct
37+
checkpoint_dir: /tmp/Qwen2.5-72B-Instruct
3838
checkpoint_files:
3939
filename_format: model-{}-of-{}.safetensors
4040
max_filename: "00037"

Diff for: recipes/configs/qwen2_5/7B_full.yaml

+4-4
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
#
44
# This config assumes that you've run the following command before launching
55
# this run:
6-
# tune download Qwen/Qwen2.5-7B-Instruct --output-dir /tmp/Qwen2_5-7B-Instruct
6+
# tune download Qwen/Qwen2.5-7B-Instruct --output-dir /tmp/Qwen2.5-7B-Instruct
77
#
88
# To launch on 2 devices, run the following command from root:
99
# tune run --nnodes 1 --nproc_per_node 2 full_finetune_distributed --config qwen2_5/7B_full
@@ -22,8 +22,8 @@ output_dir: /tmp/torchtune/qwen2_5_7B/full # /tmp may be deleted by your system.
2222
# Tokenizer
2323
tokenizer:
2424
_component_: torchtune.models.qwen2_5.qwen2_5_tokenizer
25-
path: /tmp/Qwen2_5-7B-Instruct/vocab.json
26-
merges_file: /tmp/Qwen2_5-7B-Instruct/merges.txt
25+
path: /tmp/Qwen2.5-7B-Instruct/vocab.json
26+
merges_file: /tmp/Qwen2.5-7B-Instruct/merges.txt
2727
max_seq_len: null
2828

2929
# Dataset
@@ -39,7 +39,7 @@ model:
3939

4040
checkpointer:
4141
_component_: torchtune.training.FullModelHFCheckpointer
42-
checkpoint_dir: /tmp/Qwen2_5-7B-Instruct
42+
checkpoint_dir: /tmp/Qwen2.5-7B-Instruct
4343
checkpoint_files: [
4444
model-00001-of-00004.safetensors,
4545
model-00002-of-00004.safetensors,

Diff for: recipes/configs/qwen2_5/7B_full_single_device.yaml

+4-4
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
#
44
# This config assumes that you've run the following command before launching
55
# this run:
6-
# tune download Qwen/Qwen2.5-7B-Instruct --output-dir /tmp/Qwen2_5-7B-Instruct
6+
# tune download Qwen/Qwen2.5-7B-Instruct --output-dir /tmp/Qwen2.5-7B-Instruct
77
#
88
# The default config uses an optimizer from bitsandbytes. If you do not have it installed,
99
# you can install it with
@@ -24,8 +24,8 @@ output_dir: /tmp/torchtune/qwen2_5_7B/full_single_device # /tmp may be deleted b
2424
# Tokenizer
2525
tokenizer:
2626
_component_: torchtune.models.qwen2_5.qwen2_5_tokenizer
27-
path: /tmp/Qwen2_5-7B-Instruct/vocab.json
28-
merges_file: /tmp/Qwen2_5-7B-Instruct/merges.txt
27+
path: /tmp/Qwen2.5-7B-Instruct/vocab.json
28+
merges_file: /tmp/Qwen2.5-7B-Instruct/merges.txt
2929
max_seq_len: null
3030

3131
# Dataset
@@ -41,7 +41,7 @@ model:
4141

4242
checkpointer:
4343
_component_: torchtune.training.FullModelHFCheckpointer
44-
checkpoint_dir: /tmp/Qwen2_5-7B-Instruct
44+
checkpoint_dir: /tmp/Qwen2.5-7B-Instruct
4545
checkpoint_files: [
4646
model-00001-of-00004.safetensors,
4747
model-00002-of-00004.safetensors,

Diff for: recipes/configs/qwen2_5/7B_lora.yaml

+4-4
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
#
44
# This config assumes that you've run the following command before launching
55
# this run:
6-
# tune download Qwen/Qwen2.5-7B-Instruct --output-dir /tmp/Qwen2_5-7B-Instruct
6+
# tune download Qwen/Qwen2.5-7B-Instruct --output-dir /tmp/Qwen2.5-7B-Instruct
77
#
88
# To launch on 2 devices, run the following command from root:
99
# tune run --nnodes 1 --nproc_per_node 2 lora_finetune_distributed --config qwen2_5/7B_lora
@@ -31,13 +31,13 @@ model:
3131

3232
tokenizer:
3333
_component_: torchtune.models.qwen2_5.qwen2_5_tokenizer
34-
path: /tmp/Qwen2_5-7B-Instruct/vocab.json
35-
merges_file: /tmp/Qwen2_5-7B-Instruct/merges.txt
34+
path: /tmp/Qwen2.5-7B-Instruct/vocab.json
35+
merges_file: /tmp/Qwen2.5-7B-Instruct/merges.txt
3636
max_seq_len: null
3737

3838
checkpointer:
3939
_component_: torchtune.training.FullModelHFCheckpointer
40-
checkpoint_dir: /tmp/Qwen2_5-7B-Instruct
40+
checkpoint_dir: /tmp/Qwen2.5-7B-Instruct
4141
checkpoint_files: [
4242
model-00001-of-00004.safetensors,
4343
model-00002-of-00004.safetensors,

Diff for: recipes/configs/qwen2_5/7B_lora_single_device.yaml

+4-4
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
#
44
# This config assumes that you've run the following command before launching
55
# this run:
6-
# tune download Qwen/Qwen2.5-7B-Instruct --output-dir /tmp/Qwen2_5-7B-Instruct
6+
# tune download Qwen/Qwen2.5-7B-Instruct --output-dir /tmp/Qwen2.5-7B-Instruct
77
#
88
# To launch on a single device, run the following command from root:
99
# tune run lora_finetune_single_device --config qwen2_5/7B_lora_single_device
@@ -30,13 +30,13 @@ model:
3030

3131
tokenizer:
3232
_component_: torchtune.models.qwen2_5.qwen2_5_tokenizer
33-
path: /tmp/Qwen2_5-7B-Instruct/vocab.json
34-
merges_file: /tmp/Qwen2_5-7B-Instruct/merges.txt
33+
path: /tmp/Qwen2.5-7B-Instruct/vocab.json
34+
merges_file: /tmp/Qwen2.5-7B-Instruct/merges.txt
3535
max_seq_len: null
3636

3737
checkpointer:
3838
_component_: torchtune.training.FullModelHFCheckpointer
39-
checkpoint_dir: /tmp/Qwen2_5-7B-Instruct
39+
checkpoint_dir: /tmp/Qwen2.5-7B-Instruct
4040
checkpoint_files: [
4141
model-00001-of-00004.safetensors,
4242
model-00002-of-00004.safetensors,

Diff for: recipes/configs/qwen2_5/evaluation.yaml

+3-3
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ model:
1111

1212
checkpointer:
1313
_component_: torchtune.training.FullModelHFCheckpointer
14-
checkpoint_dir: /tmp/Qwen2_5-0_5B-Instruct
14+
checkpoint_dir: /tmp/Qwen2.5-0_5B-Instruct
1515
checkpoint_files: [
1616
model.safetensors,
1717
]
@@ -21,8 +21,8 @@ checkpointer:
2121
# Tokenizer
2222
tokenizer:
2323
_component_: torchtune.models.qwen2_5.qwen2_5_tokenizer
24-
path: /tmp/Qwen2_5-0_5B-Instruct/vocab.json
25-
merges_file: /tmp/Qwen2_5-0_5B-Instruct/merges.txt
24+
path: /tmp/Qwen2.5-0_5B-Instruct/vocab.json
25+
merges_file: /tmp/Qwen2.5-0_5B-Instruct/merges.txt
2626
max_seq_len: null
2727

2828
# Environment

Diff for: torchtune/modules/attention_utils.py

+20-1
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,26 @@
2222
flex_attention,
2323
)
2424

25-
flex_attention_compiled = torch.compile(flex_attention, dynamic=False)
25+
def compile_flex_attention():
26+
try:
27+
return torch.compile(flex_attention, dynamic=False)
28+
except Exception as e:
29+
# It may fail on some combinations of hardware/versions. Using max-autotune fixes this issue.
30+
# Context: https://github.com/pytorch/torchtune/issues/2113
31+
_log.info(
32+
f"Compiling flex_attention failed with error '{e}'. Retrying with mode='max-autotune'."
33+
)
34+
try:
35+
return torch.compile(flex_attention, dynamic=False, mode="max-autotune")
36+
except Exception as e:
37+
_log.info(
38+
f"Compiling flex_attention failed with error: '{e}', "
39+
"Updating your pytorch version to nightlies may solve it, or you can set"
40+
"in your config dataset.packed=False to avoid using flex attention."
41+
)
42+
raise
43+
44+
flex_attention_compiled = compile_flex_attention()
2645

2746
# We cannot do nested compile, but flex attention only has perf benefits
2847
# when compiled. To insulate it from the compiler, we wrap it with

0 commit comments

Comments
 (0)