Skip to content

Commit 7bb7f23

Browse files
[float8] Support passing extra args to benchmarking script (#1961)
1 parent fe5bf73 commit 7bb7f23

File tree

2 files changed

+3
-1
lines changed

2 files changed

+3
-1
lines changed

benchmarks/float8/training/README.md

+1
Original file line numberDiff line numberDiff line change
@@ -14,5 +14,6 @@ Training parameters can be configured via environment variables.
1414
- `FLOAT8_RECIPE_WITH_BEST_SETTINGS`: "rowwise" or "tensorwise". Applies float8 training with the specified scaling recipe, as well as additional training configs which are optimal for that scaling recipe. See `float8_training_benchmark.sh` for more details.
1515
- `BATCH_SIZE`: Defaults to 1.
1616
- `STEPS`: Defaults to 100.
17+
- `EXTRA_ARGS`: Extra arguments to pass to torchtitan training script. See [torchtitan](https://github.com/pytorch/torchtitan) docs for the full list of options.
1718

1819
**NOTE**: `torch.compile` and FSDP2 are always used. Other forms of parallelism supported in torchtitan are not yet supported in this script.

benchmarks/float8/training/float8_training_benchmark.sh

+2-1
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ if [ -z "${TORCHTITAN_ROOT}" ]; then
2222
echo " * FLOAT8_RECIPE_WITH_BEST_SETTINGS: "rowwise" or "tensorwise". if set, use float8 training in torchtitan with the specified recipe, including the additional settings which are optimal for that recipe. otherwise, use bf16 mixed precision training."
2323
echo " * BATCH_SIZE: defaults to 1."
2424
echo " * STEPS: defaults to 100."
25+
echo " * EXTRA_ARGS: additional arguments to pass to the torchtitan training script."
2526
exit 1
2627
fi
2728

@@ -44,7 +45,7 @@ cd ${TORCHTITAN_ROOT}
4445
echo "float8 args: ${FLOAT8_ARGS}"
4546

4647
# run the command with the specified arguments
47-
CONFIG_FILE="./torchtitan/models/llama/train_configs/llama3_8b.toml" ${TORCHTITAN_ROOT}/run_train.sh --training.steps=${STEPS} --training.batch_size=${BATCH_SIZE} --training.compile ${FLOAT8_ARGS} 2>&1 | tee ${LOG_FILE}
48+
CONFIG_FILE="./torchtitan/models/llama/train_configs/llama3_8b.toml" ${TORCHTITAN_ROOT}/run_train.sh --training.steps=${STEPS} --training.batch_size=${BATCH_SIZE} --training.compile ${FLOAT8_ARGS} ${EXTRA_ARGS} 2>&1 | tee ${LOG_FILE}
4849

4950
# return to original working directory
5051
cd $original_dir

0 commit comments

Comments
 (0)