This repository contains a reference implementation of SAMI (Self-Supervised Alignment with Mutual Information) using the TL;DR dataset.
- Set up a conda environment (we used
python==3.10.0
) and install the required dependencies by runningpip install -e .
- Adjust the
experiments/tldr/config/generate.yaml
config file to match your directories and desired configurations. Example constitutions using principles written bymistral-7b
andclaude-opus
are provided in constitutions_mistral and constitutions_opus. - Navigate to
cd experiments/tldr
and runpython generate.py
to generate your own data. By default, the generated data will be stored inexperiments/tldr/data/base
. Note that this directory is already populated with the data used in the paper if you prefer to finetune a model directly.
- Select a model configuration (e.g.,
mistral-7b
) from theexperiments/tldr/conf/model
directory and update thecache_dir
accordingly (e.g.,/scr/YOUR_USERNAME/sami/checkpoints
). - Adjust the
experiments/tldr/conf/train_sami.yaml
config as needed, including optional wandb logging. If you setlog: true
you should have an account/make sure that you are logged in. - Navigate to
cd experiments/tldr
and run training using an interactive job using the command below, or adapt the example slurm script to meet your computing needs and submit it usingsbatch
(or modify the script to be a standard bash script and submit from e.g. atmux
window).
python train.py \
training.beta=0.0 \
wandb.name="$YOUR_WANDB_NAME" \
training.checkpoint_dir="$YOUR_CHECKPOINT_DIR" \
training.lr=5e-7 \
data_path="data/base" \
data_file="base_mistral_from_mistral_principles.json" \
n_examples=2000
- Adjust the
experiments/tldr/config/evaluate.yaml
configuration, navigate tocd experiments/tldr
and runpython evaluate.py
. This will write the generated responses intoexperiments/tldr/results/responses
. - Compute win rates by adjusting the
experiments/tldr/config/win_rates.yaml
configuration and runningpython win_rates.py
from the same directory. Note that this script currently uses azure, so if you dont have access to GPT-4 via azure, you might have to copy-paste the/scr/models/openai_models/azure.py
and create your ownAsyncOpenAI
class. FYI: We used thegpt-4-0613
snapshot for all evaluations.
If you don't have access to GPUs, you can attempt to run training using experiments/tldr/conf/model/mistral_tiny_base
, which we tested locally on an Apple M2 Pro (2023 MacBook Pro with 16B memory).
The SAMITrainer
and train.py
use FSDP
(FullyShardedDataParallel). To learn more about FSDP
, you may find the FSDP tutorial series and the DDP tutorial series helpful.
If you found this work useful, please cite:
@article{fränken2024selfsupervised,
title={Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels},
author={Jan-Philipp Fränken and Eric Zelikman and Rafael Rafailov and Kanishk Gandhi and Tobias Gerstenberg and Noah D. Goodman},
year={2024},
eprint={2404.14313},
archivePrefix={arXiv},
primaryClass={cs.CL}
}