Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CHAPTER] New chapter on supervised fine tuning based on smol course #777

Merged
merged 31 commits into from
Feb 17, 2025
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
7bc134b
initial copy from smol-course
burtenshaw Jan 29, 2025
995493b
convert smol course material into nlp course style
burtenshaw Jan 30, 2025
beec8b5
review text and read through
burtenshaw Jan 30, 2025
4cb5f93
add links to colab
burtenshaw Feb 5, 2025
f7fc25d
add quiz app
burtenshaw Feb 5, 2025
564d9ec
add toc
burtenshaw Feb 5, 2025
edcf049
format code blocks
burtenshaw Feb 5, 2025
267c171
combine pages together and add extra guidance
burtenshaw Feb 5, 2025
a9847d0
update toc and format snippets
burtenshaw Feb 5, 2025
82b1d4a
update structure
burtenshaw Feb 5, 2025
549612b
followinf readthrough: simplify and add more tips
burtenshaw Feb 6, 2025
881865e
format code blocks
burtenshaw Feb 6, 2025
a386bbf
suggestions in intro page
burtenshaw Feb 11, 2025
6cefbc9
respond to suggestions on chat templates page
burtenshaw Feb 11, 2025
3b7cc5a
Update chapters/en/chapter11/3.mdx
burtenshaw Feb 11, 2025
c6800f1
Update chapters/en/chapter11/5.mdx
burtenshaw Feb 11, 2025
844d715
Update chapters/en/chapter11/5.mdx
burtenshaw Feb 11, 2025
7d0519c
Merge branch 'add-supervised-finetuning' of https://github.com/burten…
burtenshaw Feb 11, 2025
3f9815c
respond to suggestions in SFT page
burtenshaw Feb 11, 2025
c47a5a5
improve loss illustrations on sft page
burtenshaw Feb 11, 2025
d66fa86
respond to feedback in chat template
burtenshaw Feb 12, 2025
21c8dd1
respond to feedback on sft section
burtenshaw Feb 12, 2025
5d4025d
respond to feedback on lora section
burtenshaw Feb 12, 2025
e0ecc8c
respond to feedback in unit 5
burtenshaw Feb 12, 2025
2c30171
update toc with new tag and subtitle
burtenshaw Feb 14, 2025
cc7ddee
improve intro congruency with previous chapters
burtenshaw Feb 14, 2025
f040b6c
make chat templates more about structure
burtenshaw Feb 14, 2025
6d2a54c
add packing and references to the sft section
burtenshaw Feb 14, 2025
3a2ee3c
fix qlora mistake in lora page
burtenshaw Feb 14, 2025
a02b2d2
add more benchmarks to evaluation section
burtenshaw Feb 14, 2025
c74ebd3
add final quizzes to quiz section
burtenshaw Feb 17, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions chapters/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,24 @@
title: End-of-chapter quiz
quiz: 10

- title: 11. Supervised fine-tuning
sections:
- local: chapter11/1
title: Introduction
- local: chapter11/2
title: Chat Templates
- local: chapter11/3
title: Fine-Tuning with SFTTrainer
- local: chapter11/4
title: LoRA (Low-Rank Adaptation)
- local: chapter11/5
title: Evaluation
- local: chapter11/6
title: Conclusion
- local: chapter11/7
title: Exam Time!
quiz: 11

- title: Course Events
sections:
- local: events/1
Expand Down
33 changes: 33 additions & 0 deletions chapters/en/chapter11/1.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Supervised Fine-Tuning

This chapter will introduce fine-tuning generative language models with supervised fine-tuning (SFT). SFT involves adapting pre-trained models to specific tasks by further training them on task-specific datasets. This process helps models improve their performance on targeted tasks. We will separate this chapter into three sections:

## 1️⃣ Chat Templates

Chat templates structure interactions between users and AI models, ensuring consistent and contextually appropriate responses. They include components like system prompts and role-based messages.

## 2️⃣ Supervised Fine-Tuning

Supervised Fine-Tuning (SFT) is a critical process for adapting pre-trained language models to specific tasks. It involves training the model on a task-specific dataset with labeled examples. For a detailed guide on SFT, including key steps and best practices.

## 3️⃣ Low Rank Adaptation (LoRA)

Low Rank Adaptation (LoRA) is a technique for fine-tuning language models by adding low-rank matrices to the model's layers. This allows for efficient fine-tuning while preserving the model's pre-trained knowledge.

## 4️⃣ Evaluation

Evaluation is a crucial step in the fine-tuning process. It allows us to measure the performance of the model on a task-specific dataset.

<Tip>
⚠️ In order to benefit from all features available with the Model Hub and 🤗 Transformers, we recommend <a href="https://huggingface.co/join">creating an account</a>.
</Tip>

## References

- [Transformers documentation on chat templates](https://huggingface.co/docs/transformers/main/en/chat_templating)
- [Script for Supervised Fine-Tuning in TRL](https://github.com/huggingface/trl/blob/main/examples/scripts/sft.py)
- [`SFTTrainer` in TRL](https://huggingface.co/docs/trl/main/en/sft_trainer)
- [Direct Preference Optimization Paper](https://arxiv.org/abs/2305.18290)
- [Supervised Fine-Tuning with TRL](https://huggingface.co/docs/trl/main/en/tutorials/supervised_finetuning)
- [How to fine-tune Google Gemma with ChatML and Hugging Face TRL](https://www.philschmid.de/fine-tune-google-gemma)
- [Fine-tuning LLM to Generate Persian Product Catalogs in JSON Format](https://huggingface.co/learn/cookbook/en/fine_tuning_llm_to_generate_persian_product_catalogs_in_json_format)
257 changes: 257 additions & 0 deletions chapters/en/chapter11/2.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,257 @@
<CourseFloatingBanner chapter={2}
classNames="absolute z-10 right-0 top-0"
notebooks={[
{label: "Google Colab", value: "https://colab.research.google.com/github/huggingface/smol-course/blob/main/1_instruction_tuning/notebooks/chat_templates_example.ipynb"},
]} />

# Chat Templates

## Introduction
Chat templates are essential for structuring interactions between language models and users. They provide a consistent format for conversations, ensuring that models understand the context and role of each message while maintaining appropriate response patterns.

<Tip>
Chat templates are crucial for:
- Maintaining consistent conversation structure
- Ensuring proper role identification
- Managing context across multiple turns
- Supporting advanced features like tool use
</Tip>

## Model Types and Templates

### Base Models vs Instruct Models
A base model is trained on raw text data to predict the next token, while an instruct model is fine-tuned specifically to follow instructions and engage in conversations. For example, `SmolLM2-135M` is a base model, while `SmolLM2-135M-Instruct` is its instruction-tuned variant.

To make a base model behave like an instruct model, we need to format our prompts in a consistent way that the model can understand. This is where chat templates come in. ChatML is one such template format that structures conversations with clear role indicators (system, user, assistant).

<Tip warning={true}>
When using an instruct model, always verify you're using the correct chat template format. Using the wrong template can result in poor model performance or unexpected behavior.
</Tip>

### Common Template Formats

Different models use different chat template formats. To illustrate this, let's look at a few chat templates. Here's how the same conversation would be formatted for different models:

We'll use the following conversation structure for all examples:

```python
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"},
{"role": "assistant", "content": "Hi! How can I help you today?"},
{"role": "user", "content": "What's the weather?"},
]
```

This is using the `mistral` template format:

```sh
<s>[INST] You are a helpful assistant. [/INST]
Hi! How can I help you today?</s>
[INST] Hello! [/INST]
```

This is the chat template for a Qwen 2 model:

```sh
<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
Hello!<|im_end|>
<|im_start|>assistant
Hi! How can I help you today?<|im_end|>
<|im_start|>user
What's the weather?<|im_end|>
<|im_start|>assistant
```

Key differences between these formats include:
1. **System Message Handling**:
- Llama 2 wraps system messages in `<<SYS>>` tags
- Llama 3 uses `<|system|>` tags with `</s>` endings
- Mistral includes system message in the first instruction
- Qwen uses explicit `system` role with `<|im_start|>` tags
- ChatGPT uses `SYSTEM:` prefix

2. **Message Boundaries**:
- Llama 2 uses `[INST]` and `[/INST]` tags
- Llama 3 uses role-specific tags (`<|system|>`, `<|user|>`, `<|assistant|>`) with `</s>` endings
- Mistral uses `[INST]` and `[/INST]` with `<s>` and `</s>`
- Qwen uses role-specific start/end tokens

3. **Special Tokens**:
- Llama 2 uses `<s>` and `</s>` for conversation boundaries
- Llama 3 uses `</s>` to end each message
- Mistral uses `<s>` and `</s>` for turn boundaries
- Qwen uses role-specific start/end tokens

The transformers library handles these differences through model-specific chat templates. When you load a tokenizer, it automatically uses the correct template for that model:

```python
from transformers import AutoTokenizer

# These will use different templates automatically
llama_tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf")
mistral_tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")
qwen_tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen-7B-Chat")

messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"},
]

# Each will format according to its model's template
llama_chat = llama_tokenizer.apply_chat_template(messages, tokenize=False)
mistral_chat = mistral_tokenizer.apply_chat_template(messages, tokenize=False)
qwen_chat = qwen_tokenizer.apply_chat_template(messages, tokenize=False)
```

## Working with Templates

### Basic Implementation

The transformers library provides built-in support for chat templates through the `apply_chat_template()` method:

```python
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-135M-Instruct")

messages = [
{"role": "system", "content": "You are a helpful coding assistant."},
{"role": "user", "content": "Write a Python function to sort a list"},
]

# Apply the chat template
formatted_chat = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
```

This will return a formatted string that looks like:

```sh
<|im_start|>system
You are a helpful coding assistant.<|im_end|>
<|im_start|>user
Write a Python function to sort a list<|im_end|>
```

### Advanced Features
Chat templates can handle more complex scenarios, including:

1. **Tool Use**: When models need to interact with external tools or APIs
2. **Multimodal Inputs**: For handling images, audio, or other media types
3. **Function Calling**: For structured function execution
4. **Multi-turn Context**: For maintaining conversation history

<Tip>
When implementing advanced features:
- Test thoroughly with your specific model
- Handle errors gracefully
- Monitor token usage carefully
- Document the expected format for each feature
</Tip>

For multimodal conversations, chat templates can include image references or base64-encoded images:

```python
messages = [
{
"role": "system",
"content": "You are a helpful vision assistant that can analyze images.",
},
{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{"type": "image", "image_url": "https://example.com/image.jpg"},
],
},
]
```

Here's an example of a chat template with tool use:

```python
messages = [
{
"role": "system",
"content": "You are an AI assistant that can use tools. Available tools: calculator, weather_api",
},
{"role": "user", "content": "What's 123 * 456 and is it raining in Paris?"},
{
"role": "assistant",
"content": "Let me help you with that.",
"tool_calls": [
{
"tool": "calculator",
"parameters": {"operation": "multiply", "x": 123, "y": 456},
},
{"tool": "weather_api", "parameters": {"city": "Paris", "country": "France"}},
],
},
{"role": "tool", "tool_name": "calculator", "content": "56088"},
{
"role": "tool",
"tool_name": "weather_api",
"content": "{'condition': 'rain', 'temperature': 15}",
},
]
```

## Best Practices

### General Guidelines
When working with chat templates, follow these key practices:

1. **Consistent Formatting**: Always use the same template format throughout your application
2. **Clear Role Definition**: Clearly specify roles (system, user, assistant, tool) for each message
3. **Context Management**: Be mindful of token limits when maintaining conversation history
4. **Error Handling**: Include proper error handling for tool calls and multimodal inputs
5. **Validation**: Validate message structure before sending to the model

<Tip warning={true}>
Common pitfalls to avoid:
- Mixing different template formats in the same application
- Exceeding token limits with long conversation histories
- Not properly escaping special characters in messages
- Forgetting to validate input message structure
- Ignoring model-specific template requirements
</Tip>

## Hands-on Exercise

Let's practice implementing chat templates with a real-world example.

<Tip>
Follow these steps to convert the `HuggingFaceTB/smoltalk` dataset into chatml format:

1. Load the dataset:
```python
from datasets import load_dataset

dataset = load_dataset("HuggingFaceTB/smoltalk")
```

2. Create a processing function:
```python
def convert_to_chatml(example):
return {
"messages": [
{"role": "user", "content": example["input"]},
{"role": "assistant", "content": example["output"]},
]
}
```

3. Apply the chat template using your chosen model's tokenizer

Remember to validate your output format matches your target model's requirements!
</Tip>

## Additional Resources

- [Hugging Face Chat Templating Guide](https://huggingface.co/docs/transformers/main/en/chat_templating)
- [Transformers Documentation](https://huggingface.co/docs/transformers)
- [Chat Templates Examples Repository](https://github.com/chujiezheng/chat_templates)
Loading