huggingface · burtenshaw · Feb 17, 2025 · Jan 29, 2025 · Jan 30, 2025 · Jan 30, 2025
diff --git a/chapters/en/_toctree.yml b/chapters/en/_toctree.yml
@@ -210,6 +210,24 @@
     title: End-of-chapter quiz
     quiz: 10
 
+- title: 11. Supervised fine-tuning
+  sections:
+  - local: chapter11/1
+    title: Introduction
+  - local: chapter11/2
+    title: Chat Templates
+  - local: chapter11/3
+    title: Fine-Tuning with SFTTrainer
+  - local: chapter11/4
+    title: LoRA (Low-Rank Adaptation)
+  - local: chapter11/5
+    title: Evaluation
+  - local: chapter11/6
+    title: Conclusion
+  - local: chapter11/7
+    title: Exam Time!
+    quiz: 11
+
 - title: Course Events
   sections:
   - local: events/1

diff --git a/chapters/en/chapter11/1.mdx b/chapters/en/chapter11/1.mdx
@@ -0,0 +1,33 @@
+# Supervised Fine-Tuning
+
+This chapter will introduce fine-tuning generative language models with supervised fine-tuning (SFT). SFT involves adapting pre-trained models to specific tasks by further training them on task-specific datasets. This process helps models improve their performance on targeted tasks. We will separate this chapter into three sections:
+
+## 1️⃣ Chat Templates
+
+Chat templates structure interactions between users and AI models, ensuring consistent and contextually appropriate responses. They include components like system prompts and role-based messages.
+
+## 2️⃣ Supervised Fine-Tuning
+
+Supervised Fine-Tuning (SFT) is a critical process for adapting pre-trained language models to specific tasks. It involves training the model on a task-specific dataset with labeled examples. For a detailed guide on SFT, including key steps and best practices.
+
+## 3️⃣ Low Rank Adaptation (LoRA)
+
+Low Rank Adaptation (LoRA) is a technique for fine-tuning language models by adding low-rank matrices to the model's layers. This allows for efficient fine-tuning while preserving the model's pre-trained knowledge.
+
+## 4️⃣ Evaluation
+
+Evaluation is a crucial step in the fine-tuning process. It allows us to measure the performance of the model on a task-specific dataset.
+
+<Tip>
+⚠️ In order to benefit from all features available with the Model Hub and 🤗 Transformers, we recommend <a href="https://huggingface.co/join">creating an account</a>.
+</Tip>
+
+## References
+
+- [Transformers documentation on chat templates](https://huggingface.co/docs/transformers/main/en/chat_templating)
+- [Script for Supervised Fine-Tuning in TRL](https://github.com/huggingface/trl/blob/main/examples/scripts/sft.py)
+- [`SFTTrainer` in TRL](https://huggingface.co/docs/trl/main/en/sft_trainer)
+- [Direct Preference Optimization Paper](https://arxiv.org/abs/2305.18290)
+- [Supervised Fine-Tuning with TRL](https://huggingface.co/docs/trl/main/en/tutorials/supervised_finetuning)
+- [How to fine-tune Google Gemma with ChatML and Hugging Face TRL](https://www.philschmid.de/fine-tune-google-gemma)
+- [Fine-tuning LLM to Generate Persian Product Catalogs in JSON Format](https://huggingface.co/learn/cookbook/en/fine_tuning_llm_to_generate_persian_product_catalogs_in_json_format)
diff --git a/chapters/en/chapter11/2.mdx b/chapters/en/chapter11/2.mdx
@@ -0,0 +1,257 @@
+<CourseFloatingBanner chapter={2}
+  classNames="absolute z-10 right-0 top-0"
+  notebooks={[
+    {label: "Google Colab", value: "https://colab.research.google.com/github/huggingface/smol-course/blob/main/1_instruction_tuning/notebooks/chat_templates_example.ipynb"},
+]} />
+
+# Chat Templates
+
+## Introduction
+Chat templates are essential for structuring interactions between language models and users. They provide a consistent format for conversations, ensuring that models understand the context and role of each message while maintaining appropriate response patterns.
+
+<Tip>
+Chat templates are crucial for:
+- Maintaining consistent conversation structure
+- Ensuring proper role identification
+- Managing context across multiple turns
+- Supporting advanced features like tool use
+</Tip>
+
+## Model Types and Templates
+
+### Base Models vs Instruct Models
+A base model is trained on raw text data to predict the next token, while an instruct model is fine-tuned specifically to follow instructions and engage in conversations. For example, `SmolLM2-135M` is a base model, while `SmolLM2-135M-Instruct` is its instruction-tuned variant.
+
+To make a base model behave like an instruct model, we need to format our prompts in a consistent way that the model can understand. This is where chat templates come in. ChatML is one such template format that structures conversations with clear role indicators (system, user, assistant).
+
+<Tip warning={true}>
+When using an instruct model, always verify you're using the correct chat template format. Using the wrong template can result in poor model performance or unexpected behavior.
+</Tip>
+
+### Common Template Formats
+
+Different models use different chat template formats. To illustrate this, let's look at a few chat templates. Here's how the same conversation would be formatted for different models:
+
+We'll use the following conversation structure for all examples:
+
+```python
+messages = [
+    {"role": "system", "content": "You are a helpful assistant."},
+    {"role": "user", "content": "Hello!"},
+    {"role": "assistant", "content": "Hi! How can I help you today?"},
+    {"role": "user", "content": "What's the weather?"},
+]
+```
+
+This is using the `mistral` template format:
+
+```sh
+<s>[INST] You are a helpful assistant. [/INST]
+Hi! How can I help you today?</s>
+[INST] Hello! [/INST]
+```
+
+This is the chat template for a Qwen 2 model:
+
+```sh
+<|im_start|>system
+You are a helpful assistant.<|im_end|>
+<|im_start|>user
+Hello!<|im_end|>
+<|im_start|>assistant
+Hi! How can I help you today?<|im_end|>
+<|im_start|>user
+What's the weather?<|im_end|>
+<|im_start|>assistant
+```
+
+Key differences between these formats include:
+1. **System Message Handling**: 
+   - Llama 2 wraps system messages in `<<SYS>>` tags
+   - Llama 3 uses `<|system|>` tags with `</s>` endings
+   - Mistral includes system message in the first instruction
+   - Qwen uses explicit `system` role with `<|im_start|>` tags
+   - ChatGPT uses `SYSTEM:` prefix
+
+2. **Message Boundaries**:
+   - Llama 2 uses `[INST]` and `[/INST]` tags
+   - Llama 3 uses role-specific tags (`<|system|>`, `<|user|>`, `<|assistant|>`) with `</s>` endings
+   - Mistral uses `[INST]` and `[/INST]` with `<s>` and `</s>`
+   - Qwen uses role-specific start/end tokens
+
+3. **Special Tokens**:
+   - Llama 2 uses `<s>` and `</s>` for conversation boundaries
+   - Llama 3 uses `</s>` to end each message
+   - Mistral uses `<s>` and `</s>` for turn boundaries
+   - Qwen uses role-specific start/end tokens
+
+The transformers library handles these differences through model-specific chat templates. When you load a tokenizer, it automatically uses the correct template for that model:
+
+```python
+from transformers import AutoTokenizer
+
+# These will use different templates automatically
+llama_tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf")
+mistral_tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")
+qwen_tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen-7B-Chat")
+
+messages = [
+    {"role": "system", "content": "You are a helpful assistant."},
+    {"role": "user", "content": "Hello!"},
+]
+
+# Each will format according to its model's template
+llama_chat = llama_tokenizer.apply_chat_template(messages, tokenize=False)
+mistral_chat = mistral_tokenizer.apply_chat_template(messages, tokenize=False)
+qwen_chat = qwen_tokenizer.apply_chat_template(messages, tokenize=False)
+```
+
+## Working with Templates
+
+### Basic Implementation
+
+The transformers library provides built-in support for chat templates through the `apply_chat_template()` method:
+
+```python
+from transformers import AutoTokenizer
+
+tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-135M-Instruct")
+
+messages = [
+    {"role": "system", "content": "You are a helpful coding assistant."},
+    {"role": "user", "content": "Write a Python function to sort a list"},
+]
+
+# Apply the chat template
+formatted_chat = tokenizer.apply_chat_template(
+    messages, tokenize=False, add_generation_prompt=True
+)
+```
+
+This will return a formatted string that looks like:
+
+```sh
+<|im_start|>system
+You are a helpful coding assistant.<|im_end|>
+<|im_start|>user
+Write a Python function to sort a list<|im_end|>
+```
+
+### Advanced Features
+Chat templates can handle more complex scenarios, including:
+
+1. **Tool Use**: When models need to interact with external tools or APIs
+2. **Multimodal Inputs**: For handling images, audio, or other media types
+3. **Function Calling**: For structured function execution
+4. **Multi-turn Context**: For maintaining conversation history
+
+<Tip>
+When implementing advanced features:
+- Test thoroughly with your specific model
+- Handle errors gracefully
+- Monitor token usage carefully
+- Document the expected format for each feature
+</Tip>
+
+For multimodal conversations, chat templates can include image references or base64-encoded images:
+
+```python
+messages = [
+    {
+        "role": "system",
+        "content": "You are a helpful vision assistant that can analyze images.",
+    },
+    {
+        "role": "user",
+        "content": [
+            {"type": "text", "text": "What's in this image?"},
+            {"type": "image", "image_url": "https://example.com/image.jpg"},
+        ],
+    },
+]
+```
+
+Here's an example of a chat template with tool use:
+
+```python
+messages = [
+    {
+        "role": "system",
+        "content": "You are an AI assistant that can use tools. Available tools: calculator, weather_api",
+    },
+    {"role": "user", "content": "What's 123 * 456 and is it raining in Paris?"},
+    {
+        "role": "assistant",
+        "content": "Let me help you with that.",
+        "tool_calls": [
+            {
+                "tool": "calculator",
+                "parameters": {"operation": "multiply", "x": 123, "y": 456},
+            },
+            {"tool": "weather_api", "parameters": {"city": "Paris", "country": "France"}},
+        ],
+    },
+    {"role": "tool", "tool_name": "calculator", "content": "56088"},
+    {
+        "role": "tool",
+        "tool_name": "weather_api",
+        "content": "{'condition': 'rain', 'temperature': 15}",
+    },
+]
+```
+
+## Best Practices
+
+### General Guidelines
+When working with chat templates, follow these key practices:
+
+1. **Consistent Formatting**: Always use the same template format throughout your application
+2. **Clear Role Definition**: Clearly specify roles (system, user, assistant, tool) for each message
+3. **Context Management**: Be mindful of token limits when maintaining conversation history
+4. **Error Handling**: Include proper error handling for tool calls and multimodal inputs
+5. **Validation**: Validate message structure before sending to the model
+
+<Tip warning={true}>
+Common pitfalls to avoid:
+- Mixing different template formats in the same application
+- Exceeding token limits with long conversation histories
+- Not properly escaping special characters in messages
+- Forgetting to validate input message structure
+- Ignoring model-specific template requirements
+</Tip>
+
+## Hands-on Exercise
+
+Let's practice implementing chat templates with a real-world example.
+
+<Tip>
+Follow these steps to convert the `HuggingFaceTB/smoltalk` dataset into chatml format:
+
+1. Load the dataset:
+```python
+from datasets import load_dataset
+
+dataset = load_dataset("HuggingFaceTB/smoltalk")
+```
+
+2. Create a processing function:
+```python
+def convert_to_chatml(example):
+    return {
+        "messages": [
+            {"role": "user", "content": example["input"]},
+            {"role": "assistant", "content": example["output"]},
+        ]
+    }
+```
+
+3. Apply the chat template using your chosen model's tokenizer
+
+Remember to validate your output format matches your target model's requirements!
+</Tip>
+
+## Additional Resources
+
+- [Hugging Face Chat Templating Guide](https://huggingface.co/docs/transformers/main/en/chat_templating)
+- [Transformers Documentation](https://huggingface.co/docs/transformers)
+- [Chat Templates Examples Repository](https://github.com/chujiezheng/chat_templates)