Merge pull request #777 from burtenshaw/add-supervised-finetuning

[CHAPTER] New chapter on supervised fine tuning based on smol course
huggingface · Feb 17, 2025 · f1a8e79 · f1a8e79
2 parents fb04c75 + c74ebd3
commit f1a8e79
Show file tree

Hide file tree

Showing 8 changed files with 1,130 additions and 1 deletion.
diff --git a/chapters/en/_toctree.yml b/chapters/en/_toctree.yml
@@ -191,7 +191,6 @@
     quiz: 9
 
 - title: 10. Curate high-quality datasets
-  new: true
   subtitle: How to use Argilla to create amazing datasets
   sections:
   - local: chapter10/1
@@ -210,6 +209,26 @@
     title: End-of-chapter quiz
     quiz: 10
 
+- title: 11. Fine-tune Large Language Models
+  subtitle: Use Supervised Fine-tuning and Low-Rank Adaptation to fine-tune a large language model
+  new: true
+  sections:
+  - local: chapter11/1
+    title: Introduction
+  - local: chapter11/2
+    title: Chat Templates
+  - local: chapter11/3
+    title: Fine-Tuning with SFTTrainer
+  - local: chapter11/4
+    title: LoRA (Low-Rank Adaptation)
+  - local: chapter11/5
+    title: Evaluation
+  - local: chapter11/6
+    title: Conclusion
+  - local: chapter11/7
+    title: Exam Time!
+    quiz: 11
+
 - title: Course Events
   sections:
   - local: events/1

diff --git a/chapters/en/chapter11/1.mdx b/chapters/en/chapter11/1.mdx
@@ -0,0 +1,33 @@
+# Supervised Fine-Tuning
+
+In [Chapter 2 Section 2](/course/chapter2/2), we saw that generative language models can be fine-tuned on specific tasks like summarization and question answering. However, nowadays it is far more common to fine-tune language models on a broad range of tasks simultaneously; a method known as supervised fine-tuning (SFT). This process helps models become more versatile and capable of handling diverse use cases. Most LLMs that people interact with on platforms like ChatGPT have undergone SFT to make them more helpful and aligned with human preferences. We will separate this chapter into four sections:
+
+## 1️⃣ Chat Templates
+
+Chat templates structure interactions between users and AI models, ensuring consistent and contextually appropriate responses. They include components like system prompts and role-based messages.
+
+## 2️⃣ Supervised Fine-Tuning
+
+Supervised Fine-Tuning (SFT) is a critical process for adapting pre-trained language models to specific tasks. It involves training the model on a task-specific dataset with labeled examples. For a detailed guide on SFT, including key steps and best practices, see [The supervised fine-tuning section of the TRL documentation](https://huggingface.co/docs/trl/en/sft_trainer).
+
+## 3️⃣ Low Rank Adaptation (LoRA)
+
+Low Rank Adaptation (LoRA) is a technique for fine-tuning language models by adding low-rank matrices to the model's layers. This allows for efficient fine-tuning while preserving the model's pre-trained knowledge. One of the key benefits of LoRA is the significant memory savings it offers, making it possible to fine-tune large models on hardware with limited resources.
+
+## 4️⃣ Evaluation
+
+Evaluation is a crucial step in the fine-tuning process. It allows us to measure the performance of the model on a task-specific dataset.
+
+<Tip>
+⚠️ In order to benefit from all features available with the Model Hub and 🤗 Transformers, we recommend <a href="https://huggingface.co/join">creating an account</a>.
+</Tip>
+
+## References
+
+- [Transformers documentation on chat templates](https://huggingface.co/docs/transformers/main/en/chat_templating)
+- [Script for Supervised Fine-Tuning in TRL](https://github.com/huggingface/trl/blob/main/examples/scripts/sft.py)
+- [`SFTTrainer` in TRL](https://huggingface.co/docs/trl/main/en/sft_trainer)
+- [Direct Preference Optimization Paper](https://arxiv.org/abs/2305.18290)
+- [Supervised Fine-Tuning with TRL](https://huggingface.co/docs/trl/main/en/tutorials/supervised_finetuning)
+- [How to fine-tune Google Gemma with ChatML and Hugging Face TRL](https://github.com/huggingface/alignment-handbook)  
+- [Fine-tuning LLM to Generate Persian Product Catalogs in JSON Format](https://huggingface.co/learn/cookbook/en/fine_tuning_llm_to_generate_persian_product_catalogs_in_json_format)
diff --git a/chapters/en/chapter11/2.mdx b/chapters/en/chapter11/2.mdx
@@ -0,0 +1,254 @@
+<CourseFloatingBanner chapter={2}
+  classNames="absolute z-10 right-0 top-0"
+  notebooks={[
+    {label: "Google Colab", value: "https://colab.research.google.com/github/huggingface/notebooks/blob/main/course/en/chapter11/section2.ipynb"},
+]} />
+
+# Chat Templates
+
+## Introduction
+
+Chat templates are essential for structuring interactions between language models and users. Whether you're building a simple chatbot or a complex AI agent, understanding how to properly format your conversations is crucial for getting the best results from your model. In this guide, we'll explore what chat templates are, why they matter, and how to use them effectively.
+
+<Tip>
+Chat templates are crucial for:
+- Maintaining consistent conversation structure
+- Ensuring proper role identification
+- Managing context across multiple turns
+- Supporting advanced features like tool use
+</Tip>
+
+## Model Types and Templates
+
+### Base Models vs Instruct Models
+A base model is trained on raw text data to predict the next token, while an instruct model is fine-tuned specifically to follow instructions and engage in conversations. For example, [`SmolLM2-135M`](https://huggingface.co/HuggingFaceTB/SmolLM2-135M) is a base model, while [`SmolLM2-135M-Instruct`](https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct) is its instruction-tuned variant.  
+
+Instuction tuned models are trained to follow a specific conversational structure, making them more suitable for chatbot applications. Moreover, instruct models can handle complex interactions, including tool use, multimodal inputs, and function calling.
+
+To make a base model behave like an instruct model, we need to format our prompts in a consistent way that the model can understand. This is where chat templates come in. ChatML is one such template format that structures conversations with clear role indicators (system, user, assistant). Here's a guide on [ChatML](https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct/blob/e2c3f7557efbdec707ae3a336371d169783f1da1/tokenizer_config.json#L146).
+
+<Tip warning={true}>
+When using an instruct model, always verify you're using the correct chat template format. Using the wrong template can result in poor model performance or unexpected behavior. The easiest way to ensure this is to check the model tokenizer configuration on the Hub. For example, the `SmolLM2-135M-Instruct` model uses [this configuration](https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct/blob/e2c3f7557efbdec707ae3a336371d169783f1da1/tokenizer_config.json#L146).  
+</Tip>
+
+### Common Template Formats
+
+Before diving into specific implementations, it's important to understand how different models expect their conversations to be formatted. Let's explore some common template formats using a simple example conversation:
+
+We'll use the following conversation structure for all examples:
+
+```python
+messages = [
+    {"role": "system", "content": "You are a helpful assistant."},
+    {"role": "user", "content": "Hello!"},
+    {"role": "assistant", "content": "Hi! How can I help you today?"},
+    {"role": "user", "content": "What's the weather?"},
+]
+```
+
+This is the ChatML template used in models like SmolLM2 and Qwen 2:
+
+```sh
+<|im_start|>system
+You are a helpful assistant.<|im_end|>
+<|im_start|>user
+Hello!<|im_end|>
+<|im_start|>assistant
+Hi! How can I help you today?<|im_end|>
+<|im_start|>user
+What's the weather?<|im_start|>assistant
+```
+
+This is using the `mistral` template format:
+
+```sh
+<s>[INST] You are a helpful assistant. [/INST]
+Hi! How can I help you today?</s>
+[INST] Hello! [/INST]
+```
+
+Key differences between these formats include:
+1. **System Message Handling**: 
+   - Llama 2 wraps system messages in `<<SYS>>` tags
+   - Llama 3 uses `<|system|>` tags with `</s>` endings
+   - Mistral includes system message in the first instruction
+   - Qwen uses explicit `system` role with `<|im_start|>` tags
+   - ChatGPT uses `SYSTEM:` prefix
+
+2. **Message Boundaries**:
+   - Llama 2 uses `[INST]` and `[/INST]` tags
+   - Llama 3 uses role-specific tags (`<|system|>`, `<|user|>`, `<|assistant|>`) with `</s>` endings
+   - Mistral uses `[INST]` and `[/INST]` with `<s>` and `</s>`
+   - Qwen uses role-specific start/end tokens
+
+3. **Special Tokens**:
+   - Llama 2 uses `<s>` and `</s>` for conversation boundaries
+   - Llama 3 uses `</s>` to end each message
+   - Mistral uses `<s>` and `</s>` for turn boundaries
+   - Qwen uses role-specific start/end tokens
+
+Understanding these differences is key to working with various models. Let's look at how the transformers library helps us handle these variations automatically:
+
+```python
+from transformers import AutoTokenizer
+
+# These will use different templates automatically
+mistral_tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")
+qwen_tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen-7B-Chat")
+smol_tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-135M-Instruct")
+
+messages = [
+    {"role": "system", "content": "You are a helpful assistant."},
+    {"role": "user", "content": "Hello!"},
+]
+
+# Each will format according to its model's template
+mistral_chat = mistral_tokenizer.apply_chat_template(messages, tokenize=False)
+qwen_chat = qwen_tokenizer.apply_chat_template(messages, tokenize=False)
+smol_chat = smol_tokenizer.apply_chat_template(messages, tokenize=False)
+```
+
+<details>
+<summary>Click to see template examples</summary>
+
+Qwen 2 and SmolLM2 ChatML template:
+
+```sh
+<|im_start|>system
+You are a helpful assistant.<|im_end|>
+<|im_start|>user
+Hello!<|im_end|>
+<|im_start|>assistant
+Hi! How can I help you today?<|im_end|>
+<|im_start|>user
+What's the weather?<|im_start|>assistant
+```
+
+Mistral template:
+
+```sh
+<s>[INST] You are a helpful assistant. [/INST]
+Hi! How can I help you today?</s>
+[INST] Hello! [/INST]
+```
+
+</details>
+
+
+### Advanced Features
+Chat templates can handle more complex scenarios beyond just conversational interactions, including:
+
+1. **Tool Use**: When models need to interact with external tools or APIs
+2. **Multimodal Inputs**: For handling images, audio, or other media types
+3. **Function Calling**: For structured function execution
+4. **Multi-turn Context**: For maintaining conversation history
+
+<Tip>
+When implementing advanced features:
+- Test thoroughly with your specific model. Vision and tool use template are particularly diverse.
+- Monitor token usage carefully between each feature and model.
+- Document the expected format for each feature
+</Tip>
+
+For multimodal conversations, chat templates can include image references or base64-encoded images:
+
+```python
+messages = [
+    {
+        "role": "system",
+        "content": "You are a helpful vision assistant that can analyze images.",
+    },
+    {
+        "role": "user",
+        "content": [
+            {"type": "text", "text": "What's in this image?"},
+            {"type": "image", "image_url": "https://example.com/image.jpg"},
+        ],
+    },
+]
+```
+
+Here's an example of a chat template with tool use:
+
+```python
+messages = [
+    {
+        "role": "system",
+        "content": "You are an AI assistant that can use tools. Available tools: calculator, weather_api",
+    },
+    {"role": "user", "content": "What's 123 * 456 and is it raining in Paris?"},
+    {
+        "role": "assistant",
+        "content": "Let me help you with that.",
+        "tool_calls": [
+            {
+                "tool": "calculator",
+                "parameters": {"operation": "multiply", "x": 123, "y": 456},
+            },
+            {"tool": "weather_api", "parameters": {"city": "Paris", "country": "France"}},
+        ],
+    },
+    {"role": "tool", "tool_name": "calculator", "content": "56088"},
+    {
+        "role": "tool",
+        "tool_name": "weather_api",
+        "content": "{'condition': 'rain', 'temperature': 15}",
+    },
+]
+```
+
+## Best Practices
+
+### General Guidelines
+When working with chat templates, follow these key practices:
+
+1. **Consistent Formatting**: Always use the same template format throughout your application
+2. **Clear Role Definition**: Clearly specify roles (system, user, assistant, tool) for each message
+3. **Context Management**: Be mindful of token limits when maintaining conversation history
+4. **Error Handling**: Include proper error handling for tool calls and multimodal inputs
+5. **Validation**: Validate message structure before sending to the model
+
+<Tip warning={true}>
+Common pitfalls to avoid:
+- Mixing different template formats in the same application
+- Exceeding token limits with long conversation histories
+- Not properly escaping special characters in messages
+- Forgetting to validate input message structure
+- Ignoring model-specific template requirements
+</Tip>
+
+## Hands-on Exercise
+
+Let's practice implementing chat templates with a real-world example.
+
+<Tip>
+Follow these steps to convert the `HuggingFaceTB/smoltalk` dataset into chatml format:
+
+1. Load the dataset:
+```python
+from datasets import load_dataset
+
+dataset = load_dataset("HuggingFaceTB/smoltalk")
+```
+
+2. Create a processing function:
+```python
+def convert_to_chatml(example):
+    return {
+        "messages": [
+            {"role": "user", "content": example["input"]},
+            {"role": "assistant", "content": example["output"]},
+        ]
+    }
+```
+
+3. Apply the chat template using your chosen model's tokenizer
+
+Remember to validate your output format matches your target model's requirements!
+</Tip>
+
+## Additional Resources
+
+- [Hugging Face Chat Templating Guide](https://huggingface.co/docs/transformers/main/en/chat_templating)
+- [Transformers Documentation](https://huggingface.co/docs/transformers)
+- [Chat Templates Examples Repository](https://github.com/chujiezheng/chat_templates)