From 7d2a168ac0bb55b5c9a1e662329e00b5b3fda65a Mon Sep 17 00:00:00 2001
From: Ofir Zafrir <ofir.zafrir@intel.com>
Date: Thu, 6 Feb 2025 16:46:42 +0200
Subject: [PATCH] Add link to DeepSeek FastDraft notebook in the DeepSeek
 notebook (#2722)

@eaidova
---
 notebooks/deepseek-r1/README.md         | 1 +
 notebooks/deepseek-r1/deepseek-r1.ipynb | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/notebooks/deepseek-r1/README.md b/notebooks/deepseek-r1/README.md
index 0cf3b169576..e76d386feb2 100644
--- a/notebooks/deepseek-r1/README.md
+++ b/notebooks/deepseek-r1/README.md
@@ -13,6 +13,7 @@ The tutorial supports different models, you can select one from the provided opt
 * **DeepSeek-R1-Distill-Qwen-7B** is a distilled model based on [Qwen-2.5-Math-7B](https://huggingface.co/Qwen/Qwen2.5-Math-7B). The model demonstrates a good balance between mathematical and factual reasoning and can be less suited for complex coding tasks. Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) for more info.
 * **DeepSeek-R1-Distil-Qwen-14B** is a distilled model based on [Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) that has great competence in factual reasoning and solving complex mathematical tasks.  Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B) for more info.
 
+Learn how to accelerate **DeepSeek-R1-Distill-Llama-8B** with **FastDraft** and OpenVINO GenAI speculative decoding pipeline in this [notebook](../../supplementary_materials/notebooks/fastdraft-deepseek/fastdraft_deepseek.ipynb)
 ## Notebook Contents
 
 The tutorial consists of the following steps:
diff --git a/notebooks/deepseek-r1/deepseek-r1.ipynb b/notebooks/deepseek-r1/deepseek-r1.ipynb
index 926a60d2bdd..07bf0d982e1 100644
--- a/notebooks/deepseek-r1/deepseek-r1.ipynb
+++ b/notebooks/deepseek-r1/deepseek-r1.ipynb
@@ -106,7 +106,7 @@
     "\n",
     "The tutorial supports different models, you can select one from the provided options to compare the quality of LLM solutions:\n",
     "\n",
-    "* **DeepSeek-R1-Distill-Llama-8B** is a distilled model based on [Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B), that prioritizes high performance and advanced reasoning capabilities, particularly excelling in tasks requiring mathematical and factual precision. Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) for more info.\n",
+    "* **DeepSeek-R1-Distill-Llama-8B** is a distilled model based on [Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B), that prioritizes high performance and advanced reasoning capabilities, particularly excelling in tasks requiring mathematical and factual precision. Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) for more info. Note: this model can also be accelerated with [FastDraft](../../supplementary_materials/notebooks/fastdraft-deepseek/fastdraft_deepseek.ipynb).\n",
     "* **DeepSeek-R1-Distill-Qwen-1.5B** is the smallest DeepSeek-R1 distilled model based on [Qwen2.5-Math-1.5B](https://huggingface.co/Qwen/Qwen2.5-Math-1.5B). Despite its compact size, the model demonstrates strong capabilities in solving basic mathematical tasks, at the same time its programming capabilities are limited. Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) for more info.\n",
     "* **DeepSeek-R1-Distill-Qwen-7B** is a distilled model based on [Qwen-2.5-Math-7B](https://huggingface.co/Qwen/Qwen2.5-Math-7B). The model demonstrates a good balance between mathematical and factual reasoning and can be less suited for complex coding tasks. Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) for more info.\n",
     "* **DeepSeek-R1-Distil-Qwen-14B** is a distilled model based on [Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) that has great competence in factual reasoning and solving complex mathematical tasks.  Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B) for more info.\n",