From 7d2a168ac0bb55b5c9a1e662329e00b5b3fda65a Mon Sep 17 00:00:00 2001 From: Ofir Zafrir Date: Thu, 6 Feb 2025 16:46:42 +0200 Subject: [PATCH] Add link to DeepSeek FastDraft notebook in the DeepSeek notebook (#2722) @eaidova --- notebooks/deepseek-r1/README.md | 1 + notebooks/deepseek-r1/deepseek-r1.ipynb | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/notebooks/deepseek-r1/README.md b/notebooks/deepseek-r1/README.md index 0cf3b169576..e76d386feb2 100644 --- a/notebooks/deepseek-r1/README.md +++ b/notebooks/deepseek-r1/README.md @@ -13,6 +13,7 @@ The tutorial supports different models, you can select one from the provided opt * **DeepSeek-R1-Distill-Qwen-7B** is a distilled model based on [Qwen-2.5-Math-7B](https://huggingface.co/Qwen/Qwen2.5-Math-7B). The model demonstrates a good balance between mathematical and factual reasoning and can be less suited for complex coding tasks. Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) for more info. * **DeepSeek-R1-Distil-Qwen-14B** is a distilled model based on [Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) that has great competence in factual reasoning and solving complex mathematical tasks. Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B) for more info. +Learn how to accelerate **DeepSeek-R1-Distill-Llama-8B** with **FastDraft** and OpenVINO GenAI speculative decoding pipeline in this [notebook](../../supplementary_materials/notebooks/fastdraft-deepseek/fastdraft_deepseek.ipynb) ## Notebook Contents The tutorial consists of the following steps: diff --git a/notebooks/deepseek-r1/deepseek-r1.ipynb b/notebooks/deepseek-r1/deepseek-r1.ipynb index 926a60d2bdd..07bf0d982e1 100644 --- a/notebooks/deepseek-r1/deepseek-r1.ipynb +++ b/notebooks/deepseek-r1/deepseek-r1.ipynb @@ -106,7 +106,7 @@ "\n", "The tutorial supports different models, you can select one from the provided options to compare the quality of LLM solutions:\n", "\n", - "* **DeepSeek-R1-Distill-Llama-8B** is a distilled model based on [Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B), that prioritizes high performance and advanced reasoning capabilities, particularly excelling in tasks requiring mathematical and factual precision. Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) for more info.\n", + "* **DeepSeek-R1-Distill-Llama-8B** is a distilled model based on [Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B), that prioritizes high performance and advanced reasoning capabilities, particularly excelling in tasks requiring mathematical and factual precision. Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) for more info. Note: this model can also be accelerated with [FastDraft](../../supplementary_materials/notebooks/fastdraft-deepseek/fastdraft_deepseek.ipynb).\n", "* **DeepSeek-R1-Distill-Qwen-1.5B** is the smallest DeepSeek-R1 distilled model based on [Qwen2.5-Math-1.5B](https://huggingface.co/Qwen/Qwen2.5-Math-1.5B). Despite its compact size, the model demonstrates strong capabilities in solving basic mathematical tasks, at the same time its programming capabilities are limited. Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) for more info.\n", "* **DeepSeek-R1-Distill-Qwen-7B** is a distilled model based on [Qwen-2.5-Math-7B](https://huggingface.co/Qwen/Qwen2.5-Math-7B). The model demonstrates a good balance between mathematical and factual reasoning and can be less suited for complex coding tasks. Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) for more info.\n", "* **DeepSeek-R1-Distil-Qwen-14B** is a distilled model based on [Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) that has great competence in factual reasoning and solving complex mathematical tasks. Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B) for more info.\n",