Skip to content

Commit cf02913

Browse files
authored
support deepseek-r1 distilled models in llm chatbot (#2696)
1 parent 9fce4a6 commit cf02913

File tree

6 files changed

+359
-254
lines changed

6 files changed

+359
-254
lines changed

.ci/spellcheck/.pyspelling.wordlist.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -183,6 +183,7 @@ DeciDiffusion's
183183
deduplicated
184184
DeepFloyd
185185
DeepLabV
186+
DeepSeek
186187
denoise
187188
denoised
188189
denoises

notebooks/llm-chatbot/README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,9 @@ For more details, please refer to [model_card](https://huggingface.co/Qwen/Qwen2
6767
* **internlm2-chat-1.8b** - InternLM2 is the second generation InternLM series. Compared to the previous generation model, it shows significant improvements in various capabilities, including reasoning, mathematics, and coding. More details about model can be found in [model repository](https://huggingface.co/internlm).
6868
* **glm-4-9b-chat** - GLM-4-9B is the open-source version of the latest generation of pre-trained models in the GLM-4 series launched by Zhipu AI. In the evaluation of data sets in semantics, mathematics, reasoning, code, and knowledge, GLM-4-9B and its human preference-aligned version GLM-4-9B-Chat have shown superior performance beyond Llama-3-8B. In addition to multi-round conversations, GLM-4-9B-Chat also has advanced features such as web browsing, code execution, custom tool calls (Function Call), and long text reasoning (supporting up to 128K context). More details about model can be found in [model card](https://huggingface.co/THUDM/glm-4-9b-chat/blob/main/README_en.md), [technical report](https://arxiv.org/pdf/2406.12793) and [repository](https://github.com/THUDM/GLM-4).
6969
* **minicpm3-4b** - MiniCPM3-4B is the 3rd generation of MiniCPM series. The overall performance of MiniCPM3-4B surpasses Phi-3.5-mini-Instruct, being comparable with many recent 7B~9B models.Compared to previous generations, MiniCPM3-4B has a more powerful and versatile skill set to enable more general usage. More details can be found in [model card](https://huggingface.co/openbmb/MiniCPM3-4B).
70+
* **DeepSeek-R1-Distill-Qwen-1.5B** - Qwen2.5-1.5B fine-tuned using the reasoning data generated by [DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1). You can find more info in [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)
71+
* **DeepSeek-R1-Distill-Qwen-7B** - Qwen2.5-7B fine-tuned using the reasoning data generated by [DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1). You can find more info in [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)
72+
* **DeepSeek-R1-Distill-Llama-8B** - Llama-3.1-8B fine-tuned using the reasoning data generated by [DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1). You can find more info in [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B)
7073

7174
The image below illustrates the provided user instruction and model answer examples.
7275

notebooks/llm-chatbot/gradio_helper_genai.py

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
from uuid import uuid4
44
from threading import Event, Thread
55
import queue
6+
import sys
67

78
max_new_tokens = 256
89

@@ -54,7 +55,9 @@
5455
"""
5556

5657

57-
def get_system_prompt(model_language):
58+
def get_system_prompt(model_language, system_prompt=None):
59+
if system_prompt is not None:
60+
return system_prompt
5861
return (
5962
DEFAULT_SYSTEM_PROMPT_CHINESE
6063
if (model_language == "Chinese")
@@ -189,6 +192,7 @@ def put(self, token_id: int) -> bool:
189192
if (len(self.tokens_cache) + 1) % self.tokens_len != 0:
190193
self.tokens_cache.append(token_id)
191194
return False
195+
sys.stdout.flush()
192196
return super().put(token_id)
193197

194198

@@ -197,7 +201,9 @@ def make_demo(pipe, model_configuration, model_id, model_language, disable_advan
197201

198202
max_new_tokens = 256
199203

200-
start_message = get_system_prompt(model_language)
204+
start_message = get_system_prompt(model_language, model_configuration.get("system_prompt"))
205+
if "genai_chat_template" in model_configuration:
206+
pipe.get_tokenizer().set_chat_template(model_configuration["genai_chat_template"])
201207

202208
def get_uuid():
203209
"""

0 commit comments

Comments
 (0)