openvinotoolkit
diff --git a/‎.ci/spellcheck/.pyspelling.wordlist.txt
Lines changed: 1 addition & 0 deletions b/‎.ci/spellcheck/.pyspelling.wordlist.txt
Lines changed: 1 addition & 0 deletions
diff --git a/‎notebooks/llm-chatbot/README.md
Lines changed: 3 additions & 0 deletions b/‎notebooks/llm-chatbot/README.md
Lines changed: 3 additions & 0 deletions
diff --git a/‎notebooks/llm-chatbot/gradio_helper_genai.py
Lines changed: 8 additions & 2 deletions b/‎notebooks/llm-chatbot/gradio_helper_genai.py
Lines changed: 8 additions & 2 deletions
@@ -183,6 +183,7 @@ DeciDiffusion's
 deduplicated
 DeepFloyd
 DeepLabV
+DeepSeek
 denoise
 denoised
 denoises
 
@@ -67,6 +67,9 @@ For more details, please refer to [model_card](https://huggingface.co/Qwen/Qwen2
 * **internlm2-chat-1.8b** - InternLM2 is the second generation InternLM series. Compared to the previous generation model, it shows significant improvements in various capabilities, including reasoning, mathematics, and coding. More details about model can be found in [model repository](https://huggingface.co/internlm).
 * **glm-4-9b-chat** - GLM-4-9B is the open-source version of the latest generation of pre-trained models in the GLM-4 series launched by Zhipu AI. In the evaluation of data sets in semantics, mathematics, reasoning, code, and knowledge, GLM-4-9B and its human preference-aligned version GLM-4-9B-Chat have shown superior performance beyond Llama-3-8B. In addition to multi-round conversations, GLM-4-9B-Chat also has advanced features such as web browsing, code execution, custom tool calls (Function Call), and long text reasoning (supporting up to 128K context). More details about model can be found in [model card](https://huggingface.co/THUDM/glm-4-9b-chat/blob/main/README_en.md), [technical report](https://arxiv.org/pdf/2406.12793) and [repository](https://github.com/THUDM/GLM-4).
 *  **minicpm3-4b** - MiniCPM3-4B is the 3rd generation of MiniCPM series. The overall performance of MiniCPM3-4B surpasses Phi-3.5-mini-Instruct, being comparable with many recent 7B~9B models.Compared to previous generations, MiniCPM3-4B has a more powerful and versatile skill set to enable more general usage. More details can be found in [model card](https://huggingface.co/openbmb/MiniCPM3-4B).
+* **DeepSeek-R1-Distill-Qwen-1.5B** - Qwen2.5-1.5B fine-tuned using the reasoning data generated by [DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1). You can find more info in [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)
+* **DeepSeek-R1-Distill-Qwen-7B** - Qwen2.5-7B fine-tuned using the reasoning data generated by [DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1). You can find more info in [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)
+* **DeepSeek-R1-Distill-Llama-8B** - Llama-3.1-8B fine-tuned using the reasoning data generated by [DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1). You can find more info in [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B)
 
 The image below illustrates the provided user instruction and model answer examples.
 
 
@@ -3,6 +3,7 @@
 from uuid import uuid4
 from threading import Event, Thread
 import queue
+import sys
 
 max_new_tokens = 256
 
@@ -54,7 +55,9 @@
 """
 
 
-def get_system_prompt(model_language):
+def get_system_prompt(model_language, system_prompt=None):
+    if system_prompt is not None:
+        return system_prompt
     return (
         DEFAULT_SYSTEM_PROMPT_CHINESE
         if (model_language == "Chinese")
@@ -189,6 +192,7 @@ def put(self, token_id: int) -> bool:
         if (len(self.tokens_cache) + 1) % self.tokens_len != 0:
             self.tokens_cache.append(token_id)
             return False
+        sys.stdout.flush()
         return super().put(token_id)
 
 
@@ -197,7 +201,9 @@ def make_demo(pipe, model_configuration, model_id, model_language, disable_advan
 
     max_new_tokens = 256
 
-    start_message = get_system_prompt(model_language)
+    start_message = get_system_prompt(model_language, model_configuration.get("system_prompt"))
+    if "genai_chat_template" in model_configuration:
+        pipe.get_tokenizer().set_chat_template(model_configuration["genai_chat_template"])
 
     def get_uuid():
         """