Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the gibberish output from llm-prompt #279

Closed
wants to merge 2 commits into from

Conversation

apsonawane
Copy link
Collaborator

@apsonawane apsonawane commented Feb 1, 2025

Currently when we run llm-prompt with any model we get a gibberish output similar to this:

deepseek-ai/DeepSeek-R1-Distill-Qwen-7B:
        Parameters:     7,615,616,512 (28.37 GB)
        Build dir:      C:\Users\asonawane\.cache\lemonade\deepseek-ai_DeepSeek-R1-Distill-Qwen-7B
        Peak memory:    21.036 GB
        Status:         Successful build!
        Dtype:                      float16
        Prompt Tokens:              3
        Prompt:                     Hi There
        Response Tokens:            512
        Response:                   : is a 911111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
                                    1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
                                    1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
                                    1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
                                    1111111111111111111111111111111111111111111

Added a fix to support chat_template similar to oga script here: https://github.com/microsoft/onnxruntime-genai/blob/main/examples/python/model-chat.py

After the fix

deepseek-ai/DeepSeek-R1-Distill-Qwen-7B:
        Build dir:      C:\Users\asonawane\.cache\lemonade\deepseek-ai_DeepSeek-R1-Distill-Qwen-7B
        Peak memory:    1.995 GB
        Status:         Successful build!
        Dtype:                      int4
        Device:                     cuda
        Oga Models Subfolder:       deepseek-ai_deepseek-r1-distill-qwen-7b\cuda-int4
        Prompt Tokens:              3
        Prompt:                     Hi there
        Response Tokens:            446
        Response:                   <|im_start|>user Hi there<|im_end|> <|im_start|>assistant Okay, so I need to figure out how to respond to this message. The user wrote something like "Hi there<|im_end|>
                                    <|im_start|>assistant" and then sent "Hi there<|im_end|> <|im_start|>assistant". I'm not entirely sure what they're trying to do, but I think they want to say "Hi there"
                                    and then "Hi there assistant" or something like that.  Maybe they're testing how the system responds to messages with different tags or something. I'm not very familiar
                                    with all the syntax here, but I know that "<|im_end|>" and "<|im_start|>" are used to mark the beginning and end of interactions. So perhaps the user is trying to see if
                                    the assistant can properly respond to their messages.  I should probably respond in a friendly way, letting them know I can help with whatever they need. Maybe something
                                    like "Hi there! How can I assist you today?" That sounds polite and gives them a clear direction to follow up if they need assistance.  I also need to make sure I'm using
                                    the correct tags so that the system recognizes my response. I think I should place my response between "<|im_start|>" and "<|im_end|>". So putting it all together, my
                                    response would be "<|im_start|>Hi there! How can I assist you today?<|im_end|>".  I should double-check to make sure I'm not missing any tags or misplacing them. It's easy
                                    to get confused with all the different parts, so taking it step by step would help prevent errors. Once I'm confident that my response is correctly formatted, I can send
                                    it off.  In summary, the user seems to be testing message formatting, and I want to make sure my response is properly structured so it gets through without issues. Being
                                    clear and friendly in my response should help them understand that I'm ready to assist them. </think>  <|im_start|>   Hi there! How can I assist you today?   <|im_end|>

@apsonawane apsonawane force-pushed the asonawane/prompt-fix branch 3 times, most recently from 1feb9de to 6adaa89 Compare February 1, 2025 00:22
@apsonawane apsonawane force-pushed the asonawane/prompt-fix branch 2 times, most recently from c048ccd to 505c391 Compare February 3, 2025 15:48
Signed-off-by: Akshay Sonawane <[email protected]>
@apsonawane apsonawane force-pushed the asonawane/prompt-fix branch from 505c391 to 3536a37 Compare February 3, 2025 16:45
@apsonawane
Copy link
Collaborator Author

Closing this, opening a new PR in main repo

@apsonawane apsonawane closed this Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant