Better user experience for llamacpp-server #10012
PierreCarceller
started this conversation in
Ideas
Replies: 1 comment 5 replies
-
Let me know if you need more help or if you have a specific example that you would like demonstrated. |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello!
I'm a llamacpp-server user. In particular the
openai
apis.In a dream world I would like :
A bit like what you can do at VLLM
I can think of 2 alternative solutions that are a little less practical but easier to set up.
Solution 1 : Create a
/message-format
endpoint that would apply the template chat to the list of messages sent and then use the/completion
endpoint that already exists.Solution 2 : Let the client format the message list on its own (with jinja for example) before using the
/completion
endpoint. But in this case, the server must provide access to the information needed to do the job on the client side (the BOS token, the EOS token...etc).I hope I haven't missed any important information.
Beta Was this translation helpful? Give feedback.
All reactions