-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
Issue
I use aider with ollama_chat: models.
Something that I find moderately annoying is that the LLM output in Aider is not smooth, letter by letter like in ollama run or Open WebUI, but rather I get 5/10 words in one go. That makes reading the output really annoying, as, while it is fast enough to read confortably as it happens, my eyes need to full stop and wait for the next few words to appear all at once.
Now, I have tcpdump'ed Aider calls to Ollama, and Ollama is returning a streaming response where each payload is one single word. However, Aider is not showing one-by-one: the output updates with a bunch of words all at once.
Is this perhaps something known by other people and are there any pointers on how to get the streaming response stream correctly to the Aider chat window?
Version and model info
aider 0.86.1
model: ollama_chat