ollama_chat: Model output gets updated in chunks

### Issue

I use aider with `ollama_chat:` models.

Something that I find moderately annoying is that the LLM output in Aider is not smooth, letter by letter like in `ollama run` or Open WebUI, but rather I get 5/10 words in one go. That makes reading the output really annoying, as, while it is fast enough to read confortably as it happens, my eyes need to full stop and wait for the next few words to appear all at once.

Now, I have tcpdump'ed Aider calls to Ollama, and Ollama is returning a streaming response where each payload is one single word. However, Aider is not showing one-by-one: the output updates with a bunch of words all at once.

Is this perhaps something known by other people and are there any pointers on how to get the streaming response stream correctly to the Aider chat window?

### Version and model info

aider 0.86.1
model: ollama_chat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ollama_chat: Model output gets updated in chunks #4637

Issue

Version and model info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ollama_chat: Model output gets updated in chunks #4637

Description

Issue

Version and model info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions