fim : support n_cmpl by ggerganov · Pull Request #125 · ggml-org/llama.vim

ggerganov · 2026-05-12T14:52:18Z

Overview

Support generating multiple completions at a time using the n_cmpl parameter:

Approach

Cache as ring buffer per key

Each cache key (hash of prefix/middle/suffix) now maps to a ring buffer of up to n_cmpl individual completion responses. When a new completion arrives for an existing key, it is appended; if the ring is full, the oldest entry is evicted. Completions with duplicate content are skipped on insert.

This naturally accumulates diverse completions across multiple requests for the same position, without packing them into a single array value.

Simplified `fim_try_hint`

Two clear phases:

Exact match — look up the current position hash. If completions exist, render them and enable cycling (<C-J>/<C-K>).
Nearby match (only if Phase 1 yields nothing) — scan 128 characters back for a cached completion whose start matches what was typed. Pick the single best match, render without cycling.

New config options

n_cmpl (default: 1) — max completions per position in the ring buffer
keymap_fim_next / keymap_fim_prev (default: <C-J> / <C-K>) — cycle through completions

Changes

s:cache_insert / s:cache_get — ring buffer per key, dedup by content
s:fim_on_response — normalize server response (dict or array), insert each individually
s:fim_try_hint — simplified two-phase logic
s:fim_render — accept list of responses + selected index + fuzzy flag
llama#fim_cycle — cycle through completions
Info bar shows [N/M] completion index and total cached entries

AI usage disclosure: YES. llama.cpp + pi

…port - cache now stores up to n_cmpl individual completions per key (ring buffer) - simplify fim_try_hint: exact match with cycling, nearby match without - add n_cmpl config option (default: 1) - add keymap_fim_next/keymap_fim_prev for cycling (<C-J>/<C-K>) - add llama#fim_cycle() for cycling through completions - deduplicate completions on cache insert by content - update info bar to show total cached entries and completion index Assisted-by: llama.cpp:local pi

Map the cycle keymaps whenever a FIM hint is shown, not just when multiple completions are available. llama#fim_cycle returns '' early when there is nothing to cycle, preventing the default insert-mode behavior of moving the cursor. Assisted-by: llama.cpp:local pi

The fuzzy flag was only used to suppress the [N/M] completion index in the info bar. Simply checking len(responses) > 1 is sufficient. Assisted-by: llama.cpp:local pi

ggerganov changed the title ~~core : add n_cmpl support~~ fim : add n_cmpl support May 12, 2026

ggerganov force-pushed the gg/n-cmpl branch from 85aa9bb to 7a420c9 Compare May 12, 2026 19:08

ggerganov changed the title ~~fim : add n_cmpl support~~ fim : redesign cache as ring buffer per key with multi-completion support May 12, 2026

ggerganov marked this pull request as ready for review May 12, 2026 19:11

ggerganov added 2 commits May 12, 2026 22:17

fim : remove unnecessary fuzzy argument from s:fim_render

6d01f0b

The fuzzy flag was only used to suppress the [N/M] completion index in the info bar. Simply checking len(responses) > 1 is sufficient. Assisted-by: llama.cpp:local pi

ggerganov changed the title ~~fim : redesign cache as ring buffer per key with multi-completion support~~ fim : support n_cmpl May 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fim : support n_cmpl#125

fim : support n_cmpl#125
ggerganov wants to merge 3 commits into
masterfrom
gg/n-cmpl

ggerganov commented May 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ggerganov commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Approach

Cache as ring buffer per key

Simplified fim_try_hint

New config options

Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ggerganov commented May 12, 2026 •

edited

Loading

Simplified `fim_try_hint`