Skip to content

feat: multiple display mode and json flag for list command#149

Open
sammwyy wants to merge 358 commits intoAlexsJones:mainfrom
sammwyy:feat/display-modes
Open

feat: multiple display mode and json flag for list command#149
sammwyy wants to merge 358 commits intoAlexsJones:mainfrom
sammwyy:feat/display-modes

Conversation

@sammwyy
Copy link

@sammwyy sammwyy commented Mar 2, 2026

I moved the display.rs file from the TUI (CLI mode) to its own module, src/display/mod.rs, as a generic trait.

I created two implementations: json_mode.rs and table_mode.rs, with the possibility of adding more in the future.

This way, each file handles its own implementation. In the CLI, I simply initialize the display mode based on the global flag "--json" instead of using an if statement in each subcommand.

I also added support for the "--json" flag to the "list" command.

AlexsJones and others added 30 commits February 21, 2026 21:06
Signed-off-by: Alex <alexsimonjones@gmail.com>
- release.yml now excludes v*-mac tags (CLI + crate + homebrew only)
- New release-desktop.yml triggers on v*-mac tags
- Uses --bundles app to produce .app bundle without code signing
- Searches both target/ and llmfit-desktop/target/ for bundle
- Desktop releases no longer slow down normal CLI releases

Signed-off-by: Three Foxes (in a Trenchcoat) <threefoxesyes3inatrenchcoat@gmail.com>
Problem: Multi-GPU systems had their VRAM summed into a single pool, leading to
overly optimistic model fit recommendations since most inference runtimes
(llama.cpp, Ollama, etc.) don't support tensor parallelism by default.

Changes:
- NVIDIA detection: group by model, keep max per-card VRAM (never sum)
- AMD ROCm detection: collect per-card VRAM, use max per-card
- Refactor nvidia-smi parsing into separate testable function
- Update display text from "GB VRAM total" → "GB VRAM each"
- Add unit tests for multi-GPU parsing behavior

This gives more realistic recommendations by assuming models must fit on
a single GPU unless explicitly configured for tensor parallelism.
fix: use per-card VRAM instead of summed for multi-GPU systems
fix: typo in CHANGELOG.md (suppor -> support)
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…AlexsJones#49)

- For dense models: use choose_quant before deciding GPU path
- For MoE models: try quantization hierarchy in moe_offload_path
- Add moe_memory_for_quant helper to compute MoE memory at specific quant
- Add test_moe_offload_tries_lower_quantization test
- Add Remote Ollama instances section to README
- Documents OLLAMA_HOST env var for custom endpoints
- Addresses issue AlexsJones#40 - feature already exists but was undocumented
- Includes examples for remote servers, custom ports, Docker, etc.
docs: document OLLAMA_HOST environment variable for remote connections
…ysfs

Improve GPU identification fallback on Linux containers
- Rename llmfit-tui package to llmfit for crates.io continuity
- Add homepage and keywords to llmfit-core for publishing
- Update authors field to proper format
- Add version requirement for llmfit-core dependency

Fixes AlexsJones#58

Signed-off-by: Three Foxes (in a Trenchcoat) <threefoxesyes3inatrenchcoat@gmail.com>
- Publish llmfit-core first (dependency)
- Wait for crates.io index to update
- Then publish llmfit (depends on llmfit-core)

Signed-off-by: Three Foxes (in a Trenchcoat) <threefoxesyes3inatrenchcoat@gmail.com>
…/crates-io-metadata

fix: correct crates.io metadata and prepare for publishing
Signed-off-by: Alex <alexsimonjones@gmail.com>
Signed-off-by: Alex <alexsimonjones@gmail.com>
Signed-off-by: Alex <alexsimonjones@gmail.com>
Signed-off-by: Alex <alexsimonjones@gmail.com>
- Add RX 9060 XT (16GB) and RX 9060 (8GB) to estimate_vram_from_name()
- Fixes incorrect VRAM detection on Windows due to WMI UINT32 limitation
- Update comment to clarify this is RDNA 4 series

Fixes AlexsJones#55

Signed-off-by: Three Foxes (in a Trenchcoat) <threefoxesyes3inatrenchcoat@gmail.com>
…/amd-rx-9060-vram

fix: add AMD RX 9060 series to VRAM estimation database
AlexsJones and others added 27 commits March 12, 2026 22:08
  - test_gguf_source_deserialization — GgufSource JSON round-trips correctly
  - test_gguf_sources_default_to_empty — models without gguf_sources in JSON default to []
  - test_catalog_popular_models_have_gguf_sources — 5 well-known models (Llama-3.3-70B, Qwen2.5-7B, etc.)
  have non-empty gguf_sources in the catalog
  - test_catalog_gguf_sources_have_valid_repos — every gguf_source in the catalog has owner/repo format,
  non-empty provider, and contains GGUF
  - test_catalog_has_significant_gguf_coverage — at least 25% of catalog models have GGUF sources (currently
  30%)

  providers.rs (7 tests):
  - test_hf_name_to_gguf_candidates_generates_common_patterns — heuristic generates bartowski, ggml-org,
  TheBloke candidates
  - test_hf_name_to_gguf_candidates_strips_owner — strips the Org/ prefix correctly
  - test_lookup_gguf_repo_known_mappings — hardcoded mappings resolve for known models
  - test_lookup_gguf_repo_unknown_returns_none — unknown models return None
  - test_has_gguf_mapping_matches_known_models — boolean check works
  - test_gguf_candidates_fallback_covers_major_providers — fallback covers all 3 providers and all end in
  -GGUF
  - test_gguf_candidates_known_mapping_returns_single — hardcoded mapping returns exactly 1 result

Signed-off-by: AlexsJones <alexsimonjones@gmail.com>
Signed-off-by: AlexsJones <alexsimonjones@gmail.com>
Signed-off-by: AlexsJones <alexsimonjones@gmail.com>
Signed-off-by: AlexsJones <alexsimonjones@gmail.com>
Signed-off-by: AlexsJones <alexsimonjones@gmail.com>
Signed-off-by: AlexsJones <alexsimonjones@gmail.com>
Signed-off-by: AlexsJones <alexsimonjones@gmail.com>
The JSON output (--json flag and API) was missing `moe_offloaded_gb`,
so MoE models showed only active-expert VRAM as `memory_required_gb`
without indicating the additional RAM needed for inactive experts.

Add `moe_offloaded_gb` and `total_memory_gb` (VRAM + offloaded RAM)
to both display and API JSON serializers so consumers can see the
full memory footprint.

Closes AlexsJones#230

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…-fields

fix: surface MoE offloaded RAM in JSON output
Signed-off-by: Alex <alexsimonjones@gmail.com>
Signed-off-by: Alex <alexsimonjones@gmail.com>
Add support for Docker Desktop's built-in Model Runner as a fourth
runtime provider alongside Ollama, llama.cpp, and MLX. Detection probes
the OpenAI-compatible /v1/models endpoint on localhost:12434 (configurable
via DOCKER_MODEL_RUNNER_HOST). Downloads use `docker model pull`.

A new scraper (scripts/scrape_docker_models.py) queries Docker Hub's ai/
namespace and cross-references against the HF model database to produce
an embedded catalog (docker_models.json) of confirmed available models.
Only models verified in the catalog appear as downloadable via Docker.

- Provider: detect, list installed, pull via docker CLI
- TUI: status bar shows Docker availability, 'D' in Inst column,
  provider picker includes Docker Model Runner
- Inst column refactored from enum to bitfield for extensibility
- Makefile: `make update-catalogs` refreshes all scrapers and rebuilds

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Alex <alexsimonjones@gmail.com>
Signed-off-by: Alex <alexsimonjones@gmail.com>
Signed-off-by: Alex <alexsimonjones@gmail.com>
Signed-off-by: Alex <alexsimonjones@gmail.com>
Bumps [docker/metadata-action](https://github.com/docker/metadata-action) from 5.10.0 to 6.0.0.
- [Release notes](https://github.com/docker/metadata-action/releases)
- [Commits](docker/metadata-action@c299e40...030e881)

---
updated-dependencies:
- dependency-name: docker/metadata-action
  dependency-version: 6.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [docker/setup-buildx-action](https://github.com/docker/setup-buildx-action) from 3 to 4.
- [Release notes](https://github.com/docker/setup-buildx-action/releases)
- [Commits](docker/setup-buildx-action@v3...v4)

---
updated-dependencies:
- dependency-name: docker/setup-buildx-action
  dependency-version: '4'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [tauri-build](https://github.com/tauri-apps/tauri) from 2.5.5 to 2.5.6.
- [Release notes](https://github.com/tauri-apps/tauri/releases)
- [Commits](tauri-apps/tauri@tauri-build-v2.5.5...tauri-build-v2.5.6)

---
updated-dependencies:
- dependency-name: tauri-build
  dependency-version: 2.5.6
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Replaces the abbreviated Chinese README with a full translation covering
all sections: install, usage (TUI/CLI/REST API), how it works, model
database, project structure, runtime providers, platform support,
contributing, OpenClaw integration, and alternatives.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ctions/docker/metadata-action-6.0.0

chore(deps): bump docker/metadata-action from 5.10.0 to 6.0.0
…match

fix: prefer exact matches in info selection
…uri-build-2.5.6

chore(deps): bump tauri-build from 2.5.5 to 2.5.6
…ctions/docker/setup-buildx-action-4

chore(deps): bump docker/setup-buildx-action from 3 to 4
@sammwyy sammwyy force-pushed the feat/display-modes branch from 905cc92 to 5b4370a Compare March 16, 2026 23:49
@sammwyy
Copy link
Author

sammwyy commented Mar 16, 2026

already

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request Improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.