feat: multiple display mode and json flag for list command#149
Open
sammwyy wants to merge 358 commits intoAlexsJones:mainfrom
Open
feat: multiple display mode and json flag for list command#149sammwyy wants to merge 358 commits intoAlexsJones:mainfrom
sammwyy wants to merge 358 commits intoAlexsJones:mainfrom
Conversation
Signed-off-by: Alex <alexsimonjones@gmail.com>
- release.yml now excludes v*-mac tags (CLI + crate + homebrew only) - New release-desktop.yml triggers on v*-mac tags - Uses --bundles app to produce .app bundle without code signing - Searches both target/ and llmfit-desktop/target/ for bundle - Desktop releases no longer slow down normal CLI releases Signed-off-by: Three Foxes (in a Trenchcoat) <threefoxesyes3inatrenchcoat@gmail.com>
Problem: Multi-GPU systems had their VRAM summed into a single pool, leading to overly optimistic model fit recommendations since most inference runtimes (llama.cpp, Ollama, etc.) don't support tensor parallelism by default. Changes: - NVIDIA detection: group by model, keep max per-card VRAM (never sum) - AMD ROCm detection: collect per-card VRAM, use max per-card - Refactor nvidia-smi parsing into separate testable function - Update display text from "GB VRAM total" → "GB VRAM each" - Add unit tests for multi-GPU parsing behavior This gives more realistic recommendations by assuming models must fit on a single GPU unless explicitly configured for tensor parallelism.
fix: use per-card VRAM instead of summed for multi-GPU systems
fix: typo in CHANGELOG.md (suppor -> support)
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Fix compile warnings in providers and TUI
…AlexsJones#49) - For dense models: use choose_quant before deciding GPU path - For MoE models: try quantization hierarchy in moe_offload_path - Add moe_memory_for_quant helper to compute MoE memory at specific quant - Add test_moe_offload_tries_lower_quantization test
- Add Remote Ollama instances section to README - Documents OLLAMA_HOST env var for custom endpoints - Addresses issue AlexsJones#40 - feature already exists but was undocumented - Includes examples for remote servers, custom ports, Docker, etc.
docs: document OLLAMA_HOST environment variable for remote connections
…ysfs Improve GPU identification fallback on Linux containers
- Rename llmfit-tui package to llmfit for crates.io continuity - Add homepage and keywords to llmfit-core for publishing - Update authors field to proper format - Add version requirement for llmfit-core dependency Fixes AlexsJones#58 Signed-off-by: Three Foxes (in a Trenchcoat) <threefoxesyes3inatrenchcoat@gmail.com>
- Publish llmfit-core first (dependency) - Wait for crates.io index to update - Then publish llmfit (depends on llmfit-core) Signed-off-by: Three Foxes (in a Trenchcoat) <threefoxesyes3inatrenchcoat@gmail.com>
…/crates-io-metadata fix: correct crates.io metadata and prepare for publishing
ci: enable windows build targets
Signed-off-by: Alex <alexsimonjones@gmail.com>
Signed-off-by: Alex <alexsimonjones@gmail.com>
Signed-off-by: Alex <alexsimonjones@gmail.com>
Signed-off-by: Alex <alexsimonjones@gmail.com>
- Add RX 9060 XT (16GB) and RX 9060 (8GB) to estimate_vram_from_name() - Fixes incorrect VRAM detection on Windows due to WMI UINT32 limitation - Update comment to clarify this is RDNA 4 series Fixes AlexsJones#55 Signed-off-by: Three Foxes (in a Trenchcoat) <threefoxesyes3inatrenchcoat@gmail.com>
…/amd-rx-9060-vram fix: add AMD RX 9060 series to VRAM estimation database
…ndencies chore: Update dependencies
- test_gguf_source_deserialization — GgufSource JSON round-trips correctly - test_gguf_sources_default_to_empty — models without gguf_sources in JSON default to [] - test_catalog_popular_models_have_gguf_sources — 5 well-known models (Llama-3.3-70B, Qwen2.5-7B, etc.) have non-empty gguf_sources in the catalog - test_catalog_gguf_sources_have_valid_repos — every gguf_source in the catalog has owner/repo format, non-empty provider, and contains GGUF - test_catalog_has_significant_gguf_coverage — at least 25% of catalog models have GGUF sources (currently 30%) providers.rs (7 tests): - test_hf_name_to_gguf_candidates_generates_common_patterns — heuristic generates bartowski, ggml-org, TheBloke candidates - test_hf_name_to_gguf_candidates_strips_owner — strips the Org/ prefix correctly - test_lookup_gguf_repo_known_mappings — hardcoded mappings resolve for known models - test_lookup_gguf_repo_unknown_returns_none — unknown models return None - test_has_gguf_mapping_matches_known_models — boolean check works - test_gguf_candidates_fallback_covers_major_providers — fallback covers all 3 providers and all end in -GGUF - test_gguf_candidates_known_mapping_returns_single — hardcoded mapping returns exactly 1 result Signed-off-by: AlexsJones <alexsimonjones@gmail.com>
Signed-off-by: AlexsJones <alexsimonjones@gmail.com>
Signed-off-by: AlexsJones <alexsimonjones@gmail.com>
Signed-off-by: AlexsJones <alexsimonjones@gmail.com>
Signed-off-by: AlexsJones <alexsimonjones@gmail.com>
Signed-off-by: AlexsJones <alexsimonjones@gmail.com>
Signed-off-by: AlexsJones <alexsimonjones@gmail.com>
The JSON output (--json flag and API) was missing `moe_offloaded_gb`, so MoE models showed only active-expert VRAM as `memory_required_gb` without indicating the additional RAM needed for inactive experts. Add `moe_offloaded_gb` and `total_memory_gb` (VRAM + offloaded RAM) to both display and API JSON serializers so consumers can see the full memory footprint. Closes AlexsJones#230 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…-fields fix: surface MoE offloaded RAM in JSON output
Signed-off-by: Alex <alexsimonjones@gmail.com>
Signed-off-by: Alex <alexsimonjones@gmail.com>
Add support for Docker Desktop's built-in Model Runner as a fourth runtime provider alongside Ollama, llama.cpp, and MLX. Detection probes the OpenAI-compatible /v1/models endpoint on localhost:12434 (configurable via DOCKER_MODEL_RUNNER_HOST). Downloads use `docker model pull`. A new scraper (scripts/scrape_docker_models.py) queries Docker Hub's ai/ namespace and cross-references against the HF model database to produce an embedded catalog (docker_models.json) of confirmed available models. Only models verified in the catalog appear as downloadable via Docker. - Provider: detect, list installed, pull via docker CLI - TUI: status bar shows Docker availability, 'D' in Inst column, provider picker includes Docker Model Runner - Inst column refactored from enum to bitfield for extensibility - Makefile: `make update-catalogs` refreshes all scrapers and rebuilds Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Alex <alexsimonjones@gmail.com>
Signed-off-by: Alex <alexsimonjones@gmail.com>
Signed-off-by: Alex <alexsimonjones@gmail.com>
Signed-off-by: Alex <alexsimonjones@gmail.com>
Bumps [docker/metadata-action](https://github.com/docker/metadata-action) from 5.10.0 to 6.0.0. - [Release notes](https://github.com/docker/metadata-action/releases) - [Commits](docker/metadata-action@c299e40...030e881) --- updated-dependencies: - dependency-name: docker/metadata-action dependency-version: 6.0.0 dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [docker/setup-buildx-action](https://github.com/docker/setup-buildx-action) from 3 to 4. - [Release notes](https://github.com/docker/setup-buildx-action/releases) - [Commits](docker/setup-buildx-action@v3...v4) --- updated-dependencies: - dependency-name: docker/setup-buildx-action dependency-version: '4' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [tauri-build](https://github.com/tauri-apps/tauri) from 2.5.5 to 2.5.6. - [Release notes](https://github.com/tauri-apps/tauri/releases) - [Commits](tauri-apps/tauri@tauri-build-v2.5.5...tauri-build-v2.5.6) --- updated-dependencies: - dependency-name: tauri-build dependency-version: 2.5.6 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>
Replaces the abbreviated Chinese README with a full translation covering all sections: install, usage (TUI/CLI/REST API), how it works, model database, project structure, runtime providers, platform support, contributing, OpenClaw integration, and alternatives. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ation Feat: Docs/chinese translation
…ctions/docker/metadata-action-6.0.0 chore(deps): bump docker/metadata-action from 5.10.0 to 6.0.0
…match fix: prefer exact matches in info selection
…uri-build-2.5.6 chore(deps): bump tauri-build from 2.5.5 to 2.5.6
…ctions/docker/setup-buildx-action-4 chore(deps): bump docker/setup-buildx-action from 3 to 4
905cc92 to
5b4370a
Compare
Author
|
already |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I moved the display.rs file from the TUI (CLI mode) to its own module, src/display/mod.rs, as a generic trait.
I created two implementations: json_mode.rs and table_mode.rs, with the possibility of adding more in the future.
This way, each file handles its own implementation. In the CLI, I simply initialize the display mode based on the global flag "--json" instead of using an if statement in each subcommand.
I also added support for the "--json" flag to the "list" command.