Instructions for AI agents contributing to this codebase.
llmfit is a Rust CLI/TUI tool that matches LLM models against local system hardware (RAM, CPU, GPU). It detects system specs, loads a model database from embedded JSON, scores each model's fit, and presents results in an interactive terminal UI or classic table output.
- Rust, edition 2024.
- Build with
cargo build. Run withcargo run. - No nightly features required. Stable toolchain only.
- Minimum supported Rust version: whatever edition 2024 requires (1.85+).
main.rs Entrypoint. Parses CLI args via clap. Launches TUI by default,
falls back to CLI subcommands (system, list, fit, search, info)
or --cli flag for classic table output.
hardware.rs SystemSpecs::detect() reads RAM/CPU via sysinfo crate.
detect_gpu() shells out to nvidia-smi / rocm-smi, and
detects Apple Silicon via system_profiler.
On unified memory (Apple Silicon), VRAM = system RAM.
No async. No unsafe.
models.rs LlmModel struct. ModelDatabase loads from data/hf_models.json
embedded via include_str!() at compile time. No runtime file I/O.
fit.rs FitLevel enum (Perfect, Good, Marginal, TooTight).
RunMode enum (Gpu, CpuOffload, CpuOnly).
ModelFit::analyze() compares a model against SystemSpecs,
selecting the best available execution path (GPU > CPU offload > CPU).
rank_models_by_fit() sorts by fit level, then run mode, then utilization.
display.rs CLI-mode table rendering using the tabled crate.
Only used when --cli flag or subcommands are invoked.
tui_app.rs TUI application state. Holds all models, filters (search text,
provider toggles, fit filter), selection index.
All filtering logic is here -- apply_filters() recomputes
filtered_fits indices whenever inputs change.
tui_ui.rs Rendering with ratatui. Four layout regions: system bar,
search/filter bar, model table (or detail pane), status bar.
Stateless rendering -- reads from App, writes to Frame.
tui_events.rs Keyboard event handling with crossterm. Two modes: Normal
(navigation, filter toggling, quit) and Search (text input).
App::new()callsSystemSpecs::detect()andModelDatabase::new().- Every model is analyzed into a
ModelFitviaModelFit::analyze(). - Results are sorted by
rank_models_by_fit(). apply_filters()producesfiltered_fits: Vec<usize>(indices intoall_fits).- The TUI render loop reads
Appstate and draws viatui_ui::draw(). tui_events::handle_events()mutatesAppstate, triggering re-render.
- Source:
data/hf_models.json(33 models). - Generated by
scripts/scrape_hf_models.py(Python, stdlib only, no pip deps). - Embedded at compile time via
include_str!("../data/hf_models.json"). - Schema per entry: name, provider, parameter_count, min_ram_gb, recommended_ram_gb, min_vram_gb, quantization, context_length, use_case.
min_vram_gbis VRAM needed for GPU inference.min_ram_gbis system RAM needed for CPU inference. Both are derived from the same parameter count.- RAM formula:
params * 0.5 bytes (Q4_K_M) / 1024^3 * 1.2 overhead. - VRAM formula:
params * 0.5 bytes (Q4_K_M) / 1024^3 * 1.1 activation overhead. - Recommended RAM:
model_size * 2.0.
Do not manually edit hf_models.json. Regenerate it by running the scraper:
python3 scripts/scrape_hf_models.pyThe scraper has hardcoded fallback entries for gated models that require authentication.
- No
unsafecode. - No
.unwrap()on user-facing paths. Use proper error handling orexpect()with a descriptive message for internal invariants only. - Fit levels are ordered: Perfect > Good > Marginal > TooTight. Do not add levels without updating
rank_models_by_fit()sort logic. - Fit is VRAM-first. GPU inference with sufficient VRAM is the ideal path. CPU inference via system RAM is a fallback. The
RunModeenum tracks which memory pool is being used (Gpu, CpuOffload, CpuOnly). min_vram_gbis the VRAM needed to load model weights on GPU.min_ram_gbis the system RAM needed for CPU-only inference (same weights, loaded into RAM instead). They represent the same workload on different hardware paths.- On Apple Silicon (unified memory), VRAM = system RAM. The
CpuOffloadpath is skipped because there is no separate RAM pool to spill to.SystemSpecs::unified_memorytracks this. - TUI rendering is stateless.
tui_ui::draw()must not mutateApp. Pass&mut Apponly forTableStatewidget requirements -- do not use it to change application state. - Event handling in
tui_events.rsis the sole place that mutatesAppin the TUI loop. - Keep
display.rsandtui_*.rsindependent. The CLI path must work without initializing any TUI state.
- Add the model's HuggingFace repo ID to
TARGET_MODELSinscripts/scrape_hf_models.py. - If the model is gated (requires HF auth), add a fallback entry to the
FALLBACKdict in the same script. - Run
python3 scripts/scrape_hf_models.py. - Verify the output in
data/hf_models.json. - Run
cargo buildto verify compilation.
- Add the filter state to
Appintui_app.rs. - Add filtering logic inside
apply_filters(). - Add the keybinding in
tui_events.rs(Normal mode handler). - Add the UI widget in
tui_ui.rs(draw_search_and_filters()function). - Update the status bar help text in
draw_status_bar().
- Add a variant to the
Commandsenum inmain.rs. - Add the match arm in the
main()function's command dispatch. - Use
display.rsfunctions for output, or add new ones as needed.
There are no tests yet. When adding tests:
- Unit tests for
fit.rslogic (given known SystemSpecs and LlmModel values, assert correct FitLevel). - Unit tests for
models.rs(verify JSON parsing, search matching). - Integration tests for CLI subcommands via
assert_cmdcrate. - TUI is difficult to unit test. Keep rendering stateless and test the state mutations in
tui_app.rsdirectly.
- Prefer crates that are well-maintained and have minimal transitive dependencies.
sysinfois the system detection crate. Do not replace it with raw platform calls.ratatui+crosstermis the TUI stack. Do not mix intermionorncurses.clapwith derive feature for CLI parsing. Do not use manual arg parsing.- The Python scraper uses only stdlib (
urllib,json). Do not add pip dependencies.
# Build
cargo build
# Run TUI
cargo run
# Run CLI mode
cargo run -- --cli
# Run specific subcommand
cargo run -- system
cargo run -- fit --perfect -n 5
cargo run -- search "llama"
# Refresh model database
python3 scripts/scrape_hf_models.py && cargo build
# Check for compilation issues
cargo check
# Format code
cargo fmt
# Lint
cargo clippy- GPU detection shells out to
nvidia-smi(NVIDIA) androcm-smi(AMD). These are best-effort and fail silently if unavailable. - Apple Silicon detection uses
system_profiler SPDisplaysDataType. On unified memory Macs, VRAM is reported as available system RAM (same pool). sysinfohandles cross-platform RAM/CPU. No conditional compilation needed.- The TUI uses crossterm which works on Linux, macOS, and Windows terminals.