Skip to content

common : refactor downloading system, handle mmproj with -hf option #12694

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 1, 2025

Conversation

ngxson
Copy link
Collaborator

@ngxson ngxson commented Apr 1, 2025

Ref comment: #12664 (comment)

Changes in this PR:

  • Refactor download system
  • Remove usage of JSON in common_get_hf_file
  • Add --mmproj-url ; if -hf-repo is specify, we default to the value returned by HF backend
    The goal is to make vision examples a bit more simple to play with, for example: llama-gemma3-cli -hf bartowski/google_gemma-3-4b-it-GGUF now works without specifying --mmproj

The download system is now moved to arg.cpp. A new struct common_params_model is added to store the path/url of the model. The logic is as follow:

View text caption
  • either model.path, model.url, or hf_repo/hf_file is supplied from CLI
  • common_params_handle_model firstly fill in and make sure the values are correct:
    • if hf_repo is in the repo/user:tag format, it ask the HF backend for the hf_file
    • if both hf_repo and hf_file is there, we construct the model.url
    • if model.url is there but no model.path, we set model.path to default cache directory
  • common_params_handle_model then call common_download_model to download the actual file
    • it firstly make a call to common_download_file_single to download the first shard
    • then optionally call common_download_file_multiple to download the remaining shards

After common_params_handle_model returns, the model.path is ready to be loaded.

Diagram generated by Claude:

sequenceDiagram
    actor CLI as common_params_parse_ex
    participant HMDL as common_params_handle_model
    participant DWN as common_download_model
    participant S1 as common_download_file_single
    participant SM as common_download_file_multiple
    
    CLI->>HMDL: model.path/model.url/hf_repo/hf_file
    
    Note over HMDL: Process inputs
    
    alt hf_repo in repo/user:tag format
        HMDL->>HMDL: Query HF backend for hf_file
    end
    
    alt both hf_repo & hf_file exist
        HMDL->>HMDL: Construct model.url
    end
    
    alt model.url exists but no model.path
        HMDL->>HMDL: Set model.path to default cache
    end
    
    HMDL->>DWN: Download model.url to model.path
    
    DWN->>S1: Download first shard
    S1-->>DWN: First shard downloaded
    
    alt Multiple shards exist
        DWN->>SM: Download remaining shards
        SM-->>DWN: All shards downloaded
    end
    
    DWN-->>HMDL: Download complete
    HMDL-->>CLI: model.path ready to be loaded
Loading

@ngxson ngxson changed the title common : refactor downloading system common : refactor downloading system, handle mmproj with -hf option Apr 1, 2025
@ngxson ngxson requested a review from ggerganov April 1, 2025 15:27
@ngxson
Copy link
Collaborator Author

ngxson commented Apr 1, 2025

@ggerganov Sorry for including both refactoring + functional changes in one single PR. Indeed, I started with doing the auto download for --mmproj with -hf, but realized that the code is a bit messy to make it work, so I decide to do both at the same time.

@github-actions github-actions bot added testing Everything test related examples server labels Apr 1, 2025
@ngxson
Copy link
Collaborator Author

ngxson commented Apr 1, 2025

Hmm ok seems like I missed one case (that's why server CI fails), fixing it now..

@ngxson ngxson marked this pull request as ready for review April 1, 2025 20:28
@ngxson ngxson merged commit 267c139 into ggml-org:master Apr 1, 2025
47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples server testing Everything test related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants