llm-hf

LLM plugin for accessing Hugging Face Inference Providers - giving you access to 100+ open-weight models through a unified API.

Project Status

This is a personal project that is still in development. Contributions and feedback are welcome, but please note that support may be limited.

Installation

Make sure LLM is installed on your machine.

Then clone this repository:

git clone https://github.com/sebington/llm-hf.git

cd llm-hf

llm install -e .

Configuration

You need a Hugging Face access token with "Make calls to Inference Providers" permissions.

First, create a token at https://huggingface.co/settings/tokens/new?tokenType=fineGrained.

Then configure it using one of these methods:

Option 1: Using llm keys (recommended)

llm keys set hf

<paste token here>

Option 2: Using environment variable

export HF_TOKEN="your-token-here"

Usage

Plugin Commands

The plugin provides an llm hf command group for managing Hugging Face models:

# List all available Hugging Face models
llm hf models

# Refresh the model list from the API and see what changed
llm hf refresh

The llm hf refresh command is particularly useful to:

Check if new models have been added to Hugging Face Inference Providers
See which models have been removed from the service
Verify your token is working correctly

Alternative way to list models:

llm models | grep HuggingFaceChat

Both methods show ~116 models dynamically fetched from the Hugging Face API.

Basic Usage

Simply use the model name directly:

llm -m meta-llama/Llama-3.1-8B-Instruct "Write a poem about translation"

With options:

llm -m Qwen/Qwen2.5-Coder-32B-Instruct \
  -o temperature 0.7 \
  -o max_tokens 500 \
  "Write a Python function to sort a list"

With a specific provider:

llm -m meta-llama/Llama-3.1-8B-Instruct \
  -o provider sambanova \
  "What is the capital of France?"

In chat mode:

llm chat -m meta-llama/Llama-3.1-8B-Instruct

With system prompt:

llm -m Qwen/Qwen2.5-Coder-32B-Instruct \
  -s "You are a helpful coding assistant" \
  "How do I sort a list in Python?"

Available Options

provider (optional): Specify a provider (e.g., sambanova, together, fireworks-ai, groq)
- If not specified, Hugging Face automatically selects the best available provider
- Note: Not all providers support all models
temperature: Sampling temperature between 0.0 and 2.0 (default: provider default)
max_tokens: Maximum number of tokens to generate (default: provider default)
top_p: Nucleus sampling parameter between 0.0 and 1.0 (default: provider default)

Supported Providers

When using the provider option, you can choose from:

sambanova
together
fireworks-ai
groq
cerebras
hyperbolic
featherless-ai
nebius
novita
And more!

Note: Each provider supports different models. If you request a model from a provider that doesn't support it, you'll get an error message.

Finding More Models

All models available through Hugging Face Inference Providers are automatically discoverable using the commands above.

You can also browse available models at:

The plugin uses the same model list as the Hugging Face API, so any model shown in the playground should work with this plugin. Run llm hf refresh periodically to update your local model list.

Logging

All prompts and responses are automatically logged. View logs with:

llm logs

View the most recent entry:

llm logs -n 1

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
llm_hf.py		llm_hf.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

llm-hf

Project Status

Installation

Configuration

Usage

Plugin Commands

Basic Usage

Available Options

Supported Providers

Finding More Models

Logging

License

About

Uh oh!

Releases

Packages

Languages

License

sebington/llm-hf

Folders and files

Latest commit

History

Repository files navigation

llm-hf

Project Status

Installation

Configuration

Usage

Plugin Commands

Basic Usage

Available Options

Supported Providers

Finding More Models

Logging

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages