Llama-Nemotron Support

[Llama-Nemotron models](https://huggingface.co/nvidia/Llama-3_3-Nemotron-Super-49B-v1_5) seem to offer significant ahead-of-time optimization and handming nemotron [may be useful in general](https://www.arxiv.org/abs/2508.15884). Took a peek at the config.json and it didn't look pretty i nthere. There's a [PR](https://github.com/turboderp-org/exllamav2/pull/726) on exllama and llamacpp [merged](https://github.com/ggml-org/llama.cpp/pull/10669) one by the same author it appears

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llama-Nemotron Support #282

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Llama-Nemotron Support #282

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions