Change the repository type filter
All
Repositories list
595 repositories
- Ongoing research training transformer models at scale
cccl
PublicCUDA Core Compute Libraries- TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in performant way.
NeMo-Guardrails
Public- C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflows
- A unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed.
cutlass
PublicCUDA Templates for Linear Algebra SubroutinesNeMo-Skills
PublicIsaac-GR00T
Publiccloud-native-docs
PublicDocumentation repository for NVIDIA Cloud Native Technologiesnv-ingest
PublicNeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extraction uses specialized NVIDIA NIM microservices to find, contextualize, and extract text, tables, charts and images that you can use in downstream generative applications.- cuEquivariance is a math library that is a collective of low-level primitives and tensor ops to accelerate widely-used models, like DiffDock, MACE, Allegro and NEQUIP, based on equivariant neural networks. Also includes kernels for accelerated structure prediction.
MatX
Publictopograph
PublicTransformerEngine
PublicA library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.cuda-checkpoint
Publiccuopt
PublicGPU accelerated decision optimizationjitify
PublicA single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).garak
Publicthe LLM vulnerability scannernvshmem
PublicNVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process communication and coordination overheads by allowing programmers to perform one-sided communication from within CUDA kernels and on CUDA streams.numba-cuda
Public