Skip to content
Change the repository type filter

All

    Repositories list

    • Ongoing research training transformer models at scale
      Python
      3.1k14k301124Updated Sep 16, 2025Sep 16, 2025
    • The NVIDIA NeMo Agent toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents.
      Python
      3621.4k5319Updated Sep 16, 2025Sep 16, 2025
    • cccl

      Public
      CUDA Core Compute Libraries
      C++
      2671.9k1.1k157Updated Sep 16, 2025Sep 16, 2025
    • TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in performant way.
      C++
      1.7k12k748388Updated Sep 16, 2025Sep 16, 2025
    • NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
      Python
      5385.1k12759Updated Sep 16, 2025Sep 16, 2025
    • CUDA Python: Performance meets Productivity
      Python
      2053k14913Updated Sep 16, 2025Sep 16, 2025
    • Fuser

      Public
      A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
      C++
      67354174173Updated Sep 16, 2025Sep 16, 2025
    • An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.
      Go
      32126628Updated Sep 16, 2025Sep 16, 2025
    • NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes
      Go
      3832.3k38573Updated Sep 16, 2025Sep 16, 2025
    • C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflows
      C++
      28378938987Updated Sep 16, 2025Sep 16, 2025
    • VisRTX

      Public
      NVIDIA OptiX based implementation of ANARI
      C++
      3426380Updated Sep 16, 2025Sep 16, 2025
    • NVIDIA Digital Biology examples for optimized inference and training at scale
      4817422Updated Sep 16, 2025Sep 16, 2025
    • QEMU

      Public
      NVIDIA fork of QEMU
      C
      3600Updated Sep 16, 2025Sep 16, 2025
    • A unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed.
      Python
      1561.4k10924Updated Sep 16, 2025Sep 16, 2025
    • cutlass

      Public
      CUDA Templates for Linear Algebra Subroutines
      C++
      1.4k8.4k36353Updated Sep 15, 2025Sep 15, 2025
    • A project to improve skills of large language models
      Python
      98559408Updated Sep 15, 2025Sep 15, 2025
    • NVIDIA Isaac GR00T N1.5 is the world's first open foundation model for generalized humanoid robot reasoning and skills.
      Jupyter Notebook
      7394.9k8919Updated Sep 15, 2025Sep 15, 2025
    • Documentation repository for NVIDIA Cloud Native Technologies
      PowerShell
      2829415Updated Sep 15, 2025Sep 15, 2025
    • nv-ingest

      Public
      NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extraction uses specialized NVIDIA NIM microservices to find, contextualize, and extract text, tables, charts and images that you can use in downstream generative applications.
      Python
      2622.7k9031Updated Sep 15, 2025Sep 15, 2025
    • cuEquivariance is a math library that is a collective of low-level primitives and tensor ops to accelerate widely-used models, like DiffDock, MACE, Allegro and NEQUIP, based on equivariant neural networks. Also includes kernels for accelerated structure prediction.
      Python
      1929945Updated Sep 15, 2025Sep 15, 2025
    • MatX

      Public
      An efficient C++17 GPU numerical computing library with Python-like syntax
      C++
      1061.4k398Updated Sep 15, 2025Sep 15, 2025
    • topograph

      Public
      A toolkit for discovering cluster network topology.
      Go
      66811Updated Sep 15, 2025Sep 15, 2025
    • A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.
      Python
      5022.7k21686Updated Sep 15, 2025Sep 15, 2025
    • CUDA checkpoint and restore utility
      C
      21367240Updated Sep 15, 2025Sep 15, 2025
    • cuopt

      Public
      GPU accelerated decision optimization
      Cuda
      714209325Updated Sep 15, 2025Sep 15, 2025
    • cuDecomp

      Public
      An Adaptive Pencil Decomposition Library for NVIDIA GPUs
      C++
      116901Updated Sep 15, 2025Sep 15, 2025
    • jitify

      Public
      A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).
      C++
      72554207Updated Sep 15, 2025Sep 15, 2025
    • garak

      Public
      the LLM vulnerability scanner
      Python
      6125.8k28634Updated Sep 15, 2025Sep 15, 2025
    • nvshmem

      Public
      NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process communication and coordination overheads by allowing programmers to perform one-sided communication from within CUDA kernels and on CUDA streams.
      C++
      2230632Updated Sep 15, 2025Sep 15, 2025
    • The CUDA target for Numba
      Python
      381859824Updated Sep 15, 2025Sep 15, 2025