Skip to content
Change the repository type filter

All

    Repositories list

    • LightLLM

      Public
      LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
      Python
      2823.7k8125Updated Nov 1, 2025Nov 1, 2025
    • LightX2V

      Public
      Light Video Generation Inference Framework
      Python
      45730381Updated Nov 1, 2025Nov 1, 2025
    • LightKernel

      Public
      HTML
      0300Updated Oct 31, 2025Oct 31, 2025
    • LightCompress

      Public
      A powerful toolkit for compressing large models including LLM, VLM, and video generation models.
      Python
      63599390Updated Oct 30, 2025Oct 30, 2025
    • ComfyUI-LightVAE

      Public
      Python
      52060Updated Oct 30, 2025Oct 30, 2025
    • FlashVSR

      Public
      Towards Real-Time Diffusion-Based Streaming Video Super-Resolution — An efficient one-step diffusion framework for streaming VSR with locality-constrained sparse attention and a tiny conditional decoder.
      Python
      45000Updated Oct 28, 2025Oct 28, 2025
    • general-sam-py

      Public
      Python bindings for general-sam and some utilities
      Python
      0504Updated Oct 27, 2025Oct 27, 2025
    • mtc-token-healing

      Public
      Token healing implementation in Rust
      Rust
      0403Updated Oct 27, 2025Oct 27, 2025
    • ComfyUI-Lightx2vWrapper

      Public
      ComfyUI custom node for lightx2v
      Python
      54600Updated Oct 26, 2025Oct 26, 2025
    • Qwen-Image-Lightning

      Public
      Qwen-Image-Lightning: Speed up Qwen-Image model with distillation
      Python
      36893170Updated Oct 14, 2025Oct 14, 2025
    • HBP

      Public
      [NIPS 2025] This is the official PyTorch implementation of "Hierarchical Balance Packing: Towards Efficient Supervised Fine-tuning for Long-Context LLM".
      Python
      0300Updated Sep 30, 2025Sep 30, 2025
    • TFMQ-DM

      Public
      [CVPR 2024 Highlight & TPAMI 2025] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".
      Jupyter Notebook
      410600Updated Sep 29, 2025Sep 29, 2025
    • Wan2.2-Lightning: Speed up wan2.2 model with distillation
      Python
      1.2k211170Updated Sep 28, 2025Sep 28, 2025
    • HTML
      0000Updated Sep 15, 2025Sep 15, 2025
    • Greedily tokenize strings with the longest tokens iteratively.
      Python
      0003Updated Sep 15, 2025Sep 15, 2025
    • A general suffix automaton implementation in Rust with Python bindings
      Rust
      0800Updated Sep 8, 2025Sep 8, 2025
    • SCSS
      0100Updated Sep 4, 2025Sep 4, 2025
    • Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.
      Cuda
      250200Updated Aug 16, 2025Aug 16, 2025
    • fa3

      Public
      Python
      1000Updated Aug 7, 2025Aug 7, 2025
    • Dockerfile
      2000Updated Jul 24, 2025Jul 24, 2025
    • HarmoniCa

      Public
      [ICML 2025] This is the official PyTorch implementation of "🎵 HarmoniCa: Harmonizing Training and Inference for Better Feature Caching in Diffusion Transformer Acceleration".
      Python
      14320Updated Jul 10, 2025Jul 10, 2025
    • LightTTS

      Public
      Light-tts is a lightweight TTS inference framework optimized for CosyVoice2, enabling fast and scalable speech synthesis in Python.
      Python
      01100Updated Jun 24, 2025Jun 24, 2025
    • OmniBal

      Public
      [ICML 2025] This is the official PyTorch implementation of "OmniBal: Towards Fast Instruction-Tuning for Vision-Language Models via Omniverse Computation Balance".
      Python
      32530Updated Jun 16, 2025Jun 16, 2025
    • 0000Updated Apr 28, 2025Apr 28, 2025
    • MQBench

      Public
      Model Quantization Benchmark
      Python
      142847116Updated Apr 20, 2025Apr 20, 2025
    • Fast and memory-efficient exact attention
      Python
      2.1k000Updated Apr 17, 2025Apr 17, 2025
    • verl

      Public
      verl: Volcano Engine Reinforcement Learning for LLMs
      Python
      2.4k100Updated Mar 17, 2025Mar 17, 2025
    • LLM_QAT

      Public
      Python
      0000Updated Feb 19, 2025Feb 19, 2025
    • Cuda
      11100Updated Jan 10, 2025Jan 10, 2025
    • EasyLLM

      Public
      Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.
      Python
      84800Updated Sep 18, 2024Sep 18, 2024