Skip to content

Releases: NVIDIA-NeMo/Export-Deploy

NVIDIA NeMo-Export-Deploy 0.6.0

Choose a tag to compare

@nemo-automation-bot nemo-automation-bot released this 23 Jun 01:40
400d5d8
Changelog Details

NVIDIA NeMo-Export-Deploy 0.5.0

Choose a tag to compare

@svcnvidia-nemo-ci svcnvidia-nemo-ci released this 16 Apr 20:49
04ca37f
Changelog Details

NVIDIA NeMo-Export-Deploy 0.4.0

Choose a tag to compare

@svcnvidia-nemo-ci svcnvidia-nemo-ci released this 26 Feb 00:19
2ba74b0

Highlights

  • vLLM support for Megatron-Bridge LLM checkpoints.
  • Remove NeMo 2.0 support.
  • Deployment of Megatron-Bridge VLM checkpoints
Changelog Details

NVIDIA NeMo-Export-Deploy 0.3.1

Choose a tag to compare

@chtruong814 chtruong814 released this 15 Dec 23:36
44a30f0
  • Fix vLLM top_p parameter handling in HuggingFace Ray deployment (#524)
  • Pin peft dependency to <0.14.0 for compatibility (#524)

NVIDIA NeMo-Export-Deploy 0.3.0

Choose a tag to compare

@chtruong814 chtruong814 released this 04 Dec 00:55
2cdaf51
  • Update TensorRT-LLM export to use NeMo->HF->TensorRT-LLM export path
  • Add chat template support for VLM deployment.
  • Bug fixes and folder name updates such as updating nlp to llm.

NVIDIA NeMo-Export-Deploy 0.2.1

Choose a tag to compare

@chtruong814 chtruong814 released this 22 Oct 23:36
950000c
  • Bug fixes for HuggingFace model deployment (#459)
    • Fixed HuggingFace deployable implementations for both Triton and Ray Serve backends
    • Improved tokenizer handling in HuggingFace deployment scripts
  • Minor fixes for Ray deployment (#464)
    • Additional bug fixes in Ray deployment utilities

NVIDIA NeMo-Export-Deploy 0.2.0

Choose a tag to compare

@chtruong814 chtruong814 released this 09 Oct 20:01
726695b
  • MegatronLM and Megatron-Bridge model deployment support with Triton Inference Server and Ray Serve
  • Multi-node multi-instance Ray Serve based deployment for NeMo 2, Megatron-Bridge, and Megatron-LM models.
  • Update vLLM export to use NeMo->HF->vLLM export path
  • Multi-Modal deployment for NeMo 2 models with Triton Inference Server
  • NeMo Retriever Text Reranking ONNX and TensorRT export support

NVIDIA NeMo-Export-Deploy 0.2.0rc2

Pre-release

Choose a tag to compare

@chtruong814 chtruong814 released this 18 Aug 06:32
7867110

Prerelease: NVIDIA NeMo-Export-Deploy 0.2.0rc2 (2025-08-18)

NVIDIA NeMo-Export-Deploy 0.1.1

Choose a tag to compare

@chtruong814 chtruong814 released this 15 Aug 08:24
ca72da9
ci: Mock DCO check

Signed-off-by: oliver könig <okoenig@nvidia.com>

NVIDIA NeMo-Export-Deploy 0.2.0rc1

Pre-release

Choose a tag to compare

@chtruong814 chtruong814 released this 14 Aug 15:54
62485cc

Prerelease: NVIDIA NeMo-Export-Deploy 0.2.0rc1 (2025-08-14)