-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Pull requests: NVIDIA/Megatron-LM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix _set_wandb_writer serialization issues
bug
Something isn't working
module: debugging
#1806
opened Sep 11, 2025 by
gakkiri
Loading…
5 of 8 tasks
fix typo in SP/TP config validation error
module: documentation
#1790
opened Sep 3, 2025 by
clumsy
Loading…
Add support for packing-with-padding in gpt-dataset
enhancement
New feature or request
module: data pipeline
#1788
opened Sep 3, 2025 by
terminator123
Loading…
Add falcon h1 2
enhancement
New feature or request
#1785
opened Sep 2, 2025 by
dhiaEddineRhaiem
Loading…
bugfix: raise error if eos_token is not set in tokenizer
bug
Something isn't working
module: data pipeline
#1774
opened Aug 27, 2025 by
imomayiz
Loading…
Fix torch_dist checkpointing ETP replica_id
bug
Something isn't working
module: moe
#1770
opened Aug 25, 2025 by
Skylion007
Loading…
Fix Context Parallel NaN Loss
bug
Something isn't working
#1765
opened Aug 21, 2025 by
leoleoasd
Loading…
Fix runaway Etpt in straggler detector by resetting FLOPs accumulator
bug
Something isn't working
#1755
opened Aug 19, 2025 by
cms42
Loading…
perf(MoE): Use TE quant/dequant for SwiGLU fp8 input store to improve performance and stability
#1753
opened Aug 19, 2025 by
xiaoxi-wangfj
Loading…
[main][feature][under updating]zero-overhead activation offload
enhancement
New feature or request
#1752
opened Aug 18, 2025 by
GeYuhong
Loading…
fix: Initialize master_weight with params_dtype directly
bug
Something isn't working
#1748
opened Aug 15, 2025 by
Mirza-Samad-Ahmed-Baig
Loading…
fix several typos in megatron/core/transformer/multi_token_prediction.py
module: documentation
#1744
opened Aug 13, 2025 by
andy-yangz
Loading…
Add world_size dict getter method for simple integration with W&B
enhancement
New feature or request
#1735
opened Aug 9, 2025 by
WoosungMyung
Loading…
export _move_new_state_to_right_device for offload/load
enhancement
New feature or request
#1734
opened Aug 8, 2025 by
techkang
Loading…
Megatron-LM changes to make Hyena/Evo 2 inference usable, especially for 40B models
enhancement
New feature or request
#1727
opened Aug 1, 2025 by
antonvnv
Loading…
fix router input jitter dtype
bug
Something isn't working
#1726
opened Aug 1, 2025 by
chaitanyadwivedi96
Loading…
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.