You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm interested in whether there are plans to implement block-wise quantization for FP8 training, similar to what's described in papers like "Deepseek V3".
Block quantization could potentially provide better numerical stability and accuracy compared to tensor-wide quantization, especially for outlier values. This could be particularly valuable for large language models where maintaining precision is crucial.
Some specific questions:
Is this feature currently on your roadmap?
If yes, what's the approximate timeline?
If no, are there technical challenges preventing this implementation?
Thank you for your time!
The text was updated successfully, but these errors were encountered:
Hi TE team,
I'm interested in whether there are plans to implement block-wise quantization for FP8 training, similar to what's described in papers like "Deepseek V3".
Block quantization could potentially provide better numerical stability and accuracy compared to tensor-wide quantization, especially for outlier values. This could be particularly valuable for large language models where maintaining precision is crucial.
Some specific questions:
Thank you for your time!
The text was updated successfully, but these errors were encountered: