-
-
Notifications
You must be signed in to change notification settings - Fork 11.4k
Relax Transformers modeling backend MoE experts check #28952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The experts model could now also be a 3D tensor Note that there are other issues on Transformers main which currently break Transformers modeling backend MoE support. These are being worked on separately. Signed-off-by: Harry Mellor <[email protected]>
|
Documentation preview: https://vllm--28952.org.readthedocs.build/en/28952/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request relaxes the check for Mixture-of-Experts (MoE) layers in the Transformers modeling backend. It now supports identifying packed expert modules where parameters are 3D tensors, in addition to the existing check for nn.ModuleList. The documentation has been updated accordingly. My review focuses on the implementation of this new check. I've found a potential edge case where a module with no parameters could be incorrectly identified as a packed expert module and have provided a suggestion to make the check more robust.
Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
The experts module could now also be a 3D tensor.
Note that there are other issues on Transformers main which currently break Transformers modeling backend MoE support. These are being worked on separately.