Are there any plans to support true tensor parallelism in the future? #13013
lingyezhixing
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The current performance overhead for multi-GPU inference is substantial. What optimization methods are available?
Beta Was this translation helpful? Give feedback.
All reactions