-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ONNX model bringup #344
Comments
MobilenetV2 in pytorch is working as of today on main. When exporting the model to onnx and running that compile flow. The parameters become inlined as constants in the graph. TTIR has no way to consume this. Not quite sure how to handle this. If we did bringup functionality for this through tt-mlir, at runtime the parameter data would end up being pushed to device every inference call since it would have to come from host as it would be embedded as a constant op which is called in the graph. I think ideally, once consteval is working these constants would just be hoisted into the consteval function and passed as graph inputs to the main program. However, that seems as though its a while away. |
Much of the following information does not pertain to the ONNX flow. However when originally attempting to bringup SwinV2 in PyTorch I discovered a number of issues:
|
We need to bring up the following ONNX models
The goal is op-by-op flow in torch then, once @ddilbazTT onnx flow is ready in onnx.
If the original source is a github repo (as opposed to huggingface), please try and extract the minimum necessary repro code and add it into the model folder as an model_implementation.py file (in pytorch) then convert the model to onnx and run. If the model is published under a licence we don't currently have in our LICENCE file, add it.
Don't add the onnx file to CI, do the conversion at runtime.
If you cannot extract the model cleanly into a single (or several) files, let's discuss.
The text was updated successfully, but these errors were encountered: