-
Notifications
You must be signed in to change notification settings - Fork 725
Open
Labels
bugUnexpected behaviour that should be corrected (type)Unexpected behaviour that should be corrected (type)
Description
🐞Describing the bug
I'm roughly following this guide on LLM exporting. I adjusted the input names to be able to use it with this HF demo. I also added W8A8 quantization to reduce space. The model takes 1.4Gb saved. by my estimations, all the activations shouldn't take more than 1Gb. Nevertheless, when calling "predict" on IPhone or IPad (iOS18 both, M1 chip IPad pro and IPhone 15 pro, both have ~8Gb of RAM), it OOMs with memory profiler showing a 10Gb malloc call inside some metal dispatch call. It also happens when using the .mlpackage GUI benchmarking feature. It also uses a bunch of RAM (~12Gb) when benchmarking on iOS15 using that feature.
To Reproduce
- Conversion script
- HF demo
- Memory profiler trace (IDK how to copy from that app)
System environment (please complete the following information):
Name: torch
Version: 2.8.0
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: [email protected]
License: BSD-3-Clause
Location: /opt/miniconda3/envs/executorch/lib/python3.11/site-packages
Requires: filelock, fsspec, jinja2, networkx, sympy, typing-extensions
Required-by: executorch, timm, torchaudio, torchdata, torchsr, torchvision
---
Name: transformers
Version: 4.47.1
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: [email protected]
License: Apache 2.0 License
Location: /opt/miniconda3/envs/executorch/lib/python3.11/site-packages
Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, safetensors, tokenizers, tqdm
Required-by:
---
Name: coremltools
Version: 9.0b1
Summary: Community Tools for Core ML
Home-page: https://github.com/apple/coremltools
Author: Apple Inc.
Author-email: [email protected]
License: BSD
Location: /opt/miniconda3/envs/executorch/lib/python3.11/site-packages
Editable project location: /opt/miniconda3/envs/executorch/lib/python3.11/site-packages
Requires: attrs, cattrs, numpy, packaging, protobuf, pyaml, sympy, tqdm
Required-by: executorch
Converted on macOS 15.6.1 Apple M3 Pro 18Gb.
Ran on:
macOS 15.6.1 Apple M3 Pro 18Gb.iPadOS 18.6.2 iPad Pro M1iOS 18.6.2 iPhone 15 Pro
Additional context
- Add anything else about the problem here that you want to share.
Metadata
Metadata
Assignees
Labels
bugUnexpected behaviour that should be corrected (type)Unexpected behaviour that should be corrected (type)