Skip to content

Exported Llama 1B transformer with static 128 sequence length tries to allocate 10Gb on iOS18 causing OOM #2590

@BlackSamorez

Description

@BlackSamorez

🐞Describing the bug

I'm roughly following this guide on LLM exporting. I adjusted the input names to be able to use it with this HF demo. I also added W8A8 quantization to reduce space. The model takes 1.4Gb saved. by my estimations, all the activations shouldn't take more than 1Gb. Nevertheless, when calling "predict" on IPhone or IPad (iOS18 both, M1 chip IPad pro and IPhone 15 pro, both have ~8Gb of RAM), it OOMs with memory profiler showing a 10Gb malloc call inside some metal dispatch call. It also happens when using the .mlpackage GUI benchmarking feature. It also uses a bunch of RAM (~12Gb) when benchmarking on iOS15 using that feature.

To Reproduce

Image

System environment (please complete the following information):

Name: torch
Version: 2.8.0
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: [email protected]
License: BSD-3-Clause
Location: /opt/miniconda3/envs/executorch/lib/python3.11/site-packages
Requires: filelock, fsspec, jinja2, networkx, sympy, typing-extensions
Required-by: executorch, timm, torchaudio, torchdata, torchsr, torchvision
---
Name: transformers
Version: 4.47.1
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: [email protected]
License: Apache 2.0 License
Location: /opt/miniconda3/envs/executorch/lib/python3.11/site-packages
Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, safetensors, tokenizers, tqdm
Required-by: 
---
Name: coremltools
Version: 9.0b1
Summary: Community Tools for Core ML
Home-page: https://github.com/apple/coremltools
Author: Apple Inc.
Author-email: [email protected]
License: BSD
Location: /opt/miniconda3/envs/executorch/lib/python3.11/site-packages
Editable project location: /opt/miniconda3/envs/executorch/lib/python3.11/site-packages
Requires: attrs, cattrs, numpy, packaging, protobuf, pyaml, sympy, tqdm
Required-by: executorch

Converted on macOS 15.6.1 Apple M3 Pro 18Gb.
Ran on:

  • macOS 15.6.1 Apple M3 Pro 18Gb.
  • iPadOS 18.6.2 iPad Pro M1
  • iOS 18.6.2 iPhone 15 Pro

Additional context

  • Add anything else about the problem here that you want to share.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugUnexpected behaviour that should be corrected (type)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions