Skip to content

Commit 645cd2c

Browse files
authored
Update LLaMA iOS docs (#10255)
1 parent adb7056 commit 645cd2c

File tree

3 files changed

+30
-83
lines changed

3 files changed

+30
-83
lines changed
-1.55 MB
Binary file not shown.
-271 KB
Loading
+30-83
Original file line numberDiff line numberDiff line change
@@ -1,100 +1,47 @@
11
# ExecuTorch Llama iOS Demo App
22

3-
**[UPDATE - 10/24]** We have added support for running quantized Llama 3.2 1B/3B models in demo apps on the [XNNPACK backend](https://github.com/pytorch/executorch/blob/main/examples/demo-apps/apple_ios/LLaMA/docs/delegates/xnnpack_README.md). We currently support inference with SpinQuant and QAT+LoRA quantization methods.
3+
Get hands-on with running LLaMA and LLaVA models — exported via ExecuTorch — natively on your iOS device!
44

5-
We’re excited to share that the newly revamped iOS demo app is live and includes many new updates to provide a more intuitive and smoother user experience with a chat use case! The primary goal of this app is to showcase how easily ExecuTorch can be integrated into an iOS demo app and how to exercise the many features ExecuTorch and Llama models have to offer.
6-
7-
This app serves as a valuable resource to inspire your creativity and provide foundational code that you can customize and adapt for your particular use case.
8-
9-
Please dive in and start exploring our demo app today! We look forward to any feedback and are excited to see your innovative ideas.
10-
11-
## Key Concepts
12-
From this demo app, you will learn many key concepts such as:
13-
* How to prepare Llama models, build the ExecuTorch library, and perform model inference across delegates
14-
* Expose the ExecuTorch library via Swift Package Manager
15-
* Familiarity with current ExecuTorch app-facing capabilities
16-
17-
The goal is for you to see the type of support ExecuTorch provides and feel comfortable with leveraging it for your use cases.
18-
19-
## Supported Models
20-
21-
As a whole, the models that this app supports are (varies by delegate):
22-
* Llama 3.2 Quantized 1B/3B
23-
* Llama 3.2 1B/3B in BF16
24-
* Llama 3.1 8B
25-
* Llama 3 8B
26-
* Llama 2 7B
27-
* Llava 1.5 (only XNNPACK)
28-
29-
## Building the application
30-
First it’s important to note that currently ExecuTorch provides support across several delegates. Once you identify the delegate of your choice, select the README link to get a complete end-to-end instructions for environment set-up to export the models to build ExecuTorch libraries and apps to run on device:
31-
32-
| Delegate | Resource |
33-
| ------------------------------ | --------------------------------- |
34-
| XNNPACK (CPU-based library) | [link](https://github.com/pytorch/executorch/blob/main/examples/demo-apps/apple_ios/LLaMA/docs/delegates/xnnpack_README.md)|
35-
| MPS (Metal Performance Shader) | [link](https://github.com/pytorch/executorch/blob/main/examples/demo-apps/apple_ios/LLaMA/docs/delegates/mps_README.md) |
36-
37-
## How to Use the App
38-
This section will provide the main steps to use the app, along with a code snippet of the ExecuTorch API.
39-
40-
### Swift Package Manager
41-
42-
ExecuTorch runtime is distributed as a Swift package providing some .xcframework as prebuilt binary targets.
43-
Xcode will download and cache the package on the first run, which will take some time.
44-
45-
Note: If you're running into any issues related to package dependencies, quit Xcode entirely, delete the whole executorch repo, clean the caches by running the command below in terminal and clone the repo again.
46-
47-
```
48-
rm -rf \
49-
~/Library/org.swift.swiftpm \
50-
~/Library/Caches/org.swift.swiftpm \
51-
~/Library/Caches/com.apple.dt.Xcode \
52-
~/Library/Developer/Xcode/DerivedData
53-
```
54-
55-
Link your binary with the ExecuTorch runtime and any backends or kernels used by the exported ML model. It is recommended to link the core runtime to the components that use ExecuTorch directly, and link kernels and backends against the main app target.
56-
57-
Note: To access logs, link against the Debug build of the ExecuTorch runtime, i.e., the executorch_debug framework. For optimal performance, always link against the Release version of the deliverables (those without the _debug suffix), which have all logging overhead removed.
58-
59-
For more details integrating and Running ExecuTorch on Apple Platforms, checkout this [link](https://pytorch.org/executorch/main/using-executorch-ios).
60-
61-
### XCode
62-
* Open XCode and select "Open an existing project" to open `examples/demo-apps/apple_ios/LLama`.
63-
* Ensure that the ExecuTorch package dependencies are installed correctly, then select which ExecuTorch framework should link against which target.
5+
*Click the image below to see it in action!*
646

657
<p align="center">
66-
<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/ios_demo_app_swift_pm.png" alt="iOS LLaMA App Swift PM" style="width:600px">
8+
<a href="../../../../docs/source/_static/img/llama_ios_app.mp4">
9+
<img src="../../../../docs/source/_static/img/llama_ios_app.png" width="600" alt="iOS app running a LlaMA model">
10+
</a>
6711
</p>
6812

69-
<p align="center">
70-
<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/ios_demo_app_choosing_package.png" alt="iOS LLaMA App Choosing package" style="width:600px">
71-
</p>
13+
## Requirements
14+
- [Xcode](https://apps.apple.com/us/app/xcode/id497799835?mt=12/) 15.0 or later
15+
- [Cmake](https://cmake.org/download/) 3.19 or later
16+
- Download and open the macOS `.dmg` installer and move the Cmake app to `/Applications` folder.
17+
- Install Cmake command line tools: `sudo /Applications/CMake.app/Contents/bin/cmake-gui --install`
18+
- A development provisioning profile with the [`increased-memory-limit`](https://developer.apple.com/documentation/bundleresources/entitlements/com_apple_developer_kernel_increased-memory-limit) entitlement.
7219

73-
* Run the app. This builds and launches the app on the phone.
74-
* In app UI pick a model and tokenizer to use, type a prompt and tap the arrow buton
20+
## Models
7521

76-
## Copy the model to Simulator
22+
Download already exported LLaMA/LLaVA models along with tokenizers from [HuggingFace](https://huggingface.co/executorch-community) or export your own empowered by [XNNPACK](docs/delegates/xnnpack_README.md) or [MPS](docs/delegates/mps_README.md) backends.
7723

78-
* Drag&drop the model and tokenizer files onto the Simulator window and save them somewhere inside the iLLaMA folder.
79-
* Pick the files in the app dialog, type a prompt and click the arrow-up button.
24+
## Build and Run
8025

81-
## Copy the model to Device
26+
1. Make sure git submodules are up-to-date:
27+
```bash
28+
git submodule update --init --recursive
29+
```
8230

83-
* Wire-connect the device and open the contents in Finder.
84-
* Navigate to the Files tab and drag&drop the model and tokenizer files onto the iLLaMA folder.
85-
* Wait until the files are copied.
31+
2. Open the Xcode project:
32+
```bash
33+
open examples/demo-apps/apple_ios/LLaMA/LLaMA.xcodeproj
34+
```
35+
36+
3. Click the Play button to launch the app in the Simulator.
8637

87-
If the app successfully run on your device, you should see something like below:
38+
4. To run on a device, ensure you have it set up for development and a provisioning profile with the `increased-memory-limit` entitlement. Update the app's bundle identifier to match your provisioning profile with the required capability.
8839
89-
<p align="center">
90-
<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/ios_demo_app.jpg" alt="iOS LLaMA App" style="width:300px">
91-
</p>
40+
5. After successfully launching the app, copy the exported ExecuTorch model (`.pte`) and tokenizer (`.model`) files to the iLLaMA folder.
9241
93-
For Llava 1.5 models, you can select and image (via image/camera selector button) before typing prompt and send button.
42+
- **For the Simulator:** Drag and drop both files onto the Simulator window and save them in the `On My iPhone > iLLaMA` folder.
43+
- **For a Device:** Open a separate Finder window, navigate to the Files tab, drag and drop both files into the iLLaMA folder, and wait for the copying to finish.
9444
95-
<p align="center">
96-
<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/ios_demo_app_llava.jpg" alt="iOS LLaMA App" style="width:300px">
97-
</p>
45+
6. Follow the app's UI guidelines to select the model and tokenizer files from the local filesystem and issue a prompt.
9846

99-
## Reporting Issues
100-
If you encountered any bugs or issues following this tutorial please file a bug/issue here on [Github](https://github.com/pytorch/executorch/issues/new).
47+
For more details check out the [Using ExecuTorch on iOS](../../../../docs/source/using-executorch-ios.md) page.

0 commit comments

Comments
 (0)