|
| 1 | +# Pull Request: MAP_JIT support for ARM64 macOS |
| 2 | + |
| 3 | +## Title |
| 4 | + |
| 5 | +fix(jit): Add MAP_JIT support for 100% stable JIT execution on ARM64 macOS |
| 6 | + |
| 7 | +## Summary |
| 8 | + |
| 9 | +This PR fixes non-deterministic JIT execution failures on Apple Silicon by implementing proper memory allocation and W^X mode handling required by the platform. |
| 10 | + |
| 11 | +**Before:** ~56% success rate |
| 12 | +**After:** 100% success rate (verified over 50+ consecutive runs) |
| 13 | + |
| 14 | +## Problem |
| 15 | + |
| 16 | +JIT-compiled code on ARM64 macOS fails non-deterministically in multi-threaded scenarios. Symptoms include: |
| 17 | + |
| 18 | +| Failure Mode | Frequency | |
| 19 | +|--------------|-----------| |
| 20 | +| SIGBUS on JIT function calls | ~30% | |
| 21 | +| Silent incorrect results | ~10% | |
| 22 | +| Segmentation fault | ~4% | |
| 23 | + |
| 24 | +### Root Cause |
| 25 | + |
| 26 | +Apple Silicon enforces W^X (Write XOR Execute) at the hardware level: |
| 27 | + |
| 28 | +1. **Memory must be allocated with `MAP_JIT` flag** so the kernel can track it for W^X enforcement |
| 29 | +2. **Threads must switch to execute mode** via `pthread_jit_write_protect_np(1)` before calling JIT code |
| 30 | + |
| 31 | +The current implementation: |
| 32 | +- Uses standard allocator (no `MAP_JIT`) |
| 33 | +- Doesn't call `pthread_jit_write_protect_np()` |
| 34 | + |
| 35 | +## Solution |
| 36 | + |
| 37 | +### `cranelift/jit/src/memory/system.rs` |
| 38 | + |
| 39 | +Added an ARM64 macOS-specific `PtrLen::with_size()` implementation that uses `mmap` with the `MAP_JIT` flag (0x0800) instead of the standard allocator. This allows macOS to properly track the memory for W^X policy enforcement. |
| 40 | + |
| 41 | +Also added a corresponding `Drop` implementation that uses `munmap` to deallocate the memory, since memory allocated with `mmap` cannot be freed with the standard allocator. |
| 42 | + |
| 43 | +### `cranelift/jit/src/memory/mod.rs` |
| 44 | + |
| 45 | +After making memory executable in `set_readable_and_executable()`, added a call to `pthread_jit_write_protect_np(1)` to switch the current thread to execute mode. This is required by Apple's W^X enforcement - threads must explicitly opt into execute mode before running JIT code. |
| 46 | + |
| 47 | +## Testing |
| 48 | + |
| 49 | +### Test Script |
| 50 | + |
| 51 | +```bash |
| 52 | +#!/bin/bash |
| 53 | +passed=0 |
| 54 | +failed=0 |
| 55 | + |
| 56 | +for i in {1..50}; do |
| 57 | + result=$(timeout 120 ./target/release/jit_test 2>&1) |
| 58 | + if echo "$result" | grep -q "All tests passed"; then |
| 59 | + echo "Run $i: PASSED" |
| 60 | + passed=$((passed+1)) |
| 61 | + else |
| 62 | + echo "Run $i: FAILED" |
| 63 | + failed=$((failed+1)) |
| 64 | + fi |
| 65 | +done |
| 66 | + |
| 67 | +echo "" |
| 68 | +echo "=== RESULTS ===" |
| 69 | +echo "Passed: $passed/50, Failed: $failed/50" |
| 70 | +echo "Success rate: $((passed * 100 / 50))%" |
| 71 | +``` |
| 72 | + |
| 73 | +### Results |
| 74 | + |
| 75 | +Tested with the [Rayzor compiler's stdlib e2e test suite](https://github.com/darmie/rayzor/blob/main/compiler/examples/test_rayzor_stdlib_e2e.rs) (50+ JIT-compiled runtime functions, multi-threaded): |
| 76 | + |
| 77 | +| Configuration | Success Rate | |
| 78 | +|--------------|--------------| |
| 79 | +| Before fix (standard allocator) | ~56% (28/50) | |
| 80 | +| After fix (MAP_JIT + pthread_jit_write_protect_np) | **100%** (50/50) | |
| 81 | + |
| 82 | +**Note:** Simple standalone tests may not reliably reproduce this issue. The failure is non-deterministic and depends on timing, memory layout, and CPU core scheduling (P-core vs E-core). |
| 83 | + |
| 84 | +## Platform Impact |
| 85 | + |
| 86 | +- **ARM64 macOS**: Fixed (was broken) |
| 87 | +- **x86_64 macOS**: No change (not affected) |
| 88 | +- **Linux (all arch)**: No change (not affected) |
| 89 | +- **Windows**: No change (not affected) |
| 90 | + |
| 91 | +All changes are gated behind `#[cfg(all(target_arch = "aarch64", target_os = "macos"))]`. |
| 92 | + |
| 93 | +## Related Issues |
| 94 | + |
| 95 | +- Fixes #XXXX (replace with issue number after creating) |
| 96 | +- Related to #2735 - Support PLT entries in `cranelift-jit` crate on aarch64 |
| 97 | +- Related to #8852 - Cranelift: JIT assertion failure on macOS (A64) |
| 98 | +- Related to #4000 - JIT relocations depend on system allocator behaviour |
| 99 | + |
| 100 | +## References |
| 101 | + |
| 102 | +- [Apple: Writing ARM64 Code for Apple Platforms](https://developer.apple.com/documentation/xcode/writing_arm64_code_for_apple_platforms) |
| 103 | +- [Porting Just-In-Time Compilers to Apple Silicon](https://developer.apple.com/documentation/apple-silicon/porting-just-in-time-compilers-to-apple-silicon) |
| 104 | +- [MAP_JIT documentation](https://developer.apple.com/documentation/bundleresources/entitlements/com_apple_security_cs_allow-jit) |
| 105 | + |
| 106 | +## Notes for Reviewers |
| 107 | + |
| 108 | +1. **Thread safety consideration**: The `pthread_jit_write_protect_np(1)` call in `mod.rs` handles the compiling thread. Applications spawning threads that call JIT code must also ensure those threads are in execute mode. This could be: |
| 109 | + - Documented as a requirement for users |
| 110 | + - Handled via a helper function in the public API |
| 111 | + |
| 112 | +2. **Deallocation**: Memory allocated with `mmap` must be freed with `munmap`, hence the separate `Drop` implementation. |
| 113 | + |
| 114 | +## Checklist |
| 115 | + |
| 116 | +- [x] Code compiles without warnings |
| 117 | +- [x] All existing tests pass |
| 118 | +- [x] New functionality tested (50+ stability runs) |
| 119 | +- [x] Changes are platform-specific (no impact on other platforms) |
| 120 | +- [x] Comments explain the "why" not just the "what" |
0 commit comments