Pre-built wheels that erase Flash Attention 3 installation headaches — now with Windows & Arm support! 🎉
**🚀 Update: We've successfully built Flash Attention 3 wheels for Windows and Arm CUDA SBSA platforms like GH200!
Upstream PR: Windows compatibility fixes submitted to Dao-AILab/flash-attention#2047
- 2026-03-19: 🆕 CUDA 13.0 support added for Windows and Linux
- 2026-03-16: ⚙️ GitHub Actions updated to Node.js 24 compatible versions
- 2026-03-04: 🐛 Fixed corrupt patch error in Windows builds
- 2026-03-04: 🔧 Build scripts now clone directly from upstream flash-attention
- Earlier: 🪟 Windows and Arm64 (GH200) wheel support added; upstream PR #2047 submitted
Pick the line that matches your setup (change cu128 / torch280 if needed):
# CUDA 12.8 + PyTorch 2.8.0
pip install flash_attn_3 \
--find-links https://windreamer.github.io/flash-attention3-wheels/cu128_torch280Visit the GitHub Pages site and choose the link that matches:
- CUDA 13.0 →
cu130_torch... - CUDA 12.9 →
cu129_torch... - CUDA 12.8 →
cu128_torch... - CUDA 12.6 →
cu126_torch...
Each page shows the one-liner you need.
- Biweekly, every 2nd and 4th Sunday at 22:00 UTC
- On demand, by triggering the workflow manually if you need a fresher build
Releases are tagged with the build date (2025.10.15) so you always know how fresh your wheel is.
The build scripts and index generator are Apache-2.0.