- Run the workflow 'Build Triton'. Set 'Git tag' (e.g. 
v3.3.x-windowsas of now) and 'Triton wheel version suffix' (e.g..post18for a regular release anda0.post18for a pre-release) - Download the artifacts
 - Do some sanity checks locally, e.g. diff with the last wheel, pip install the wheel, run it in ComfyUI
 - Upload the wheels to PyPI using twine
 
The workflow 'Build and Test Triton' runs all unit tests. It requires a self-hosted runner with GPU. Due to the cost, we only turn on the VM when there is a significant release.
- Run the workflow 'Build SageAttention'. We support torch 2.5.1 + CUDA 12.4.1 and torch 2.6.0 + CUDA 12.6.3 as of now
 - Download the artifacts
 - Do some sanity checks locally
 - Upload the wheels to GitHub releases