Skip to content

Update Inductor windows tutorial with xpu support #3309

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Apr 18, 2025
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
1d49a44
update link to a intel compiler guide with performance data
ZhaoqiongZ Oct 21, 2024
d3f2ca0
update
ZhaoqiongZ Oct 21, 2024
7335f5b
add title on the link
ZhaoqiongZ Oct 21, 2024
a773e9b
update
ZhaoqiongZ Oct 21, 2024
53f3886
rephrase the sentence
ZhaoqiongZ Oct 21, 2024
ca95172
Merge branch 'main' into main
ZhaoqiongZ Oct 21, 2024
372f0c9
Merge branch 'main' into main
svekars Oct 22, 2024
7aae4e6
Merge branch 'pytorch:main' into main
ZhaoqiongZ Mar 31, 2025
3f09390
update inductor windows with xpu support
ZhaoqiongZ Mar 31, 2025
ab6da28
Update conclusion inductor_windows.rst
ZhaoqiongZ Apr 1, 2025
615f97b
Update inductor_windows.rst
ZhaoqiongZ Apr 1, 2025
1ce21b9
Merge branch 'main' into main
ZhaoqiongZ Apr 2, 2025
8cac362
add inductor_windows_cpu and redirect to inductor_windows
ZhaoqiongZ Apr 3, 2025
e40b266
update inductor_windows
ZhaoqiongZ Apr 4, 2025
9bc6895
Merge branch 'main' into main
ZhaoqiongZ Apr 14, 2025
8fff638
Merge branch 'main' into main
svekars Apr 15, 2025
a81c441
Update prototype_source/inductor_windows.rst
ZhaoqiongZ Apr 17, 2025
836b2b9
Update prototype_source/inductor_windows.rst
ZhaoqiongZ Apr 17, 2025
5af0cd5
Update prototype_source/inductor_windows.rst
ZhaoqiongZ Apr 17, 2025
c8f5f6c
Update prototype_source/inductor_windows.rst
ZhaoqiongZ Apr 17, 2025
62358be
Update prototype_source/inductor_windows.rst
ZhaoqiongZ Apr 17, 2025
fcbd371
Update prototype_source/inductor_windows.rst
ZhaoqiongZ Apr 17, 2025
a90c38a
Update prototype_source/inductor_windows.rst
ZhaoqiongZ Apr 17, 2025
8202be7
Update prototype_source/inductor_windows.rst
ZhaoqiongZ Apr 17, 2025
b41b86c
Update prototype_source/inductor_windows.rst
ZhaoqiongZ Apr 17, 2025
2ce29a0
Update prototype_source/inductor_windows.rst
ZhaoqiongZ Apr 17, 2025
eec7ebf
Merge branch 'main' into main
svekars Apr 18, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
111 changes: 111 additions & 0 deletions prototype_source/inductor_windows.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
How to use ``torch.compile`` on Windows CPU/XPU
===============================================

**Author**: `Zhaoqiong Zheng <https://github.com/ZhaoqiongZ>`_, `Xu, Han <https://github.com/xuhancn>`_


Introduction
------------

TorchInductor is the new compiler backend that compiles the FX Graphs generated by TorchDynamo into optimized C++/Triton kernels.

This tutorial introduces the steps for utilizing TorchInductor via ``torch.compile`` on Windows CPU/XPU.


Software Installation
---------------------

Now, we will walk you through a step-by-step tutorial for how to use ``torch.compile`` on Windows CPU/XPU.

Install a Compiler
^^^^^^^^^^^^^^^^^^

C++ compiler is required for torchinductor optimization, let's take Microsoft Visual C++ (MSVC) as an example.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep formatting consistent, ie, TorchInductor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep formatting for all the TorchInductor


Download and install `MSVC <https://visualstudio.microsoft.com/downloads/>`_.

During Installation, select ``Workloads`` table then ``Desktop & Mobile`` Section, check mark on ``Desktop Development with C++`` and then install.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to have screenshots here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a screenshot here


.. note::

Windows CPU inductor also support C++ compiler `LLVM Compiler <https://github.com/llvm/llvm-project/releases>`_ and `Intel Compiler <https://www.intel.com/content/www/us/en/developer/tools/oneapi/dpc-compiler-download.html>`_ for better performance.
Please check `Alternative Compiler for better performance on CPU <#alternative-compiler-for-better-performance>`_.

Conda Installation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per pytorch/pytorch#149551, Conda is no longer being used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the conda installation and let user create and activate virtual environment on their own

^^^^^^^^^^^^^^^^^^

Prepare Conda Environment by Miniforge or Anaconda.
For example, download and install `Miniforge <https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Windows-x86_64.exe>`_.

Set Up Environment
^^^^^^^^^^^^^^^^^^

#. Open a command line environment via cmd.exe.
#. Activate ``MSVC`` via below command::

"C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Auxiliary/Build/vcvars64.bat"
#. Activate ``conda`` via below command::

"C:/ProgramData/miniforge3/Scripts/activate.bat"
#. Create and activate customer conda environment::

conda create -n inductor_windows python=3.10 -y
#. Activate customer conda environment::

conda activate inductor_windows
#. Install `PyTorch 2.5 <https://pytorch.org/get-started/locally/>`_ or later for CPU Usage. Install PyTorch 2.7 or later refer to `Getting Started on Intel GPU <https://pytorch.org/docs/main/notes/get_start_xpu.html>`_ for XPU usage.
#. Use torchinductor on Windows::

import torch
device="cpu" # or "xpu" for XPU
def foo(x, y):
a = torch.sin(x)
b = torch.cos(x)
return a + b
opt_foo1 = torch.compile(foo)
print(opt_foo1(torch.randn(10, 10).to(device), torch.randn(10, 10).to(device)))

#. Output of the above example::

tensor([[-3.9074e-02, 1.3994e+00, 1.3894e+00, 3.2630e-01, 8.3060e-01,
1.1833e+00, 1.4016e+00, 7.1905e-01, 9.0637e-01, -1.3648e+00],
[ 1.3728e+00, 7.2863e-01, 8.6888e-01, -6.5442e-01, 5.6790e-01,
5.2025e-01, -1.2647e+00, 1.2684e+00, -1.2483e+00, -7.2845e-01],
[-6.7747e-01, 1.2028e+00, 1.1431e+00, 2.7196e-02, 5.5304e-01,
6.1945e-01, 4.6654e-01, -3.7376e-01, 9.3644e-01, 1.3600e+00],
[-1.0157e-01, 7.7200e-02, 1.0146e+00, 8.8175e-02, -1.4057e+00,
8.8119e-01, 6.2853e-01, 3.2773e-01, 8.5082e-01, 8.4615e-01],
[ 1.4140e+00, 1.2130e+00, -2.0762e-01, 3.3914e-01, 4.1122e-01,
8.6895e-01, 5.8852e-01, 9.3310e-01, 1.4101e+00, 9.8318e-01],
[ 1.2355e+00, 7.9290e-02, 1.3707e+00, 1.3754e+00, 1.3768e+00,
9.8970e-01, 1.1171e+00, -5.9944e-01, 1.2553e+00, 1.3394e+00],
[-1.3428e+00, 1.8400e-01, 1.1756e+00, -3.0654e-01, 9.7973e-01,
1.4019e+00, 1.1886e+00, -1.9194e-01, 1.3632e+00, 1.1811e+00],
[-7.1615e-01, 4.6622e-01, 1.2089e+00, 9.2011e-01, 1.0659e+00,
9.0892e-01, 1.1932e+00, 1.3888e+00, 1.3898e+00, 1.3218e+00],
[ 1.4139e+00, -1.4000e-01, 9.1192e-01, 3.0175e-01, -9.6432e-01,
-1.0498e+00, 1.4115e+00, -9.3212e-01, -9.0964e-01, 1.0127e+00],
[ 5.7244e-04, 1.2799e+00, 1.3595e+00, 1.0907e+00, 3.7191e-01,
1.4062e+00, 1.3672e+00, 6.8502e-02, 8.5216e-01, 8.6046e-01]])

Alternative Compiler for better performance on CPU
--------------------------------------------------

To enhance performance for inductor on Windows CPU, you can use the Intel Compiler or LLVM Compiler. However, they rely on the runtime libraries from Microsoft Visual C++ (MSVC). Therefore, your first step should be to install MSVC.

Intel Compiler
^^^^^^^^^^^^^^

#. Download and install `Intel Compiler <https://www.intel.com/content/www/us/en/developer/tools/oneapi/dpc-compiler-download.html>`_ with Windows version.
#. Set Windows Inductor Compiler via environment variable ``set CXX=icx-cl``

LLVM Compiler
^^^^^^^^^^^^^

#. Download and install `LLVM Compiler <https://github.com/llvm/llvm-project/releases>`_ and choose win64 version.
#. Set Windows Inductor Compiler via environment variable ``set CXX=clang-cl``

Conclusion
----------

With this tutorial, we introduce how to use Inductor on Windows CPU/XPU with PyTorch. We can use Intel Compiler or LLVM Compiler to get better performance on CPU.
130 changes: 0 additions & 130 deletions prototype_source/inductor_windows_cpu.rst

This file was deleted.

4 changes: 2 additions & 2 deletions prototype_source/prototype_index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -221,7 +221,7 @@ Prototype features are not available as part of binary distributions like PyPI o
:header: Inductor Windows CPU Tutorial
:card_description: Speed up your models with Inductor On Windows CPU
:image: ../_static/img/thumbnails/cropped/generic-pytorch-logo.png
:link: ../prototype/inductor_windows_cpu.html
:link: ../prototype/inductor_windows.html
:tags: Model-Optimization

.. customcarditem::
Expand Down Expand Up @@ -271,7 +271,7 @@ Prototype features are not available as part of binary distributions like PyPI o
prototype/flight_recorder_tutorial.html
prototype/graph_mode_dynamic_bert_tutorial.html
prototype/inductor_cpp_wrapper_tutorial.html
prototype/inductor_windows_cpu.html
prototype/inductor_windows.html
prototype/pt2e_quantizer.html
prototype/pt2e_quant_ptq.html
prototype/pt2e_quant_qat.html
Expand Down