From 310aa6a49db4596aaaa52ee04dc91d1aa9e1c74d Mon Sep 17 00:00:00 2001 From: Alanna Burke Date: Thu, 3 Apr 2025 20:06:23 -0400 Subject: [PATCH 1/9] Removing and marking conda mentions for update. --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index f17f7f5e3d9..80c097d0a31 100644 --- a/README.md +++ b/README.md @@ -37,7 +37,7 @@ The tutorial build is very large and requires a GPU. If your machine does not ha 1. Install required dependencies by running: `pip install -r requirements.txt`. -> Typically, you would run either in `conda` or `virtualenv`. If you want to use `virtualenv`, in the root of the repo, run: `virtualenv venv`, then `source venv/bin/activate`. +> To use `virtualenv`, in the root of the repo, run: `virtualenv venv`, then `source venv/bin/activate`. - If you have a GPU-powered laptop, you can build using `make docs`. This will download the data, execute the tutorials and build the documentation to `docs/` directory. This might take about 60-120 min for systems with GPUs. If you do not have a GPU installed on your system, then see next step. - You can skip the computationally intensive graph generation by running `make html-noplot` to build basic html documentation to `_build/html`. This way, you can quickly preview your tutorial. From d922a63fef63b6bfc83c5c29765cefed9bba17c4 Mon Sep 17 00:00:00 2001 From: Alanna Burke Date: Thu, 3 Apr 2025 20:06:32 -0400 Subject: [PATCH 2/9] Removing and marking conda mentions for update. --- _templates/layout.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_templates/layout.html b/_templates/layout.html index d6946eab087..ac19f8b4869 100644 --- a/_templates/layout.html +++ b/_templates/layout.html @@ -22,7 +22,7 @@ From 927dde7690b4bc81053ab8e01c9455a75c963326 Mon Sep 17 00:00:00 2001 From: Alanna Burke Date: Thu, 3 Apr 2025 20:06:43 -0400 Subject: [PATCH 3/9] Removing and marking conda mentions for update. --- advanced_source/sharding.rst | 11 +++++------ advanced_source/torch_script_custom_ops.rst | 1 + 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/advanced_source/sharding.rst b/advanced_source/sharding.rst index 7dfeeb88bf1..8c811034671 100644 --- a/advanced_source/sharding.rst +++ b/advanced_source/sharding.rst @@ -15,8 +15,8 @@ We highly recommend CUDA when using torchRec. If using CUDA: - cuda >= 11.0 .. code:: python - - # install conda to make installying pytorch with cudatoolkit 11.3 easier. + # TODO: replace these + # install conda to make installying pytorch with cudatoolkit 11.3 easier. !sudo rm Miniconda3-py37_4.9.2-Linux-x86_64.sh Miniconda3-py37_4.9.2-Linux-x86_64.sh.* !sudo wget https://repo.anaconda.com/miniconda/Miniconda3-py37_4.9.2-Linux-x86_64.sh !sudo chmod +x Miniconda3-py37_4.9.2-Linux-x86_64.sh @@ -213,7 +213,7 @@ embedding table placement using planner and generate sharded model using ) sharders = [cast(ModuleSharder[torch.nn.Module], EmbeddingBagCollectionSharder())] plan: ShardingPlan = planner.collective_plan(module, sharders, pg) - + sharded_model = DistributedModelParallel( module, env=ShardingEnv.from_process_group(pg), @@ -234,7 +234,7 @@ ranks. .. code:: python import multiprocess - + def spmd_sharing_simulation( sharding_type: ShardingType = ShardingType.TABLE_WISE, world_size = 2, @@ -254,7 +254,7 @@ ranks. ) p.start() processes.append(p) - + for p in processes: p.join() assert 0 == p.exitcode @@ -333,4 +333,3 @@ With data parallel, we will repeat the tables for all devices. rank:0,sharding plan: {'': {'large_table_0': ParameterSharding(sharding_type='data_parallel', compute_kernel='batched_dense', ranks=[0, 1], sharding_spec=None), 'large_table_1': ParameterSharding(sharding_type='data_parallel', compute_kernel='batched_dense', ranks=[0, 1], sharding_spec=None), 'small_table_0': ParameterSharding(sharding_type='data_parallel', compute_kernel='batched_dense', ranks=[0, 1], sharding_spec=None), 'small_table_1': ParameterSharding(sharding_type='data_parallel', compute_kernel='batched_dense', ranks=[0, 1], sharding_spec=None)}} rank:1,sharding plan: {'': {'large_table_0': ParameterSharding(sharding_type='data_parallel', compute_kernel='batched_dense', ranks=[0, 1], sharding_spec=None), 'large_table_1': ParameterSharding(sharding_type='data_parallel', compute_kernel='batched_dense', ranks=[0, 1], sharding_spec=None), 'small_table_0': ParameterSharding(sharding_type='data_parallel', compute_kernel='batched_dense', ranks=[0, 1], sharding_spec=None), 'small_table_1': ParameterSharding(sharding_type='data_parallel', compute_kernel='batched_dense', ranks=[0, 1], sharding_spec=None)}} - diff --git a/advanced_source/torch_script_custom_ops.rst b/advanced_source/torch_script_custom_ops.rst index 0a0e6e2bd70..99d0185c47c 100644 --- a/advanced_source/torch_script_custom_ops.rst +++ b/advanced_source/torch_script_custom_ops.rst @@ -189,6 +189,7 @@ Environment setup We need an installation of PyTorch and OpenCV. The easiest and most platform independent way to get both is to via Conda:: +.. # TODO: replace these conda install -c pytorch pytorch conda install opencv From 6491947db3ae077abd85ca18f521a24815223cfe Mon Sep 17 00:00:00 2001 From: Alanna Burke Date: Thu, 3 Apr 2025 20:06:56 -0400 Subject: [PATCH 4/9] Removing and marking conda mentions for update. --- beginner_source/hta_intro_tutorial.rst | 6 +- beginner_source/introyt/captumyt.py | 85 +++++++++---------- .../introyt/tensorboardyt_tutorial.py | 61 ++++++------- 3 files changed, 69 insertions(+), 83 deletions(-) diff --git a/beginner_source/hta_intro_tutorial.rst b/beginner_source/hta_intro_tutorial.rst index dc7c8cedf9e..0443202d627 100644 --- a/beginner_source/hta_intro_tutorial.rst +++ b/beginner_source/hta_intro_tutorial.rst @@ -9,7 +9,7 @@ below. Installing HTA ~~~~~~~~~~~~~~ - +.. # TODO: replace We recommend using a Conda environment to install HTA. To install Anaconda, see `the official Anaconda documentation `_. @@ -130,12 +130,12 @@ on each rank. .. image:: ../_static/img/hta/idle_time_summary.png :scale: 100% - + .. tip:: By default, the idle time breakdown presents the percentage of each of the idle time categories. Setting the ``visualize_pctg`` argument to ``False``, - the function renders with absolute time on the y-axis. + the function renders with absolute time on the y-axis. Kernel Breakdown diff --git a/beginner_source/introyt/captumyt.py b/beginner_source/introyt/captumyt.py index abf2391d254..824998b63da 100644 --- a/beginner_source/introyt/captumyt.py +++ b/beginner_source/introyt/captumyt.py @@ -106,14 +106,7 @@ - Matplotlib version 3.3.4, since Captum currently uses a Matplotlib function whose arguments have been renamed in later versions -To install Captum in an Anaconda or pip virtual environment, use the -appropriate command for your environment below: - -With ``conda``: - -.. code-block:: sh - - conda install pytorch torchvision captum flask-compress matplotlib=3.3.4 -c pytorch +To install Captum in a pip virtual environment, use the command below: With ``pip``: @@ -127,14 +120,14 @@ A First Example --------------- - + To start, let’s take a simple, visual example. We’ll start with a ResNet model pretrained on the ImageNet dataset. We’ll get a test input, and use different **Feature Attribution** algorithms to examine how the input images affect the output, and see a helpful visualization of this input attribution map for some test images. - -First, some imports: + +First, some imports: """ @@ -160,7 +153,7 @@ # Now we’ll use the TorchVision model library to download a pretrained # ResNet. Since we’re not training, we’ll place it in evaluation mode for # now. -# +# model = models.resnet18(weights='IMAGENET1K_V1') model = model.eval() @@ -169,7 +162,7 @@ ####################################################################### # The place where you got this interactive notebook should also have an # ``img`` folder with a file ``cat.jpg`` in it. -# +# test_img = Image.open('img/cat.jpg') test_img_data = np.asarray(test_img) @@ -183,7 +176,7 @@ # range of values. We’ll also pull in the list of human-readable labels # for the categories our model recognizes - that should be in the ``img`` # folder as well. -# +# # model expects 224x224 3-color image transform = transforms.Compose([ @@ -210,7 +203,7 @@ ###################################################################### # Now, we can ask the question: What does our model think this image # represents? -# +# output = model(input_img) output = F.softmax(output, dim=1) @@ -223,53 +216,53 @@ ###################################################################### # We’ve confirmed that ResNet thinks our image of a cat is, in fact, a # cat. But *why* does the model think this is an image of a cat? -# +# # For the answer to that, we turn to Captum. -# +# ########################################################################## # Feature Attribution with Integrated Gradients # --------------------------------------------- -# +# # **Feature attribution** attributes a particular output to features of # the input. It uses a specific input - here, our test image - to generate # a map of the relative importance of each input feature to a particular # output feature. -# +# # `Integrated # Gradients `__ is one of # the feature attribution algorithms available in Captum. Integrated # Gradients assigns an importance score to each input feature by # approximating the integral of the gradients of the model’s output with # respect to the inputs. -# +# # In our case, we’re going to be taking a specific element of the output # vector - that is, the one indicating the model’s confidence in its # chosen category - and use Integrated Gradients to understand what parts # of the input image contributed to this output. -# +# # Once we have the importance map from Integrated Gradients, we’ll use the # visualization tools in Captum to give a helpful representation of the # importance map. Captum’s ``visualize_image_attr()`` function provides a # variety of options for customizing display of your attribution data. # Here, we pass in a custom Matplotlib color map. -# +# # Running the cell with the ``integrated_gradients.attribute()`` call will # usually take a minute or two. -# +# # Initialize the attribution algorithm with the model integrated_gradients = IntegratedGradients(model) -# Ask the algorithm to attribute our output target to +# Ask the algorithm to attribute our output target to attributions_ig = integrated_gradients.attribute(input_img, target=pred_label_idx, n_steps=200) # Show the original image for comparison -_ = viz.visualize_image_attr(None, np.transpose(transformed_img.squeeze().cpu().detach().numpy(), (1,2,0)), +_ = viz.visualize_image_attr(None, np.transpose(transformed_img.squeeze().cpu().detach().numpy(), (1,2,0)), method="original_image", title="Original Image") -default_cmap = LinearSegmentedColormap.from_list('custom blue', +default_cmap = LinearSegmentedColormap.from_list('custom blue', [(0, '#ffffff'), (0.25, '#0000ff'), (1, '#0000ff')], N=256) @@ -286,13 +279,13 @@ ####################################################################### # In the image above, you should see that Integrated Gradients gives us # the strongest signal around the cat’s location in the image. -# +# ########################################################################## # Feature Attribution with Occlusion # ---------------------------------- -# +# # Gradient-based attribution methods help to understand the model in terms # of directly computing out the output changes with respect to the input. # *Perturbation-based attribution* methods approach this more directly, by @@ -300,7 +293,7 @@ # `Occlusion `__ is one such method. # It involves replacing sections of the input image, and examining the # effect on the output signal. -# +# # Below, we set up Occlusion attribution. Similarly to configuring a # convolutional neural network, you can specify the size of the target # region, and a stride length to determine the spacing of individual @@ -310,7 +303,7 @@ # image with the positive attribution regions. The masking gives a very # instructive view of what regions of our cat photo the model found to be # most “cat-like”. -# +# occlusion = Occlusion(model) @@ -334,18 +327,18 @@ ###################################################################### # Again, we see greater significance placed on the region of the image # that contains the cat. -# +# ######################################################################### # Layer Attribution with Layer GradCAM # ------------------------------------ -# +# # **Layer Attribution** allows you to attribute the activity of hidden # layers within your model to features of your input. Below, we’ll use a # layer attribution algorithm to examine the activity of one of the # convolutional layers within our model. -# +# # GradCAM computes the gradients of the target output with respect to the # given layer, averages for each output channel (dimension 2 of output), # and multiplies the average gradient for each channel by the layer @@ -353,12 +346,12 @@ # designed for convnets; since the activity of convolutional layers often # maps spatially to the input, GradCAM attributions are often upsampled # and used to mask the input. -# +# # Layer attribution is set up similarly to input attribution, except that # in addition to the model, you must specify a hidden layer within the # model that you wish to examine. As above, when we call ``attribute()``, # we specify the target class of interest. -# +# layer_gradcam = LayerGradCam(model, model.layer3[1].conv2) attributions_lgc = layer_gradcam.attribute(input_img, target=pred_label_idx) @@ -373,7 +366,7 @@ # `LayerAttribution `__ # base class to upsample this attribution data for comparison to the input # image. -# +# upsamp_attr_lgc = LayerAttribution.interpolate(attributions_lgc, input_img.shape[2:]) @@ -393,26 +386,26 @@ ####################################################################### # Visualizations such as this can give you novel insights into how your # hidden layers respond to your input. -# +# ########################################################################## # Visualization with Captum Insights # ---------------------------------- -# +# # Captum Insights is an interpretability visualization widget built on top # of Captum to facilitate model understanding. Captum Insights works # across images, text, and other features to help users understand feature # attribution. It allows you to visualize attribution for multiple # input/output pairs, and provides visualization tools for image, text, # and arbitrary data. -# +# # In this section of the notebook, we’ll visualize multiple image # classification inferences with Captum Insights. -# +# # First, let’s gather some image and see what the model thinks of them. # For variety, we’ll take our cat, a teapot, and a trilobite fossil: -# +# imgs = ['img/cat.jpg', 'img/teapot.jpg', 'img/trilobite.jpg'] @@ -437,9 +430,9 @@ # imported below. The ``AttributionVisualizer`` expects batches of data, # so we’ll bring in Captum’s ``Batch`` helper class. And we’ll be looking # at images specifically, so well also import ``ImageFeature``. -# +# # We configure the ``AttributionVisualizer`` with the following arguments: -# +# # - An array of models to be examined (in our case, just the one) # - A scoring function, which allows Captum Insights to pull out the # top-k predictions from a model @@ -447,7 +440,7 @@ # - A list of features to look for - in our case, an ``ImageFeature`` # - A dataset, which is an iterable object returning batches of inputs # and labels - just like you’d use for training -# +# from captum.insights import AttributionVisualizer, Batch from captum.insights.attr_vis.features import ImageFeature @@ -488,12 +481,12 @@ def full_img_transform(input): # configure different attribution algorithms in a visual widget, after # which it will compute and display the attributions. *That* process will # take a few minutes. -# +# # Running the cell below will render the Captum Insights widget. You can # then choose attributions methods and their arguments, filter model # responses based on predicted class or prediction correctness, see the # model’s predictions with associated probabilities, and view heatmaps of # the attribution compared with the original image. -# +# visualizer.render() diff --git a/beginner_source/introyt/tensorboardyt_tutorial.py b/beginner_source/introyt/tensorboardyt_tutorial.py index 49d321bd6df..4b7aec8e2f3 100644 --- a/beginner_source/introyt/tensorboardyt_tutorial.py +++ b/beginner_source/introyt/tensorboardyt_tutorial.py @@ -24,13 +24,6 @@ To run this tutorial, you’ll need to install PyTorch, TorchVision, Matplotlib, and TensorBoard. -With ``conda``: - -.. code-block:: sh - - conda install pytorch torchvision -c pytorch - conda install matplotlib tensorboard - With ``pip``: .. code-block:: sh @@ -43,11 +36,11 @@ Introduction ------------ - + In this notebook, we’ll be training a variant of LeNet-5 against the Fashion-MNIST dataset. Fashion-MNIST is a set of image tiles depicting various garments, with ten class labels indicating the type of garment -depicted. +depicted. """ @@ -79,9 +72,9 @@ ###################################################################### # Showing Images in TensorBoard # ----------------------------- -# +# # Let’s start by adding sample images from our dataset to TensorBoard: -# +# # Gather datasets and prepare them for consumption transform = transforms.Compose( @@ -138,7 +131,7 @@ def matplotlib_imshow(img, one_channel=False): # minibatch of our input data. Below, we use the ``add_image()`` call on # ``SummaryWriter`` to log the image for consumption by TensorBoard, and # we also call ``flush()`` to make sure it’s written to disk right away. -# +# # Default log_dir argument is "runs" - but it's good to be specific # torch.utils.tensorboard.SummaryWriter is imported above @@ -157,17 +150,17 @@ def matplotlib_imshow(img, one_channel=False): # If you start TensorBoard at the command line and open it in a new # browser tab (usually at `localhost:6006 `__), you should # see the image grid under the IMAGES tab. -# +# # Graphing Scalars to Visualize Training # -------------------------------------- -# +# # TensorBoard is useful for tracking the progress and efficacy of your # training. Below, we’ll run a training loop, track some metrics, and save # the data for TensorBoard’s consumption. -# +# # Let’s define a model to categorize our image tiles, and an optimizer and # loss function for training: -# +# class Net(nn.Module): def __init__(self): @@ -187,7 +180,7 @@ def forward(self, x): x = F.relu(self.fc2(x)) x = self.fc3(x) return x - + net = Net() criterion = nn.CrossEntropyLoss() @@ -197,7 +190,7 @@ def forward(self, x): ########################################################################## # Now let’s train a single epoch, and evaluate the training vs. validation # set losses every 1000 batches: -# +# print(len(validation_loader)) for epoch in range(1): # loop over the dataset multiple times @@ -217,7 +210,7 @@ def forward(self, x): print('Batch {}'.format(i + 1)) # Check against the validation set running_vloss = 0.0 - + # In evaluation mode some model specific operations can be omitted eg. dropout layer net.train(False) # Switching to evaluation mode, eg. turning off regularisation for j, vdata in enumerate(validation_loader, 0): @@ -226,10 +219,10 @@ def forward(self, x): vloss = criterion(voutputs, vlabels) running_vloss += vloss.item() net.train(True) # Switching back to training mode, eg. turning on regularisation - + avg_loss = running_loss / 1000 avg_vloss = running_vloss / len(validation_loader) - + # Log the running loss averaged per batch writer.add_scalars('Training vs. Validation Loss', { 'Training' : avg_loss, 'Validation' : avg_vloss }, @@ -243,14 +236,14 @@ def forward(self, x): ######################################################################### # Switch to your open TensorBoard and have a look at the SCALARS tab. -# +# # Visualizing Your Model # ---------------------- -# +# # TensorBoard can also be used to examine the data flow within your model. # To do this, call the ``add_graph()`` method with a model and sample # input: -# +# # Again, grab a single mini-batch of images dataiter = iter(training_loader) @@ -266,10 +259,10 @@ def forward(self, x): # When you switch over to TensorBoard, you should see a GRAPHS tab. # Double-click the “NET” node to see the layers and data flow within your # model. -# +# # Visualizing Your Dataset with Embeddings # ---------------------------------------- -# +# # The 28-by-28 image tiles we’re using can be modeled as 784-dimensional # vectors (28 \* 28 = 784). It can be instructive to project this to a # lower-dimensional representation. The ``add_embedding()`` method will @@ -277,9 +270,9 @@ def forward(self, x): # and display them as an interactive 3D chart. The ``add_embedding()`` # method does this automatically by projecting to the three dimensions # with highest variance. -# +# # Below, we’ll take a sample of our data, and generate such an embedding: -# +# # Select a random subset of data and corresponding labels def select_n_random(data, labels, n=100): @@ -309,19 +302,19 @@ def select_n_random(data, labels, n=100): # zoom the model. Examine it at large and small scales, and see whether # you can spot patterns in the projected data and the clustering of # labels. -# +# # For better visibility, it’s recommended to: -# +# # - Select “label” from the “Color by” drop-down on the left. # - Toggle the Night Mode icon along the top to place the # light-colored images on a dark background. -# +# # Other Resources # --------------- -# +# # For more information, have a look at: -# +# # - PyTorch documentation on `torch.utils.tensorboard.SummaryWriter `__ -# - Tensorboard tutorial content in the `PyTorch.org Tutorials `__ +# - Tensorboard tutorial content in the `PyTorch.org Tutorials `__ # - For more information about TensorBoard, see the `TensorBoard # documentation `__ From 6a8af9346b7d2f0b217b127163f8f73cc47dbc3d Mon Sep 17 00:00:00 2001 From: Alanna Burke Date: Thu, 3 Apr 2025 20:07:02 -0400 Subject: [PATCH 5/9] Removing and marking conda mentions for update. --- intermediate_source/dist_tuto.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/intermediate_source/dist_tuto.rst b/intermediate_source/dist_tuto.rst index 1b622aa2770..fa1930b3d5c 100644 --- a/intermediate_source/dist_tuto.rst +++ b/intermediate_source/dist_tuto.rst @@ -523,6 +523,7 @@ for an available MPI implementation. The following steps install the MPI backend, by installing PyTorch `from source `__. +.. #TODO: replace 1. Create and activate your Anaconda environment, install all the pre-requisites following `the guide `__, but do From 7633425548745492932b7b1cd5280ce03e5ea287 Mon Sep 17 00:00:00 2001 From: Alanna Burke Date: Thu, 3 Apr 2025 20:07:07 -0400 Subject: [PATCH 6/9] Removing and marking conda mentions for update. --- prototype_source/gpu_quantization_torchao_tutorial.py | 2 +- prototype_source/inductor_windows_cpu.rst | 9 +++++---- prototype_source/prototype_index.rst | 2 +- 3 files changed, 7 insertions(+), 6 deletions(-) diff --git a/prototype_source/gpu_quantization_torchao_tutorial.py b/prototype_source/gpu_quantization_torchao_tutorial.py index f901f8abd31..f8f111bd93a 100644 --- a/prototype_source/gpu_quantization_torchao_tutorial.py +++ b/prototype_source/gpu_quantization_torchao_tutorial.py @@ -27,7 +27,7 @@ # # # .. code-block:: bash -# +# TODO: replace # > conda create -n myenv python=3.10 # > pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121 # > pip install git+https://github.com/facebookresearch/segment-anything.git diff --git a/prototype_source/inductor_windows_cpu.rst b/prototype_source/inductor_windows_cpu.rst index 96e1bf46909..46e4ac99330 100644 --- a/prototype_source/inductor_windows_cpu.rst +++ b/prototype_source/inductor_windows_cpu.rst @@ -27,7 +27,7 @@ Install the Required Software ----------------------------- First, let's install the required software. C++ compiler is required for TorchInductor optimization. -We will use Microsoft Visual C++ (MSVC) for this example. +We will use Microsoft Visual C++ (MSVC) for this example. 1. Download and install `MSVC `_. @@ -38,6 +38,7 @@ We will use Microsoft Visual C++ (MSVC) for this example. We recommend C++ compiler `Clang `_ and `Intel Compiler `_. Please check `Alternative Compiler for better performance <#alternative-compiler-for-better-performance>`_. +.. TODO: replace instructions 3. Download and install `Miniforge3-Windows-x86_64.exe `__. Set Up the Environment @@ -55,10 +56,10 @@ Set Up the Environment "C:/ProgramData/miniforge3/Scripts/activate.bat" #. Create and activate a custom conda environment: - + .. code-block:: sh - conda create -n inductor_cpu_windows python=3.10 -y + conda create -n inductor_cpu_windows python=3.10 -y conda activate inductor_cpu_windows #. Install `PyTorch 2.5 `_ or later. @@ -79,7 +80,7 @@ Here’s a simple example to demonstrate how to use TorchInductor: opt_foo1 = torch.compile(foo) print(opt_foo1(torch.randn(10, 10), torch.randn(10, 10))) -Here is the sample output that this code might return: +Here is the sample output that this code might return: .. code-block:: sh diff --git a/prototype_source/prototype_index.rst b/prototype_source/prototype_index.rst index 927f5f694b8..5498c5e2a3c 100644 --- a/prototype_source/prototype_index.rst +++ b/prototype_source/prototype_index.rst @@ -1,6 +1,6 @@ PyTorch Prototype Recipes --------------------------------------------- -Prototype features are not available as part of binary distributions like PyPI or Conda (except maybe behind run-time flags). To test these features we would, depending on the feature, recommend building from master or using the nightly wheels that are made available on `pytorch.org `_. +Prototype features are not available as part of binary distributions like PyPI (except maybe behind run-time flags). To test these features we would, depending on the feature, recommend building from master or using the nightly wheels that are made available on `pytorch.org `_. *Level of commitment*: We are committing to gathering high bandwidth feedback only on these features. Based on this feedback and potential further engagement between community members, we as a community will decide if we want to upgrade the level of commitment or to fail fast. From 2b9926717595074c1a144d69ec4d54aaee296fe5 Mon Sep 17 00:00:00 2001 From: Alanna Burke Date: Thu, 3 Apr 2025 20:07:14 -0400 Subject: [PATCH 7/9] Removing and marking conda mentions for update. --- .../intel_neural_compressor_for_pytorch.rst | 3 - recipes_source/recipes/Captum_Recipe.py | 53 +++++++------- .../recipes/tensorboard_with_pytorch.py | 72 +++++++++---------- recipes_source/xeon_run_cpu.rst | 31 ++------ 4 files changed, 65 insertions(+), 94 deletions(-) diff --git a/recipes_source/intel_neural_compressor_for_pytorch.rst b/recipes_source/intel_neural_compressor_for_pytorch.rst index 02ce3d7b378..3c108afd9f9 100755 --- a/recipes_source/intel_neural_compressor_for_pytorch.rst +++ b/recipes_source/intel_neural_compressor_for_pytorch.rst @@ -50,9 +50,6 @@ Installation # install nightly version from pip pip install -i https://test.pypi.org/simple/ neural-compressor - # install stable version from from conda - conda install neural-compressor -c conda-forge -c intel - *Supported python versions are 3.6 or 3.7 or 3.8 or 3.9* Usages diff --git a/recipes_source/recipes/Captum_Recipe.py b/recipes_source/recipes/Captum_Recipe.py index 11fdc24429c..74951621c0f 100644 --- a/recipes_source/recipes/Captum_Recipe.py +++ b/recipes_source/recipes/Captum_Recipe.py @@ -9,37 +9,36 @@ # Captum helps you understand how the data features impact your model # predictions or neuron activations, shedding light on how your model # operates. -# +# # Using Captum, you can apply a wide range of state-of-the-art feature # attribution algorithms such as \ ``Guided GradCam``\ and # \ ``Integrated Gradients``\ in a unified way. -# -# In this recipe you will learn how to use Captum to: # -# - Attribute the predictions of an image classifier to their corresponding image features. +# In this recipe you will learn how to use Captum to: +# +# - Attribute the predictions of an image classifier to their corresponding image features. # - Visualize the attribution results. -# +# ###################################################################### # Before you begin # ---------------- -# +# ###################################################################### # Make sure Captum is installed in your active Python environment. Captum -# is available both on GitHub, as a ``pip`` package, or as a ``conda`` -# package. For detailed instructions, consult the installation guide at -# https://captum.ai/ -# +# is available on GitHub, as a ``pip`` package. For detailed instructions, +# consult the installation guide at https://captum.ai/ +# ###################################################################### # For a model, we use a built-in image classifier in PyTorch. Captum can # reveal which parts of a sample image support certain predictions made by # the model. -# +# import torchvision from torchvision import models, transforms @@ -70,23 +69,23 @@ ###################################################################### # Computing Attribution # --------------------- -# +# ###################################################################### # Among the top-3 predictions of the models are classes 208 and 283 which # correspond to dog and cat. -# +# # Let us attribute each of these predictions to the corresponding part of # the input, using Captum’s \ ``Occlusion``\ algorithm. -# +# -from captum.attr import Occlusion +from captum.attr import Occlusion occlusion = Occlusion(model) strides = (3, 9, 9) # smaller = more fine-grained attribution but slower -target=208, # Labrador index in ImageNet +target=208, # Labrador index in ImageNet sliding_window_shapes=(3,45, 45) # choose size enough to change object appearance baselines = 0 # values to occlude the image with. 0 corresponds to gray @@ -97,7 +96,7 @@ baselines=baselines) -target=283, # Persian cat index in ImageNet +target=283, # Persian cat index in ImageNet attribution_cat = occlusion.attribute(input_img, strides = strides, target=target, @@ -113,22 +112,22 @@ # ``Attribution`` which expects your model as a callable ``forward_func`` # upon initialization and has an ``attribute(...)`` method which returns # the attribution result in a unified format. -# +# # Let us visualize the computed attribution results in case of images. -# +# ###################################################################### # Visualizing the Results # ----------------------- -# +# ###################################################################### # Captum’s \ ``visualization``\ utility provides out-of-the-box methods # to visualize attribution results both for pictorial and for textual # inputs. -# +# import numpy as np from captum.attr import visualization as viz @@ -154,7 +153,7 @@ _ = viz.visualize_image_attr_multiple(attribution_cat, np.array(center_crop(img)), - ["heat_map", "original_image"], + ["heat_map", "original_image"], ["all", "all"], # positive/negative attribution or all ["attribution for cat", "image"], show_colorbar = True @@ -165,13 +164,13 @@ # If your data is textual, ``visualization.visualize_text()`` offers a # dedicated view to explore attribution on top of the input text. Find out # more at http://captum.ai/tutorials/IMDB_TorchText_Interpret -# +# ###################################################################### # Final Notes # ----------- -# +# ###################################################################### @@ -181,10 +180,10 @@ # specific output to a hidden-layer neuron (see Captum API reference). \* # Attribute a hidden-layer neuron response to the model input (see Captum # API reference). -# +# # For complete API of the supported methods and a list of tutorials, # consult our website http://captum.ai -# +# # Another useful post by Gilbert Tanner: # https://gilberttanner.com/blog/interpreting-pytorch-models-with-captum -# +# diff --git a/recipes_source/recipes/tensorboard_with_pytorch.py b/recipes_source/recipes/tensorboard_with_pytorch.py index 4bceda81eaf..7eb8722e603 100644 --- a/recipes_source/recipes/tensorboard_with_pytorch.py +++ b/recipes_source/recipes/tensorboard_with_pytorch.py @@ -1,24 +1,16 @@ """ How to use TensorBoard with PyTorch =================================== -TensorBoard is a visualization toolkit for machine learning experimentation. -TensorBoard allows tracking and visualizing metrics such as loss and accuracy, -visualizing the model graph, viewing histograms, displaying images and much more. -In this tutorial we are going to cover TensorBoard installation, +TensorBoard is a visualization toolkit for machine learning experimentation. +TensorBoard allows tracking and visualizing metrics such as loss and accuracy, +visualizing the model graph, viewing histograms, displaying images and much more. +In this tutorial we are going to cover TensorBoard installation, basic usage with PyTorch, and how to visualize data you logged in TensorBoard UI. Installation ---------------------- -PyTorch should be installed to log models and metrics into TensorBoard log -directory. The following command will install PyTorch 1.4+ via -Anaconda (recommended): - -.. code-block:: sh - - $ conda install pytorch torchvision -c pytorch - - -or pip +PyTorch should be installed to log models and metrics into TensorBoard log +directory. The following command will install PyTorch 1.4+ via pip: .. code-block:: sh @@ -29,10 +21,10 @@ ###################################################################### # Using TensorBoard in PyTorch # ----------------------------- -# -# Let’s now try using TensorBoard with PyTorch! Before logging anything, +# +# Let’s now try using TensorBoard with PyTorch! Before logging anything, # we need to create a ``SummaryWriter`` instance. -# +# import torch from torch.utils.tensorboard import SummaryWriter @@ -40,20 +32,20 @@ ###################################################################### # Writer will output to ``./runs/`` directory by default. -# +# ###################################################################### # Log scalars # ----------- -# -# In machine learning, it’s important to understand key metrics such as -# loss and how they change during training. Scalar helps to save -# the loss value of each training step, or the accuracy after each epoch. -# -# To log a scalar value, use -# ``add_scalar(tag, scalar_value, global_step=None, walltime=None)``. -# For example, lets create a simple linear regression training, and +# +# In machine learning, it’s important to understand key metrics such as +# loss and how they change during training. Scalar helps to save +# the loss value of each training step, or the accuracy after each epoch. +# +# To log a scalar value, use +# ``add_scalar(tag, scalar_value, global_step=None, walltime=None)``. +# For example, lets create a simple linear regression training, and # log loss value using ``add_scalar`` # @@ -72,18 +64,18 @@ def train_model(iter): optimizer.zero_grad() loss.backward() optimizer.step() - + train_model(10) writer.flush() -###################################################################### -# Call ``flush()`` method to make sure that all pending events +###################################################################### +# Call ``flush()`` method to make sure that all pending events # have been written to disk. -# -# See `torch.utils.tensorboard tutorials `_ +# +# See `torch.utils.tensorboard tutorials `_ # to find more TensorBoard visualization types you can log. -# +# # If you do not need the summary writer anymore, call ``close()`` method. # @@ -92,7 +84,7 @@ def train_model(iter): ###################################################################### # Run TensorBoard # ---------------- -# +# # Install TensorBoard through the command line to visualize data you logged # # .. code-block:: sh @@ -100,9 +92,9 @@ def train_model(iter): # pip install tensorboard # # -# Now, start TensorBoard, specifying the root log directory you used above. -# Argument ``logdir`` points to directory where TensorBoard will look to find -# event files that it can display. TensorBoard will recursively walk +# Now, start TensorBoard, specifying the root log directory you used above. +# Argument ``logdir`` points to directory where TensorBoard will look to find +# event files that it can display. TensorBoard will recursively walk # the directory structure rooted at ``logdir``, looking for ``.*tfevents.*`` files. # # .. code-block:: sh @@ -114,9 +106,9 @@ def train_model(iter): # .. image:: ../../_static/img/thumbnails/tensorboard_scalars.png # :scale: 40 % # -# This dashboard shows how the loss and accuracy change with every epoch. -# You can use it to also track training speed, learning rate, and other -# scalar values. It’s helpful to compare these metrics across different +# This dashboard shows how the loss and accuracy change with every epoch. +# You can use it to also track training speed, learning rate, and other +# scalar values. It’s helpful to compare these metrics across different # training runs to improve your model. # @@ -124,7 +116,7 @@ def train_model(iter): ######################################################################## # Learn More # ---------------------------- -# +# # - `torch.utils.tensorboard `_ docs # - `Visualizing models, data, and training with TensorBoard `_ tutorial # diff --git a/recipes_source/xeon_run_cpu.rst b/recipes_source/xeon_run_cpu.rst index 6426bc57819..4ecfc65ed68 100644 --- a/recipes_source/xeon_run_cpu.rst +++ b/recipes_source/xeon_run_cpu.rst @@ -99,11 +99,6 @@ The Intel® OpenMP Runtime Library can be installed using one of these commands: $ pip install intel-openmp -or - -.. code-block:: console - - $ conda install mkl Choosing an Optimized Memory Allocator ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -122,12 +117,6 @@ On CentOS, you can install it by running: $ yum install gperftools -In a conda environment, it can also be installed by running: - -.. code-block:: console - - $ conda install conda-forge::gperftools - On Ubuntu ``JeMalloc`` can be installed by this command: .. code-block:: console @@ -140,12 +129,6 @@ On CentOS it can be installed by running: $ yum install jemalloc -In a conda environment, it can also be installed by running: - -.. code-block:: console - - $ conda install conda-forge::jemalloc - Quick Start Example Commands ---------------------------- @@ -214,12 +197,12 @@ The generic option settings (knobs) include the following: - default value - help * - ``-h``, ``--help`` - - - - + - + - - To show the help message and exit. * - ``-m``, ``--module`` - - - - + - + - - To change each process to interpret the launch script as a python module, executing with the same behavior as "python -m". * - ``--no-python`` - bool @@ -323,7 +306,7 @@ Knobs for controlling instance number and compute resource allocation are: - bool - False - To disable the usage of ``taskset`` command. - + .. note:: Environment variables that will be set by this script include the following: @@ -344,13 +327,13 @@ Knobs for controlling instance number and compute resource allocation are: - Value of ``ncores_per_instance`` * - MALLOC_CONF - If libjemalloc.so is preloaded, MALLOC_CONF will be set to ``"oversize_threshold:1,background_thread:true,metadata_thp:auto"``. - + Please note that the script respects environment variables set preliminarily. For example, if you have set the environment variables mentioned above before running the script, the values of the variables will not be overwritten by the script. Conclusion ---------- -In this tutorial, we explored a variety of advanced configurations and tools designed to optimize PyTorch inference performance on Intel® Xeon® Scalable Processors. +In this tutorial, we explored a variety of advanced configurations and tools designed to optimize PyTorch inference performance on Intel® Xeon® Scalable Processors. By leveraging the ``torch.backends.xeon.run_cpu`` script, we demonstrated how to fine-tune thread and memory management to achieve peak performance. We covered essential concepts such as NUMA access control, optimized memory allocators like ``TCMalloc`` and ``JeMalloc``, and the use of Intel® OpenMP for efficient multithreading. From 3ee545d780ec1735392f6feb55eaf3815d24d9e3 Mon Sep 17 00:00:00 2001 From: Alanna Burke Date: Mon, 7 Apr 2025 14:56:23 -0400 Subject: [PATCH 8/9] Fixing Conda referernces. --- advanced_source/sharding.rst | 24 +++++++++---------- beginner_source/hta_intro_tutorial.rst | 1 - intermediate_source/dist_tuto.rst | 2 +- .../gpu_quantization_torchao_tutorial.py | 1 - 4 files changed, 13 insertions(+), 15 deletions(-) diff --git a/advanced_source/sharding.rst b/advanced_source/sharding.rst index 8c811034671..f09b474143e 100644 --- a/advanced_source/sharding.rst +++ b/advanced_source/sharding.rst @@ -9,14 +9,14 @@ tables by explicitly configuring them. Installation ------------ -Requirements: - python >= 3.7 +Requirements: - Python >= 3.7 We highly recommend CUDA when using torchRec. If using CUDA: - cuda >= 11.0 .. code:: python # TODO: replace these - # install conda to make installying pytorch with cudatoolkit 11.3 easier. + # install Conda to make installing PyTorch with cudatoolkit 11.3 easier. !sudo rm Miniconda3-py37_4.9.2-Linux-x86_64.sh Miniconda3-py37_4.9.2-Linux-x86_64.sh.* !sudo wget https://repo.anaconda.com/miniconda/Miniconda3-py37_4.9.2-Linux-x86_64.sh !sudo chmod +x Miniconda3-py37_4.9.2-Linux-x86_64.sh @@ -24,29 +24,29 @@ We highly recommend CUDA when using torchRec. If using CUDA: - cuda >= .. code:: python - # install pytorch with cudatoolkit 11.3 + # install PyTorch with cudatoolkit 11.3 !sudo conda install pytorch cudatoolkit=11.3 -c pytorch-nightly -y -Installing torchRec will also install +Installing TorchRec will also install `FBGEMM `__, a collection of CUDA -kernels and GPU enabled operations to run +kernels and GPU enabled operations to run. .. code:: python # install torchrec !pip3 install torchrec-nightly -Install multiprocess which works with ipython to for multi-processing -programming within colab +Install `multiprocess`` which works with `iPython` for multi-processing +programming within `Colab``: .. code:: python !pip3 install multiprocess The following steps are needed for the Colab runtime to detect the added -shared libraries. The runtime searches for shared libraries in /usr/lib, -so we copy over the libraries which were installed in /usr/local/lib/. -**This is a very necessary step, only in the colab runtime**. +shared libraries. The runtime searches for shared libraries is in `/usr/lib`, +so we copy over the libraries which were installed in `/usr/local/lib/`. +**This is a very necessary step, only in the Colab runtime**. .. code:: python @@ -54,7 +54,7 @@ so we copy over the libraries which were installed in /usr/local/lib/. **Restart your runtime at this point for the newly installed packages to be seen.** Run the step below immediately after restarting so that -python knows where to look for packages. **Always run this step after +Python knows where to look for packages. **Always run this step after restarting the runtime.** .. code:: python @@ -71,7 +71,7 @@ Due to the notebook enviroment, we cannot run can do multiprocessing inside the notebook to mimic the setup. Users should be responsible for setting up their own `SPMD `_ launcher when using -Torchrec. We setup our environment so that torch distributed based +TorchRec. We setup our environment so that torch distributed based communication backend can work. .. code:: python diff --git a/beginner_source/hta_intro_tutorial.rst b/beginner_source/hta_intro_tutorial.rst index 0443202d627..96f63deafed 100644 --- a/beginner_source/hta_intro_tutorial.rst +++ b/beginner_source/hta_intro_tutorial.rst @@ -9,7 +9,6 @@ below. Installing HTA ~~~~~~~~~~~~~~ -.. # TODO: replace We recommend using a Conda environment to install HTA. To install Anaconda, see `the official Anaconda documentation `_. diff --git a/intermediate_source/dist_tuto.rst b/intermediate_source/dist_tuto.rst index fa1930b3d5c..f73dccb6b26 100644 --- a/intermediate_source/dist_tuto.rst +++ b/intermediate_source/dist_tuto.rst @@ -523,7 +523,7 @@ for an available MPI implementation. The following steps install the MPI backend, by installing PyTorch `from source `__. -.. #TODO: replace +.. #TODO: replace? 1. Create and activate your Anaconda environment, install all the pre-requisites following `the guide `__, but do diff --git a/prototype_source/gpu_quantization_torchao_tutorial.py b/prototype_source/gpu_quantization_torchao_tutorial.py index f8f111bd93a..75264f39904 100644 --- a/prototype_source/gpu_quantization_torchao_tutorial.py +++ b/prototype_source/gpu_quantization_torchao_tutorial.py @@ -27,7 +27,6 @@ # # # .. code-block:: bash -# TODO: replace # > conda create -n myenv python=3.10 # > pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121 # > pip install git+https://github.com/facebookresearch/segment-anything.git From 0527eb786e490216a86a536eb972e87fd9d412fd Mon Sep 17 00:00:00 2001 From: Alanna Burke Date: Tue, 8 Apr 2025 19:00:26 -0400 Subject: [PATCH 9/9] Fixing conda references. --- .../introyt/tensorboardyt_tutorial.py | 7 +++++++ .../intel_neural_compressor_for_pytorch.rst | 3 +++ recipes_source/xeon_run_cpu.rst | 17 +++++++++++++++++ 3 files changed, 27 insertions(+) diff --git a/beginner_source/introyt/tensorboardyt_tutorial.py b/beginner_source/introyt/tensorboardyt_tutorial.py index 4b7aec8e2f3..1bababa237d 100644 --- a/beginner_source/introyt/tensorboardyt_tutorial.py +++ b/beginner_source/introyt/tensorboardyt_tutorial.py @@ -24,6 +24,13 @@ To run this tutorial, you’ll need to install PyTorch, TorchVision, Matplotlib, and TensorBoard. +With ``conda``: + +.. code-block:: sh + + conda install pytorch torchvision -c pytorch + conda install matplotlib tensorboard + With ``pip``: .. code-block:: sh diff --git a/recipes_source/intel_neural_compressor_for_pytorch.rst b/recipes_source/intel_neural_compressor_for_pytorch.rst index 3c108afd9f9..02ce3d7b378 100755 --- a/recipes_source/intel_neural_compressor_for_pytorch.rst +++ b/recipes_source/intel_neural_compressor_for_pytorch.rst @@ -50,6 +50,9 @@ Installation # install nightly version from pip pip install -i https://test.pypi.org/simple/ neural-compressor + # install stable version from from conda + conda install neural-compressor -c conda-forge -c intel + *Supported python versions are 3.6 or 3.7 or 3.8 or 3.9* Usages diff --git a/recipes_source/xeon_run_cpu.rst b/recipes_source/xeon_run_cpu.rst index 4ecfc65ed68..af6f91bd83d 100644 --- a/recipes_source/xeon_run_cpu.rst +++ b/recipes_source/xeon_run_cpu.rst @@ -99,6 +99,11 @@ The Intel® OpenMP Runtime Library can be installed using one of these commands: $ pip install intel-openmp + or + +.. code-block:: console + $ conda install mkl + Choosing an Optimized Memory Allocator ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -117,6 +122,12 @@ On CentOS, you can install it by running: $ yum install gperftools +In a conda environment, it can also be installed by running: + +.. code-block:: console + + $ conda install conda-forge::gperftools + On Ubuntu ``JeMalloc`` can be installed by this command: .. code-block:: console @@ -129,6 +140,12 @@ On CentOS it can be installed by running: $ yum install jemalloc +In a conda environment, it can also be installed by running: + +.. code-block:: console + + $ conda install conda-forge::jemalloc + Quick Start Example Commands ----------------------------