Improved READMEs

jpablomch · jpablomch · commit 24cd58782298 · 2024-10-22T10:03:45.000-07:00
diff --git a/BootstrapNAS/README.md b/BootstrapNAS/README.md
@@ -16,7 +16,7 @@ If you already have a super-network trained with BootstrapNAS, please follow the
 
 More information about BootstrapNAS is available in our papers:
 
-[Automated Super-Network Generation for Scalable Neural Architecture Search](https://openreview.net/attachment?id=HK-zmbTB8gq&name=main_paper_and_supplementary_material).
+[Automated Super-Network Generation for Scalable Neural Architecture Search](https://openreview.net/pdf?id=HK-zmbTB8gq).
 
 ```bibtex
   @inproceedings{
@@ -25,7 +25,7 @@ More information about BootstrapNAS is available in our papers:
     author={Muñoz, J. Pablo and Lyalyushkin, Nikolay and Lacewell, Chaunte and Senina, Anastasia and Cummings, Daniel and Sarah, Anthony  and Kozlov, Alexander and Jain, Nilesh},
     booktitle={First Conference on Automated Machine Learning (Main Track)},
     year={2022},
-    url={https://openreview.net/forum?id=HK-zmbTB8gq}
+    url={https://openreview.net/pdf?id=HK-zmbTB8gq}
   }
 ```
 [Enabling NAS with Automated Super-Network Generation](https://arxiv.org/abs/2112.10878)
@@ -35,15 +35,10 @@ More information about BootstrapNAS is available in our papers:
   bootstrapNAS,
   author    = {Muñoz, J. Pablo  and Lyalyushkin, Nikolay  and Akhauri, Yash and Senina, Anastasia and Kozlov, Alexander  and Jain, Nilesh},
   title     = {Enabling NAS with Automated Super-Network Generation},
-  journal   = {CoRR},
-  volume    = {abs/2112.10878},
-  year      = {2021},
-  url       = {https://arxiv.org/abs/2112.10878},
-  eprinttype = {arXiv},
-  eprint    = {2112.10878},
-  timestamp = {Tue, 04 Jan 2022 15:59:27 +0100},
-  biburl    = {https://dblp.org/rec/journals/corr/abs-2112-10878.bib},
-  bibsource = {dblp computer science bibliography, https://dblp.org}
+  journal   = {1st International Workshop on Practical
+Deep Learning in the Wild at AAAI},
+  year      = {2022},
+  url       = {https://practical-dl.github.io/2022/short_paper/21.pdf},
 }
 ```
 
diff --git a/BootstrapNAS/instructions/Configuration.md b/BootstrapNAS/instructions/Configuration.md
@@ -1,6 +1,6 @@
 ### Configuration file
 
-The parameters for generating, training and searching on the super-network are defined in a configuration file within two exclusive subsets of parameters for training and search: 
+The parameters for generating, training, and searching on the super-network are defined in a configuration file within two exclusive subsets of parameters for training and search: 
 ```json
     "bootstrapNAS": {
         "training": {
@@ -12,7 +12,7 @@ The parameters for generating, training and searching on the super-network are d
     }
 ```
 
-In the `training` section, you specify the training algorithm, e.g., `progressive_shrinking`, schedule and elasticity parameters: 
+In the `training` section, you specify the training algorithm, e.g., `progressive_shrinking`, schedule, and elasticity parameters: 
 
 ```json
 "training": {
@@ -41,7 +41,7 @@ In the `training` section, you specify the training algorithm, e.g., `progressiv
 }
 
 ```
-In the search section, you specify the search algorithm, e.g., `NSGA-II` and its parameters. For example: 
+In the search section, you specify the search algorithm, e.g., `NSGA-II`, and its parameters. For example: 
 ```json
 "search": {
     "algorithm": "NSGA2",
@@ -51,7 +51,7 @@ In the search section, you specify the search algorithm, e.g., `NSGA-II` and its
 }
 ```
 
-By default, BootstrapNAS uses `NSGA-II` (Dev et al., 2002), an genetic algorithm that constructs a pareto front of efficient sub-networks. 
+By default, BootstrapNAS uses `NSGA-II` (Dev et al., 2002), a genetic algorithm that constructs a Pareto front of efficient sub-networks. 
 
 List of parameters that can be used in the configuration file: 
 
@@ -65,20 +65,20 @@ List of parameters that can be used in the configuration file:
 
 `schedule`: The schedule section includes a list of stage descriptors (`list_stage_descriptions`) that specify the elasticity dimensions enabled for a particular stage (`train_dims`), the number of `epochs` for the stage, the `depth_indicator` which in the case of elastic depth, restricts the maximum number of blocks in each independent group that can be skipped, the `width_indicator`, which restricts the maximum number of width values in each elastic layer. The user can also specify whether weights should be reorganized (`reorg_weights`), whether batch norm adaptation should be triggered at the beginning of the stage (`bn_adapt`), the initial learning rate for the stage (`init_lr`), and the epochs to use for adjusting the learning rate (`epochs_lr`). 
 
-`elasticity`: Currently, BootstrapNAS supports three elastic dimensions (`kernel`, `width` and `depth`). The `mode` for elastic depth can be set as `auto` or `manual`. If manual is selected, the user can specify, a list of possible `skipped_blocks` that, as the name suggest, might be skipped. In `auto` mode, the user can specify the `min_block_size`, i.e., minimal number of operations in the skipping block, and the `max_block_size`, i.e., maximal number of operations in the block. The user can also `allow_nested_blocks` or `allow_linear_combination` of blocks. In the case of elastic width, the user can specify the `min_width`, i.e., the minimal number of output channels that can be activated for each layers with elastic width. Default value is 32, the `max_num_widths`, which restricts total number of different elastic width values for each layer, a `width_step`, which defines a step size for a generation of the elastic width search space, or a `width_multiplier` to define the elastic width search space via a list of multipliers. Finally, the user can determine the type of filter importance metric: L1, L2 or geometric mean. L2 is selected by default. For elastic kernel, the user can specify the `max_num_kernels`, which restricts the total number of different elastic kernel values for each layer.
+`elasticity`: Currently, BootstrapNAS supports three elastic dimensions (`kernel`, `width`, and `depth`). The `mode` for elastic depth can be set as `auto` or `manual`. If _manual_ is selected, the user can specify a list of possible `skipped_blocks` that might be skipped, as the name suggests. In `auto` mode, the user can specify the `min_block_size`, i.e., the minimal number of operations in the skipping block, and the `max_block_size`, i.e., the maximal number of operations in the block. The user can also `allow_nested_blocks` or `allow_linear_combination` of blocks. In the case of elastic width, the user can specify the `min_width`, i.e., the minimal number of output channels that can be activated for each layer with elastic width. The default value is 32, the `max_num_widths`, which restricts the total number of different elastic width values for each layer, a `width_step`, which defines a step size for a generation of the elastic width search space, or a `width_multiplier` to define the elastic width search space via a list of multipliers. Finally, the user can determine the type of filter importance metric: L1, L2 or geometric mean. L2 is selected by default. The user can specify the `max_num_kernels` for the elastic kernel, which restricts the total number of different elastic kernel values for each layer.
 
 `train_steps`: Defines the number of samples used for each training epoch.
 
 **Search:**
 
 `algorithm`: Defines the search algorithm. The default algorithm is NSGA-II.
 
-`num_evals`: Defines the number of evaluations that will be used by the search algorithm.
+`num_evals`: Defines the number of evaluations the search algorithm will use.
 
 `population`: Defines the population size when using an evolutionary search algorithm.
 
 `acc_delta`: Defines the absolute difference in accuracy that is tolerated when looking for a subnetwork.
 
 `ref_acc`: Defines the reference accuracy from the pre-trained model used to generate the super-network.
 
-*A full list of the possible configuration parameters can be found [here](https://github.com/openvinotoolkit/nncf/blob/develop/nncf/config/experimental_schema.py).
+*A complete list of the possible configuration parameters can be found [here](https://github.com/openvinotoolkit/nncf/blob/develop/nncf/config/experimental_schema.py).
diff --git a/BootstrapNAS/instructions/Home.md b/BootstrapNAS/instructions/Home.md
@@ -6,5 +6,3 @@
 
 ### [Configuration](https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/BootstrapNAS/instructions/Configuration.md)
 
-More detailed guides coming soon.
-
diff --git a/BootstrapNAS/instructions/Quickstart.md b/BootstrapNAS/instructions/Quickstart.md
@@ -51,7 +51,7 @@ python bootstrap_nas.py -m train \
 
 
 ### Expected Output Files after executing BootstrapNAS
-The output of running ```bootstrap_nas.py``` will be a sub-network configuration that has an accuracy similar to the input model (by default a $\pm$1% absolute difference in accuracy is allowed), but with improvements in MACs. Format: ([MACs_subnet, ACC_subnet]). 
+The output of running ```bootstrap_nas.py``` will be a sub-network configuration with an accuracy similar to the input model (by default a $\pm$1% absolute difference in accuracy is allowed), but with improvements in MACs. Format: ([MACs_subnet, ACC_subnet]). 
 
 Several files are saved to your `log_dir` after the training has ended: 
 
@@ -62,9 +62,9 @@ Several files are saved to your `log_dir` after the training has ended:
 - `last_elasticity.pth`- Super-network's elasticity information. This file can be used when loading super-networks for searching or inspection.
 - `last_model_weights.pth`- Super-network's weights after training. 
 - `snapshot.tar.gz` - Copy of the code used for this run. 
-- `subnetwork_best.pth` - Dictionary with the configuration of the best sub-network. Best defined as a sub-network that performs in the Pareto front, and that deviates a maximum `acc_delta` from original model.
+- `subnetwork_best.pth` - Dictionary with the configuration of the best sub-network. Best is defined as a sub-network that performs in the Pareto front and deviates a maximum `acc_delta` from the original model.
 - `supernet_{best, last}.pth` - Super-network weights at its best and last state. 
 
 If the user wants to have a CSV output file of the search progression, ```search_algo.search_progression_to_csv()``` can be called after running the search step.
 
-For a visualization of the search progression please use ```search_algo.visualize_search_progression()``` after the search has concluded. A PNG file will be generated. 
+For a visualization of the search progression, please use ```search_algo.visualize_search_progression()``` after the search has concluded. A PNG file will be generated. 
diff --git a/BootstrapNAS/instructions/Subnetwork_Search.md b/BootstrapNAS/instructions/Subnetwork_Search.md
@@ -2,15 +2,15 @@
 
 If you have a trained super-network, you can start the search stage directly using the ```bootstrap_nas_search.py``` script located [here](https://github.com/openvinotoolkit/nncf/blob/develop/examples/experimental/torch/classification/bootstrap_nas_search.py).
 
-You will need to pass the path where the weights and elasticity information has been stored, which by default is your log directory. 
+You must pass the path where the weights and elasticity information have been stored, which is your log directory by default. 
 
 ```shell
 python bootstrap_nas_search.py -m
 train
 --config <Config path to your config.json used when training the super-network> 
 --log-dir <Path to your log dir for the search stage> 
 --dataset
-<cifar10, imagenet, or other depending on your model>
+<cifar10, imagenet, or other, depending on your model>
 --data <Path to your dataset>
 --elasticity-state-path
  <Path to your last_elasticity.pth file generated when training of the super-network>
@@ -20,4 +20,4 @@ train
 
 #### Hardware-aware search 
 
-BootstrapNAS can be made hardware-aware when searching for efficient sub-networks. To accomplish this, you can pass your own  `eficiency evaluator` for the target hardware to the search component.
+BootstrapNAS can be made hardware-aware when searching for efficient sub-networks. To accomplish this, you can pass your own  `efficiency evaluator` for the target hardware to the search component.
diff --git a/EFTNAS/README.md b/EFTNAS/README.md
@@ -74,15 +74,15 @@ CUDA_VISIBLE_DEVICES=${DEVICES} python examples/pytorch/text-classification/run_
 - `--nncf_config`: the NNCF configuration including the config of movement sparsity
 (refer to [MovementSparsity.md](https://github.com/openvinotoolkit/nncf/blob/develop/nncf/experimental/torch/sparsity/movement/MovementSparsity.md)
 for more details).
-- `--output_dir`: the directory to save importance weights.
+- `--output_dir`: the directory used to save the weights' importance information.
 
 After movement sparsity training, the importance for the pretrained weights of `--model_name_or_path` can 
-be obtained in the `--output_dir` directory, which will be utilized for search space generation and weight-reorder during 
+be obtained in the `--output_dir` directory, which will be utilized for search space generation and weight reorder during 
 NAS training.
 
 #### Generate search space
 
-Based on the trained weight importance, EFTNAS has a well-designed algorithm to automatically generate 
+Based on the trained weight importance, EFTNAS has a well-designed algorithm to generate automatically 
 the search space for the super-network.
 Below is an example command for search space generation using the weight importance:
 ```bash
@@ -101,7 +101,7 @@ The generated search space will be saved in `--target_config`.
 ### Step 2. Training
 
 Once the weight importance scores and the NNCF configuration with the automatically generated search space are obtained, 
-EFTNAS conducts NAS training utilizing based on the information from Step 1. 
+EFTNAS conducts NAS training based on the information from Step 1. 
 
 Due to the feature of BootstrapNAS, we need to manually set the epoch and learning rate for NAS training in the 
 NNCF configuration (please refer to [BootstrapNAS.md](https://github.com/openvinotoolkit/nncf/blob/develop/nncf/experimental/torch/nas/bootstrapNAS/BootstrapNAS.md)
diff --git a/LoNAS/README.md b/LoNAS/README.md
@@ -55,7 +55,7 @@ CUDA_VISIBLE_DEVICES=${DEVICES} python run_commonsense.py \
 
 The `nncf_config` indicates the NNCF configuration encompassing the search space for elastic adapters and modules of the base model (e.g., `q_proj`). 
 The implementation of the elastic modules leverages the BootstrapNAS feature of [OpenVINO™ NNCF](https://github.com/openvinotoolkit/nncf).
-And we employ the stage LR scheduler within NNCF, so the learning rate schedule is specified within the NNCF configuration file, 
+We employ the stage LR scheduler within NNCF, so the learning rate schedule is specified within the NNCF configuration file, 
 rather than within the arguments of `TrainingArguments`. For instance, 
 ```json
 "schedule": {
@@ -128,9 +128,8 @@ CUDA_VISIBLE_DEVICES=${DEVICES} python run_commonsense.py \
 ```
 
 The argument `--val_set_size 1000` signifies the utilization of 1k validation samples to evaluate each discovered 
-subnetwork. After running this command, results of the 200 identified subnetworks (`"num_evals": 200` set in `search` field of NNCF config) 
-can be obtained in the `--output_dir` folder, including `search_progression.png` and `search_progression.csv`. 
-From these results, we can select the subnetwork configurations that best meets different requirements.
+subnetwork. After running this command, the results of the 200 identified subnetworks (`"num_evals": 200` set in the `search` field of NNCF config) will be placed in the `--output_dir` folder, including `search_progression.png` and `search_progression.csv`. 
+From these results, we can select the subnetwork configurations that best meet different requirements.
 
 
 ## Released Models
diff --git a/SQFT/README.md b/SQFT/README.md
@@ -117,7 +117,7 @@ quantized_sparse_base_model_path=sqft-llama-3-8b-50-base-gptq
 python utils/quantization.py --base_model_path ${sparse_base_model_path} --output_dir ${quantized_sparse_base_model_path}
 ```
 
-You can also skip the quantization step and adopt our released quantized models (find them in Sparse-and-Quantized Model of [this Table](#released-foundation-models-)). 
+You can also skip the quantization step and adopt our released quantized models (find them in the *Sparse-and-Quantized Model* column of [this Table](#released-foundation-models-)). 
 
 #### :hammer_and_wrench: SQFT
 
@@ -152,7 +152,7 @@ python run_standard_tuning.py \
     --search_space ${search_space} # low-rank search space
 ```
 
-After the completion of the super-adapter training, the command to extract the heuristic sub-adapter is as follows. 
+After completing the super-adapter training, the command to extract the heuristic sub-adapter is as follows. 
 Additionally, more powerful sub-adapters can be obtained through other advanced search algorithms.
 
 ```bash
diff --git a/Shears/README.md b/Shears/README.md

Original file line number	Diff line number	Diff line change
`@@ -6,5 +6,3 @@`
`6`	`6`
`7`	`7`	`### [Configuration](https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/BootstrapNAS/instructions/Configuration.md)`
`8`	`8`
`9`		`-More detailed guides coming soon.`
`10`		`-`