Skip to content

Commit be5fd49

Browse files
new sample - INC Quantization with PyTorch (#1550)
* new sample - INC Quantization with PyTorch Signed-off-by: devpramod-intel <[email protected]> * add additional instructions * readme & json changes --------- Signed-off-by: devpramod-intel <[email protected]>
1 parent 73d7ca2 commit be5fd49

File tree

11 files changed

+1495
-0
lines changed

11 files changed

+1495
-0
lines changed

AI-and-Analytics/Getting-Started-Samples/INC-Quantization-Sample-for-PyTorch/.ipynb_checkpoints/quantize_with_inc-checkpoint.ipynb

+489
Large diffs are not rendered by default.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
neural_compressor==2.1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
Copyright Intel Corporation
2+
3+
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4+
5+
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
6+
7+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,140 @@
1+
# `Getting Started with Intel® Neural Compressor for Quantization` Sample
2+
3+
The sample is a getting started tutorial for the Intel® Neural Compressor (INC), and demonstrates how to perform INT8 quantization on a Hugging Face BERT model. This sample shows how to achieve performance boosts using Intel hardware.
4+
5+
| Area | Description
6+
|:--- |:---
7+
| What you will learn | How to quantize a BERT model using Intel® Neural Compressor
8+
| Time to complete | 20 minutes
9+
| Category | Code Optimization
10+
11+
## Purpose
12+
13+
Intel® Neural Compressor comes with many options for deep learning model compression, one of them being INT8 Quantization. Quantization help to reduce the size of the model, which enables faster inference. The approach requires a trade-off in reduced accuracy for the reduced size; however, Intel® Neural Compressor provides automated accuracy-driven tuning recipes that will allow you to quantize your model and maintain your model accuracy goals.
14+
15+
The sample starts by loading a BERT model from Hugging Face. After loading the model, we set up an evaluation function that we care about using PyTorch* Dataset and DataLoader classes. Using this evaluation function, Intel® Neural Compressor can perform both post training static and dynamic quantization to achieve the speedups.
16+
17+
## Prerequisites
18+
19+
| Optimized for | Description
20+
|:--- |:---
21+
| OS | Ubuntu* 20.04 (or newer)
22+
| Hardware | Intel® Xeon® Scalable processor family
23+
| Software | Intel® AI Analytics Toolkit (AI Kit)
24+
25+
### For Local Development Environments
26+
27+
You will need to download and install the following toolkits, tools, and components to use the sample.
28+
29+
- **Intel® AI Analytics Toolkit (AI Kit)**
30+
31+
You can get the AI Kit from [Intel® oneAPI Toolkits](https://www.intel.com/content/www/us/en/developer/tools/oneapi/toolkits.html#analytics-kit). <br> See [*Get Started with the Intel® AI Analytics Toolkit for Linux**](https://www.intel.com/content/www/us/en/develop/documentation/get-started-with-ai-linux) for AI Kit installation information and post-installation steps and scripts.
32+
33+
- **Jupyter Notebook**
34+
35+
Install using PIP: `$pip install notebook`. <br> Alternatively, see [*Installing Jupyter*](https://jupyter.org/install) for detailed installation instructions.
36+
37+
### For Intel® DevCloud
38+
39+
The necessary tools and components are already installed in the environment. You do not need to install additional components. See [Intel® DevCloud for oneAPI](https://devcloud.intel.com/oneapi/get_started/) for information.
40+
41+
### **Additional Packages**
42+
43+
You will need to install these additional packages in *requirements.txt*.
44+
```
45+
python -m pip install -r requirements.txt
46+
```
47+
48+
## Key Implementation Details
49+
50+
The sample contains one Jupyter Notebook and one Python script.
51+
52+
### Jupyter Notebook
53+
54+
|Notebook |Description
55+
|:--- |:---
56+
|`quantize_with_inc.ipynb` | Get started tutorial for using Intel® Neural Compressor for PyTorch*
57+
58+
### Python Script
59+
60+
|Script |Description
61+
|:--- |:---
62+
|`dataset.py` | The script provides a PyTorch* Dataset class that tokenizes text data
63+
64+
65+
## Run the `Getting Started with Intel® Neural Compressor for Quantization` Sample
66+
67+
> **Note**: If you have not already done so, set up your CLI
68+
> environment by sourcing the `setvars` script in the root of your oneAPI installation.
69+
>
70+
> Linux*:
71+
> - For system wide installations: `. /opt/intel/oneapi/setvars.sh`
72+
> - For private installations: ` . ~/intel/oneapi/setvars.sh`
73+
> - For non-POSIX shells, like csh, use the following command: `bash -c 'source <install-dir>/setvars.sh ; exec csh'`
74+
>
75+
> For more information on configuring environment variables, see [Use the setvars Script with Linux* or macOS*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html).
76+
77+
### On Linux*
78+
79+
#### Activate Conda
80+
81+
1. Activate the Conda environment.
82+
83+
```
84+
conda activate pytorch
85+
```
86+
87+
By default, the AI Kit is installed in the `/opt/intel/oneapi` folder and requires root privileges to manage it.
88+
89+
You can choose to activate Conda environment without root access. To bypass root access to manage your Conda environment, clone and activate your desired Conda environment using the following commands similar to the following.
90+
91+
#### Run the NoteBook
92+
93+
1. Launch Jupyter Notebook.
94+
```
95+
jupyter notebook --ip=0.0.0.0
96+
```
97+
2. Follow the instructions to open the URL with the token in your browser.
98+
3. Locate and select the Notebook.
99+
```
100+
optimize_pytorch_models_with_ipex.ipynb
101+
```
102+
4. Change the kernel to **pytorch**.
103+
5. Run every cell in the Notebook in sequence.
104+
105+
#### Troubleshooting
106+
107+
If you receive an error message, troubleshoot the problem using the **Diagnostics Utility for Intel® oneAPI Toolkits**. The diagnostic utility provides configuration and system checks to help find missing dependencies, permissions errors, and other issues. See the [Diagnostics Utility for Intel® oneAPI Toolkits User Guide](https://www.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html) for more information on using the utility.
108+
109+
### Run the Sample on Intel® DevCloud (Optional)
110+
111+
1. If you do not already have an account, request an Intel® DevCloud account at [*Create an Intel® DevCloud Account*](https://intelsoftwaresites.secure.force.com/DevCloud/oneapi).
112+
2. On a Linux* system, open a terminal.
113+
3. SSH into Intel® DevCloud.
114+
```
115+
ssh DevCloud
116+
```
117+
> **Note**: You can find information about configuring your Linux system and connecting to Intel DevCloud at Intel® DevCloud for oneAPI [Get Started](https://devcloud.intel.com/oneapi/get_started).
118+
119+
4. Follow the instructions to open the URL with the token in your browser.
120+
3. Locate and select the Notebook.
121+
```
122+
quantize_with_inc.ipynb
123+
```
124+
4. Change the kernel to **PyTorch (AI Kit)**.
125+
7. Run every cell in the Notebook in sequence.
126+
127+
## Example Output
128+
129+
You should see an image showing the performance comparison and analysis between FP32 and INT8.
130+
131+
>**Note**: The image shown below is an example of a general performance comparison for inference speedup obtained by quantization. (Your results might be different.)
132+
133+
![Performance Numbers](images/inc_speedup.png)
134+
135+
## License
136+
137+
Code samples are licensed under the MIT license. See
138+
[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details.
139+
140+
Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt).
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
import os
2+
3+
def runJupyterNotebook(input_notebook_filename, output_notebook_filename, conda_env, fdpath='./'):
4+
import nbformat
5+
import os
6+
from nbconvert.preprocessors import ExecutePreprocessor
7+
from nbconvert.preprocessors import CellExecutionError
8+
if os.path.isfile(input_notebook_filename) is False:
9+
print("No Jupyter notebook found : ",input_notebook_filename)
10+
try:
11+
with open(input_notebook_filename) as f:
12+
nb = nbformat.read(f, as_version=4)
13+
ep = ExecutePreprocessor(timeout=6000, kernel_name=conda_env, allow_errors=True)
14+
ep.preprocess(nb, {'metadata': {'path': fdpath}})
15+
with open(output_notebook_filename, 'w', encoding='utf-8') as f:
16+
nbformat.write(nb, f)
17+
return 0
18+
except CellExecutionError:
19+
print("Exception!")
20+
return -1
21+
22+
23+
runJupyterNotebook(os.path.join(os.path.dirname(os.path.realpath(__file__)),
24+
'quantize_with_inc.ipynb'),
25+
'quantize_with_inc.ipynb',
26+
'workshop')
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
from torch.utils.data import Dataset
2+
from typing import List
3+
from transformers import AutoTokenizer
4+
import torch
5+
6+
class IMDBDataset(Dataset):
7+
"""Dataset with strings to predict pos/neg
8+
Args:
9+
text (List[str]): list of strings
10+
label (List[str]): list of corresponding labels (spam/ham)
11+
data_size (int): number of data rows to use
12+
"""
13+
14+
15+
def __init__(
16+
self,
17+
text: List[str],
18+
label: List[str],
19+
tokenizer: AutoTokenizer,
20+
max_length: int = 64,
21+
data_size: int = 1000):
22+
23+
if data_size > len(text):
24+
raise ValueError(f"Maximum rows in dataset {len(text)}")
25+
self.text = text[:data_size]
26+
self.label = label
27+
self.tokenizer = tokenizer
28+
self.max_length = max_length
29+
30+
def __len__(self):
31+
return len(self.text)
32+
33+
def __getitem__(self, idx):
34+
encoding = self.tokenizer(
35+
self.text[idx],
36+
max_length=self.max_length,
37+
padding='max_length',
38+
truncation=True)
39+
item = {key: torch.as_tensor(val) for key, val in encoding.items()}
40+
41+
return (item, self.label[idx])

0 commit comments

Comments
 (0)