You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: pytorch_tutorial/convolutional_neural_network/README.md
+357Lines changed: 357 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -18,3 +18,360 @@ math: true # Use default Marp engine for math rendering
18
18
This example trains a convolutional neural network to classify fashion items. The complete sourse code is available [here](test_convolutional_neural_network.py).
> The `get_device()` utility function was defined in a [previous example](../fundamentals/README.md#gpu-support)
36
+
37
+
```python
38
+
device = get_device()
39
+
print(f"PyTorch {torch.__version__}, using {device} device")
40
+
```
41
+
42
+
## Hyperparameters
43
+
44
+
```python
45
+
# Hyperparameters
46
+
n_epochs =10# Number of training iterations on the whole dataset
47
+
learning_rate =0.001# Rate of parameter change during gradient descent
48
+
batch_size =64# Number of samples used for one gradient descent step
49
+
conv2d_kernel_size =3# Size of the 2D convolution kernels
50
+
```
51
+
52
+
## Dataset loading
53
+
54
+
We use [Fashion-MNIST](https://github.com/zalandoresearch/fashion-mnist), a classic dataset for image recognition. Each example is a 28x28 grayscale image, associated with one label from 10 classes: t-shirt, trouser, pullover...
55
+
56
+
This dataset is provided py PyTorch through the [FashionMNIST](https://pytorch.org/vision/0.19/generated/torchvision.datasets.FashionMNIST.html) class. In order to evaluate the trained model performance on unseen data, this class splits the data into training and test sets.
57
+
58
+
Alongside download, a [transform](https://pytorch.org/vision/main/transforms.html) operation is applied to turn images into PyTorch tensors of shape `(color_depth, height, width)`, with pixel values scaled to the $[0,1]$ range.
59
+
60
+
### Dataset download
61
+
62
+
```python
63
+
# Directory for downloaded files
64
+
DATA_DIR="./_output"
65
+
66
+
# Download and construct the Fashion-MNIST images dataset
67
+
# The training set is used to train the model
68
+
train_dataset = datasets.FashionMNIST(
69
+
root=f"DATA_DIR",
70
+
train=True, # Training set
71
+
download=True,
72
+
transform=transforms.ToTensor(),
73
+
)
74
+
# The test set is used to evaluate the trained model performance on unseen data
75
+
test_dataset = datasets.FashionMNIST(
76
+
root=f"DATA_DIR",
77
+
train=False, # Test set
78
+
download=True,
79
+
transform=transforms.ToTensor(),
80
+
)
81
+
```
82
+
83
+
### Bacth loading: training set
84
+
85
+
```python
86
+
# Create data loader for loading training data as randomized batches
print(f"{n_train_samples} training samples, {n_test_samples} test samples")
108
+
```
109
+
110
+
## Model definition
111
+
112
+
### PyTorch models as classes
113
+
114
+
Non-trivial PyTorch models are created as subclasses of the [Module]() class. Two elements must be included into a model class:
115
+
116
+
- the constructor (`__init__()` function) to define the model architecture;
117
+
- the `forward()` function to implement the forward pass of input data through the model.
118
+
119
+
### Model architecture
120
+
121
+
We design a basic convolutional network. It takes a tensor of shape `(1, 28, 28)` (a rescaled grayscale image) as input and applies 2D convolution and max-pooling operations to detect interesting features. The output of these operations is flattened into a vector of shape and passes through two linear layers to compute 10 values, one for each possible class.
Our model implementation leverages the following PyTorch classes:
128
+
129
+
-[Sequential](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html) to create a sequential container of operations.
130
+
-[Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html) to apply a 2D convolution operation.
131
+
- The [ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html) activation function.
132
+
-[MaxPool2d](https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html) to apply max-pooling.
133
+
-[Flatten](https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html) to flatten the extracted features into a vector.
134
+
-[LazyLinear](https://pytorch.org/docs/stable/generated/torch.nn.LazyLinear.html), a fully connected layer whose input features are inferred during the first forward pass.
135
+
-[Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html), a fully connected layer used for final classification.
For this multiclass classification task, we use the [CrossEntropyLoss](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html) class.
231
+
232
+
> As seen in a [previous example](../logistic_regression/README.md#loss-function), this class uses a softmax operation to output a probability distribution before computing the loss value.
233
+
234
+
```python
235
+
# Use cross-entropy loss function.
236
+
# nn.CrossEntropyLoss computes softmax internally
237
+
criterion = nn.CrossEntropyLoss()
238
+
```
239
+
240
+
## Gradient descent optimizer
241
+
242
+
We use the standard [Adam](https://pytorch.org/docs/stable/generated/torch.optim.Adam.html) optimizer, which improves the gradient descent algorithm through various optimizations ([more details](https://github.com/bpesquet/mlcourse/tree/main/lectures/gradient_descent#gradient-descent-optimization-algorithms)).
0 commit comments