Skip to content

Commit 3859a6f

Browse files
committed
chore: Some udpates
1 parent 4a18895 commit 3859a6f

File tree

2 files changed

+247
-2
lines changed

2 files changed

+247
-2
lines changed

tutorials/01-basics/05-neural-network/main.py

+8-2
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,14 @@
88
# Get Device for Training #
99
# ================================================================ #
1010

11-
device = ('cuda' if torch.cuda.is_available() else 'cpu')
12-
print(f'Using {device} device.')
11+
device = (
12+
"cuda"
13+
if torch.cuda.is_available()
14+
else "mps"
15+
if torch.backends.mps.is_available()
16+
else "cpu"
17+
)
18+
print(f'Using {device} device')
1319

1420

1521
# ================================================================ #
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,239 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "f02477e6-c231-4b62-a390-1ceebe858060",
6+
"metadata": {},
7+
"source": [
8+
"# A GENTLE INTRODUCTION TO `TORCH.AUTOGRAD`\n",
9+
"\n",
10+
"`torch.autograd` is PyTorch’s automatic differentiation engine that powers neural network training."
11+
]
12+
},
13+
{
14+
"cell_type": "markdown",
15+
"id": "1d124872-f6e3-4cfa-8d91-9232c16b1a0c",
16+
"metadata": {},
17+
"source": [
18+
"## Background\n",
19+
"\n",
20+
"Neural networks (NNs) are a collection of nested functions that are executed on some input data. These functions are defined by parameters (consisting of weights and biases), which in PyTorch are stored in tensors.\n",
21+
"\n",
22+
"Training a NN happens in two steps:\n",
23+
"\n",
24+
"**Forward Propagation**: In forward prop, the NN makes its best guess about the correct output. It runs the input data through each of its functions to make this guess.\n",
25+
"\n",
26+
"**Backward Propagation**: In backprop, the NN adjusts its parameters proportionate to the error in its guess. It does this by traversing backwards from the output, collecting the derivatives of the error with respect to the parameters of the functions (gradients), and optimizing the parameters using gradient descent."
27+
]
28+
},
29+
{
30+
"cell_type": "markdown",
31+
"id": "38242089-98ee-487e-b26b-c99b1f2c44f9",
32+
"metadata": {},
33+
"source": [
34+
"## Usage in PyTorch"
35+
]
36+
},
37+
{
38+
"cell_type": "code",
39+
"execution_count": 1,
40+
"id": "fe758d79-6218-4460-b138-5cfab18bc2ee",
41+
"metadata": {},
42+
"outputs": [],
43+
"source": [
44+
"import torch\n",
45+
"from torchvision.models import resnet18, ResNet18_Weights\n",
46+
"model = resnet18(weights=ResNet18_Weights.DEFAULT)\n",
47+
"data = torch.rand(1, 3, 64, 64)\n",
48+
"labels = torch.rand(1, 1000)"
49+
]
50+
},
51+
{
52+
"cell_type": "code",
53+
"execution_count": 2,
54+
"id": "c8cbe694-3c24-499b-bfb4-1af6a0f60b23",
55+
"metadata": {},
56+
"outputs": [],
57+
"source": [
58+
"# let us run forward pass\n",
59+
"prediction = model(data)"
60+
]
61+
},
62+
{
63+
"cell_type": "markdown",
64+
"id": "aeb5732e-61eb-4857-8bf5-32e9b4bdcb2d",
65+
"metadata": {},
66+
"source": [
67+
"We use the model’s prediction and the corresponding label to calculate the error (`loss`). The next step is to backpropagate this error through the network. Backward propagation is kicked off when we call `.backward()` on the error tensor. Autograd then calculates and stores the gradients for each model parameter in the parameter’s `.grad` attribute."
68+
]
69+
},
70+
{
71+
"cell_type": "code",
72+
"execution_count": 3,
73+
"id": "1e9ff9be-e712-420d-a713-7a8c2d3bd202",
74+
"metadata": {},
75+
"outputs": [],
76+
"source": [
77+
"loss = (prediction - labels).sum()\n",
78+
"loss.backward() # backward pass"
79+
]
80+
},
81+
{
82+
"cell_type": "markdown",
83+
"id": "48d67658-e90d-49ba-8f2e-e0afcb06764e",
84+
"metadata": {},
85+
"source": [
86+
"Next, we load an optimizer, in this case SGD with a learning rate of 0.01 and momentum of 0.9. We register all the parameters of the model in the optimizer."
87+
]
88+
},
89+
{
90+
"cell_type": "code",
91+
"execution_count": 4,
92+
"id": "a0fd8859-3772-4896-80bd-044752a844be",
93+
"metadata": {},
94+
"outputs": [],
95+
"source": [
96+
"optim = torch.optim.SGD(model.parameters(), lr=1e-2, momentum=0.9)"
97+
]
98+
},
99+
{
100+
"cell_type": "markdown",
101+
"id": "81c7e0eb-0d80-411b-9886-79e84058aa98",
102+
"metadata": {},
103+
"source": [
104+
"Finally, we call `.step()` to initiate gradient descent. The optimizer adjusts each parameter by its gradient stored in `.grad`.\r\n"
105+
]
106+
},
107+
{
108+
"cell_type": "code",
109+
"execution_count": 5,
110+
"id": "67640c1a-e163-49f5-a083-95540469a8ed",
111+
"metadata": {},
112+
"outputs": [],
113+
"source": [
114+
"optim.step() #gradient descent"
115+
]
116+
},
117+
{
118+
"cell_type": "markdown",
119+
"id": "e18cc09f-62b0-4dea-8ca8-659a977ccd19",
120+
"metadata": {},
121+
"source": [
122+
"## Differentiation in Autograd\r\n"
123+
]
124+
},
125+
{
126+
"cell_type": "markdown",
127+
"id": "04cc0273-2d6e-4a97-a484-8ce1c54d4141",
128+
"metadata": {},
129+
"source": [
130+
"Let’s take a look at how `autograd` collects gradients. We create two tensors `a` and `b` with `requires_grad=True`. This signals to `autograd` that every operation on them should be tracked."
131+
]
132+
},
133+
{
134+
"cell_type": "code",
135+
"execution_count": 7,
136+
"id": "c3bb911e-08a1-40cd-9bcf-df1452cdbcc1",
137+
"metadata": {},
138+
"outputs": [],
139+
"source": [
140+
"import torch\n",
141+
"\n",
142+
"a = torch.tensor([2., 3.], requires_grad=True)\n",
143+
"b = torch.tensor([6., 4.], requires_grad=True)"
144+
]
145+
},
146+
{
147+
"cell_type": "code",
148+
"execution_count": 9,
149+
"id": "a83aae7d-61bb-45c8-b67b-6ef4f268dea5",
150+
"metadata": {},
151+
"outputs": [],
152+
"source": [
153+
"# let us create another tensfor from a and b\n",
154+
"Q = 3*a**3 - b**2"
155+
]
156+
},
157+
{
158+
"cell_type": "markdown",
159+
"id": "ac6f95ef-6ee9-447a-a060-c33dcf3c3305",
160+
"metadata": {},
161+
"source": [
162+
"Let’s assume `a` and `b` to be parameters of an NN, and `Q` to be the error function. In NN training, we want gradients of the error w.r.t. parameter.\n",
163+
"\n",
164+
"When we call `.backward()` on `Q`, autograd calculates these gradients and stores them in the respective tensors’ `.grad` attribute.\n",
165+
"\n",
166+
"Equivalently, we can also aggregate `Q` into a scalar and call backward implicitly, like `Q.sum().backward()`.\r\n"
167+
]
168+
},
169+
{
170+
"cell_type": "code",
171+
"execution_count": 10,
172+
"id": "adaf0998-9c4e-4024-a6c1-d52f9b7d4835",
173+
"metadata": {},
174+
"outputs": [],
175+
"source": [
176+
"external_grad = torch.tensor([1., 1.])\n",
177+
"Q.backward(gradient=external_grad)"
178+
]
179+
},
180+
{
181+
"cell_type": "markdown",
182+
"id": "9c30509b-d704-4c85-8ba9-be2eb837f34f",
183+
"metadata": {},
184+
"source": [
185+
"Gradients are now deposited in `a.grad` and `b.grad`\r\n"
186+
]
187+
},
188+
{
189+
"cell_type": "code",
190+
"execution_count": 11,
191+
"id": "cb866a0d-af2a-47f0-af5e-fa506efccb76",
192+
"metadata": {},
193+
"outputs": [
194+
{
195+
"name": "stdout",
196+
"output_type": "stream",
197+
"text": [
198+
"tensor([True, True])\n",
199+
"tensor([True, True])\n"
200+
]
201+
}
202+
],
203+
"source": [
204+
"# check if collected gradients are correct\n",
205+
"print(9*a**2 == a.grad)\n",
206+
"print(-2*b == b.grad)"
207+
]
208+
},
209+
{
210+
"cell_type": "code",
211+
"execution_count": null,
212+
"id": "fcb6225e-5954-414f-b0bc-49e1c0ba0712",
213+
"metadata": {},
214+
"outputs": [],
215+
"source": []
216+
}
217+
],
218+
"metadata": {
219+
"kernelspec": {
220+
"display_name": "Python 3 (ipykernel)",
221+
"language": "python",
222+
"name": "python3"
223+
},
224+
"language_info": {
225+
"codemirror_mode": {
226+
"name": "ipython",
227+
"version": 3
228+
},
229+
"file_extension": ".py",
230+
"mimetype": "text/x-python",
231+
"name": "python",
232+
"nbconvert_exporter": "python",
233+
"pygments_lexer": "ipython3",
234+
"version": "3.10.13"
235+
}
236+
},
237+
"nbformat": 4,
238+
"nbformat_minor": 5
239+
}

0 commit comments

Comments
 (0)