Skip to content

Commit e33f472

Browse files
authoredMar 13, 2023
Update README.md
1 parent 0fa674e commit e33f472

File tree

1 file changed

+66
-2
lines changed

1 file changed

+66
-2
lines changed
 

‎README.md

+66-2
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,66 @@
1-
# mnist-dataset-classification
2-
A standard (non-convolution based) neural network to classify the MNIST dataset.
1+
# MNIST Dataset Classification
2+
> ~A standard (non-convolution based) neural network to classify the MNIST dataset.
3+
4+
The MNIST Database contains gray-scale images of 28x28 dimension where each image represents a handwritten digit which the network has to identify.
5+
6+
## Step 1 : Setting up the database
7+
8+
### Downloading and Transforming the database :
9+
10+
We need to download the MNIST Dataset and Transform it to Tensors which we are going to input into the model. This is achieved by :
11+
12+
https://github.com/infinitecoder1729/mnist-dataset-classification/blob/0fa674e4325acf4e82ea8513c948062677d04baf/MNIST%20Classification%20Model..py#L8-L9
13+
14+
> 'train' dataset represents our Training dataset and 'test' dataset represents the Testing dataset.
15+
16+
### Getting to know the dataset better :
17+
18+
To know about number of samples given in dataset, we can simply use :
19+
20+
https://github.com/infinitecoder1729/mnist-dataset-classification/blob/0fa674e4325acf4e82ea8513c948062677d04baf/MNIST%20Classification%20Model..py#L10-L11
21+
22+
To see example of images in training dataset, We can use :
23+
24+
```py
25+
image,label = test[0] #to display the first image in test dataset along with its corresponding number
26+
plt.imshow(image.numpy().squeeze(), cmap='gray_r');
27+
print("\nThe Number is : " ,label,"\n")
28+
```
29+
### Deciding on whether to use batches or not :
30+
31+
The accuracy of the estimate and the possibility that the weights of the network will be changed in a way that enhances the model's performance go up with the number of training examples used.
32+
33+
A noisy estimate is produced as a result of smaller batch size, which leads to noisy updates to the model, such as several updates with potentially very different estimates of the error gradient. However, these noisy updates sometimes lead to a more robust model and definately contribute to a faster learning.
34+
35+
Various Types of Gradient Descents :
36+
1. Batch Gradient Descent : The whole dataset is treated as one batch
37+
2. Stochastic Gradient Descent : Batch size is set to one example.
38+
3. Minibatch Gradient Descent : Batch size is set to somewhere in between one and total number of examples in the training dataset.
39+
40+
Given that we have quite a large database, we will not take batch size to be equivalent to the whole dataset.
41+
42+
Smaller batch sizes also give us certain benifits such as :
43+
44+
1. Lower generalization error.
45+
2. Easiness in fitting one batch of training data in memory.
46+
47+
We will use mini-batch gradient descent so that we update our parameters frequently as well as we can use vectorized implementation for faster computations.
48+
49+
A batch size of maybe 30 examples would be suitable.
50+
51+
We would use dataloader for randomly breaking our datasets into small batches :
52+
53+
https://github.com/infinitecoder1729/mnist-dataset-classification/blob/0fa674e4325acf4e82ea8513c948062677d04baf/MNIST%20Classification%20Model..py#L13
54+
55+
## Step 2 : Creating the neural network
56+
57+
### Deciding on Number of Hidden Layers and neurons :
58+
59+
This is a topic of very elaborate discussion but to make it easier, The discussions on : [AI FAQs](http://www.faqs.org/faqs/ai-faq/neural-nets/part3/) were followed in making this model. Thus, The number of hidden layers were decided to be one and the number of hidden nodes in the layer would be 490 (Considering the thumb rule as : The number of hidden neurons should be 2/3 the size of the input layer, plus the size of the output layer.)
60+
61+
The input nodes are 784 as a result of 28 x 28 (Number of square pixels in each image), While the Output layer is 10, one for each digit (0 to 9)
62+
63+
This is implemented as :
64+
65+
https://github.com/infinitecoder1729/mnist-dataset-classification/blob/0fa674e4325acf4e82ea8513c948062677d04baf/MNIST%20Classification%20Model..py#L16-L18
66+

0 commit comments

Comments
 (0)