Course project submission for the course CS6910: Fundamentals of Deep Learning.
Check this link for the task description: Problem statement link
Team Members : Vamsi Sai Krishna Malineni (OE20S302), Mohammed Safi Ur Rahman Khan (CS21M035)
- Install the required libraries using the following command :
pip install -r requirements.txt
-
The project is divided into two parts:
PART A
andPART B
. You can find the juypter notebooks in the respective folders. Along with jupyter notebooks we have also given python code filed (.py) files. These contain the code to direclty train and test the CNN in a non interactive way. -
For seeing the outputs and the various explanations, and how the code has developed, please check the jupyter notebooks (i.e., .ipynb files for both part A and part B). For training and testing the CNN model from command line, run the (.py) file by following the instructions given in below sections.
-
If you are running the jupyter notebooks on colab, the libraries from the
requirements.txt
file are preinstalled, with theexception
ofwandb
. You can install wandb by using the following command :
!pip install wandb
- The dataset for this project can be found at : Dataset Link
As mentioned earlier, there are two files in the folder of Part A. One is a jupyter notebook and the other is the python code file.
The jupyter notebook has the outputs still intact so that can be used for reference.
The python file has all the functions and the code used in the jupyter file (along with some additional code that can be used to run from the command line)
The python file can be run from the terminal by passing the various command line arguments. Please make sure that the unzipped folder of the dataset is as it is present in the same directory as this python file
There are two modes of running this file
1. Running the hyperparameter sweeps using wandb
python assgn2.py --sweep yes
The code will now run in the sweep mode and will enable wandb integration and log all the data in wand. Make sure you have wandb installed if you want to run in this mode. Also, change the entity name and project name in the code before running in this mode
2. Running in normal mode
python assgn2.py --sweep no --batchNorm xxx --numFilters xxx --filterOrg xxx --dropout xxx --dataAugment xxx --numEpochs xxx --batchSize xxx --denseLayer xxx --learningRate xxx --kernelSize xxx --denseAct xxx --convAct xxx
Replace xxx
in above with the appropriate parameter you want to train the model with
For example:
python assgn2.py --sweep no --batchNorm True --numFilters 32 --filterOrg 2 --dropout 0.4 --dataAugment False --numEpochs 10 --batchSize 128 --denseLayer 512 --learningRate 0.0001 --kernelSize 3 --denseAct relu --convAct relu
Description of various command line arguments
--sweep
: Do you want to sweep or not: Enter 'yes' or 'no'. If this is 'yes' then below arguments are not required. Enter below arguments only if this is 'no'--batchNorm
: Batch Normalization: True or False--numFilters
: Number of Filters: integer value--filterOrg
: Filter Organization: float value--dropout
: Dropout: float value--dataAugment
: Data Augmentation: True or False--numEpochs
: Number of Epochs: integer value--batchSize
: Batch Size: integer value--denseLayer
: Dense Layer size: integer value--learningRate
: Learning Rate: float value--kernelSize
: Kernel Size: integer value--denseAct
: Dense Layer Activation function: string value--convAct
: Conv Layer Activation function: string value
This can be run in a sequential manner. i.e., one cell at a time. This notebook also has the code for plotting the various images required for the assignment.
Dataset for training and validation is prepared using the following functions :
- Un-augmented Dataset:
get_data()
- Augmented Dataset:
get_augmented_data()
Buidling a small CNN network with 5 convolution layers can be done by using the following method :
build_cnn(conv_activation , dense_activation, num_filters, conv_filter_size, pool_filter_size, batch_norm, dense_layer, dropout)
where :
conv_activation
: dtype="List" activation used for convolution layerdense_activation
: dtype="String" acitvation used for densely connected layersnum_filters
: dtype="List" number of activation filters for each layerconv_filter_size
: dtype="List" kernel sizes for convultion layerspool_filter_size
: dtype="List" kernel sizes for maxpooling layersbatch_norm
: dtype="Boolean" set to True, if you are using batch normalizationdim_final
: dtype="Integer" dimensionality of output space after 5 blocks of convultion, maxpooling blocksdropout
: dtype="float or double" specify the dropout % for regularization (in decimals)
The hyperparamter sweeps can be run using the following method
sweeper(entity_name, project_name)
where
entity_name
: Enter the wandb entity nameproject_name
: Enter the wandb project name
The various hyperparameters used are :
hyperparameters={
'batch_norm': {'values':[True, False]},
'num_filter': {'values':[32, 64, 128, 256]},
'filter_org': {'values':[0.5, 1, 2]},
'dropout': {'values':[0.0, 0.5, 0.6, 0.4]},
'data_augmentation': {'values':[True, False]},
'num_epochs': {'values':[10, 20, 30]},
'batch_size': {'values':[32, 64, 128]},
'dense_layer': {'values':[32, 64, 128, 512]},
'learning_rate': {'values':[0.001, 0.0001]},
'kernel_size': {'values': [3, 5, 7]}
}
sweep_config = {
'method' : 'bayes','metric' :{'name': 'val_acc','goal': 'maximize'},
'parameters': hyperparameters
}
The following function will define the model, train the model according to the hyperparameters given to it by wandb and logs the metrics to wandb.
train()
Use the following function to run the wandb sweeps
sweeper(entity_name,project_name)
Use the following function to generate test dataset to determine the test accuracy of model with best performance in terms of validation accuracy.
get_test_data()
Use the following function to determine the test accuracy of the best performing model, and log the metrics to wandb
testing(entity_name,project_name)
- The best trained model can be accessed at: https://drive.google.com/file/d/1aInmPFMV_rpJI_xPDP45h7XD4sGi1KaC/view?usp=sharing
The ipynb file contains all the necessary plots and the code to get them. The plots include
- Plotting images with their true and predicted labels
- Plotting the filters and the feature maps for a random image
- Visualizing 10 random neurons using guided backpropagation
As mentioned earlier, there are two files in the folder of Part B. One is a jupyter notebook and the other is the python code file.
The jupyter notebook has the outputs still intact so that can be used for reference.
The python file has all the functions and the code used in the jupyter file (along with some additional code that can be used to run from the command line)
The python file can be run from the terminal by passing the various command line arguments. Please make sure that the unzipped folder of the dataset is as it is present in the same directory as this python file
There are two modes of running this file
1. Running the hyperparameter sweeps using wandb
python assgn2B.py --sweep yes
The code will now run in the sweep mode and will enable wandb integration and log all the data in wandb. Make sure you have wandb installed if you want to run in this mode. Also, change the entity name and project name in the code before running in this mode
2. Running in normal mode
python assgn2B.py --sweep no --model xxx --dropout xxx --dataAugment xxx --numEpochs xxx --batchSize xxx --denseLayer xxx --learningRate xxx --trainLayers xxx --denseAct xxx
Replace xxx
in above with the appropriate parameter you want to train the model with
For example:
python assgn2B.py --sweep no --model Xception --dropout 0.3 --dataAugment False --numEpochs 10 --batchSize 32 --denseLayer 128 --learningRate 0.001 --trainLayers 10 --denseAct relu
Description of various command line arguments
--sweep
: Do you want to sweep or not: Enter 'yes' or 'no'. If this is 'yes' then below arguments are not required. Enter below arguments only if this is 'no'--model
: Pretrained model to use: string value--dropout
: Dropout: float value--dataAugment
: Data Augmentation: True or False--numEpochs
: Number of Epochs: integer value--batchSize
: Batch Size: integer value--denseLayer
: Dense Layer size: integer value--learningRate
: Learning Rate: float value--trainLayers
: Number of trainable layers: integer value--denseAct
: Dense Layer Activation function: string value
This can be run in a sequential manner. i.e., one cell at a time. This notebook also has the code for plotting the various images required for the assignment.
Dataset for training and validation is prepared using the following functions :
- Un-augmented Dataset:
get_data()
- Augmented Dataset:
get_augmented_data()
The following function is used to build a model based on a pretrained model:
build_model(model_name, dense_activation, dense_layer, dropout, trainable_layers)
where
model_name
: Enter the name of a pretrained model ("String")dense_activation
: Enter the name of the activation function for dense layer ("String")dense_layer
: Enter the number of units in the dense layer ("Integer")dropout
: Enter the percent of drop out in decimals ("Double/float")trainable_layers
: Enter the number of layers to be tuned ("Integer")
The available models for pretraining are : * `ResNet50` * `InceptionV3` * `InceptionResNetV2` * `Xception`
sweeper(entity_name, project_name)
where
entity_name
: Enter the wandb entity nameproject_name
: Enter the wandb project name
The various hyperparameters used are :
hyperparameters = {
"model_name":{'values': ["InceptionV3", "ResNet50", "InceptionResNetV2", "Xception"]},
"data_augmentation": {"values": [True, False]},
"dense_layer": {"values": [64, 128, 256, 512]},
"dropout": {"values": [0.0, 0.1, 0.2, 0.3]},
"trainable_layers": {"values": [0, 10, 15, 20]},
"batch_size": {"values": [64, 128]},
"num_epochs":{"values": [5, 10, 15]}
}
sweep_config = {
"method": "bayes",
"metric": {"name":"val_acc","goal": "maximize"},
"parameters": hyperparameters
}
The following function will define the model, train the model according to the hyperparameters given to it by wandb and logs the metrics to wandb.
wandb_train()
Use the following function to run the wandb sweeps
sweeper(entity_name,project_name)
The test data can be accessed using the following function:
- Un-augmented data: Use this function if the best model for your data set returns that data augmentation is not necessary
get_test_data()
- Augmented data: Use this function if the best model for your data set returns that data augmentation is necessary
get_test_augmented_data()
Use the following function to determine the test accuracy of the best performing model, and log the metrics to wandb
testing(entity_name,project_name)
The results and the learnings from this assignment can be found here: https://wandb.ai/safi-vamsi-cs6910/Assignment%202/reports/Assignment-2--VmlldzoxNzY2Njky