From ad36b46eaea40bc22c4558bdbd7b2e7599ae34e5 Mon Sep 17 00:00:00 2001 From: Julius Simonelli Date: Mon, 3 Jul 2023 10:23:08 -0700 Subject: [PATCH] Update 2021-5-26-torchvision-mobilenet-v3-implementation.md --- _posts/2021-5-26-torchvision-mobilenet-v3-implementation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_posts/2021-5-26-torchvision-mobilenet-v3-implementation.md b/_posts/2021-5-26-torchvision-mobilenet-v3-implementation.md index 6496c42a1806..8b9419e7a193 100644 --- a/_posts/2021-5-26-torchvision-mobilenet-v3-implementation.md +++ b/_posts/2021-5-26-torchvision-mobilenet-v3-implementation.md @@ -81,7 +81,7 @@ Another important detail is that though PyTorch’s and TensorFlow’s RMSProp i **Increasing our accuracy by tuning hyperparameters & improving our training recipe** -After configuring the optimizer to achieve fast and stable training, we turned into optimizing the accuracy of the model. There are a few techniques that helped us achieve this. First of all, to avoid overfitting we augmented out data using the AutoAugment algorithm, followed by RandomErasing. Additionally we tuned parameters such as the weight decay using cross validation. We also found beneficial to perform [weight averaging](https://github.com/pytorch/vision/blob/674e8140042c2a3cbb1eb9ebad1fa49501599130/references/classification/utils.py#L259) across different epoch checkpoints after the end of the training. Finally, though not used in our published training recipe, we found that using Label Smoothing, Stochastic Depth and LR noise injection improve the overall accuracy by over [1.5 points](https://rwightman.github.io/pytorch-image-models/training_hparam_examples/#mobilenetv3-large-100-75766-top-1-92542-top-5). +After configuring the optimizer to achieve fast and stable training, we turned into optimizing the accuracy of the model. There are a few techniques that helped us achieve this. First of all, to avoid overfitting we augmented our data using the AutoAugment algorithm, followed by RandomErasing. Additionally we tuned parameters such as the weight decay using cross validation. We also found beneficial to perform [weight averaging](https://github.com/pytorch/vision/blob/674e8140042c2a3cbb1eb9ebad1fa49501599130/references/classification/utils.py#L259) across different epoch checkpoints after the end of the training. Finally, though not used in our published training recipe, we found that using Label Smoothing, Stochastic Depth and LR noise injection improve the overall accuracy by over [1.5 points](https://rwightman.github.io/pytorch-image-models/training_hparam_examples/#mobilenetv3-large-100-75766-top-1-92542-top-5). The graph and table depict a simplified summary of the most important iterations for improving the accuracy of the MobileNetV3 Large variant. Note that the actual number of iterations done while training the model was significantly larger and that the progress in accuracy was not always monotonically increasing. Also note that the Y-axis of the graph starts from 70% instead from 0% to make the difference between iterations more visible: