You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Consistent with the paper, the two _trainval_ datasets are to be used for training, while the VOC 2007 _test_ will serve as our validation and testing data.
624
+
Consistent with the paper, the two _trainval_ datasets are to be used for training, while the VOC 2007 _test_ will serve as our test data.
625
625
626
626
Make sure you extract both the VOC 2007 _trainval_ and 2007 _test_ data to the same location, i.e. merge them.
627
627
@@ -712,7 +712,7 @@ As mentioned in the paper, these transformations play a crucial role in obtainin
712
712
713
713
#### PyTorch DataLoader
714
714
715
-
The `Dataset` described above, `PascalVOCDataset`, will be used by a PyTorch [`DataLoader`](https://pytorch.org/docs/master/data.html#torch.utils.data.DataLoader) in `train.py` to **create and feed batches of data to the model** for training or validation.
715
+
The `Dataset` described above, `PascalVOCDataset`, will be used by a PyTorch [`DataLoader`](https://pytorch.org/docs/master/data.html#torch.utils.data.DataLoader) in `train.py` to **create and feed batches of data to the model** for training or evaluation.
716
716
717
717
Since the number of objects vary across different images, their bounding boxes, labels, and difficulties cannot simply be stacked together in the batch. There would be no way of knowing which objects belong to which image.
718
718
@@ -788,7 +788,7 @@ The **Multibox Loss is the aggregate of these two losses**, combined in the rati
788
788
789
789
# Training
790
790
791
-
Before you begin, make sure to save the required data files for training and validation. To do this, run the contents of [`create_data_lists.py`](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Object-Detection/blob/master/create_data_lists.py) after pointing it to the `VOC2007` and `VOC2012` folders in your [downloaded data](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Object-Detection#download).
791
+
Before you begin, make sure to save the required data files for training and evaluation. To do this, run the contents of [`create_data_lists.py`](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Object-Detection/blob/master/create_data_lists.py) after pointing it to the `VOC2007` and `VOC2012` folders in your [downloaded data](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Object-Detection#download).
792
792
793
793
See [`train.py`](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Object-Detection/blob/master/train.py).
794
794
@@ -800,8 +800,6 @@ To **train your model from scratch**, run this file –
800
800
801
801
To **resume training at a checkpoint**, point to the corresponding file with the `checkpoint` parameter at the beginning of the code.
802
802
803
-
Note that we perform validation at the end of every training epoch.
804
-
805
803
### Remarks
806
804
807
805
In the paper, they recommend using **Stochastic Gradient Descent** in batches of `32` images, with an initial learning rate of `1e−3`, momentum of `0.9`, and `5e-4` weight decay.
@@ -810,11 +808,9 @@ I ended up using a batch size of `8` images for increased stability. If you find
810
808
811
809
The authors also doubled the learning rate for bias parameters. As you can see in the code, this is easy do in PyTorch, by passing [separate groups of parameters](https://pytorch.org/docs/stable/optim.html#per-parameter-options) to the `params` argument of its [SGD optimizer](https://pytorch.org/docs/stable/optim.html#torch.optim.SGD).
812
810
813
-
The paper recommends training for 80000 iterations at the initial learning rate. Then, it is decayed by 90% for an additional 20000 iterations, _twice_. With the paper's batch size of `32`, this means that the learning rate is decayed by 90% once at the 155th epoch and once more at the 194th epoch, and training is stopped at 232 epochs.
814
-
815
-
In practice, I just decayed the learning rate by 90% when the validation loss stopped improving for long periods. I resumed training at this reduced learning rate from the best checkpoint obtained thus far, not the most recent.
811
+
The paper recommends training for 80000 iterations at the initial learning rate. Then, it is decayed by 90% (i.e. to a tenth) for an additional 20000 iterations, _twice_. With the paper's batch size of `32`, this means that the learning rate is decayed by 90% once at the 155th epoch and once more at the 194th epoch, and training is stopped at 232 epochs. I followed the same schedule.
816
812
817
-
On a TitanX (Pascal), each epoch of training required about 6 minutes. My best checkpoint was from epoch 186, with a validation loss of `2.515`.
813
+
On a TitanX (Pascal), each epoch of training required about 6 minutes.
818
814
819
815
### Model checkpoint
820
816
@@ -834,32 +830,32 @@ To begin evaluation, simply run the `evaluate()` function with the data-loader a
834
830
835
831
We will use `calculate_mAP()` in [`utils.py`](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Object-Detection/blob/master/utils.py) for this purpose. As is the norm, we will ignore _difficult_ detections in the mAP calculation. But nevertheless, it is important to include them from the evaluation dataset because if the model does detect an object that is considered to be _difficult_, it must not be counted as a false positive.
836
832
837
-
The model scores **77.1 mAP**, against the 77.2 mAP reported in the paper.
833
+
The model scores **77.2 mAP**, same as the result reported in the paper.
838
834
839
-
Class-wise average precisions are listed below.
835
+
Class-wise average precisions (not scaled to 100) are listed below.
840
836
841
837
| Class | Average Precision |
842
838
| :-----: | :------: |
843
-
|aeroplane| 78.9|
844
-
| bicycle | 83.7|
845
-
| bird | 76.9|
846
-
| boat | 72.0|
847
-
| bottle | 46.0|
848
-
| bus | 86.7|
849
-
| car | 86.9|
850
-
| cat | 89.2|
851
-
| chair | 59.6|
852
-
| cow | 82.7|
853
-
| diningtable | 75.2|
854
-
| dog | 85.6|
855
-
| horse | 87.4|
856
-
| motorbike | 82.9|
857
-
| person | 78.8|
858
-
| pottedplant | 50.3|
859
-
| sheep | 78.7|
860
-
| sofa | 80.5|
861
-
| train | 85.7|
862
-
| tvmonitor | 75.0|
839
+
|_aeroplane_|0.7887580990791321 |
840
+
|_bicycle_| 0.8351995348930359 |
841
+
|_bird_| 0.7623348236083984 |
842
+
|_boat_| 0.7218425273895264 |
843
+
|_bottle_| 0.45978495478630066 |
844
+
|_bus_| 0.8705356121063232 |
845
+
|_car_| 0.8655831217765808 |
846
+
|_cat_| 0.8828985095024109 |
847
+
|_chair_| 0.5917483568191528 |
848
+
|_cow_| 0.8255912661552429 |
849
+
|_diningtable_| 0.756867527961731 |
850
+
|_dog_| 0.856262743473053 |
851
+
|_horse_| 0.8778411149978638 |
852
+
|_motorbike_| 0.8316892385482788 |
853
+
|_person_| 0.7884440422058105 |
854
+
|_pottedplant_| 0.5071538090705872 |
855
+
|_sheep_| 0.7936667799949646 |
856
+
|_sofa_| 0.7998116612434387 |
857
+
|_train_| 0.8655905723571777 |
858
+
|_tvmonitor_| 0.7492395043373108 |
863
859
864
860
You can see that some objects, like bottles and potted plants, are considerably harder to detect than others.
Copy file name to clipboardExpand all lines: train.py
+20-94
Original file line number
Diff line number
Diff line change
@@ -18,13 +18,12 @@
18
18
# Learning parameters
19
19
checkpoint=None# path to model checkpoint, None if none
20
20
batch_size=8# batch size
21
-
start_epoch=0# start at this epoch
22
-
epochs=200# number of epochs to run without early-stopping
23
-
epochs_since_improvement=0# number of epochs since there was an improvement in the validation metric
24
-
best_loss=100.# assume a high loss at first
21
+
iterations=120000# number of iterations to train
25
22
workers=4# number of workers for loading data in the DataLoader
26
-
print_freq=200# print training or validation status every __ batches
23
+
print_freq=200# print training status every __ batches
27
24
lr=1e-3# learning rate
25
+
decay_lr_at= [80000, 100000] # decay learning rate after these many iterations
26
+
decay_lr_to=0.1# decay learning rate to this fraction of the existing learning rate
28
27
momentum=0.9# momentum
29
28
weight_decay=5e-4# weight decay
30
29
grad_clip=None# clip if gradients are exploding, which may happen at larger batch sizes (sometimes at 32) - you will recognize it by a sorting error in the MuliBox loss calculation
0 commit comments