Skip to content

Commit 091c967

Browse files
committed
changes
1 parent 0d38943 commit 091c967

File tree

5 files changed

+55
-144
lines changed

5 files changed

+55
-144
lines changed

README.md

+27-31
Original file line numberDiff line numberDiff line change
@@ -621,7 +621,7 @@ Specfically, you will need to download the following VOC datasets –
621621

622622
- [2007 _test_](http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar) (451MB)
623623

624-
Consistent with the paper, the two _trainval_ datasets are to be used for training, while the VOC 2007 _test_ will serve as our validation and testing data.
624+
Consistent with the paper, the two _trainval_ datasets are to be used for training, while the VOC 2007 _test_ will serve as our test data.
625625

626626
Make sure you extract both the VOC 2007 _trainval_ and 2007 _test_ data to the same location, i.e. merge them.
627627

@@ -712,7 +712,7 @@ As mentioned in the paper, these transformations play a crucial role in obtainin
712712

713713
#### PyTorch DataLoader
714714

715-
The `Dataset` described above, `PascalVOCDataset`, will be used by a PyTorch [`DataLoader`](https://pytorch.org/docs/master/data.html#torch.utils.data.DataLoader) in `train.py` to **create and feed batches of data to the model** for training or validation.
715+
The `Dataset` described above, `PascalVOCDataset`, will be used by a PyTorch [`DataLoader`](https://pytorch.org/docs/master/data.html#torch.utils.data.DataLoader) in `train.py` to **create and feed batches of data to the model** for training or evaluation.
716716

717717
Since the number of objects vary across different images, their bounding boxes, labels, and difficulties cannot simply be stacked together in the batch. There would be no way of knowing which objects belong to which image.
718718

@@ -788,7 +788,7 @@ The **Multibox Loss is the aggregate of these two losses**, combined in the rati
788788

789789
# Training
790790

791-
Before you begin, make sure to save the required data files for training and validation. To do this, run the contents of [`create_data_lists.py`](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Object-Detection/blob/master/create_data_lists.py) after pointing it to the `VOC2007` and `VOC2012` folders in your [downloaded data](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Object-Detection#download).
791+
Before you begin, make sure to save the required data files for training and evaluation. To do this, run the contents of [`create_data_lists.py`](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Object-Detection/blob/master/create_data_lists.py) after pointing it to the `VOC2007` and `VOC2012` folders in your [downloaded data](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Object-Detection#download).
792792

793793
See [`train.py`](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Object-Detection/blob/master/train.py).
794794

@@ -800,8 +800,6 @@ To **train your model from scratch**, run this file –
800800

801801
To **resume training at a checkpoint**, point to the corresponding file with the `checkpoint` parameter at the beginning of the code.
802802

803-
Note that we perform validation at the end of every training epoch.
804-
805803
### Remarks
806804

807805
In the paper, they recommend using **Stochastic Gradient Descent** in batches of `32` images, with an initial learning rate of `1e−3`, momentum of `0.9`, and `5e-4` weight decay.
@@ -810,11 +808,9 @@ I ended up using a batch size of `8` images for increased stability. If you find
810808

811809
The authors also doubled the learning rate for bias parameters. As you can see in the code, this is easy do in PyTorch, by passing [separate groups of parameters](https://pytorch.org/docs/stable/optim.html#per-parameter-options) to the `params` argument of its [SGD optimizer](https://pytorch.org/docs/stable/optim.html#torch.optim.SGD).
812810

813-
The paper recommends training for 80000 iterations at the initial learning rate. Then, it is decayed by 90% for an additional 20000 iterations, _twice_. With the paper's batch size of `32`, this means that the learning rate is decayed by 90% once at the 155th epoch and once more at the 194th epoch, and training is stopped at 232 epochs.
814-
815-
In practice, I just decayed the learning rate by 90% when the validation loss stopped improving for long periods. I resumed training at this reduced learning rate from the best checkpoint obtained thus far, not the most recent.
811+
The paper recommends training for 80000 iterations at the initial learning rate. Then, it is decayed by 90% (i.e. to a tenth) for an additional 20000 iterations, _twice_. With the paper's batch size of `32`, this means that the learning rate is decayed by 90% once at the 155th epoch and once more at the 194th epoch, and training is stopped at 232 epochs. I followed the same schedule.
816812

817-
On a TitanX (Pascal), each epoch of training required about 6 minutes. My best checkpoint was from epoch 186, with a validation loss of `2.515`.
813+
On a TitanX (Pascal), each epoch of training required about 6 minutes.
818814

819815
### Model checkpoint
820816

@@ -834,32 +830,32 @@ To begin evaluation, simply run the `evaluate()` function with the data-loader a
834830

835831
We will use `calculate_mAP()` in [`utils.py`](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Object-Detection/blob/master/utils.py) for this purpose. As is the norm, we will ignore _difficult_ detections in the mAP calculation. But nevertheless, it is important to include them from the evaluation dataset because if the model does detect an object that is considered to be _difficult_, it must not be counted as a false positive.
836832

837-
The model scores **77.1 mAP**, against the 77.2 mAP reported in the paper.
833+
The model scores **77.2 mAP**, same as the result reported in the paper.
838834

839-
Class-wise average precisions are listed below.
835+
Class-wise average precisions (not scaled to 100) are listed below.
840836

841837
| Class | Average Precision |
842838
| :-----: | :------: |
843-
| aeroplane | 78.9|
844-
| bicycle | 83.7|
845-
| bird | 76.9|
846-
| boat | 72.0|
847-
| bottle | 46.0|
848-
| bus | 86.7|
849-
| car | 86.9|
850-
| cat | 89.2|
851-
| chair | 59.6|
852-
| cow | 82.7|
853-
| diningtable | 75.2|
854-
| dog | 85.6|
855-
| horse | 87.4|
856-
| motorbike | 82.9|
857-
| person | 78.8|
858-
| pottedplant | 50.3|
859-
| sheep | 78.7|
860-
| sofa | 80.5|
861-
| train | 85.7|
862-
| tvmonitor | 75.0|
839+
| _aeroplane_ | 0.7887580990791321 |
840+
| _bicycle_ | 0.8351995348930359 |
841+
| _bird_ | 0.7623348236083984 |
842+
| _boat_ | 0.7218425273895264 |
843+
| _bottle_ | 0.45978495478630066 |
844+
| _bus_ | 0.8705356121063232 |
845+
| _car_ | 0.8655831217765808 |
846+
| _cat_ | 0.8828985095024109 |
847+
| _chair_ | 0.5917483568191528 |
848+
| _cow_ | 0.8255912661552429 |
849+
| _diningtable_ | 0.756867527961731 |
850+
| _dog_ | 0.856262743473053 |
851+
| _horse_ | 0.8778411149978638 |
852+
| _motorbike_ | 0.8316892385482788 |
853+
| _person_ | 0.7884440422058105 |
854+
| _pottedplant_ | 0.5071538090705872 |
855+
| _sheep_ | 0.7936667799949646 |
856+
| _sofa_ | 0.7998116612434387 |
857+
| _train_ | 0.8655905723571777 |
858+
| _tvmonitor_ | 0.7492395043373108 |
863859

864860
You can see that some objects, like bottles and potted plants, are considerably harder to detect than others.
865861

detect.py

+2-3
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,10 @@
55
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
66

77
# Load model checkpoint
8-
checkpoint = 'BEST_checkpoint_ssd300.pth.tar'
8+
checkpoint = 'checkpoint_ssd300.pth.tar'
99
checkpoint = torch.load(checkpoint)
1010
start_epoch = checkpoint['epoch'] + 1
11-
best_loss = checkpoint['best_loss']
12-
print('\nLoaded checkpoint from epoch %d. Best loss so far is %.3f.\n' % (start_epoch, best_loss))
11+
print('\nLoaded checkpoint from epoch %d.\n' % start_epoch)
1312
model = checkpoint['model']
1413
model = model.to(device)
1514
model.eval()

eval.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
batch_size = 64
1313
workers = 4
1414
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
15-
checkpoint = './BEST_checkpoint_ssd300.pth.tar'
15+
checkpoint = './checkpoint_ssd300.pth.tar'
1616

1717
# Load model checkpoint that is to be evaluated
1818
checkpoint = torch.load(checkpoint)

train.py

+20-94
Original file line numberDiff line numberDiff line change
@@ -18,13 +18,12 @@
1818
# Learning parameters
1919
checkpoint = None # path to model checkpoint, None if none
2020
batch_size = 8 # batch size
21-
start_epoch = 0 # start at this epoch
22-
epochs = 200 # number of epochs to run without early-stopping
23-
epochs_since_improvement = 0 # number of epochs since there was an improvement in the validation metric
24-
best_loss = 100. # assume a high loss at first
21+
iterations = 120000 # number of iterations to train
2522
workers = 4 # number of workers for loading data in the DataLoader
26-
print_freq = 200 # print training or validation status every __ batches
23+
print_freq = 200 # print training status every __ batches
2724
lr = 1e-3 # learning rate
25+
decay_lr_at = [80000, 100000] # decay learning rate after these many iterations
26+
decay_lr_to = 0.1 # decay learning rate to this fraction of the existing learning rate
2827
momentum = 0.9 # momentum
2928
weight_decay = 5e-4 # weight decay
3029
grad_clip = None # clip if gradients are exploding, which may happen at larger batch sizes (sometimes at 32) - you will recognize it by a sorting error in the MuliBox loss calculation
@@ -34,12 +33,13 @@
3433

3534
def main():
3635
"""
37-
Training and validation.
36+
Training.
3837
"""
39-
global epochs_since_improvement, start_epoch, label_map, best_loss, epoch, checkpoint
38+
global start_epoch, label_map, epoch, checkpoint, decay_lr_at
4039

4140
# Initialize model or load checkpoint
4241
if checkpoint is None:
42+
start_epoch = 0
4343
model = SSD300(n_classes=n_classes)
4444
# Initialize the optimizer, with twice the default learning rate for biases, as in the original Caffe repo
4545
biases = list()
@@ -56,9 +56,7 @@ def main():
5656
else:
5757
checkpoint = torch.load(checkpoint)
5858
start_epoch = checkpoint['epoch'] + 1
59-
epochs_since_improvement = checkpoint['epochs_since_improvement']
60-
best_loss = checkpoint['best_loss']
61-
print('\nLoaded checkpoint from epoch %d. Best loss so far is %.3f.\n' % (start_epoch, best_loss))
59+
print('\nLoaded checkpoint from epoch %d.\n' % start_epoch)
6260
model = checkpoint['model']
6361
optimizer = checkpoint['optimizer']
6462

@@ -70,28 +68,22 @@ def main():
7068
train_dataset = PascalVOCDataset(data_folder,
7169
split='train',
7270
keep_difficult=keep_difficult)
73-
val_dataset = PascalVOCDataset(data_folder,
74-
split='test',
75-
keep_difficult=keep_difficult)
7671
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True,
7772
collate_fn=train_dataset.collate_fn, num_workers=workers,
7873
pin_memory=True) # note that we're passing the collate function here
79-
val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=batch_size, shuffle=True,
80-
collate_fn=val_dataset.collate_fn, num_workers=workers,
81-
pin_memory=True)
74+
75+
# Calculate total number of epochs to train and the epochs to decay learning rate at (i.e. convert iterations to epochs)
76+
# To convert iterations to epochs, divide iterations by the number of iterations per epoch
77+
# The paper trains for 120,000 iterations with a batch size of 32, decays after 80,000 and 100,000 iterations
78+
epochs = iterations // (len(train_dataset) // 32)
79+
decay_lr_at = [it // (len(train_dataset) // 32) for it in decay_lr_at]
80+
8281
# Epochs
8382
for epoch in range(start_epoch, epochs):
84-
# Paper describes decaying the learning rate at the 80000th, 100000th, 120000th 'iteration', i.e. model update or batch
85-
# The paper uses a batch size of 32, which means there were about 517 iterations in an epoch
86-
# Therefore, to find the epochs to decay at, you could do,
87-
# if epoch in {80000 // 517, 100000 // 517, 120000 // 517}:
88-
# adjust_learning_rate(optimizer, 0.1)
89-
90-
# In practice, I just decayed the learning rate when loss stopped improving for long periods,
91-
# and I would resume from the last best checkpoint with the new learning rate,
92-
# since there's no point in resuming at the most recent and significantly worse checkpoint.
93-
# So, when you're ready to decay the learning rate, just set checkpoint = 'BEST_checkpoint_ssd300.pth.tar' above
94-
# and have adjust_learning_rate(optimizer, 0.1) BEFORE this 'for' loop
83+
84+
# Decay learning rate at particular epochs
85+
if epoch in decay_lr_at:
86+
adjust_learning_rate(optimizer, decay_lr_to)
9587

9688
# One epoch's training
9789
train(train_loader=train_loader,
@@ -100,24 +92,8 @@ def main():
10092
optimizer=optimizer,
10193
epoch=epoch)
10294

103-
# One epoch's validation
104-
val_loss = validate(val_loader=val_loader,
105-
model=model,
106-
criterion=criterion)
107-
108-
# Did validation loss improve?
109-
is_best = val_loss < best_loss
110-
best_loss = min(val_loss, best_loss)
111-
112-
if not is_best:
113-
epochs_since_improvement += 1
114-
print("\nEpochs since last improvement: %d\n" % (epochs_since_improvement,))
115-
116-
else:
117-
epochs_since_improvement = 0
118-
11995
# Save checkpoint
120-
save_checkpoint(epoch, epochs_since_improvement, model, optimizer, val_loss, best_loss, is_best)
96+
save_checkpoint(epoch, model, optimizer)
12197

12298

12399
def train(train_loader, model, criterion, optimizer, epoch):
@@ -180,55 +156,5 @@ def train(train_loader, model, criterion, optimizer, epoch):
180156
del predicted_locs, predicted_scores, images, boxes, labels # free some memory since their histories may be stored
181157

182158

183-
def validate(val_loader, model, criterion):
184-
"""
185-
One epoch's validation.
186-
187-
:param val_loader: DataLoader for validation data
188-
:param model: model
189-
:param criterion: MultiBox loss
190-
:return: average validation loss
191-
"""
192-
model.eval() # eval mode disables dropout
193-
194-
batch_time = AverageMeter()
195-
losses = AverageMeter()
196-
197-
start = time.time()
198-
199-
# Prohibit gradient computation explicity because I had some problems with memory
200-
with torch.no_grad():
201-
# Batches
202-
for i, (images, boxes, labels, difficulties) in enumerate(val_loader):
203-
204-
# Move to default device
205-
images = images.to(device) # (N, 3, 300, 300)
206-
boxes = [b.to(device) for b in boxes]
207-
labels = [l.to(device) for l in labels]
208-
209-
# Forward prop.
210-
predicted_locs, predicted_scores = model(images) # (N, 8732, 4), (N, 8732, n_classes)
211-
212-
# Loss
213-
loss = criterion(predicted_locs, predicted_scores, boxes, labels)
214-
215-
losses.update(loss.item(), images.size(0))
216-
batch_time.update(time.time() - start)
217-
218-
start = time.time()
219-
220-
# Print status
221-
if i % print_freq == 0:
222-
print('[{0}/{1}]\t'
223-
'Batch Time {batch_time.val:.3f} ({batch_time.avg:.3f})\t'
224-
'Loss {loss.val:.4f} ({loss.avg:.4f})\t'.format(i, len(val_loader),
225-
batch_time=batch_time,
226-
loss=losses))
227-
228-
print('\n * LOSS - {loss.avg:.3f}\n'.format(loss=losses))
229-
230-
return losses.avg
231-
232-
233159
if __name__ == '__main__':
234160
main()

utils.py

+5-15
Original file line numberDiff line numberDiff line change
@@ -93,12 +93,12 @@ def create_data_lists(voc07_path, voc12_path, output_folder):
9393
print('\nThere are %d training images containing a total of %d objects. Files have been saved to %s.' % (
9494
len(train_images), n_objects, os.path.abspath(output_folder)))
9595

96-
# Validation data
96+
# Test data
9797
test_images = list()
9898
test_objects = list()
9999
n_objects = 0
100100

101-
# Find IDs of images in validation data
101+
# Find IDs of images in the test data
102102
with open(os.path.join(voc07_path, 'ImageSets/Main/test.txt')) as f:
103103
ids = f.read().splitlines()
104104

@@ -119,7 +119,7 @@ def create_data_lists(voc07_path, voc12_path, output_folder):
119119
with open(os.path.join(output_folder, 'TEST_objects.json'), 'w') as j:
120120
json.dump(test_objects, j)
121121

122-
print('\nThere are %d validation images containing a total of %d objects. Files have been saved to %s.' % (
122+
print('\nThere are %d test images containing a total of %d objects. Files have been saved to %s.' % (
123123
len(test_images), n_objects, os.path.abspath(output_folder)))
124124

125125

@@ -602,7 +602,7 @@ def transform(image, boxes, labels, difficulties, split):
602602
new_boxes = boxes
603603
new_labels = labels
604604
new_difficulties = difficulties
605-
# Skip the following operations if validation/evaluation
605+
# Skip the following operations for evaluation/testing
606606
if split == 'TRAIN':
607607
# A series of photometric distortions in random order, each with 50% chance of occurrence, as in Caffe repo
608608
new_image = photometric_distort(new_image)
@@ -666,29 +666,19 @@ def accuracy(scores, targets, k):
666666
return correct_total.item() * (100.0 / batch_size)
667667

668668

669-
def save_checkpoint(epoch, epochs_since_improvement, model, optimizer, loss, best_loss, is_best):
669+
def save_checkpoint(epoch, model, optimizer):
670670
"""
671671
Save model checkpoint.
672672
673673
:param epoch: epoch number
674-
:param epochs_since_improvement: number of epochs since last improvement
675674
:param model: model
676675
:param optimizer: optimizer
677-
:param loss: validation loss in this epoch
678-
:param best_loss: best validation loss achieved so far (not necessarily in this checkpoint)
679-
:param is_best: is this checkpoint the best so far?
680676
"""
681677
state = {'epoch': epoch,
682-
'epochs_since_improvement': epochs_since_improvement,
683-
'loss': loss,
684-
'best_loss': best_loss,
685678
'model': model,
686679
'optimizer': optimizer}
687680
filename = 'checkpoint_ssd300.pth.tar'
688681
torch.save(state, filename)
689-
# If this checkpoint is the best so far, store a copy so it doesn't get overwritten by a worse checkpoint
690-
if is_best:
691-
torch.save(state, 'BEST_' + filename)
692682

693683

694684
class AverageMeter(object):

0 commit comments

Comments
 (0)