Skip to content

strong_basline训练到后期acc突然降到0 loss变为nan #43

@Tidewww

Description

@Tidewww

Epoch: [12][ 380/1161] Time 0.496 (0.498) Acc@1 54.69% (60.06%) cross_entropy 3.617 (4.314) softmax_triplet 2.303 (3.466)
Epoch: [12][ 390/1161] Time 0.493 (0.498) Acc@1 57.03% (60.02%) cross_entropy 3.469 (4.292) softmax_triplet 1.859 (3.471)
Epoch: [12][ 400/1161] Time 0.501 (0.498) Acc@1 56.25% (59.95%) cross_entropy 3.451 (4.274) softmax_triplet 4.916 (3.471)
Epoch: [12][ 410/1161] Time 0.487 (0.498) Acc@1 53.12% (59.85%) cross_entropy 3.627 (4.257) softmax_triplet 3.050 (3.438)
Epoch: [12][ 420/1161] Time 0.489 (0.498) Acc@1 54.69% (59.79%) cross_entropy 3.693 (4.240) softmax_triplet 5.153 (3.429)
Epoch: [12][ 430/1161] Time 0.505 (0.498) Acc@1 58.59% (59.72%) cross_entropy 3.444 (4.225) softmax_triplet 1.498 (3.448)
Epoch: [12][ 440/1161] Time 0.488 (0.498) Acc@1 57.03% (59.68%) cross_entropy 3.482 (4.208) softmax_triplet 5.507 (3.431)
Epoch: [12][ 450/1161] Time 0.478 (0.498) Acc@1 60.94% (59.59%) cross_entropy 3.388 (4.195) softmax_triplet 0.360 (3.432)
Epoch: [12][ 460/1161] Time 0.487 (0.498) Acc@1 51.56% (59.52%) cross_entropy 3.739 (4.181) softmax_triplet 2.203 (3.410)
Epoch: [12][ 470/1161] Time 0.185 (0.493) Acc@1 0.00% (58.61%) cross_entropy nan (nan) softmax_triplet nan (nan)
Epoch: [12][ 480/1161] Time 0.182 (0.487) Acc@1 0.00% (57.40%) cross_entropy nan (nan) softmax_triplet nan (nan)
Epoch: [12][ 490/1161] Time 0.191 (0.481) Acc@1 0.00% (56.23%) cross_entropy nan (nan) softmax_triplet nan (nan)
Epoch: [12][ 500/1161] Time 1.477 (0.477) Acc@1 0.00% (55.11%) cross_entropy nan (nan) softmax_triplet nan (nan)
Epoch: [12][ 510/1161] Time 0.186 (0.472) Acc@1 0.00% (54.03%) cross_entropy nan (nan) softmax_triplet nan (nan)
Epoch: [12][ 520/1161] Time 0.192 (0.466) Acc@1 0.00% (52.99%) cross_entropy nan (nan) softmax_triplet nan (nan)
Epoch: [12][ 530/1161] Time 0.181 (0.461) Acc@1 0.00% (52.00%) cross_entropy nan (nan) softmax_triplet nan (nan)
Epoch: [12][ 540/1161] Time 0.191 (0.456) Acc@1 0.00% (51.03%) cross_entropy nan (nan) softmax_triplet nan (nan)
Epoch: [12][ 550/1161] Time 0.196 (0.451) Acc@1 0.00% (50.11%) cross_entropy nan (nan) softmax_triplet nan (nan)

  • 如上,在单卡训练strong_basline的时候acc突然变0 loss都变为nan

  • 之前在训练market 2 duke 的时候也有这个情况出现,单卡训练到49epoch的时候 中间几个iter会突然acc变0 loss变nan

  • 四卡训练的时候倒是没有出现这个情况,请问是为什么呢?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions