On the use of Apex AMP and hybrid stages

Is there a specific reason why you used Apex AMP instead of the native AMP provided by PyTorch? Have you tried native AMP? 

I tried to train `poolformer_s12` and `poolformer_s24` with [solo-learn](https://github.com/vturrisi/solo-learn); with native fp16 the loss goes to `nan` after a few epochs, while with fp32 it works fine. Did you experience similar behavior?

On a side note, can you provide the implementation and the hyperparameters for the hybrid stage [Pool, Pool, Attention, Attention]? It seems very interesting!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

On the use of Apex AMP and hybrid stages #22

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

On the use of Apex AMP and hybrid stages #22

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions