F1 Score for Structured/Beer on paper can't be reproduced #30

junwei-h · 2022-12-16T22:01:52Z

Hello,
I run the code on Windows without GPU and turned off -fp16:
python.exe .\train_ditto.py '--task' 'Structured/Beer' '--batch_size' '32' '--max_len' '256' '--lr' '3e-5' '--n_epochs' '40' '--lm' 'roberta' '--da' 'del' '--dk' 'product' '--summarize'

RoBERTa is from https://huggingface.co/roberta-base

The paper says "We use the base uncased variant of each model in all our experiments". RoBERTa is case-sensitive.
So which uncased variant of RoBERTa was used?
Also which uncased variant of XLNet was used?

The paper reported 94.34%, the best I got is
epoch 14: dev_f1=0.896551724137931, f1=0.8666666666666666, best_f1=0.9032258064516129
Any suggestions on what may have caused this low performance?

Thank you.

Here is the output
step: 0, loss: 0.5871710777282715
epoch 1: dev_f1=0.37931034482758624, f1=0.36666666666666664, best_f1=0.36666666666666664
step: 0, loss: 0.2969485819339752
epoch 2: dev_f1=0.2745098039215686, f1=0.2692307692307693, best_f1=0.36666666666666664
step: 0, loss: 0.2463674694299698
epoch 3: dev_f1=0.32558139534883723, f1=0.32499999999999996, best_f1=0.36666666666666664
step: 0, loss: 0.5062930583953857
epoch 4: dev_f1=0.32558139534883723, f1=0.32499999999999996, best_f1=0.36666666666666664
step: 0, loss: 0.2536587119102478
epoch 5: dev_f1=0.4117647058823529, f1=0.36923076923076925, best_f1=0.36923076923076925
step: 0, loss: 0.3347562551498413
epoch 6: dev_f1=0.6923076923076924, f1=0.6470588235294117, best_f1=0.6470588235294117
step: 0, loss: 0.3830795884132385
epoch 7: dev_f1=0.8275862068965518, f1=0.6666666666666665, best_f1=0.6666666666666665
step: 0, loss: 0.27009156346321106
epoch 8: dev_f1=0.8387096774193549, f1=0.9333333333333333, best_f1=0.9333333333333333
step: 0, loss: 0.13321542739868164
epoch 9: dev_f1=0.8666666666666666, f1=0.9032258064516129, best_f1=0.9032258064516129
step: 0, loss: 0.024025270715355873
epoch 10: dev_f1=0.8666666666666666, f1=0.9032258064516129, best_f1=0.9032258064516129
step: 0, loss: 0.0391874834895134
epoch 11: dev_f1=0.896551724137931, f1=0.9032258064516129, best_f1=0.9032258064516129
step: 0, loss: 0.00302126444876194
epoch 12: dev_f1=0.8387096774193549, f1=0.9032258064516129, best_f1=0.9032258064516129
step: 0, loss: 0.06331554800271988
epoch 13: dev_f1=0.8666666666666666, f1=0.9032258064516129, best_f1=0.9032258064516129
step: 0, loss: 0.026920529082417488
epoch 14: dev_f1=0.896551724137931, f1=0.8666666666666666, best_f1=0.9032258064516129
step: 0, loss: 0.023745562881231308
epoch 15: dev_f1=0.8666666666666666, f1=0.9032258064516129, best_f1=0.9032258064516129
step: 0, loss: 0.012241823598742485
epoch 16: dev_f1=0.8666666666666666, f1=0.9032258064516129, best_f1=0.9032258064516129
step: 0, loss: 0.0017187324119731784
epoch 17: dev_f1=0.8666666666666666, f1=0.9032258064516129, best_f1=0.9032258064516129
step: 0, loss: 0.0006802910938858986
epoch 18: dev_f1=0.8484848484848484, f1=0.8484848484848484, best_f1=0.9032258064516129
step: 0, loss: 0.0009096315479837358
epoch 19: dev_f1=0.8387096774193549, f1=0.9032258064516129, best_f1=0.9032258064516129
step: 0, loss: 0.0005167351919226348
epoch 20: dev_f1=0.8387096774193549, f1=0.9032258064516129, best_f1=0.9032258064516129
step: 0, loss: 0.0003216741606593132
epoch 21: dev_f1=0.8387096774193549, f1=0.9032258064516129, best_f1=0.9032258064516129

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

F1 Score for Structured/Beer on paper can't be reproduced #30

F1 Score for Structured/Beer on paper can't be reproduced #30

junwei-h commented Dec 16, 2022 •

edited

Loading

F1 Score for Structured/Beer on paper can't be reproduced #30

F1 Score for Structured/Beer on paper can't be reproduced #30

Comments

junwei-h commented Dec 16, 2022 • edited Loading

junwei-h commented Dec 16, 2022 •

edited

Loading