Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adjust aapb-collaboration-27-e batch #105

Open
keighrim opened this issue Nov 26, 2024 · 1 comment
Open

adjust aapb-collaboration-27-e batch #105

keighrim opened this issue Nov 26, 2024 · 1 comment

Comments

@keighrim
Copy link
Member

keighrim commented Nov 26, 2024

As discussed in a recent meeting (and reported at https://github.com/clamsproject/aapb-collaboration/issues/27#issuecomment-2501578284), we might need to split 27-e batch that currently contains two sets of GUIDs from "challenging images" annotation (namely bm set and pbd set) so that we can full reserve pbd set as the "test" set moving forward, that can be fed to scripted evaluation process without additional manual intervention.

However, on the training side, right now I use pbd set as a validation set during training SWT models for 7.1 release. And keep 27-d batch for testing both timepoint-wise and timeframe-wise, as the 27-d batch is annotated (unlike pbd set) in the "sequentially" manner that enables natural conversion from point annotations to timeframe annotations. Re-purposing pbd set to be the test set for the future might imply that we need1 either

  • a new validation set (a.k.a "dev" set that's used for evaluation-during-training for hyperparameter tuning)
  • to go back to k-fold validation

And I want to hear any preferences, other suggestion, or any idea about this, especially @owencking and @marcverhagen.

Footnotes

  1. https://datascience.stackexchange.com/questions/18339/why-use-both-validation-set-and-test-set

@keighrim
Copy link
Member Author

On second thought, maybe I misunderstood our discussion during the meeting yesterday, and it was indeed about keeping the validation set (pbd) from being re-used in the training after hparam tuning. If that's the case, there's no need to re-configure the data split, I think (but we should expect a bit of loss in model performance due to smaller training examples).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

1 participant