Plot training calibration, PR, and ROC curves; logging of label breakdown and number of epochs #332 #340

kathwy · 2020-06-23T06:09:17Z

resolves #332
resolves #216

Added argument plot_train_curves (defaults to False) to plot calibration, PR, and ROC curves for training set.
Reports the label breakdown for train/valid/test at the end of train mode
Reports the number of epochs completed

StevenSong

comments mostly about the handling of the generator workers, we should be consistent and handle both the training and validation generators. also let's get review of @lucidtronix or @ndiamant

StevenSong · 2020-06-23T13:11:39Z

ml4cvd/models.py

@@ -1016,6 +1016,7 @@ def train_model_from_generators(
    inspect_show_labels: bool,
    return_history: bool = False,
    plot: bool = True,
+    plot_train_curves: bool = False


adding the argument here as plot_train_curves implies this function will do the plotting or have some functionality relating to it. instead it's just deferring the worker management to the caller of the function. maybe we can rename this argument for train_model_generators to defer_worker_halt or something similar?

ml4cvd/models.py

ml4cvd/recipes.py

StevenSong · 2020-06-23T13:17:40Z

ml4cvd/recipes.py

    out_path = os.path.join(args.output_folder, args.id + '/')
    test_data, test_labels, test_paths = big_batch_from_minibatch_generator(generate_test, args.test_steps)
+    train_data, train_labels = big_batch_from_minibatch_generator(generate_train, args.training_steps)


what is the size of the big_batch returned here? training_steps is usually alot larger than test_steps

train big_batch shape = (25600, 2500, 12), as opposed to (2048, 2500, 12) for test

is there a need to return train_paths?

I don't think we need to return it since it isn't used in plotting?

It is optional, if provided it will be used to label outliers

StevenSong

awesome, please remove the 2 extra files though and also request review from a Broadie!

._.DS_Store
ml4cvd/._models.py

erikr · 2020-06-25T03:43:47Z

awesome, please remove the 2 extra files though and also request review from a Broadie!
._.DS_Store
ml4cvd/._models.py

This repo has a .gitignore but it does not seem to be working.

Also, ensure ml4cvd/._arguments.py isn't being committed.

StevenSong · 2020-06-25T04:03:42Z

awesome, please remove the 2 extra files though and also request review from a Broadie!
._.DS_Store
ml4cvd/._models.py
This repo has a .gitignore but it does not seem to be working.

Also, ensure ml4cvd/._arguments.py isn't being committed.

those aren't currently in the codebase https://github.com/broadinstitute/ml/blob/106a1cca25a1e2f68cc4db99658ff38d4f5a94ce/.gitignore

I wonder what's generating the ._ files?

kathwy · 2020-06-25T05:05:45Z

It seems like they're metadata files created by Mac that get separated out when I push (can't see them on my computer but I see them on the github website)?

StevenSong · 2020-06-25T13:05:05Z

It seems like they're metadata files created by Mac that get separated out when I push (can't see them on my computer but I see them on the github website)?

would recommend doing ls -a from your mac terminal window and rm them

before add and commit to git, can also do git status to see files that are untracked

we could also just add ._* to .gitignore

kathwy · 2020-07-06T18:09:58Z

Closing this issue for now. I've merged the changes into a separate fork (aguirre-lab/ml) for sts-ecg modeling purposes. If anyone finds this capability also useful for broadinstitute/ml, I'm happy to reopen this PR and merge.

kathwy added 3 commits June 22, 2020 20:45

Plot train curves #332

5aa01ef

Enable plotting #332

b91bd0c

Remove train scripts #332

b63bf74

kathwy requested review from erikr and StevenSong June 23, 2020 06:09

erikr added the enhancement New feature or request label Jun 23, 2020

erikr assigned kathwy Jun 23, 2020

erikr linked an issue Jun 23, 2020 that may be closed by this pull request

Produce ROC and PR curves for both train and test sets #332

Closed

StevenSong requested changes Jun 23, 2020

View reviewed changes

kathwy added 2 commits June 24, 2020 19:29

PR changes #332 #340

48f2757

removed test script #340 #332

a30a60f

StevenSong approved these changes Jun 25, 2020

View reviewed changes

kathwy added 2 commits June 24, 2020 22:54

Removed extra files #340 #332

b02755e

removed extra files

56b8591

kathwy requested a review from ndiamant June 25, 2020 02:56

erikr removed their request for review June 25, 2020 03:35

Remove ._arguments.py

8b5965f

StevenSong and others added 6 commits June 29, 2020 17:10

better logs

32f339f

order of output

93890c9

Merge branch 'master' into ky_plot_train_roc

d421e2c

stats string

741d406

paths

e98b534

Erik comments #4

8a6e992

kathwy changed the title ~~Ky plot train roc #332~~ Plot training calibration, PR, and ROC curves; logging of label breakdown and number of epochs #332 Jul 4, 2020

kathwy closed this Jul 6, 2020

StevenSong deleted the ky_plot_train_roc branch August 12, 2020 21:04

erikr unassigned kathwy Aug 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plot training calibration, PR, and ROC curves; logging of label breakdown and number of epochs #332 #340

Plot training calibration, PR, and ROC curves; logging of label breakdown and number of epochs #332 #340

kathwy commented Jun 23, 2020 •

edited by StevenSong

Loading

StevenSong left a comment

StevenSong Jun 23, 2020

StevenSong Jun 23, 2020

kathwy Jun 23, 2020

erikr Jun 23, 2020

kathwy Jun 24, 2020

lucidtronix Jun 25, 2020

StevenSong left a comment

erikr commented Jun 25, 2020

StevenSong commented Jun 25, 2020

kathwy commented Jun 25, 2020

StevenSong commented Jun 25, 2020

kathwy commented Jul 6, 2020

Plot training calibration, PR, and ROC curves; logging of label breakdown and number of epochs #332 #340

Plot training calibration, PR, and ROC curves; logging of label breakdown and number of epochs #332 #340

Conversation

kathwy commented Jun 23, 2020 • edited by StevenSong Loading

StevenSong left a comment

Choose a reason for hiding this comment

StevenSong Jun 23, 2020

Choose a reason for hiding this comment

StevenSong Jun 23, 2020

Choose a reason for hiding this comment

kathwy Jun 23, 2020

Choose a reason for hiding this comment

erikr Jun 23, 2020

Choose a reason for hiding this comment

kathwy Jun 24, 2020

Choose a reason for hiding this comment

lucidtronix Jun 25, 2020

Choose a reason for hiding this comment

StevenSong left a comment

Choose a reason for hiding this comment

erikr commented Jun 25, 2020

StevenSong commented Jun 25, 2020

kathwy commented Jun 25, 2020

StevenSong commented Jun 25, 2020

kathwy commented Jul 6, 2020

kathwy commented Jun 23, 2020 •

edited by StevenSong

Loading