-
Notifications
You must be signed in to change notification settings - Fork 57
Use Botorch MultiTaskGP for transfer learning #549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
8fee382
to
88e1dfe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Hrovatin, here the first batch of comments
@Hrovatin would you consider abandoning this PR? I think if this topic is picked up again its better to start afresh (and only open a PR after investigations have concluded). |
@Scienfitz I would keep open as the main blocker for this was randomness in benchmarks. Since that may be solved now I would suggest running benchmarks again on the new HPC (need to confirm it is also reproducible there) |
@Hrovatin any update? |
No, I need to first set up testing on oneHPC to reproducibly benchmark - as that seems to be the only option to make fully reproducible. I will post update here once I have the results @Scienfitz |
8ce5fba
to
bee32aa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
@AdrianSosic @Scienfitz @AVHopp Update on the comparison of MultiTask GP from botorch and current kernel:
![]() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First round of comments, but we should discuss some of the points (in particular the one regarding multiple active values) internally first.
de81707
to
68a9c24
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be willing to approve - however, since this is technically my PR I can't
5cfb366
to
7bb49d9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets do the final check referenced here and merge if everything is alright
Co-authored-by: AdrianSosic <[email protected]>
Co-authored-by: AdrianSosic <[email protected]>
Co-authored-by: Alexander V. Hopp <[email protected]>
Co-authored-by: Alexander V. Hopp <[email protected]>
953b609
to
6c3dd93
Compare
@AdrianSosic even after the new rebase, this branch and main still differ even in the naive case (when no TaskParam/Kernerl is used)
This branch PR last commit:
![]() |
@AVHopp and @AdrianSosic the issue is indeed the handling of active dimensions in base kernel (as speculated here). Changing the active dims as in this branch ensures that in naive case the outcome matches exactly the main branch, while in non-naive case it still stays as was this branch. These are equality analysis for Michalewicz (as that one was usually most obviously discrepant) - output True means equal:
|
covar_module = kernel.to_gpytorch( | ||
ard_num_dims=kernel_num_dims, | ||
batch_shape=batch_shape, | ||
active_dims=tuple(range(kernel_num_dims)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting 🤔 Can you explain how exactly I have to understand these prints?
@AVHopp and @AdrianSosic the issue is indeed the handling of active dimensions in base kernel (as speculated here).
Changing the active dims as in this branch ensures that in naive case the outcome matches exactly the main branch, while in non-naive case it still stays as was this branch.
These are equality analysis for Michalewicz (as that one was usually most obviously discrepant) - output True means equal:
tl-benchmarking-investigation-activeDims and main in naive True: True tl-benchmarking-investigation-activeDims and main in naive False: False tl-benchmarking-investigation-activeDims and tl-benchmarking-investigation in naive True: False tl-benchmarking-investigation-activeDims and tl-benchmarking-investigation in naive False: True main and tl-benchmarking-investigation in naive True: False main and tl-benchmarking-investigation in naive False: False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Branches
- tl-benchmarking-investigation-activeDims - branch where I specify active dims directly
tuple(range(train_x.shape[-1] - context.n_task_dimensions))
- tl-benchmarking-investigation - version of this branch before the last commit where I added active_dims. To my understanding based on Gpytorch they should be the same as inputing None just uses all dimensions. Note that in BoTorch MultiTaskGP data is split up into non-task and task parts so in both in TL and naive case (with SingleTaskGP and no Task kernel) one would just use integers [0:n-1] for the base_kernel.
- main - main branch with last comit from 9. 10.
Naive
- naive =True uses no task, not TL
- naive = False uses task, TL
The True/False printed below each line is whether the two branches had exactly the same CumBest result in either naive or TL setting.
All comparisons were done on full Michlewicz domain benchmark
You can see that tl-benchmarking-investigation-activeDims
retains performance of main
in naive setting while matching the current tl-benchmarking-investigation
branch in TL setting. I am not sure why setting the active_dims seems to make a difference only in the naive setting (and not in TL). But at least we now have full reproducibility for the naive setting. For the TL setting it is still not 100% reproducible with main (also in general due to using MultiTaskGP), but I think we already decided this is acceptable.
Replaces the custom
IndexKernel
construction with BoTorch'sMultiTaskGP
(which became possible due the addedall_tasks
argument).