Rework validation for `measurements` and `pending_experiments` #456

Scienfitz · 2025-01-03T16:52:17Z

Fixes #453

Notes re measurements validation

Two utilities (one for parameters and one for targets) for input validation has been extracted. Additional validations for binary targets have been added. The utilities contain parts from add_measurements and fuzzy_row_match
As a result fuzzy_row_match does not perform any validation anymore
add_measurements now simply calls the utilities
tests for invalid parameter input have been extended

Notes re pending_experiments validation

The parameter input validation utility is now called as part of Bayesian recommenders, causing proper error messages instead of crypto and seemingly unrelated ones
A recommender based test for invalid values in pending_experiments has been added

Remaining problem:
The places of validation for measurements and pending_experiments are inconsistent. The former is done in the campaign and not in any recommender, the latter is not done in the campaign but done in the recommender. One of the issues is that numerical_measurements_must_be_within_tolerance is not available for recommenders, but its required for the parameter input validation.

Suggestion:
Imo we dont want to add more keywords to .recommend or so, hence the pending_experiments validation currently assumes numerical_measurements_must_be_within_tolerance=False and cannot be configured otherwise. To me it seems the simplest solution would be to completely get rid of any numerical_measurements_must_be_within_tolerance keywords and make it an environment variable, we wouldn't have to worry about its availability anymore and it would require adding it to more and more signatures.

AdrianSosic

Hi @Scienfitz, thanks the refactor 🏗️ Below my comments

baybe/utils/validation.py

tests/test_input_output.py

tests/test_pending_experiments.py

tests/test_input_output.py

AVHopp

LGTM

baybe/utils/dataframe.py

baybe/telemetry.py

Co-authored-by: AdrianSosic <[email protected]>

Scienfitz · 2025-02-06T14:30:53Z

After a pause I reanalzyed the outstanding issues here:

Validation should be brought to both the campaign and recommenders. This will ensure proper valdiation for measurements and pending_experiments for users using / not using campaings. There needs to be a mechanism how a campaign can tell the recommender not to perform any validation, otherwise validation would be repeated (it could also an option to simply accept that).
The flag numerical_measurements_must_be_within_tolerance is only available in .add_measurements but for validating pending_experiments in a Campaign it also needs to be available in .recommend

Solution Proposal:

numerical_measurements_must_be_within_tolerance must become an attribute of the Campaign again. In this manner its available for both recommend and add_measurements. I would prefer that over including it in the recommend signature or so
RecommenderProtocol.recommend should get an additional kwarg called numerical_measurements_must_be_within_tolerance. This could be of type bool | None. If its True/False then the recommender would just validate measurements / pending_experiments with the respective flag. If its None, validation would be skipped, conveniently solving 1.

Scienfitz added the enhancement Expand / change existing functionality label Jan 3, 2025

Scienfitz self-assigned this Jan 3, 2025

Scienfitz requested review from AdrianSosic and AVHopp as code owners January 3, 2025 16:52

AdrianSosic requested changes Jan 6, 2025

View reviewed changes

Scienfitz force-pushed the feature/exp_input_validation branch 3 times, most recently from 0a42492 to a412305 Compare January 6, 2025 12:36

AVHopp approved these changes Jan 8, 2025

View reviewed changes

baybe/utils/dataframe.py Show resolved Hide resolved

baybe/utils/dataframe.py Show resolved Hide resolved

baybe/telemetry.py Show resolved Hide resolved

Scienfitz and others added 11 commits January 14, 2025 14:19

Extract experimental input validation utility

00ee611

Fix simulation with empty initial data

a7728f7

Expand basic input output tests

c891801

Add test for invalid pending_experiments

e19b02d

Add pending_experiments validation

f2ff23f

Fix docstring

257f6b6

Add utility for creating fake input

70d0b85

Add fixture for fake measurements

8cbd0b7

Update type hints

f038018

Improve text

c71fe09

Co-authored-by: AdrianSosic <[email protected]>

Add note

969dea4

Scienfitz force-pushed the feature/exp_input_validation branch from 7056db6 to 969dea4 Compare January 14, 2025 13:20

Scienfitz requested a review from AdrianSosic January 14, 2025 13:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework validation for `measurements` and `pending_experiments` #456

Rework validation for `measurements` and `pending_experiments` #456

Scienfitz commented Jan 3, 2025 •

edited by AdrianSosic

Loading

AdrianSosic left a comment

AVHopp left a comment

Scienfitz commented Feb 6, 2025 •

edited

Loading

Rework validation for measurements and pending_experiments #456

Are you sure you want to change the base?

Rework validation for measurements and pending_experiments #456

Conversation

Scienfitz commented Jan 3, 2025 • edited by AdrianSosic Loading

AdrianSosic left a comment

Choose a reason for hiding this comment

AVHopp left a comment

Choose a reason for hiding this comment

Scienfitz commented Feb 6, 2025 • edited Loading

Rework validation for `measurements` and `pending_experiments` #456

Rework validation for `measurements` and `pending_experiments` #456

Scienfitz commented Jan 3, 2025 •

edited by AdrianSosic

Loading

Scienfitz commented Feb 6, 2025 •

edited

Loading