Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix group selection in sample_posterior_predictive when predictions=True is passed in kwargs #426

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

butterman0
Copy link

@butterman0 butterman0 commented Feb 17, 2025

Summary

Fixes hard-coded group selection in sample_posterior_predictive which unnecessarily restricts usage of predict functions. Previously, if predictions=True (ideally set in pm.sample_posterior_predictive when predicting out-of-sample) is passed as a kwarg to the predict functions, the inference data was extracted from posterior_predictive group which is incorrect when predictions = True.

Changes

Selects appropriate group depending if predictions is passed.

@butterman0
Copy link
Author

I'm sorry I haven't opened an issue first - I thought it was such a minor change that it wasn't necessary. This is also my first contribution so not 100% on the process!

Comment on lines 653 to 654
# Determine the correct group dynamically
group_name = "predictions" if kwargs.get("predictions", False) else "posterior_predictive"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to make predictions an explicit kwarg (with the same default as PyMC) and use that directly?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I initially had that! Although I wasn't sure what was best practice. It is nice to make it explicit, but it means passing predictions=False as explicit args through the predict method and then to sample_posterior_predictive which is used in other methods - although this shouldn't be a problem if keeping the same default as PyMC as you suggest.

Copy link
Author

@butterman0 butterman0 Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, the class method sample_posterior_predictive is called twice, on both occasions it is for prediction: class methods predict and predict_posterior.

I think I would argue that in this case we would like the default to be predictions=True (as opposed to the pymc pm.sample_posterior_predictive default). The default would be set in the predict and predict_posterior methods.

I say this because when False, the posterior_predictive group in the idata object is overridden - meaning we would have to run fit or sample_model again if we wanted to do posterior predictive checks?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking you agree with setting predictions=True as default @ricardoV94 ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah makes sense in the predict oriented methods

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The predictions argument should be mentioned in the docstrings now that it is explicit

@@ -624,7 +625,7 @@ def sample_prior_predictive(

return prior_predictive_samples

def sample_posterior_predictive(self, X_pred, extend_idata, combined, **kwargs):
def sample_posterior_predictive(self, X_pred, extend_idata, predictions, combined, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provide default

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other arguments do not have defaults. The sample_posterior_predictive is only called through the predict functions, which do have defaults.

Would you be able to explain why we would want predictions to have a default, when the other arguments do not?

@@ -559,7 +560,7 @@ def predict(
"""

posterior_predictive_samples = self.sample_posterior_predictive(
X_pred, extend_idata, combined=False, **kwargs
X_pred, extend_idata, predictions, combined=False, **kwargs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass by keyword to be on the safe side

@@ -723,7 +728,7 @@ def predict_posterior(

X_pred = self._validate_data(X_pred)
posterior_predictive_samples = self.sample_posterior_predictive(
X_pred, extend_idata, combined, **kwargs
X_pred, extend_idata, predictions, combined, **kwargs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass by keyword argument

Copy link
Author

@butterman0 butterman0 Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was aiming to keep it in the same format as current implementation. i.e. x_pred, extend_idata and combined do not use keyword arguments..

Similar question to the one above - should these all be changed to use keyword arguments? Why would we treat predictions differently?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ricardoV94, let me know what you think and I can adjust.

@butterman0
Copy link
Author

butterman0 commented Mar 6, 2025

Hi @ricardoV94, I've made the requested changes.

Two commits updating the doc strings.

Most recent commit passing by keyword argument and setting default as requested.

I'm still a little unclear as to why we would treat the predictions argument differently to the other arguments in the sample_posterior_predictive method (i.e. combined and extend_idata). Similarly, the same question as to why we would pass by keyword when calling the method when we don't with the other variables.

Harry

@ricardoV94
Copy link
Member

We should always pass by keyword argument, but since you didn't write the previous code I didn't ask you to change those lines

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test that confirms this is working now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants