Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EBM pmml does not support missing values #440

Open
sadsquirrel369 opened this issue Jan 25, 2025 · 2 comments
Open

EBM pmml does not support missing values #440

sadsquirrel369 opened this issue Jan 25, 2025 · 2 comments

Comments

@sadsquirrel369
Copy link

Hi there,

I came across a discrepancy between the pmml prediction for the ebm in a dataset where there are nulls. The ebm object has a score for nan's but the pmml file does not. Is there a way to fix this. This also becomes a problem where you have two variables interacting with Nan's.

Thank you

@vruusmann vruusmann changed the title EBM pmml does not support NAN EBM pmml does not support missing values Jan 25, 2025
@vruusmann
Copy link
Member

EBM is an ensemble model.

PMML uses the Segmentation@missingPredictionTreatment attribute to specify what to do in case some member model(s) of an ensemble model return a missing (sub-)prediction.

The current behaviour is returnMissing, which means "abandon the scoring process, and return a missing value as the final prediction".

You may change this attribute to skipSegment, which means "continue the scoring process, and ignore this missing sub-prediction when computing the final prediction".

Does this attribute change fix your issue, meaning that the EBM PMML starts making correct predictions? If not, then it must mean that EBM is performing some model-internal imputation. For example, maybe it replaces all NaN values with 0 values while ingesting features.

@vruusmann
Copy link
Member

@sadsquirrel369 If you want to see this issue fixed soon, then it's your job to dig through the EBM codebase and identify what happens to those inputted NaN values. Paste your research results here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants