Add `model_to_minibatch` transformation to convert all `pm.Data` to `pm.Minibatch` #7785

jessegrabowski · 2025-05-15T12:28:17Z

Description

A pain point for me when testing different algorithms (e.g. MCMC vs VI) is that I don't want to write a 2nd version of the model with pm.Minibatch on the data.

This PR adds a model transformation that does that for the user. It's the reverse of the remove_minibatched_nodes transformer that @zaxtax implemented recently.

This is a WIP, it doesn't actually work now, because I can't figure out how to rebuild the observed variable with the total_size set correctly. Help wanted.

Related Issue

Closes #
Related to #

Checklist

Checked that the pre-commit linting/style checks pass
Included tests that prove the fix is effective or that the new feature works
Added necessary documentation (docstrings and/or example notebooks)
If you are a pro: each commit corresponds to a relevant logical change

Type of change

📚 Documentation preview 📚: https://pymc--7785.org.readthedocs.build/en/7785/

ricardoV94 · 2025-05-15T12:56:21Z

This is a WIP, it doesn't actually work now, because I can't figure out how to rebuild the observed variable with the total_size set correctly. Help wanted.

You can use the lower level utility:

pymc/pymc/variational/minibatch_rv.py

Line 53 in ef26ae8

def create_minibatch_rv(

Then make that a vanilla observed RV

ricardoV94 · 2025-05-15T12:59:51Z

Ah you already did that, so your question is how to get total size? Grab the batch shape of the variable and constant fold it without raising if it can't be fully folded

jessegrabowski · 2025-05-15T13:10:51Z

My real issue was not understanding what needs to be the key and value in the replacements, between:

The model variable
The memo variable
The fgraph variable

ricardoV94 · 2025-05-15T13:25:41Z

the best is usual to replace the whole fgraph ModelObservedRV by a new one. You probably have to discard any dims on the batch dimension which is an input to that op

jessegrabowski · 2025-05-15T13:28:20Z

I don't really understand what that answer means

ricardoV94 · 2025-05-15T13:32:23Z

dprint the fgraph and it will perhaps be more obvious what I am mumbling

jessegrabowski · 2025-05-15T13:34:06Z

The problem i was running into was that I ended up with two beta RVs after doing the replace. Beta was the only RV implicated in the ModelObservedRV sub-graph

zaxtax · 2025-05-15T14:25:29Z

Because Minibatch assumes the data variables have the same length, it might make sense to take a variables argument. Or have some way to group data variables of the same size (same dim name maybe?)

…

On Thu, 15 May 2025, 15:35 Ricardo Vieira, ***@***.***> wrote: *ricardoV94* left a comment (pymc-devs/pymc#7785) <#7785 (comment)> dprint the fgraph and it will perhaps be more obvious what I am mumbling — Reply to this email directly, view it on GitHub <#7785 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAACCUMC5VCN6VAAJKNHEMT26SJPZAVCNFSM6AAAAAB5F7LYYKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQOBTHAZTINZXG4> . You are receiving this because you were mentioned.Message ID: ***@***.***>

initial PR

c1168de

jessegrabowski requested a review from zaxtax May 15, 2025 12:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add `model_to_minibatch` transformation to convert all `pm.Data` to `pm.Minibatch` #7785

Add `model_to_minibatch` transformation to convert all `pm.Data` to `pm.Minibatch` #7785

Uh oh!

jessegrabowski commented May 15, 2025 •

edited by github-actions bot

Loading

Uh oh!

ricardoV94 commented May 15, 2025

Uh oh!

ricardoV94 commented May 15, 2025

Uh oh!

jessegrabowski commented May 15, 2025

Uh oh!

ricardoV94 commented May 15, 2025 •

edited

Loading

Uh oh!

jessegrabowski commented May 15, 2025

Uh oh!

ricardoV94 commented May 15, 2025

Uh oh!

jessegrabowski commented May 15, 2025

Uh oh!

zaxtax commented May 15, 2025 via email •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Add model_to_minibatch transformation to convert all pm.Data to pm.Minibatch #7785

Are you sure you want to change the base?

Add model_to_minibatch transformation to convert all pm.Data to pm.Minibatch #7785

Uh oh!

Conversation

jessegrabowski commented May 15, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

Checklist

Type of change

Uh oh!

ricardoV94 commented May 15, 2025

Uh oh!

ricardoV94 commented May 15, 2025

Uh oh!

jessegrabowski commented May 15, 2025

Uh oh!

ricardoV94 commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jessegrabowski commented May 15, 2025

Uh oh!

ricardoV94 commented May 15, 2025

Uh oh!

jessegrabowski commented May 15, 2025

Uh oh!

zaxtax commented May 15, 2025 via email • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Add `model_to_minibatch` transformation to convert all `pm.Data` to `pm.Minibatch` #7785

Add `model_to_minibatch` transformation to convert all `pm.Data` to `pm.Minibatch` #7785

jessegrabowski commented May 15, 2025 •

edited by github-actions bot

Loading

ricardoV94 commented May 15, 2025 •

edited

Loading

zaxtax commented May 15, 2025 via email •

edited

Loading