-
Notifications
You must be signed in to change notification settings - Fork 207
Open
Description
Problem
The SBI codebase currently creates deep copies of neural networks and posteriors in three places:
train()returnsdeepcopy(self._neural_net)build_posterior()returnsdeepcopy(self._posterior)build_posterior()storesself._model_bank.append(deepcopy(self._posterior))
These deep copies can consume significant memory (10-100MB+ per round for modern density estimators).
Details
- Model bank is completely unused:
_model_bankis write-only - no code reads from - Neural networks aren't reused between rounds: SNPE only needs the data (
_theta_roundwise,_x_roundwise) from previous rounds, not the networks - No modification risk: Users interact with posteriors through stable APIs (
sample(),log_prob())
The deep copies appear to be legacy from when multi-round inference was handled internally. Now that users manage rounds explicitly, they serve no purpose.
Proposed Solution
- Remove
_model_bankentirely - it's likely legacy code? - Return networks/posteriors without copying:
def train(self, ...):
return self._neural_net # No deepcopy
def build_posterior(self, ...):
# Remove: self._model_bank.append(deepcopy(self._posterior))
return self._posterior # No deepcopyImpact
- Memory savings: ~30x reduction for multi-round inference (e.g., 300MB → 10MB for 10 rounds)
- No functionality loss: All tests should pass unchanged
- Backward compatibility: We could also add
return_copy=Falseparameter to ensure backward compatibility.
Questions
- Any hidden use cases for the model bank we're missing?
- Should we make copying opt-in via parameter or just remove it?
Metadata
Metadata
Assignees
Labels
No labels