Skip to content

Remove unnecessary deep copies in SBI training/inference workflow #1616

@janfb

Description

@janfb

Problem

The SBI codebase currently creates deep copies of neural networks and posteriors in three places:

  1. train() returns deepcopy(self._neural_net)
  2. build_posterior() returns deepcopy(self._posterior)
  3. build_posterior() stores self._model_bank.append(deepcopy(self._posterior))

These deep copies can consume significant memory (10-100MB+ per round for modern density estimators).

Details

  • Model bank is completely unused: _model_bank is write-only - no code reads from
  • Neural networks aren't reused between rounds: SNPE only needs the data (_theta_roundwise, _x_roundwise) from previous rounds, not the networks
  • No modification risk: Users interact with posteriors through stable APIs (sample(), log_prob())

The deep copies appear to be legacy from when multi-round inference was handled internally. Now that users manage rounds explicitly, they serve no purpose.

Proposed Solution

  1. Remove _model_bank entirely - it's likely legacy code?
  2. Return networks/posteriors without copying:
def train(self, ...):
    return self._neural_net  # No deepcopy

def build_posterior(self, ...):
    # Remove: self._model_bank.append(deepcopy(self._posterior))
    return self._posterior  # No deepcopy

Impact

  • Memory savings: ~30x reduction for multi-round inference (e.g., 300MB → 10MB for 10 rounds)
  • No functionality loss: All tests should pass unchanged
  • Backward compatibility: We could also add return_copy=False parameter to ensure backward compatibility.

Questions

  • Any hidden use cases for the model bank we're missing?
  • Should we make copying opt-in via parameter or just remove it?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions