Skip to content

Conversation

@mratsim
Copy link

@mratsim mratsim commented Nov 13, 2025

This is basically extracted from util/recompile.py.

When experimenting with mixed-precision quantization it saves time and disk life to not have to materialize the quant if it's out of our size target.

I've repeated the code but it might be worth it to create at least an load_override function as it would now be used in 3 places:

  • model_diff.py
  • recompile.py
  • This size_estimation.py

Not sure on naming convention between size_estimation, estimate_size or size_estimate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant