-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Replace Hessian scaling guessing by an identity matrix #2236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
While they start with identity, they also adapt during sampling according to the recent samples. I think this still makes sense, but we should test it on some models first (like in the examples folder) to see if it affects sampling in a bad way. |
The thing is, comparing to ADVI init, the fallback option (Hessian) is kind of sucks right now. |
I completely agree, Hessian at test-point is a poor default choice if nothing is passed in. Mainly was worried about breaking people's code or unanticipated effects. Thinking more about this however, I think this should be pretty safe to merge, most people should use |
I think so too. Both are pretty bad, but using the hessian is just much more trouble and also probably more surprising. Don't think it will break models either. |
For now, I checked only It turned out that only two of them create |
@twiecki @aseyboldt But we still support the ability to provide scaling parameter as a test point, which is used in examples with It is what we want, right? I mean when we get explicitly specified start point (most probably MAP), we should still use Hessian and not identity, right? |
Just FYI: model built with |
Our examples folder is in poor health to begin with. I wonder if we should rather delete them at this point and turn the ones we think are useful into proper NBs. But currently they are either broken or show incorrect usage of pymc3. |
Let's fix them. I think there is merit to having some examples that are not in notebook format. |
@twiecki @fonnesbeck I can go through them. |
@junpenglao I would fully update everything to the current best practices (i.e. let PyMC choose), unless it is explicitly demonstrating an alternative. |
New PR instead of #2232, with
np.ones
cast tofloatX
.If I understand it right, it seems like Stan uses identity scaling by default. CmdStan manual states that:
And the source code corresponding to
diag_e
shows that by default the metric is an identity matrix.