Replace Hessian scaling guessing by an identity matrix #2236

ghost · 2017-05-29T19:26:17Z

New PR instead of #2232, with np.ones cast to floatX.

If I understand it right, it seems like Stan uses identity scaling by default. CmdStan manual states that:

All HMC implementations in Stan utilize quadratic kinetic energy functions which are specified up to the choice of a symmetric, positive-definite matrix known as a mass matrix or, more formally, a metric (Betancourt and Stein, 2011).

If the metric is constant then the resulting implementation is known as Euclidean HMC. Stan allows for three Euclidean HMC implementations: a unit metric, a diagonal metric, and a dense metric. These can be specified with the values unit_e, diag_e, and dense_e, respectively.

Future versions of Stan will also include dynamic metrics associated with Riemannian HMC (Girolami and Calderhead, 2011; Betancourt, 2012).
metric=<list element>
Geometry of base manifold
Valid values: unit_e, diag_e, dense_e (Defaults to diag_e)
    unit_e
        Euclidean manifold with unit metric
    diag_e
        Euclidean manifold with diag metric
    dense_e
        Euclidean manifold with dense metric

And the source code corresponding to diag_e shows that by default the metric is an identity matrix.

twiecki · 2017-05-29T19:43:41Z

While they start with identity, they also adapt during sampling according to the recent samples.

I think this still makes sense, but we should test it on some models first (like in the examples folder) to see if it affects sampling in a bad way.

junpenglao · 2017-05-30T09:04:35Z

The thing is, comparing to ADVI init, the fallback option (Hessian) is kind of sucks right now.

twiecki · 2017-05-31T07:08:43Z

I completely agree, Hessian at test-point is a poor default choice if nothing is passed in. Mainly was worried about breaking people's code or unanticipated effects. Thinking more about this however, I think this should be pretty safe to merge, most people should use pm.sample() with auto-tuning or they would pass in a scaling or point when instantiating NUTS.

@aseyboldt?

aseyboldt · 2017-05-31T08:12:48Z

I think so too. Both are pretty bad, but using the hessian is just much more trouble and also probably more surprising. Don't think it will break models either.

ghost · 2017-05-31T08:16:20Z

For now, I checked only .py examples (i.e. not notebooks) in pymc3/examples.

It turned out that only two of them create NUTS sampler explicitly and without specifying scaling parameter, simpletest.py and gelman_bioassay.py. I ran Kolmogorov-Smirnov tests to compare marginal distributions of all samples variables in traces for master branch and for this branch, rebase to master. KS-test passed with p-value larger than 10% for all of them except x in simpletest.py. However, when I plot traces for this x variable, it turned out that it is much more close to normal with identity scaling in comparison to Hessian.

ghost · 2017-05-31T08:23:15Z

@twiecki @aseyboldt
Just to clarify: so now we use identity when no scaling provided at all.

But we still support the ability to provide scaling parameter as a test point, which is used in examples with find_MAP (like lasso_missing.py). In this case a Hessian of the test point is still computed and used as scaling.

It is what we want, right? I mean when we get explicitly specified start point (most probably MAP), we should still use Hessian and not identity, right?

junpenglao · 2017-05-31T09:00:03Z

Just FYI: model built with pm.glm.GLM are mostly OK with find_MAP and Hessian as I gather from the limited testing yesterday.

twiecki · 2017-05-31T09:30:07Z

Our examples folder is in poor health to begin with. I wonder if we should rather delete them at this point and turn the ones we think are useful into proper NBs. But currently they are either broken or show incorrect usage of pymc3.

fonnesbeck · 2017-06-01T15:46:52Z

Let's fix them. I think there is merit to having some examples that are not in notebook format.

junpenglao · 2017-06-01T16:00:51Z

@twiecki @fonnesbeck I can go through them.
Many of them are using Metropolis, should I change the default to NUTS with ADVI init, and add a new kwarg "legacy" for the old way of doing sampling?

fonnesbeck · 2017-06-01T16:03:06Z

@junpenglao I would fully update everything to the current best practices (i.e. let PyMC choose), unless it is explicitly demonstrating an alternative.

Return identity matrix if no scaling provided

e574680

twiecki merged commit 6f58dbf into pymc-devs:master May 31, 2017

ghost mentioned this pull request Jun 1, 2017

WIP: Add basic tests for examples notebooks #2252

Closed

22 tasks

junpenglao mentioned this pull request Jun 1, 2017

Update examples #2254

Merged

ghost changed the title ~~WIP: replace Hessian scaling guessing by an identity matrix~~ Replace Hessian scaling guessing by an identity matrix Jun 1, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Replace Hessian scaling guessing by an identity matrix #2236

Replace Hessian scaling guessing by an identity matrix #2236

Uh oh!

ghost commented May 29, 2017 •

edited by ghost

Loading

Uh oh!

twiecki commented May 29, 2017

Uh oh!

junpenglao commented May 30, 2017

Uh oh!

twiecki commented May 31, 2017

Uh oh!

aseyboldt commented May 31, 2017

Uh oh!

ghost commented May 31, 2017

Uh oh!

ghost commented May 31, 2017 •

edited by ghost

Loading

Uh oh!

junpenglao commented May 31, 2017

Uh oh!

twiecki commented May 31, 2017

Uh oh!

fonnesbeck commented Jun 1, 2017

Uh oh!

junpenglao commented Jun 1, 2017

Uh oh!

fonnesbeck commented Jun 1, 2017

Uh oh!

Uh oh!

Uh oh!

Replace Hessian scaling guessing by an identity matrix #2236

Replace Hessian scaling guessing by an identity matrix #2236

Uh oh!

Conversation

ghost commented May 29, 2017 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

twiecki commented May 29, 2017

Uh oh!

junpenglao commented May 30, 2017

Uh oh!

twiecki commented May 31, 2017

Uh oh!

aseyboldt commented May 31, 2017

Uh oh!

ghost commented May 31, 2017

Uh oh!

ghost commented May 31, 2017 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

junpenglao commented May 31, 2017

Uh oh!

twiecki commented May 31, 2017

Uh oh!

fonnesbeck commented Jun 1, 2017

Uh oh!

junpenglao commented Jun 1, 2017

Uh oh!

fonnesbeck commented Jun 1, 2017

Uh oh!

Uh oh!

ghost commented May 29, 2017 •

edited by ghost

Loading

ghost commented May 31, 2017 •

edited by ghost

Loading