Skip to content

Raise NotImplementedError for SplineWrapper gradient operation #2211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from May 27, 2017
Merged

Raise NotImplementedError for SplineWrapper gradient operation #2211

merged 7 commits into from May 27, 2017

Conversation

ghost
Copy link

@ghost ghost commented May 21, 2017

It is necessary to allow guess_scaling fall back to fixed_hessian here instead of just failing.

Fixes #2209.

@ghost ghost changed the title Raise NotImplementedError for SplineWrapper Raise NotImplementedError for SplineWrapper gradient operation May 21, 2017
@ferrine
Copy link
Member

ferrine commented May 21, 2017

Better design is to create new spline in the grad without subclassing

@twiecki
Copy link
Member

twiecki commented May 22, 2017

@ferrine not sure how you mean.

@ferrine
Copy link
Member

ferrine commented May 22, 2017

@ghost
Copy link
Author

ghost commented May 22, 2017

@ferrine Yes, is a good idea to instantiate objects of a single class recursively instead of having two separate classes.

itypes = [tt.dscalar]
otypes = [tt.dscalar]

def __init__(self, spline):
def __init__(self, spline, n_derivatives):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why n derivatives? Recursion is ok

Copy link
Author

@ghost ghost May 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n_derivatives is the number of derivatives that a given SplineWrapper object should implement. One wouldn't want to have it larger than necessary, because invocations of UnivariateSpline.derivative() are slow and have O(n) complexity, unbounded recursion doesn't seem to be a good choice here.

x, = inputs
x_grad, = grads
return [x_grad * self.spline_grad(x)]
return [x_grad * self.grad_op(x)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can create new op right here. Pure theano code is expected for grad method

Copy link
Author

@ghost ghost May 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid O(n) calculations on each call of grad method the SciPy spline for the gradient (created by self.spline.derivative()) should be pre-calculated and stored inside of the object. But in this case it is simpler to also store a pre-created Theano operation for the gradient, rather than somehow maintain a list of pre-created SciPy splines and then pass them to the constructor of a Theano operation in the grad method, so that the resulting operation also use a pre-calculated SciPy spline for the n-th order derivative.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can memorize the call

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memorize op creation to be more accurate

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I thought about it. But I'm concerned that in this case gradient calculation time becomes non-deterministic. For example it might significantly bias tqdm estimation for sampling time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Functions are compiled after graph is constructed. That will not affect runtime

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, you are right. I added lazy creation of the derivatives.

@ferrine
Copy link
Member

ferrine commented May 24, 2017

Lgtm


if not hasattr(self, 'grad_op'):
try:
self.grad_op = SplineWrapper(self.spline.derivative())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using property?

Copy link
Author

@ghost ghost May 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't want to add grad_op to __props__ because it would break __hash__() and __eq__(). For example, this code works now as expected, but wouldn't work if grad_op is added to __props__:

s = InterpolatedUnivariateSpline(...)
op1 = SplineWrapper(s)
op2 = SplineWrapper(s)
op2.grad(...) # calculate gradient
assert op1 == op2

One would expect op1 to be equal to op2 because they wrap the same spline, but if spline_op is a property, it won't be the case.

As far as I understand, the purpose of __props__ is to be used for implementation of correct comparisons:

__props__ enables the automatic generation of appropriate __eq__() and __hash__(). Given the method __eq__(), automatically generated from __props__, two ops will be equal if they have the same values for all the properties listed in __props__. Given to the method __hash__() automatically generated from __props__, two ops will be have the same hash if they have the same values for all the properties listed in __props__. __props__ will also generate a suitable __str__() for your op. This requires development version after September 1st, 2014 or version 0.7.

http://deeplearning.net/software/theano/extending/extending_theano.html#op-s-auxiliary-methods

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, isn't derivative deterministic? In that case the only prop (spline) is required

Copy link
Author

@ghost ghost May 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's why we have only one Theano prop, spline. And grad_op is just a helper property for internal usage, that is needed to avoid unnecessary recalculations of the derivative spline, but not a Theano property.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under property I meant

@property
def grad_op(self):...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we don't want to expose it to the user, don't we?

Copy link
Author

@ghost ghost May 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean what would possible usages be? This class is anyway internal to PyMC3, so I think that we need in it only the functionality that is actually used by PyMC3.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover it would be more pretty

Copy link
Author

@ghost ghost May 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ferrine Hmm, why not indeed :) Added it in a new commit.

@@ -378,6 +378,18 @@ class SplineWrapper (theano.Op):
def __init__(self, spline):
self.spline = spline

@property
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@ferrine
Copy link
Member

ferrine commented May 27, 2017

Could you please add a simple test that ensures that problem in #2209 is solved?

Copy link
Member

@ferrine ferrine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks really great!! I've left some code style nitpicks though

@@ -365,6 +365,7 @@ def conjugate_solve_triangular(outer, inner):
grad = tt.triu(s + s.T) - tt.diag(tt.diagonal(s))
return [tt.switch(ok, grad, floatX(np.nan))]


class SplineWrapper (theano.Op):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pep8 does not recommend spacing before opening brackets

Copy link
Author

@ghost ghost May 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed all these issues. But just to argue about PEP8... :)

PEP8 forbids spaces after functions or methods, but doesn't say about a whitespace before the base class:

Avoid extraneous whitespace in the following situations:
... (nothing interesting)
Immediately before the open parenthesis that starts the argument list of a function call
... (nothing interesting)

Also, it says that

Method definitions inside a class are surrounded by a single blank line

so a blank line before the first method in a class is legitimate as it surrounds a class method, it is even possible to argue that it is required by PEP8 there.

Actually pep8 linter seems to agree with both ways:

$ cat t.py 
class Foo (object):

    def foo(self):
        print('hi')


class Bar(object):
    def bar(self):
        print('hi')
$ pep8 t.py
$ # no errors reported

And as PEP8 is a formalized description of the code style used in Python standard library, we can check what is used in it, and it turns out that both variants are present there:

However it was more just to argue, I've already committed the fixes and I see that we need to not just follow PEP8, but also be consistent with other parts of PyMC3, so I agree with your nitpicks and find them to be absolutely reasonable :)

As an off topic, I actually think that it would be nice to have pep8 or flake8 running automatically in Travis and failing builds in case of PEP8 violations. However I briefly investigated this possibility and it seems like it is necessary to either fix some rather large parts of the existing code and/or agree on exceptions from PEP8 that we don't want to follow, like for example maximum line length.

@@ -16,7 +16,7 @@
from pymc3.theanof import floatX
from . import transforms

from .dist_math import bound, logpow, gammaln, betaln, std_cdf, i0, i1, alltrue_elemwise, DifferentiableSplineWrapper
from .dist_math import bound, logpow, gammaln, betaln, std_cdf, i0, i1, alltrue_elemwise, SplineWrapper
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line is too long, we can wrap this import in brackets and do multiline import



class TestSplineWrapper:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blank line is not needed, also it is good to inherit from object

@ferrine
Copy link
Member

ferrine commented May 27, 2017

Once tests pass I'll merge. Good job, thanks!

@ghost
Copy link
Author

ghost commented May 27, 2017

Thank you!

@ferrine ferrine merged commit 73694ef into pymc-devs:master May 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NUTS step, when selected explicitly, requires the second derivative
3 participants