[ReverseAD] Add support for user-defined hessians #1819

odow · 2022-04-26T00:35:03Z

Refactored from jump-dev/JuMP.jl#2953

src/Nonlinear/ReverseAD/utils.jl

frapac

Nice! As far as I understand, we assume by default that the Hessian of a custom operator is dense, right?

If so, would it be possible to emphasize this in the documentation (I can do it if that helps)?

src/Nonlinear/ReverseAD/utils.jl

test/Nonlinear/ReverseAD.jl

src/Nonlinear/ReverseAD/utils.jl

src/Nonlinear/ReverseAD/mathoptinterface_api.jl

test/Nonlinear/ReverseAD.jl

odow · 2022-05-23T23:22:16Z

So I did some benchmarking of this (a diagonal quadratic is also effectively a worst-case usage of this):

julia> using JuMP, Ipopt

julia> function main(N)
           model = Model(Ipopt.Optimizer)
           set_silent(model)
           @variable(model, x[1:N])
           function foo(x...)
               return sum((x[i] - 1.1*i)^2 for i in 1:N)
           end
           register(model, :foo, N, foo; autodiff = true)
           @NLobjective(model, Min, foo(x...))
           optimize!(model)
           return value.(x)
       end
main (generic function with 1 method)

julia> main(10); # compilation

julia> for i in [10, 100, 1_000, 10_000]
           GC.gc()
           @time main(i)
       end
  0.003252 seconds (1.53 k allocations: 98.891 KiB)
  0.830141 seconds (3.08 M allocations: 175.975 MiB, 3.97% gc time)
  0.186039 seconds (591.61 k allocations: 153.440 MiB, 10.86% gc time)
  7.209030 seconds (33.99 M allocations: 13.457 GiB, 12.40% gc time)

julia> function main_1(N)
           model = Model(Ipopt.Optimizer)
           set_silent(model)
           @variable(model, x[1:N])
           function foo(x...)
               return sum((x[i] - 1.1*i)^2 for i in 1:N)
           end
           function ∇foo(g, x...)
               for i in 1:N
                   g[i] = 2 * (x[i] - 1.1 * i)
               end
               return
           end
           register(model, :foo, N, foo, ∇foo)
           @NLobjective(model, Min, foo(x...))
           optimize!(model)
           return value.(x)
       end
main_1 (generic function with 1 method)

julia> main_1(10); # compilation

julia> for i in [10, 100, 1_000, 10_000]
           GC.gc()
           @time main_1(i)
       end
  0.002592 seconds (1.44 k allocations: 71.875 KiB)
  0.028930 seconds (80.08 k allocations: 4.595 MiB, 88.17% compilation time)
  0.032589 seconds (117.40 k allocations: 6.269 MiB, 75.08% compilation time)
  0.083687 seconds (495.45 k allocations: 23.409 MiB, 30.14% compilation time)

julia> function main_2(N)
           model = Model(Ipopt.Optimizer)
           set_silent(model)
           @variable(model, x[1:N])
           function foo(x...)
               return sum((x[i] - 1.1*i)^2 for i in 1:N)
           end
           function ∇foo(g, x...)
               for i in 1:N
                   g[i] = 2 * (x[i] - 1.1 * i)
               end
               return
           end
           function ∇²foo(H, x...)
               for i in 1:N
                   H[i, i] = 2.0
               end
               return
           end
           register(model, :foo, N, foo, ∇foo, ∇²foo)
           @NLobjective(model, Min, foo(x...))
           optimize!(model)
           return value.(x)
       end
main_2 (generic function with 1 method)

julia> main_2(10); # compilation

julia> for i in [10, 100, 1_000] # , 10_000]
           GC.gc()
           @time main_2(i)
       end
  0.002018 seconds (2.67 k allocations: 162.812 KiB)
  0.047883 seconds (216.95 k allocations: 13.330 MiB, 71.50% compilation time)
  5.314729 seconds (13.42 M allocations: 822.473 MiB, 7.54% gc time, 0.63% compilation time)

As could be expected, it scales pretty poorly if you have large input vectors. But there's an easy escape hatch: just don't define the hessian and use first-order methods. This is really targeting problems which have lots of other structure, and a small user-defined function in one part that is disabling second-order information for the full problem.

odow · 2022-05-26T20:23:45Z

Conclusion was if we wanted to add a sparse method in future, we could add register(model, :foo, N, foo, ∇foo, ∇²foo; hessian_sparsity = [(1, 1), (2, 2)]) and then H would be a Vector{Float64} as input. This would be non-breaking.

mlubin · 2022-05-29T20:19:22Z

src/Nonlinear/ReverseAD/forward_over_reverse.jl

+                        f_input[i] = ex.forward_storage[children_arr[c]]
+                    end
+                    H = _UnsafeLowerTriangularMatrixView(
+                        d.user_output_buffer,


Where is it enforced that user_output_buffer is large enough?

It calls this constructor: https://github.com/jump-dev/MathOptInterface.jl/pull/1819/files#diff-b6954ec7398499723bd82286762989c12564fda78890c07d49a3c1c06e0ea3a7R116

So the constructor could break other unsafe views on the same array. That sounds extra unsafe... At least needs to be documented.

Yes. For now though, the only usage of this is evaluating the user-defined hessians, and we use the .user_output_buffer as storage, so it's a safe usage of the unsafe behavior.

odow added the Project: next-gen nonlinear support Issues relating to nonlinear support label Apr 26, 2022

odow mentioned this pull request Apr 26, 2022

Add support for user-defined multivariate hessians jump-dev/JuMP.jl#2961

Merged

odow force-pushed the od/nonlinear branch from 9acff59 to f125e16 Compare May 2, 2022 20:45

odow mentioned this pull request May 2, 2022

Add [Nonlinear.ReverseAD] submodule #1803

Merged

3 tasks

odow force-pushed the od/multivariate-hessians branch from 493b86d to 9e2fd5b Compare May 3, 2022 03:01

Base automatically changed from od/nonlinear to master May 5, 2022 21:47

odow force-pushed the od/multivariate-hessians branch from 9e2fd5b to 4101a43 Compare May 5, 2022 21:52

odow added the Status: Needs review label May 5, 2022

matbesancon reviewed May 6, 2022

View reviewed changes

src/Nonlinear/ReverseAD/utils.jl Outdated Show resolved Hide resolved

frapac reviewed May 16, 2022

View reviewed changes

odow mentioned this pull request May 23, 2022

Prep for v1.3.0 #1849

Merged

3 tasks

mlubin reviewed May 29, 2022

View reviewed changes

odow added 5 commits May 30, 2022 13:14

[ReverseAD] Add support for user-defined hessians

1dda3c5

Fix formatting

9b8ca72

Improve coverage

95c66bd

Updates from francois

51ffec8

Update docs of unsafe vector views

e815011

odow force-pushed the od/multivariate-hessians branch from 305a157 to e815011 Compare May 30, 2022 01:14

mlubin approved these changes May 30, 2022

View reviewed changes

odow merged commit 5802209 into master May 30, 2022

odow deleted the od/multivariate-hessians branch May 30, 2022 21:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ReverseAD] Add support for user-defined hessians #1819

[ReverseAD] Add support for user-defined hessians #1819

odow commented Apr 26, 2022

frapac left a comment

odow commented May 23, 2022 •

edited

Loading

odow commented May 26, 2022

mlubin May 29, 2022

odow May 29, 2022 •

edited

Loading

mlubin May 29, 2022

odow May 30, 2022

[ReverseAD] Add support for user-defined hessians #1819

[ReverseAD] Add support for user-defined hessians #1819

Conversation

odow commented Apr 26, 2022

frapac left a comment

Choose a reason for hiding this comment

odow commented May 23, 2022 • edited Loading

odow commented May 26, 2022

mlubin May 29, 2022

Choose a reason for hiding this comment

odow May 29, 2022 • edited Loading

Choose a reason for hiding this comment

mlubin May 29, 2022

Choose a reason for hiding this comment

odow May 30, 2022

Choose a reason for hiding this comment

odow commented May 23, 2022 •

edited

Loading

odow May 29, 2022 •

edited

Loading