Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: explore API for multivariate hessians of user-defined functions #2953

Closed
wants to merge 25 commits into from

Conversation

odow
Copy link
Member

@odow odow commented Apr 13, 2022

This is going to need some testing/performance evaluations.

But if we get get rid of the overhead, this is a much more extensible way of evaluating the hessians. It also doesn't rely on ForwardDiff, which is nice.

Would solve the magic issue: #1198

@odow odow added the Category: Nonlinear Related to nonlinear programming label Apr 13, 2022
@codecov
Copy link

codecov bot commented Apr 13, 2022

Codecov Report

Merging #2953 (e82e6c0) into od/jump-nonlinear (981e575) will increase coverage by 0.10%.
The diff coverage is 97.08%.

@@                  Coverage Diff                  @@
##           od/jump-nonlinear    #2953      +/-   ##
=====================================================
+ Coverage              95.27%   95.38%   +0.10%     
=====================================================
  Files                     46       46              
  Lines                   5693     5717      +24     
=====================================================
+ Hits                    5424     5453      +29     
+ Misses                   269      264       -5     
Impacted Files Coverage Δ
src/nlp.jl 94.96% <ø> (+0.63%) ⬆️
src/Nonlinear/ReverseAD/utils.jl 91.17% <89.28%> (-8.83%) ⬇️
src/Nonlinear/ReverseAD/forward_over_reverse.jl 97.26% <100.00%> (+1.10%) ⬆️
src/Nonlinear/ReverseAD/mathoptinterface_api.jl 96.73% <100.00%> (+0.01%) ⬆️
src/Nonlinear/operators.jl 93.60% <100.00%> (+1.50%) ⬆️
src/Nonlinear/ReverseAD/graph_tools.jl 87.89% <0.00%> (+1.57%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 981e575...e82e6c0. Read the comment docs.

@odow
Copy link
Member Author

odow commented Apr 13, 2022

So I think this actually worked

julia> f(x...) = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2
f (generic function with 1 method)

julia> function ∇f(g, x...)
           g[1] = 400 * x[1]^3 - 400 * x[1] * x[2] + 2 * x[1] -2
           g[2] = 200 * (x[2] - x[1]^2)
           return
       end
∇f (generic function with 1 method)

julia> function ∇²f(H, x...)
           H[1, 1] = 1200 * x[1]^2 - 400 * x[2] + 2
           H[1, 2] = -400 * x[1]
           H[2, 2] = 200.0
           return
       end
∇²f (generic function with 1 method)

julia> using JuMP, Ipopt

julia> model = Model(Ipopt.Optimizer)
A JuMP Model
Feasibility problem with:
Variables: 0
Model mode: AUTOMATIC
CachingOptimizer state: EMPTY_OPTIMIZER
Solver name: Ipopt

julia> @variable(model, x)
x

julia> @variable(model, y)
y

julia> register(model, :rosenbrock, 2, f, ∇f, ∇²f)

julia> @NLobjective(model, Min, rosenbrock(x, y))

julia> optimize!(model)
This is Ipopt version 3.14.4, running with linear solver MUMPS 5.4.1.

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:        3

Total number of variables............................:        2
                     variables with only lower bounds:        0
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        0

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  1.0000000e+00 0.00e+00 2.00e+00  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  9.5312500e-01 0.00e+00 1.25e+01  -1.0 1.00e+00    -  1.00e+00 2.50e-01f  3
   2  4.8320569e-01 0.00e+00 1.01e+00  -1.0 9.03e-02    -  1.00e+00 1.00e+00f  1
   3  4.5708829e-01 0.00e+00 9.53e+00  -1.0 4.29e-01    -  1.00e+00 5.00e-01f  2
   4  1.8894205e-01 0.00e+00 4.15e-01  -1.0 9.51e-02    -  1.00e+00 1.00e+00f  1
   5  1.3918726e-01 0.00e+00 6.51e+00  -1.7 3.49e-01    -  1.00e+00 5.00e-01f  2
   6  5.4940990e-02 0.00e+00 4.51e-01  -1.7 9.29e-02    -  1.00e+00 1.00e+00f  1
   7  2.9144630e-02 0.00e+00 2.27e+00  -1.7 2.49e-01    -  1.00e+00 5.00e-01f  2
   8  9.8586451e-03 0.00e+00 1.15e+00  -1.7 1.10e-01    -  1.00e+00 1.00e+00f  1
   9  2.3237475e-03 0.00e+00 1.00e+00  -1.7 1.00e-01    -  1.00e+00 1.00e+00f  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  10  2.3797236e-04 0.00e+00 2.19e-01  -1.7 5.09e-02    -  1.00e+00 1.00e+00f  1
  11  4.9267371e-06 0.00e+00 5.95e-02  -1.7 2.53e-02    -  1.00e+00 1.00e+00f  1
  12  2.8189505e-09 0.00e+00 8.31e-04  -2.5 3.20e-03    -  1.00e+00 1.00e+00f  1
  13  1.0095041e-15 0.00e+00 8.68e-07  -5.7 9.78e-05    -  1.00e+00 1.00e+00f  1
  14  1.9770826e-29 0.00e+00 1.71e-13  -8.6 4.65e-08    -  1.00e+00 1.00e+00f  1

Number of Iterations....: 14

                                   (scaled)                 (unscaled)
Objective...............:   1.9770826437101608e-29    1.9770826437101608e-29
Dual infeasibility......:   1.7097434579227411e-13    1.7097434579227411e-13
Constraint violation....:   0.0000000000000000e+00    0.0000000000000000e+00
Variable bound violation:   0.0000000000000000e+00    0.0000000000000000e+00
Complementarity.........:   0.0000000000000000e+00    0.0000000000000000e+00
Overall NLP error.......:   1.7097434579227411e-13    1.7097434579227411e-13


Number of objective function evaluations             = 36
Number of objective gradient evaluations             = 15
Number of equality constraint evaluations            = 0
Number of inequality constraint evaluations          = 0
Number of equality constraint Jacobian evaluations   = 0
Number of inequality constraint Jacobian evaluations = 0
Number of Lagrangian Hessian evaluations             = 14
Total seconds in IPOPT                               = 0.054

EXIT: Optimal Solution Found.

julia> solution_summary(model; verbose = true)
* Solver : Ipopt

* Status
  Termination status : LOCALLY_SOLVED
  Primal status      : FEASIBLE_POINT
  Dual status        : FEASIBLE_POINT
  Result count       : 1
  Has duals          : true
  Message from the solver:
  "Solve_Succeeded"

* Candidate solution
  Objective value      : 1.97708e-29
  Primal solution :
    x : 1.00000e+00
    y : 1.00000e+00
  Dual solution :

* Work counters
  Solve time (sec)   : 5.53389e-02

@odow
Copy link
Member Author

odow commented Apr 14, 2022

cc @ccoffrin @joaquimg here's a cool demo

julia> using JuMP

julia> import Ipopt

julia> function bilevel_problem()
           last_x, last_f, last_y = nothing, 0.0, [NaN, NaN]
           function _update(x...)
               if last_x == x
                   return last_y
               end
               model = Model(Ipopt.Optimizer)
               set_silent(model)
               @variable(model, y[1:2])
               @NLobjective(
                   model,
                   Max,
                   y[1] * x[1]^2 + y[2] * x[2]^2 - x[1] * y[1]^4 - 2 * x[2] * y[2]^4,
               )
               @constraint(model, (y[1] - 10)^2 + (y[2] - 10)^2 <= 25)
               optimize!(model)
               @assert termination_status(model) == LOCALLY_SOLVED
               last_x = x
               last_f = objective_value(model)
               last_y[1] = value(y[1])
               last_y[2] = value(y[2])
               return last_y
           end
           function f(x...)
               _update(x...)
               return last_f
           end
           function ∇f(g, x...)
               _update(x...)
               g[1] = 2 * x[1] * last_y[1] - last_y[1]^4
               g[2] = 2 * x[2] * last_y[2] - 2 * last_y[2]^4
               return
           end
           function ∇²f(H, x...)
               _update(x...)
               H[1, 1] = 2 * last_y[1]
               H[2, 2] = 2 * last_y[2]
               return
           end
           return _update, f, ∇f, ∇²f
       end
bilevel_problem (generic function with 1 method)

julia> function bench_bilevel_nlp()
           model = Model(Ipopt.Optimizer)
           set_optimizer_attribute(model, "tol", 1e-5)
           @variable(model, x[1:2] >= 0)
           subproblem, f, ∇f, ∇²f = bilevel_problem()
           register(model, :bilevel_f, 2, f, ∇f, ∇²f)
           @NLobjective(model, Min, bilevel_f(x[1], x[2]) + x[1]^2 + x[2]^2)
           optimize!(model)
           @assert termination_status(model) == LOCALLY_SOLVED
           println(objective_value(model))
           println("x = ", value.(x))
           println("y = ", subproblem(value.(x)...))
           return
       end
bench_bilevel_nlp (generic function with 1 method)

julia> bench_bilevel_nlp()
This is Ipopt version 3.14.4, running with linear solver MUMPS 5.4.1.

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:        3

Total number of variables............................:        2
                     variables with only lower bounds:        2
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        0

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0 -4.9912084e+01 0.00e+00 1.40e+01  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1 -1.1605773e+03 0.00e+00 1.27e+01  -1.0 2.30e-01    -  7.08e-02 1.00e+00f  1
   2 -1.9284708e+05 0.00e+00 2.13e+01  -1.0 1.19e+02    -  2.18e-03 1.00e+00f  1
   3 -3.0193185e+05 0.00e+00 8.75e+00  -1.0 2.22e+02    -  8.52e-02 1.00e+00f  1
   4 -3.9250131e+05 0.00e+00 3.95e+00  -1.0 1.17e+02    -  1.00e+00 1.00e+00f  1
   5 -4.1203885e+05 0.00e+00 2.12e+00  -1.0 5.51e+01    -  1.00e+00 1.00e+00f  1
   6 -4.1717829e+05 0.00e+00 1.06e+00  -1.0 2.72e+01    -  1.00e+00 1.00e+00f  1
   7 -4.1850263e+05 0.00e+00 5.53e-01  -1.0 1.38e+01    -  1.00e+00 1.00e+00f  1
   8 -4.1885636e+05 0.00e+00 2.83e-01  -1.7 7.01e+00    -  1.00e+00 1.00e+00f  1
   9 -4.1894969e+05 0.00e+00 1.46e-01  -1.7 3.61e+00    -  1.00e+00 1.00e+00f  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  10 -4.1897452e+05 0.00e+00 7.52e-02  -2.5 1.85e+00    -  1.00e+00 1.00e+00f  1
  11 -4.1898111e+05 0.00e+00 3.88e-02  -2.5 9.54e-01    -  1.00e+00 1.00e+00f  1
  12 -4.1898286e+05 0.00e+00 2.00e-02  -2.5 4.91e-01    -  1.00e+00 1.00e+00f  1
  13 -4.1898332e+05 0.00e+00 1.03e-02  -3.8 2.53e-01    -  1.00e+00 1.00e+00f  1
  14 -4.1898344e+05 0.00e+00 5.30e-03  -3.8 1.30e-01    -  1.00e+00 1.00e+00f  1
  15 -4.1898347e+05 0.00e+00 2.73e-03  -3.8 6.72e-02    -  1.00e+00 1.00e+00f  1
  16 -4.1898348e+05 0.00e+00 1.41e-03  -3.8 3.46e-02    -  1.00e+00 1.00e+00f  1
  17 -4.1898349e+05 0.00e+00 7.25e-04  -5.7 1.78e-02    -  1.00e+00 1.00e+00f  1
  18 -4.1898349e+05 0.00e+00 3.74e-04  -5.7 9.19e-03    -  1.00e+00 1.00e+00f  1
  19 -4.1898349e+05 0.00e+00 1.92e-04  -5.7 4.73e-03    -  1.00e+00 1.00e+00f  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  20 -4.1898349e+05 0.00e+00 9.92e-05  -5.7 2.44e-03    -  1.00e+00 1.00e+00f  1
  21 -4.1898349e+05 0.00e+00 5.11e-05  -5.7 1.26e-03    -  1.00e+00 1.00e+00f  1
  22 -4.1898349e+05 0.00e+00 2.63e-05  -5.7 6.47e-04    -  1.00e+00 1.00e+00f  1
  23 -4.1898349e+05 0.00e+00 1.36e-05  -5.7 3.34e-04    -  1.00e+00 1.00e+00f  1
  24 -4.1898349e+05 0.00e+00 6.99e-06  -7.4 1.72e-04    -  1.00e+00 1.00e+00f  1

Number of Iterations....: 24

                                   (scaled)                 (unscaled)
Objective...............:  -2.0563555340131775e+03   -4.1898348680632992e+05
Dual infeasibility......:   6.9876474386248716e-06    1.4237367225572424e-03
Constraint violation....:   0.0000000000000000e+00    0.0000000000000000e+00
Variable bound violation:   0.0000000000000000e+00    0.0000000000000000e+00
Complementarity.........:   4.4619837019171138e-08    9.0913145055891301e-06
Overall NLP error.......:   6.9876474386248716e-06    1.4237367225572424e-03


Number of objective function evaluations             = 25
Number of objective gradient evaluations             = 25
Number of equality constraint evaluations            = 0
Number of inequality constraint evaluations          = 0
Number of equality constraint Jacobian evaluations   = 0
Number of inequality constraint Jacobian evaluations = 0
Number of Lagrangian Hessian evaluations             = 24
Total seconds in IPOPT                               = 0.367

EXIT: Optimal Solution Found.
-418983.4868063299
x = [154.978565241817, 180.00967268617077]
y = [7.072594398709957, 5.946569576826147]

@odow
Copy link
Member Author

odow commented Apr 14, 2022

Sadly, the first-order problem (i.e., if you don't pass the hessian) solves quicker in fewer iterations. I'm not sure why?

@ccoffrin
Copy link
Contributor

It's pretty common for small problems that first order is fast to converge. I am sure for some class of larger problems hessian support is more important, I am sure I can dig you up a compelling example if there is support for user defined hessians.

@odow odow force-pushed the od/multivariate-hessians branch from 24a7ce7 to 91e082f Compare April 18, 2022 01:08
@odow odow force-pushed the od/jump-nonlinear branch from 9bdbb1a to 63eb7b6 Compare April 18, 2022 04:36
odow added 12 commits April 18, 2022 16:41
Updates

More updates with beginnings of auto-registration

Add back printing

Simplify parsing macro

Re-add support for generators

Support . field access

More updates

More updates

More fixes

Updates to the docs

More fixes for the docs

Documentation fix

More fixes

Lots more fixes

Fix format

More fixes

Fix formatting and re-org

Coloring tests

More tests

Fix tests

More test fixes

More fixes

More printing fixes

F

Implement value

Fix test failure

Fix broken tests

Revert change to nonlinear constraint expressions

Update

Fix tests

Another fix

Fix test

Another fix

Start on documentation

Reenable precompile

Various updates and improvements

Add more docs

Fix docs

Add docstrings

Variety of docs and other improvements
@odow odow force-pushed the od/jump-nonlinear branch from 63eb7b6 to 981e575 Compare April 18, 2022 04:41
@odow odow force-pushed the od/multivariate-hessians branch from 5390f68 to 575c20b Compare April 18, 2022 04:50
@odow
Copy link
Member Author

odow commented Apr 26, 2022

Closing in favor of #2961

@odow odow closed this Apr 26, 2022
@odow odow deleted the od/multivariate-hessians branch April 26, 2022 00:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category: Nonlinear Related to nonlinear programming
Development

Successfully merging this pull request may close these issues.

2 participants