Skip to content

Commit 907080a

Browse files
committed
[Nonlinear] add the Nonlinear.ReverseAD submodule
The majority of this development was carried out in the JuMP PRs: * jump-dev/JuMP.jl#2939 * jump-dev/JuMP.jl#2942 * jump-dev/JuMP.jl#2943 Nonlinear.ReverseAD is a minor refactoring of code that previously existed in JuMP under the _Derivatives submodule, and prior to that the ReverseDiffSparse.jl package.
1 parent bc7b156 commit 907080a

16 files changed

+3418
-0
lines changed

docs/src/submodules/Nonlinear/overview.md

+105
Original file line numberDiff line numberDiff line change
@@ -285,6 +285,7 @@ There following backends are available to choose from within MOI, although other
285285
packages may add more options by sub-typing
286286
[`Nonlinear.AbstractAutomaticDifferentiation`](@ref):
287287
* [`Nonlinear.ExprGraphOnly`](@ref)
288+
* [`Nonlinear.SparseReverseMode`](@ref).
288289

289290
```jldoctest nonlinear_developer
290291
julia> evaluator = Nonlinear.Evaluator(model, Nonlinear.ExprGraphOnly(), [x])
@@ -317,6 +318,52 @@ julia> MOI.set(model, MOI.NLPBlock(), block);
317318
Only call [`NLPBlockData`](@ref) once you have finished modifying the
318319
problem in `model`.
319320

321+
If, instead, we set [`Nonlinear.SparseReverseMode`](@ref), then we get access to
322+
`:Grad`, the gradient of the objective function, `:Jac`, the jacobian matrix of
323+
the constraints, `:JacVec`, the ability to compute Jacobian-vector products, and
324+
`:ExprGraph`.
325+
```jldoctest nonlinear_developer
326+
julia> Nonlinear.set_differentiation_backend(
327+
data,
328+
Nonlinear.SparseReverseMode(),
329+
[x],
330+
)
331+
332+
julia> data
333+
NonlinearData with available features:
334+
* :Grad
335+
* :Jac
336+
* :JacVec
337+
* :ExprGraph
338+
```
339+
340+
However, before calling anything, we need to call [`initialize`](@ref):
341+
```jldoctest nonlinear_developer
342+
julia> MOI.initialize(data, [:Grad, :Jac, :JacVec, :ExprGraph])
343+
```
344+
345+
Now we can call methods like [`eval_objective`](@ref):
346+
```jldoctest nonlinear_developer
347+
julia> x = [1.0]
348+
1-element Vector{Float64}:
349+
1.0
350+
351+
julia> MOI.eval_objective(data, x)
352+
7.268073418273571
353+
```
354+
and [`eval_objective_gradient`](@ref):
355+
```jldoctest nonlinear_developer
356+
julia> grad = [NaN]
357+
1-element Vector{Float64}:
358+
NaN
359+
360+
julia> MOI.eval_objective_gradient(data, grad, x)
361+
362+
julia> grad
363+
1-element Vector{Float64}:
364+
1.909297426825682
365+
```
366+
320367
## Expression-graph representation
321368

322369
[`Nonlinear.Model`](@ref) stores nonlinear expressions in
@@ -477,3 +524,61 @@ user-defined functions using [`Nonlinear.register_operator`](@ref).
477524
[`Nonlinear.Model`](@ref) is a struct that stores the
478525
[`Nonlinear.OperatorRegistry`](@ref), as well as a list of parameters and
479526
subexpressions in the model.
527+
528+
## ReverseAD
529+
530+
`Nonlinear.ReverseAD` is a submodule for computing derivatives of the problem
531+
inside [`Nonlinear.NonlinearData`](@ref) using sparse reverse-mode automatic
532+
differentiation (AD).
533+
534+
This section does not attempt to explain how sparse reverse-mode AD works, but
535+
instead explains why MOI contains it's own implementation, and highlights
536+
notable differences from similar packages.
537+
538+
!!! warning
539+
You should not interact with `ReverseAD` directly. Instead, you should
540+
create a [`Nonlinear.NonlinearData`](@ref) object, call
541+
[`Nonlinear.set_differentiation_backend`](@ref) with
542+
[`Nonlinear.SparseReverseMode`](@ref), and then query the MOI API methods.
543+
544+
### Why another AD package?
545+
546+
The JuliaDiff organization maintains a [list of packages](https://juliadiff.org)
547+
for doing AD in Julia. At last count, there were at least ten packages–not
548+
including `ReverseAD`–for reverse-mode AD in Julia. Given this multitude, why
549+
does MOI maintain another implementation instead of re-using existing tooling?
550+
551+
Here are four reasons:
552+
553+
* **Scale and Sparsity:** the types of functions that MOI computes derivatives
554+
of have two key characteristics: they can be very large scale (10^5 or more
555+
functions across 10^5 or more variables) and they are very sparse. For large
556+
problems, it is common for the hessian to have `O(n)` non-zero elements
557+
instead of `O(n^2)` if it was dense. To the best of our knowledge,
558+
`ReverseAD` is the only reverse-mode AD system in Julia that handles sparsity
559+
by default. The lack of sparsity support is _the_ main reason why we do no
560+
use a generic package.
561+
* **Limited scope:** most other AD packages accept arbitrary Julia functions as
562+
input and then trace an expression graph using operator overloading. This
563+
means they must deal (or detect and ignore) with control flow, I/O, and other
564+
vagaries of Julia. In contrast, `ReverseAD` only accepts functions in the
565+
form of [`Nonlinear.NonlinearExpression`](@ref), which greatly limits the
566+
range of syntax that it must deal with. By reducing the scope of what we
567+
accept as input to functions relevant for mathematical optimization, we can
568+
provide a simpler implementation with various performance optimizations.
569+
* **Historical:** `ReverseAD` started life as [ReverseDiffSparse.jl](https://github.com/mlubin/ReverseDiffSparse.jl),
570+
development of which begain in early 2014(!). This was well before the other
571+
packages started development. Because we had a well-tested, working AD in
572+
JuMP, there was less motivation to contribute to and explore other AD
573+
packages. The lack of historical interaction also meant that other packages
574+
were not optimized for the types of problems that JuMP is built for (i.e.,
575+
large-scale sparse problems). When we first created MathOptInterface, we kept
576+
the AD in JuMP to simplify the transition, and post-poned the development of
577+
a first-class nonlinear interface in MathOptInterface.
578+
* **Technical debt** Prior to the introduction of `Nonlinear`, JuMP's nonlinear
579+
implementation was a confusing mix of functions and types spread across the
580+
code base and in the private `_Derivatives` submodule. This made it hard to
581+
swap the AD system for another. The main motivation for refactoring JuMP to
582+
create the `Nonlinear` submodule in MathOptInterface was to abstract the
583+
interface between JuMP and the AD system, allowing us to swap-in and test new
584+
AD systems in the future.

docs/src/submodules/Nonlinear/reference.md

+1
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,7 @@ Nonlinear.eval_comparison_function
7070
Nonlinear.Evaluator
7171
Nonlinear.AbstractAutomaticDifferentiation
7272
Nonlinear.ExprGraphOnly
73+
Nonlinear.SparseReverseMode
7374
```
7475

7576
## Data-structure

src/Nonlinear/Nonlinear.jl

+2
Original file line numberDiff line numberDiff line change
@@ -50,4 +50,6 @@ include("parse.jl")
5050
include("model.jl")
5151
include("evaluator.jl")
5252

53+
include("ReverseAD/ReverseAD.jl")
54+
5355
end # module

0 commit comments

Comments
 (0)