Skip to content

Commit aabe99b

Browse files
committed
[Nonlinear] add the Nonlinear.ReverseAD submodule
The majority of this development was carried out in the JuMP PRs: * jump-dev/JuMP.jl#2939 * jump-dev/JuMP.jl#2942 * jump-dev/JuMP.jl#2943 Nonlinear.ReverseAD is a minor refactoring of code that previously existed in JuMP under the _Derivatives submodule, and prior to that the ReverseDiffSparse.jl package.
1 parent 406e190 commit aabe99b

File tree

16 files changed

+3424
-1
lines changed

16 files changed

+3424
-1
lines changed

docs/src/submodules/Nonlinear/overview.md

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -208,6 +208,7 @@ before we can call [`initialize`](@ref), we need to set an
208208
There are two to choose from within MOI, although other packages may add more
209209
options by sub-typing [`Nonlinear.AbstractAutomaticDifferentiation`](@ref):
210210
* [`Nonlinear.ExprGraphOnly`](@ref)
211+
* [`Nonlinear.SparseReverseMode`](@ref).
211212

212213
If we set [`Nonlinear.ExprGraphOnly`](@ref), then we get access to `:ExprGraph`:
213214
```jldoctest nonlinear_developer
@@ -229,6 +230,52 @@ However, we cannot call gradient terms such as
229230
[`eval_objective_gradient`](@ref) because [`Nonlinear.ExprGraphOnly`](@ref) does
230231
not know how to differentiate a nonlinear expression.
231232

233+
If, instead, we set [`Nonlinear.SparseReverseMode`](@ref), then we get access to
234+
`:Grad`, the gradient of the objective function, `:Jac`, the jacobian matrix of
235+
the constraints, `:JacVec`, the ability to compute Jacobian-vector products, and
236+
`:ExprGraph`.
237+
```jldoctest nonlinear_developer
238+
julia> Nonlinear.set_differentiation_backend(
239+
data,
240+
Nonlinear.SparseReverseMode(),
241+
[x],
242+
)
243+
244+
julia> data
245+
NonlinearData with available features:
246+
* :Grad
247+
* :Jac
248+
* :JacVec
249+
* :ExprGraph
250+
```
251+
252+
However, before calling anything, we need to call [`initialize`](@ref):
253+
```jldoctest nonlinear_developer
254+
julia> MOI.initialize(data, [:Grad, :Jac, :JacVec, :ExprGraph])
255+
```
256+
257+
Now we can call methods like [`eval_objective`](@ref):
258+
```jldoctest nonlinear_developer
259+
julia> x = [1.0]
260+
1-element Vector{Float64}:
261+
1.0
262+
263+
julia> MOI.eval_objective(data, x)
264+
7.268073418273571
265+
```
266+
and [`eval_objective_gradient`](@ref):
267+
```jldoctest nonlinear_developer
268+
julia> grad = [NaN]
269+
1-element Vector{Float64}:
270+
NaN
271+
272+
julia> MOI.eval_objective_gradient(data, grad, x)
273+
274+
julia> grad
275+
1-element Vector{Float64}:
276+
1.909297426825682
277+
```
278+
232279
## Expression-graph representation
233280

234281
[`Nonlinear.NonlinearData`](@ref) stores nonlinear expressions in
@@ -388,3 +435,61 @@ user-defined functions using [`Nonlinear.register_operator`](@ref).
388435
[`Nonlinear.NonlinearData`](@ref) is a struct that stores the
389436
[`Nonlinear.OperatorRegistry`](@ref), as well as a list of parameters and
390437
subexpressions in the model.
438+
439+
## ReverseAD
440+
441+
`Nonlinear.ReverseAD` is a submodule for computing derivatives of the problem
442+
inside [`Nonlinear.NonlinearData`](@ref) using sparse reverse-mode automatic
443+
differentiation (AD).
444+
445+
This section does not attempt to explain how sparse reverse-mode AD works, but
446+
instead explains why MOI contains it's own implementation, and highlights
447+
notable differences from similar packages.
448+
449+
!!! warning
450+
You should not interact with `ReverseAD` directly. Instead, you should
451+
create a [`Nonlinear.NonlinearData`](@ref) object, call
452+
[`Nonlinear.set_differentiation_backend`](@ref) with
453+
[`Nonlinear.SparseReverseMode`](@ref), and then query the MOI API methods.
454+
455+
### Why another AD package?
456+
457+
The JuliaDiff organization maintains a [list of packages](https://juliadiff.org)
458+
for doing AD in Julia. At last count, there were at least ten packages–not
459+
including `ReverseAD`–for reverse-mode AD in Julia. Given this multitude, why
460+
does MOI maintain another implementation instead of re-using existing tooling?
461+
462+
Here are four reasons:
463+
464+
* **Scale and Sparsity:** the types of functions that MOI computes derivatives
465+
of have two key characteristics: they can be very large scale (10^5 or more
466+
functions across 10^5 or more variables) and they are very sparse. For large
467+
problems, it is common for the hessian to have `O(n)` non-zero elements
468+
instead of `O(n^2)` if it was dense. To the best of our knowledge,
469+
`ReverseAD` is the only reverse-mode AD system in Julia that handles sparsity
470+
by default. The lack of sparsity support is _the_ main reason why we do no
471+
use a generic package.
472+
* **Limited scope:** most other AD packages accept arbitrary Julia functions as
473+
input and then trace an expression graph using operator overloading. This
474+
means they must deal (or detect and ignore) with control flow, I/O, and other
475+
vagaries of Julia. In contrast, `ReverseAD` only accepts functions in the
476+
form of [`Nonlinear.NonlinearExpression`](@ref), which greatly limits the
477+
range of syntax that it must deal with. By reducing the scope of what we
478+
accept as input to functions relevant for mathematical optimization, we can
479+
provide a simpler implementation with various performance optimizations.
480+
* **Historical:** `ReverseAD` started life as [ReverseDiffSparse.jl](https://github.com/mlubin/ReverseDiffSparse.jl),
481+
development of which begain in early 2014(!). This was well before the other
482+
packages started development. Because we had a well-tested, working AD in
483+
JuMP, there was less motivation to contribute to and explore other AD
484+
packages. The lack of historical interaction also meant that other packages
485+
were not optimized for the types of problems that JuMP is built for (i.e.,
486+
large-scale sparse problems). When we first created MathOptInterface, we kept
487+
the AD in JuMP to simplify the transition, and post-poned the development of
488+
a first-class nonlinear interface in MathOptInterface.
489+
* **Technical debt** Prior to the introduction of `Nonlinear`, JuMP's nonlinear
490+
implementation was a confusing mix of functions and types spread across the
491+
code base and in the private `_Derivatives` submodule. This made it hard to
492+
swap the AD system for another. The main motivation for refactoring JuMP to
493+
create the `Nonlinear` submodule in MathOptInterface was to abstract the
494+
interface between JuMP and the AD system, allowing us to swap-in and test new
495+
AD systems in the future.

docs/src/submodules/Nonlinear/reference.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,7 @@ Nonlinear.eval_comparison_function
6868
```@docs
6969
Nonlinear.AbstractAutomaticDifferentiation
7070
Nonlinear.ExprGraphOnly
71+
Nonlinear.SparseReverseMode
7172
Nonlinear.set_differentiation_backend
7273
```
7374

src/Nonlinear/Nonlinear.jl

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,8 @@ include("operators.jl")
3434
include("types.jl")
3535
include("parse.jl")
3636

37+
include("ReverseAD/ReverseAD.jl")
38+
3739
"""
3840
set_objective(data::NonlinearData, obj)::Nothing
3941

0 commit comments

Comments
 (0)