@@ -208,7 +208,6 @@ before we can call [`initialize`](@ref), we need to set an
208
208
There are two to choose from within MOI, although other packages may add more
209
209
options by sub-typing [ ` Nonlinear.AbstractAutomaticDifferentiation ` ] ( @ref ) :
210
210
* [ ` Nonlinear.ExprGraphOnly ` ] ( @ref )
211
- * [ ` Nonlinear.SparseReverseMode ` ] ( @ref ) .
212
211
213
212
If we set [ ` Nonlinear.ExprGraphOnly ` ] ( @ref ) , then we get access to ` :ExprGraph ` :
214
213
``` jldoctest nonlinear_developer
@@ -230,52 +229,6 @@ However, we cannot call gradient terms such as
230
229
[ ` eval_objective_gradient ` ] ( @ref ) because [ ` Nonlinear.ExprGraphOnly ` ] ( @ref ) does
231
230
not know how to differentiate a nonlinear expression.
232
231
233
- If, instead, we set [ ` Nonlinear.SparseReverseMode ` ] ( @ref ) , then we get access to
234
- ` :Grad ` , the gradient of the objective function, ` :Jac ` , the jacobian matrix of
235
- the constraints, ` :JacVec ` , the ability to compute Jacobian-vector products, and
236
- ` :ExprGraph ` .
237
- ``` jldoctest nonlinear_developer
238
- julia> Nonlinear.set_differentiation_backend(
239
- data,
240
- Nonlinear.SparseReverseMode(),
241
- [x],
242
- )
243
-
244
- julia> data
245
- NonlinearData with available features:
246
- * :Grad
247
- * :Jac
248
- * :JacVec
249
- * :ExprGraph
250
- ```
251
-
252
- However, before calling anything, we need to call [ ` initialize ` ] ( @ref ) :
253
- ``` jldoctest nonlinear_developer
254
- julia> MOI.initialize(data, [:Grad, :Jac, :JacVec, :ExprGraph])
255
- ```
256
-
257
- Now we can call methods like [ ` eval_objective ` ] ( @ref ) :
258
- ``` jldoctest nonlinear_developer
259
- julia> x = [1.0]
260
- 1-element Vector{Float64}:
261
- 1.0
262
-
263
- julia> MOI.eval_objective(data, x)
264
- 7.268073418273571
265
- ```
266
- and [ ` eval_objective_gradient ` ] ( @ref ) :
267
- ``` jldoctest nonlinear_developer
268
- julia> grad = [NaN]
269
- 1-element Vector{Float64}:
270
- NaN
271
-
272
- julia> MOI.eval_objective_gradient(data, grad, x)
273
-
274
- julia> grad
275
- 1-element Vector{Float64}:
276
- 1.909297426825682
277
- ```
278
-
279
232
## Expression-graph representation
280
233
281
234
[ ` Nonlinear.NonlinearData ` ] ( @ref ) stores nonlinear expressions in
@@ -435,61 +388,3 @@ user-defined functions using [`Nonlinear.register_operator`](@ref).
435
388
[ ` Nonlinear.NonlinearData ` ] ( @ref ) is a struct that stores the
436
389
[ ` Nonlinear.OperatorRegistry ` ] ( @ref ) , as well as a list of parameters and
437
390
subexpressions in the model.
438
-
439
- ## ReverseAD
440
-
441
- ` Nonlinear.ReverseAD ` is a submodule for computing derivatives of the problem
442
- inside [ ` Nonlinear.NonlinearData ` ] ( @ref ) using sparse reverse-mode automatic
443
- differentiation (AD).
444
-
445
- This section does not attempt to explain how sparse reverse-mode AD works, but
446
- instead explains why MOI contains it's own implementation, and highlights
447
- notable differences from similar packages.
448
-
449
- !!! warning
450
- You should not interact with ` ReverseAD ` directly. Instead, you should
451
- create a [ ` Nonlinear.NonlinearData ` ] ( @ref ) object, call
452
- [ ` Nonlinear.set_differentiation_backend ` ] ( @ref ) with
453
- [ ` Nonlinear.SparseReverseMode ` ] ( @ref ) , and then query the MOI API methods.
454
-
455
- ### Why another AD package?
456
-
457
- The JuliaDiff organization maintains a [ list of packages] ( https://juliadiff.org )
458
- for doing AD in Julia. At last count, there were at least ten packages–not
459
- including ` ReverseAD ` –for reverse-mode AD in Julia. Given this multitude, why
460
- does MOI maintain another implementation instead of re-using existing tooling?
461
-
462
- Here are four reasons:
463
-
464
- * ** Scale and Sparsity:** the types of functions that MOI computes derivatives
465
- of have two key characteristics: they can be very large scale (10^5 or more
466
- functions across 10^5 or more variables) and they are very sparse. For large
467
- problems, it is common for the hessian to have ` O(n) ` non-zero elements
468
- instead of ` O(n^2) ` if it was dense. To the best of our knowledge,
469
- ` ReverseAD ` is the only reverse-mode AD system in Julia that handles sparsity
470
- by default. The lack of sparsity support is _ the_ main reason why we do no
471
- use a generic package.
472
- * ** Limited scope:** most other AD packages accept arbitrary Julia functions as
473
- input and then trace an expression graph using operator overloading. This
474
- means they must deal (or detect and ignore) with control flow, I/O, and other
475
- vagaries of Julia. In contrast, ` ReverseAD ` only accepts functions in the
476
- form of [ ` Nonlinear.NonlinearExpression ` ] ( @ref ) , which greatly limits the
477
- range of syntax that it must deal with. By reducing the scope of what we
478
- accept as input to functions relevant for mathematical optimization, we can
479
- provide a simpler implementation with various performance optimizations.
480
- * ** Historical:** ` ReverseAD ` started life as [ ReverseDiffSparse.jl] ( https://github.com/mlubin/ReverseDiffSparse.jl ) ,
481
- development of which begain in early 2014(!). This was well before the other
482
- packages started development. Because we had a well-tested, working AD in
483
- JuMP, there was less motivation to contribute to and explore other AD
484
- packages. The lack of historical interaction also meant that other packages
485
- were not optimized for the types of problems that JuMP is built for (i.e.,
486
- large-scale sparse problems). When we first created MathOptInterface, we kept
487
- the AD in JuMP to simplify the transition, and post-poned the development of
488
- a first-class nonlinear interface in MathOptInterface.
489
- * ** Technical debt** Prior to the introduction of ` Nonlinear ` , JuMP's nonlinear
490
- implementation was a confusing mix of functions and types spread across the
491
- code base and in the private ` _Derivatives ` submodule. This made it hard to
492
- swap the AD system for another. The main motivation for refactoring JuMP to
493
- create the ` Nonlinear ` submodule in MathOptInterface was to abstract the
494
- interface between JuMP and the AD system, allowing us to swap-in and test new
495
- AD systems in the future.
0 commit comments