@@ -285,6 +285,7 @@ There following backends are available to choose from within MOI, although other
285
285
packages may add more options by sub-typing
286
286
[ ` Nonlinear.AbstractAutomaticDifferentiation ` ] ( @ref ) :
287
287
* [ ` Nonlinear.ExprGraphOnly ` ] ( @ref )
288
+ * [ ` Nonlinear.SparseReverseMode ` ] ( @ref ) .
288
289
289
290
``` jldoctest nonlinear_developer
290
291
julia> evaluator = Nonlinear.Evaluator(model, Nonlinear.ExprGraphOnly(), [x])
@@ -317,6 +318,52 @@ julia> MOI.set(model, MOI.NLPBlock(), block);
317
318
Only call [ ` NLPBlockData ` ] ( @ref ) once you have finished modifying the
318
319
problem in ` model ` .
319
320
321
+ If, instead, we set [ ` Nonlinear.SparseReverseMode ` ] ( @ref ) , then we get access to
322
+ ` :Grad ` , the gradient of the objective function, ` :Jac ` , the jacobian matrix of
323
+ the constraints, ` :JacVec ` , the ability to compute Jacobian-vector products, and
324
+ ` :ExprGraph ` .
325
+ ``` jldoctest nonlinear_developer
326
+ julia> Nonlinear.set_differentiation_backend(
327
+ data,
328
+ Nonlinear.SparseReverseMode(),
329
+ [x],
330
+ )
331
+
332
+ julia> data
333
+ NonlinearData with available features:
334
+ * :Grad
335
+ * :Jac
336
+ * :JacVec
337
+ * :ExprGraph
338
+ ```
339
+
340
+ However, before calling anything, we need to call [ ` initialize ` ] ( @ref ) :
341
+ ``` jldoctest nonlinear_developer
342
+ julia> MOI.initialize(data, [:Grad, :Jac, :JacVec, :ExprGraph])
343
+ ```
344
+
345
+ Now we can call methods like [ ` eval_objective ` ] ( @ref ) :
346
+ ``` jldoctest nonlinear_developer
347
+ julia> x = [1.0]
348
+ 1-element Vector{Float64}:
349
+ 1.0
350
+
351
+ julia> MOI.eval_objective(data, x)
352
+ 7.268073418273571
353
+ ```
354
+ and [ ` eval_objective_gradient ` ] ( @ref ) :
355
+ ``` jldoctest nonlinear_developer
356
+ julia> grad = [NaN]
357
+ 1-element Vector{Float64}:
358
+ NaN
359
+
360
+ julia> MOI.eval_objective_gradient(data, grad, x)
361
+
362
+ julia> grad
363
+ 1-element Vector{Float64}:
364
+ 1.909297426825682
365
+ ```
366
+
320
367
## Expression-graph representation
321
368
322
369
[ ` Nonlinear.Model ` ] ( @ref ) stores nonlinear expressions in
@@ -477,3 +524,61 @@ user-defined functions using [`Nonlinear.register_operator`](@ref).
477
524
[ ` Nonlinear.Model ` ] ( @ref ) is a struct that stores the
478
525
[ ` Nonlinear.OperatorRegistry ` ] ( @ref ) , as well as a list of parameters and
479
526
subexpressions in the model.
527
+
528
+ ## ReverseAD
529
+
530
+ ` Nonlinear.ReverseAD ` is a submodule for computing derivatives of the problem
531
+ inside [ ` Nonlinear.NonlinearData ` ] ( @ref ) using sparse reverse-mode automatic
532
+ differentiation (AD).
533
+
534
+ This section does not attempt to explain how sparse reverse-mode AD works, but
535
+ instead explains why MOI contains it's own implementation, and highlights
536
+ notable differences from similar packages.
537
+
538
+ !!! warning
539
+ You should not interact with ` ReverseAD ` directly. Instead, you should
540
+ create a [ ` Nonlinear.NonlinearData ` ] ( @ref ) object, call
541
+ [ ` Nonlinear.set_differentiation_backend ` ] ( @ref ) with
542
+ [ ` Nonlinear.SparseReverseMode ` ] ( @ref ) , and then query the MOI API methods.
543
+
544
+ ### Why another AD package?
545
+
546
+ The JuliaDiff organization maintains a [ list of packages] ( https://juliadiff.org )
547
+ for doing AD in Julia. At last count, there were at least ten packages–not
548
+ including ` ReverseAD ` –for reverse-mode AD in Julia. Given this multitude, why
549
+ does MOI maintain another implementation instead of re-using existing tooling?
550
+
551
+ Here are four reasons:
552
+
553
+ * ** Scale and Sparsity:** the types of functions that MOI computes derivatives
554
+ of have two key characteristics: they can be very large scale (10^5 or more
555
+ functions across 10^5 or more variables) and they are very sparse. For large
556
+ problems, it is common for the hessian to have ` O(n) ` non-zero elements
557
+ instead of ` O(n^2) ` if it was dense. To the best of our knowledge,
558
+ ` ReverseAD ` is the only reverse-mode AD system in Julia that handles sparsity
559
+ by default. The lack of sparsity support is _ the_ main reason why we do no
560
+ use a generic package.
561
+ * ** Limited scope:** most other AD packages accept arbitrary Julia functions as
562
+ input and then trace an expression graph using operator overloading. This
563
+ means they must deal (or detect and ignore) with control flow, I/O, and other
564
+ vagaries of Julia. In contrast, ` ReverseAD ` only accepts functions in the
565
+ form of [ ` Nonlinear.NonlinearExpression ` ] ( @ref ) , which greatly limits the
566
+ range of syntax that it must deal with. By reducing the scope of what we
567
+ accept as input to functions relevant for mathematical optimization, we can
568
+ provide a simpler implementation with various performance optimizations.
569
+ * ** Historical:** ` ReverseAD ` started life as [ ReverseDiffSparse.jl] ( https://github.com/mlubin/ReverseDiffSparse.jl ) ,
570
+ development of which begain in early 2014(!). This was well before the other
571
+ packages started development. Because we had a well-tested, working AD in
572
+ JuMP, there was less motivation to contribute to and explore other AD
573
+ packages. The lack of historical interaction also meant that other packages
574
+ were not optimized for the types of problems that JuMP is built for (i.e.,
575
+ large-scale sparse problems). When we first created MathOptInterface, we kept
576
+ the AD in JuMP to simplify the transition, and post-poned the development of
577
+ a first-class nonlinear interface in MathOptInterface.
578
+ * ** Technical debt** Prior to the introduction of ` Nonlinear ` , JuMP's nonlinear
579
+ implementation was a confusing mix of functions and types spread across the
580
+ code base and in the private ` _Derivatives ` submodule. This made it hard to
581
+ swap the AD system for another. The main motivation for refactoring JuMP to
582
+ create the ` Nonlinear ` submodule in MathOptInterface was to abstract the
583
+ interface between JuMP and the AD system, allowing us to swap-in and test new
584
+ AD systems in the future.
0 commit comments