Skip to content

Commit 050fcdb

Browse files
committed
Improve content for automatic differentiation page
1 parent 564b74c commit 050fcdb

File tree

1 file changed

+48
-33
lines changed

1 file changed

+48
-33
lines changed

_pages/automatic_differentiation.md

+48-33
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,32 @@
11
---
22
title: "Compiler Research Research Areas"
33
layout: gridlay
4-
excerpt: "Automatic differentiation (AD) is a powerful technique for evaluating the
5-
derivatives of mathematical functions in C++, offering significant advantages
6-
over traditional differentiation methods."
4+
excerpt: "Automatic Differentiation (AD) is a general and powerful technique
5+
of computing partial derivatives (or the complete gradient) of a function inputted as a
6+
computer program."
77
sitemap: true
88
permalink: /automatic_differentiation
99
---
1010

1111
## Automatic differentiation
1212

13-
Automatic differentiation (AD) is a technique for evaluating the derivatives
14-
of mathematical functions in C++, offering a number of advantages over
15-
traditional differentiation methods. By leveraging the principles of Automatic
16-
Differentiation, programmers can efficiently calculate partial derivatives of
17-
functions, opening up a range of applications in scientific computing and
18-
machine learning.
13+
Automatic Differentiation (AD) is a general and powerful technique
14+
of computing partial derivatives (or the complete gradient) of a function inputted as a
15+
computer program.
16+
17+
Automatic Differentiation takes advantage of the fact that any computation can
18+
be represented as a composition of simple operations / functions - this is
19+
generally represented in a graphical format and referred to as the [compuation
20+
graph](https://colah.github.io/posts/2015-08-Backprop/). AD works by repeated
21+
application of chain rule over this graph.
1922

2023
### Understanding Differentiation in Computing
2124

22-
Differentiation in calculus is the process of finding the rate of change of
23-
one quantity with respect to another. There are several principles and
24-
formulas for differentiation, such as Sum Rule, Product Rule, Quotient Rule,
25-
Constant Rule, and Chain Rule. For Automatic Differentiation, the Chain Rule
26-
of differential calculus is of special interest.
25+
Efficient computation of gradients is a crucial requirement in the fields of
26+
scientific computing and machine learning, where approaches like
27+
[Gradient Descent](https://en.wikipedia.org/wiki/Gradient_descent)
28+
are used to iteratively converge over the optimum parameters of a mathematical
29+
model.
2730

2831
Within the context of computing, there are various methods for
2932
differentiation:
@@ -34,22 +37,34 @@ differentiation:
3437

3538
- **Numerical Differentiation**: This method approximates the derivatives
3639
using finite differences. It is relatively simple to implement, but can
37-
suffer from numerical instability and inaccuracy in its results.
40+
suffer from numerical instability and inaccuracy in its results. It doesn't
41+
scale well with the number of inputs of the function.
3842

3943
- **Symbolic Differentiation**: This approach uses symbolic manipulation to
4044
compute derivatives analytically. It provides accurate results but can lead to
41-
lengthy expressions for large computations. It is limited to closed-form
42-
expressions; that is, it cannot process the control flow.
43-
44-
- **Automatic Differentiation (AD)**: Automatic Differentiation is a highly
45-
efficient technique that computes derivatives of mathematical functions by
46-
applying differentiation rules to every arithmetic operation in the code.
47-
Automatic Differentiation can be used in two modes:
48-
49-
- Forward Mode: calculates derivatives with respect to a single variable, and
50-
51-
- Reverse Mode: calculates gradients with respect to all inputs
52-
simultaneously.
45+
lengthy expressions for large computations. It requires the computer program to
46+
be representable in a closed form mathematical expression, and thus doesn't work
47+
well with control flow scenarios (if conditions and loops) in the program.
48+
49+
- **Automatic Differentiation (AD)**: Automatic Differentiation is a general and
50+
efficient technique that works by repeated application of chain rule over the
51+
computation graph of the program. Given its composable nature, it can easily scale
52+
for computing gradients over a very large number of inputs.
53+
54+
### Forward and Reverse mode AD
55+
Automatic Differentiation works by applying chain rule and merging the derivatives
56+
at each node of the computation graph. The direction of this graph traversal and
57+
derivative accumulation results in two modes of operation:
58+
59+
- Forward Mode: starts at an input to the graph and moves towards all the output nodes.
60+
For every node, it sums all the paths feeding in. By adding them up, we get the total
61+
way in which the node is affected by the input. Hence, it calculates derivatives of output(s)
62+
with respect to a single input variable.
63+
64+
- Reverse Mode: starts at the output node of graph and moves backward towards all
65+
the input nodes. For every node, it merges all paths which originated at that node.
66+
It tracks how every node affects one output. Hence, it calculates derivative of a single
67+
output with respect to all inputs simultaneously - the gradient.
5368

5469
### Automatic Differentiation in C++
5570

@@ -61,10 +76,10 @@ compile time.
6176

6277
[The source code transformation approach] enables optimization by retaining
6378
all the complex knowledge of the original source code. The compute graph is
64-
constructed before compilation and then transformed and compiled. It typically
65-
uses a custom parser to build code representation and produce the transformed
66-
code. It is difficult to implement (especially in C++), but it is very
67-
efficient, since many computations and optimizations are done ahead of time.
79+
constructed during compilation and then transformed to generate the derivative
80+
code. It typically uses a custom parser to build code representation and produce
81+
the transformed code. It is difficult to implement (especially in C++), but it is
82+
very efficient, since many computations and optimizations are done ahead of time.
6883

6984
### Advantages of using Automatic Differentiation
7085

@@ -76,7 +91,7 @@ efficient, since many computations and optimizations are done ahead of time.
7691
- It can take derivatives of algorithms involving conditionals, loops, and
7792
recursion.
7893

79-
- It works without generating inefficiently long expressions.
94+
- It can be easily scaled for functions with very large number of inputs.
8095

8196
### Automatic Differentiation Implementation with Clad - a Clang Plugin
8297

@@ -88,7 +103,7 @@ It is implemented as a plugin for the Clang compiler.
88103

89104
[Clad] operates on Clang AST (Abstract Syntax Tree) and is capable of
90105
performing C++ Source Code Transformation. When Clad is given the C++ source
91-
code of a mathematical function, it can automatically generate C++ code for
106+
code of a mathematical function, it can algorithmically generate C++ code for
92107
the computing derivatives of that function. Clad has comprehensive coverage of
93108
the latest C++ features and a well-rounded fallback and recovery system in
94109
place.

0 commit comments

Comments
 (0)