Skip to content

Commit c3a5b71

Browse files
committed
Use ICD(d) instead of f(d) for inv cumulative dist
We use f(x) throughout the book to mean many different things. In book 3 section 3, we use f(x) to mean the inverse cumulative distribution of x. However, in section 4, we then switch to f(theta, phi) to mean a function that we integrate over the surface of the unit sphere. This appears in sphere_importance.cc in the same place that we had in integrate_x_sq.cc, confusing the two functions to mean very different things. Dimitry Ishenko suggested changing to a more explicit name to indicate the inverse cumulative distribution function, which helps make things much more clear. Resolves #1537
1 parent d653f2a commit c3a5b71

File tree

2 files changed

+33
-31
lines changed

2 files changed

+33
-31
lines changed

books/RayTracingTheRestOfYourLife.html

Lines changed: 31 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1130,13 +1130,13 @@
11301130

11311131
$$ r = \sqrt{4y} $$
11321132

1133-
Which means the inverse of our CDF is defined as
1133+
Which means the inverse of our CDF (which we'll call $ICD(x)$) is defined as
11341134

1135-
$$ P^{-1}(r) = \sqrt{4y} $$
1135+
$$ P^{-1}(r) = \operatorname{ICD}(r) = \sqrt{4y} $$
11361136

11371137
Thus our random number generator with density $p(r)$ can be created with:
11381138

1139-
$$ f(d) = \sqrt{4 \cdot \operatorname{random\_double}()} $$
1139+
$$ \operatorname{ICD}(d) = \sqrt{4 \cdot \operatorname{random\_double}()} $$
11401140

11411141
Note that this ranges from 0 to 2 as we hoped, and if we check our work, we replace
11421142
`random_double()` with $1/4$ to get 1, and also replace with $1/2$ to get $\sqrt{2}$, just as
@@ -1155,7 +1155,8 @@
11551155
The last time that we tried to solve for the integral we used a Monte Carlo approach, uniformly
11561156
sampling from the interval $[0, 2]$. We didn't know it at the time, but we were implicitly using a
11571157
uniform PDF between 0 and 2. This means that we're using a PDF = $1/2$ over the range $[0,2]$, which
1158-
means the CDF is $P(x) = x/2$, so $f(d) = 2d$. Knowing this, we can make this uniform PDF explicit:
1158+
means the CDF is $P(x) = x/2$, so $\operatorname{ICD}(d) = 2d$. Knowing this, we can make this
1159+
uniform PDF explicit:
11591160

11601161
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++
11611162
#include "rtweekend.h"
@@ -1165,7 +1166,7 @@
11651166

11661167

11671168
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ highlight
1168-
double f(double d) {
1169+
double icd(double d) {
11691170
return 2.0 * d;
11701171
}
11711172

@@ -1184,7 +1185,7 @@
11841185

11851186
for (int i = 0; i < N; i++) {
11861187
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ highlight
1187-
auto x = f(random_double());
1188+
auto x = icd(random_double());
11881189
sum += x*x / pdf(x);
11891190
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++
11901191
}
@@ -1199,29 +1200,28 @@
11991200
</div>
12001201

12011202
There are a couple of important things to emphasize. Every value of $x$ represents one sample of the
1202-
function $x^2$ within the distribution $[0, 2]$. We use a function $f$ to randomly select samples
1203-
from within this distribution. We were previously multiplying the average over the interval
1204-
(`sum / N`) times the length of the interval (`b - a`) to arrive at the final answer. Here, we
1205-
don't need to multiply by the interval length--that is, we no longer need to multiply the average
1203+
function $x^2$ within the distribution $[0, 2]$. We use a function $\operatorname{ICD}$ to randomly
1204+
select samples from within this distribution. We were previously multiplying the average over the
1205+
interval (`sum / N`) times the length of the interval (`b - a`) to arrive at the final answer. Here,
1206+
we don't need to multiply by the interval length--that is, we no longer need to multiply the average
12061207
by $2$.
12071208

12081209
We need to account for the nonuniformity of the PDF of $x$. Failing to account for this
12091210
nonuniformity will introduce bias in our scene. Indeed, this bias is the source of our inaccurately
1210-
bright image--if we account for nonuniformity, we will get accurate results. The PDF will "steer"
1211+
bright image. Accounting for the nonuniformity will yield accurate results. The PDF will "steer"
12111212
samples toward specific parts of the distribution, which will cause us to converge faster, but at
12121213
the cost of introducing bias. To remove this bias, we need to down-weight where we sample more
12131214
frequently, and to up-weight where we sample less frequently. For our new nonuniform random number
1214-
generator, the PDF defines how much or how little we sample a specific portion.
1215-
So the weighting function should be proportional to $1/\mathit{pdf}$.
1216-
In fact it is _exactly_ $1/\mathit{pdf}$.
1217-
This is why we divide `x*x` by `pdf(x)`.
1215+
generator, the PDF defines how much or how little we sample a specific portion. So the weighting
1216+
function should be proportional to $1/\mathit{pdf}$. In fact it is _exactly_ $1/\mathit{pdf}$. This
1217+
is why we divide `x*x` by `pdf(x)`.
12181218

1219-
We can try to solve for the integral using the linear PDF $p(r) = \frac{r}{2}$, for which we were
1220-
able to solve for the CDF and its inverse. To do that, all we need to do is replace the functions
1221-
$f = \sqrt{4d}$ and $pdf = x/2$.
1219+
We can try to solve for the integral using the linear PDF, $p(r) = \frac{r}{2}$, for which we were
1220+
able to solve for the CDF and its inverse, ICD. To do that, all we need to do is replace the
1221+
functions $\operatorname{ICD}(d) = \sqrt{4d}$ and $p(x) = x/2$.
12221222

12231223
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++
1224-
double f(double d) {
1224+
double icd(double d) {
12251225
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ highlight
12261226
return std::sqrt(4.0 * d);
12271227
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++
@@ -1243,7 +1243,7 @@
12431243
if (z == 0.0) // Ignore zero to avoid NaNs
12441244
continue;
12451245

1246-
auto x = f(z);
1246+
auto x = icd(z);
12471247
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++
12481248
sum += x*x / pdf(x);
12491249
}
@@ -1290,13 +1290,13 @@
12901290

12911291
and
12921292

1293-
$$ P^{-1}(x) = f(d) = 8d^\frac{1}{3} $$
1293+
$$ P^{-1}(x) = \operatorname{ICD}(d) = 8d^\frac{1}{3} $$
12941294

12951295
<div class='together'>
12961296
For just one sample we get:
12971297

12981298
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++
1299-
double f(double d) {
1299+
double icd(double d) {
13001300
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ highlight
13011301
return 8.0 * std::pow(d, 1.0/3.0);
13021302
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++
@@ -1319,7 +1319,7 @@
13191319
if (z == 0.0) // Ignore zero to avoid NaNs
13201320
continue;
13211321

1322-
auto x = f(z);
1322+
auto x = icd(z);
13231323
sum += x*x / pdf(x);
13241324
}
13251325
std::cout << std::fixed << std::setprecision(12);
@@ -1342,17 +1342,19 @@
13421342
nonuniform PDF is usually called _importance sampling_.
13431343

13441344
In all of the examples given, we always converged to the correct answer of $8/3$. We got the same
1345-
answer when we used both a uniform PDF and the "correct" PDF ($i.e. f(d)=8d^{\frac{1}{3}}$). While
1346-
they both converged to the same answer, the uniform PDF took much longer. After all, we only needed
1347-
a single sample from the PDF that perfectly matched the integral. This should make sense, as we were
1348-
choosing to sample the important parts of the distribution more often, whereas the uniform PDF just
1349-
sampled the whole distribution equally, without taking importance into account.
1345+
answer when we used both a uniform PDF and the "correct" PDF (that is, $\operatorname{ICD}(d) =
1346+
8d^{\frac{1}{3}}$). While they both converged to the same answer, the uniform PDF took much longer.
1347+
After all, we only needed a single sample from the PDF that perfectly matched the integral. This
1348+
should make sense, as we were choosing to sample the important parts of the distribution more often,
1349+
whereas the uniform PDF just sampled the whole distribution equally, without taking importance into
1350+
account.
13501351

13511352
Indeed, this is the case for any PDF that you create--they will all converge eventually. This is
13521353
just another part of the power of the Monte Carlo algorithm. Even the naive PDF where we solved for
13531354
the 50% value and split the distribution into two halves: $[0, \sqrt{2}]$ and $[\sqrt{2}, 2]$. That
13541355
PDF will converge. Hopefully you should have an intuition as to why that PDF will converge faster
1355-
than a pure uniform PDF, but slower than the linear PDF ($i.e. f(d) = \sqrt{4d}$).
1356+
than a pure uniform PDF, but slower than the linear PDF (that is, $\operatorname{ICD}(d) =
1357+
\sqrt{4d}$).
13561358

13571359
The perfect importance sampling is only possible when we already know the answer (we got $P$ by
13581360
integrating $p$ analytically), but it’s a good exercise to make sure our code works.

src/TheRestOfYourLife/integrate_x_sq.cc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
#include <iomanip>
1616

1717

18-
double f(double d) {
18+
double icd(double d) {
1919
return 8.0 * std::pow(d, 1.0/3.0);
2020
}
2121

@@ -34,7 +34,7 @@ int main() {
3434
if (z == 0.0) // Ignore zero to avoid NaNs
3535
continue;
3636

37-
auto x = f(z);
37+
auto x = icd(z);
3838
sum += x*x / pdf(x);
3939
}
4040

0 commit comments

Comments
 (0)