Skip to content

MAINT: Fix build failures and execution timeouts #68

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Mar 17, 2021
Merged
5 changes: 5 additions & 0 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@ jobs:
steps:
- checkout

- run:
name: Install deps for building atari-py
command: sudo apt-get install -y cmake ffmpeg

- run:
name: Install Python dependencies
command: |
Expand All @@ -25,6 +29,7 @@ jobs:

- run:
name: Build site
no_output_timeout: 30m
command: |
source venv/bin/activate
# n = nitpicky (broken links), W = warnings as errors,
Expand Down
2 changes: 1 addition & 1 deletion content/pairing.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ output.
> supports a variety of restructured text directives. These Sphinx
> markdown directives will render when NumPy tutorials are built into a
> static website, but they will show up as raw code when you open in
> Jupyter locally or on [Binder](mybinder.org).
> Jupyter locally or on [Binder](https://mybinder.org).

Consider these two versions of the same __Simple notebook example__. You
have three things in the notebooks:
Expand Down
65 changes: 39 additions & 26 deletions content/tutorial-deep-learning-on-mnist.md
Original file line number Diff line number Diff line change
Expand Up @@ -201,12 +201,18 @@ print('The data type of training images: {}'.format(x_train.dtype))
print('The data type of test images: {}'.format(x_test.dtype))
```

**2.** Normalize the arrays by dividing them by 255 (and thus promoting the data type from `uint8` to `float64`) and then assign the train and test image data variables — `x_train` and `x_test` — to `training_images` and `train_labels`, respectively. To make the neural network model train faster in this example, `training_images` contains only 1,000 samples out of 60,000. To learn from the entire sample size, change the `sample` variable to `60000`.
**2.** Normalize the arrays by dividing them by 255 (and thus promoting the data type from `uint8` to `float64`) and then assign the train and test image data variables — `x_train` and `x_test` — to `training_images` and `train_labels`, respectively.
To reduce the model training and evaluation time in this example, only a subset
of the training and test images will be used.
Both `training_images` and `test_images` will contain only 1,000 samples each out
of the complete datasets of 60,000 and 10,000 images, respectively.
These values can be controlled by changing the `training_sample` and
`test_sample` below, up to their maximum values of 60,000 and 10,000.

```{code-cell} ipython3
sample = 1000
training_images = x_train[0:sample] / 255
test_images = x_test / 255
training_sample, test_sample = 1000, 1000
training_images = x_train[0:training_sample] / 255
test_images = x_test[0:test_sample] / 255
```

**3.** Confirm that the image data has changed to the floating-point format:
Expand Down Expand Up @@ -257,8 +263,8 @@ def one_hot_encoding(labels, dimension=10):
**3.** Encode the labels and assign the values to new variables:

```{code-cell} ipython3
training_labels = one_hot_encoding(y_train)
test_labels = one_hot_encoding(y_test)
training_labels = one_hot_encoding(y_train[:training_sample])
test_labels = one_hot_encoding(y_test[:test_sample])
```

**4.** Check that the data type has changed to floating point:
Expand Down Expand Up @@ -405,6 +411,8 @@ weights_2 = 0.2 * np.random.random((hidden_size, num_labels)) - 0.1
```

**5.** Set up the neural network's learning experiment with a training loop and start the training process.
Note that the model is evaluated against the test set at each epoch to track
its performance over the training epochs.

Start the training process:

Expand All @@ -419,6 +427,11 @@ store_test_accurate_pred = []
# This is a training loop.
# Run the learning experiment for a defined number of epochs (iterations).
for j in range(epochs):

#################
# Training step #
#################

# Set the initial loss/error and the number of accurate predictions to zero.
training_loss = 0.0
training_accurate_predictions = 0
Expand Down Expand Up @@ -467,32 +480,32 @@ for j in range(epochs):
store_training_loss.append(training_loss)
store_training_accurate_pred.append(training_accurate_predictions)

# Evaluate on the test set:
# 1. Set the initial error and the number of accurate predictions to zero.
test_loss = 0.0
test_accurate_predictions = 0

# 2. Start testing the model by evaluating on the test image dataset.
for i in range(len(test_images)):
# 1. Pass the test images through the input layer.
layer_0 = test_images[i]
# 2. Compute the weighted sum of the test image inputs in and
# pass the hidden layer's output through ReLU.
layer_1 = relu(np.dot(layer_0, weights_1))
# 3. Compute the weighted sum of the hidden layer's inputs.
# Produce a 10-dimensional vector with 10 scores.
layer_2 = np.dot(layer_1, weights_2)
###################
# Evaluation step #
###################

# Evaluate model performance on the test set at each epoch.

# Unlike the training step, the weights are not modified for each image
# (or batch). Therefore the model can be applied to the test images in a
# vectorized manner, eliminating the need to loop over each image
# individually:

results = relu(test_images @ weights_1) @ weights_2

# Measure the error between the actual label (truth) and prediction values.
test_loss = np.sum((test_labels - results)**2)

# 4. Measure the error between the actual label (truth) and prediction values.
test_loss += np.sum((test_labels[i] - layer_2) ** 2)
# 5. Increment the accurate prediction count.
test_accurate_predictions += int(np.argmax(layer_2) == np.argmax(test_labels[i]))
# Measure prediction accuracy on test set
test_accurate_predictions = np.sum(
np.argmax(results, axis=1) == np.argmax(test_labels, axis=1)
)

# Store test set losses and accurate predictions.
store_test_loss.append(test_loss)
store_test_accurate_pred.append(test_accurate_predictions)

# 3. Display the error and accuracy metrics in the output.
# Summarize error and accuracy metrics at each epoch
print("\n" + \
"Epoch: " + str(j) + \
" Training set error:" + str(training_loss/ float(len(training_images)))[0:5] +\
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ This tutorial demonstrates how to implement a deep reinforcement learning (RL) a

Pong is a 2D game from 1972 where two players use "rackets" to play a form of table tennis. Each player moves the racket up and down the screen and tries to hit a ball in their opponent's direction by touching it. The goal is to hit the ball such that it goes past the opponent's racket (they miss their shot). According to the rules, if a player reaches 21 points, they win. In Pong, the RL agent that learns to play against an opponent is displayed on the right.

<center><img src="../../../content/tutorial-deep-reinforcement-learning-with-pong-from-pixels.png" width="800", hspace="20" vspace="20"></center>
![pong_rl](tutorial-deep-reinforcement-learning-with-pong-from-pixels.png)

This example is based on the [code](https://gist.github.com/karpathy/a4166c7fe253700972fcbc77e4ea32c5) developed by [Andrej Karpathy](https://karpathy.ai) for the [Deep RL Bootcamp](https://sites.google.com/view/deep-rl-bootcamp/home) in 2017 at UC Berkeley. His [blog post](http://karpathy.github.io/2016/05/31/rl/) from 2016 also provides more background on the mechanics and theory used in Pong RL.

Expand Down Expand Up @@ -51,7 +51,7 @@ This tutorial can also be run locally in an isolated environment, such as [Virtu
3. Create the policy (the neural network) and the forward pass
4. Set up the update step (backpropagation)
5. Define the discounted rewards (expected return) function
6. Train the agent for 100 episodes
6. Train the agent for 3 episodes
7. Next steps
8. Appendix
- Notes on RL and deep RL
Expand Down Expand Up @@ -480,18 +480,18 @@ The pseudocode for the policy gradient method for Pong:

- Maximize the probability of actions that lead to high rewards.

<center><img src="../../../content/tutorial-deep-reinforcement-learning-with-pong-from-pixels.png" width="800", hspace="20" vspace="20"></center>
![pong_rl](tutorial-deep-reinforcement-learning-with-pong-from-pixels.png)

You can stop the training at any time or/and check saved MP4 videos of saved plays on your disk in the `/video` directory. You can set the maximum number of episodes that is more appropriate for your setup.

+++ {"id": "gD6XBqUqfNOV"}

1. For demo purposes, let's limit the number of episodes for training to 10. If you are using hardware acceleration (CPUs and GPUs), you can increase the number to 1,000 or beyond. For comparison, Andrej Karpathy's original experiment took about 8,000 episodes.
1. For demo purposes, let's limit the number of episodes for training to 3. If you are using hardware acceleration (CPUs and GPUs), you can increase the number to 1,000 or beyond. For comparison, Andrej Karpathy's original experiment took about 8,000 episodes.

```{code-cell} ipython3
:id: TdRXrc37Rfvo

max_episodes = 10
max_episodes = 3
```

+++ {"id": "ORj7JFGB0Gy8"}
Expand All @@ -503,7 +503,7 @@ max_episodes = 10
```{code-cell} ipython3
:id: eKLLYUKbG-5A

batch_size = 10
batch_size = 3
learning_rate = 1e-4
```

Expand Down
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ pytest
nbval
statsmodels
imageio
gym[atari]
2 changes: 1 addition & 1 deletion site/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store', 'notebooks']

# MyST-NB configuration
execution_timeout = 600
execution_timeout = 900


# -- Options for HTML output -------------------------------------------------
Expand Down
14 changes: 14 additions & 0 deletions site/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,20 @@ used in the main NumPy documentation has two reasons:

[rst]: https://www.sphinx-doc.org/en/master/usage/restructuredtext/index.html

#### Note

You may notice our content is in markdown format (`.md` files). We review and
host notebooks in the [MyST-NB](https://myst-nb.readthedocs.io/) format. We
accept both Jupyter notebooks (`.ipynb`) and MyST-NB notebooks (`.md`).
If you want to sync your `.ipynb` to your `.md` file follow the [pairing
tutorial](content/pairing.md).

```{toctree}
:hidden:

content/pairing
```

### Adding your own tutorials

If you have your own tutorial in the form of a Jupyter notebook (an `.ipynb`
Expand Down