This folder contains my implementations of RL algorithms, exercises and examples from Reinforcement Learning: An Introduction (Sutton & Barto), primarily as runnable Jupyter notebooks and Python scripts.
- Chapter 2 — Multi-armed Bandits
- Chapter 4 — Dynamic Programming
- Chapter 5 — Monte Carlo
- Chapter 6 — Temporal-Difference Learning
- Chapter 7 — n-step Temporal-Difference Learning
The files/ directory contains a few representative figures generated by the notebooks.
ch_2_Bandits.ipynbch_4_DP_p1_grid_problem.ipynbch_4_DP_p2_car_rental.ipynbch_4_DP_p3_gambler.ipynbch_5_MC_p1_racetrack.ipynbch_6_TD_p1_random_walk.ipynbch_6_TD_p2_windy_gridworld.ipynbch_6_TD_p3_cliff_walking.ipynbch_7_ns_TD_p1_random_walk.ipynbch_7_ns_TD_p2_windy_gridworld.ipynb
toc.py is a small helper script to keep a lightweight table-of-contents for this folder.
- Open any notebook in VS Code or Jupyter and run cells top-to-bottom.
- If you use a virtual environment, install the usual scientific stack (e.g.,
numpy,matplotlib).



