You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+48-39
Original file line number
Diff line number
Diff line change
@@ -27,69 +27,39 @@ There are 4 parts to the package:
27
27
28
28
2)**Complex Environment Wrappers**: Similar to the toy environment, this is parameterised by a `config` dict which contains all the information needed to inject the dimensions into Atari or Mujoco environments. Please see [`example.py`](example.py) for some simple examples of how to use these. The Atari wrapper is in [`mdp_playground/envs/gym_env_wrapper.py`](mdp_playground/envs/gym_env_wrapper.py) and the Mujoco wrapper is in [`mdp_playground/envs/mujoco_env_wrapper.py`](mdp_playground/envs/mujoco_env_wrapper.py).
29
29
30
-
3)**Experiments**: Experiments are launched using [`run_experiments.py`](run_experiments.py). Config files for experiments are located inside the [`experiments`](experiments) directory. Please read the [instructions](#running-experiments) below for details.
30
+
3)**Experiments**: Experiments are launched using [`run_experiments.py`](run_experiments.py). Config files for experiments are located inside the [`experiments`](experiments) directory. Please read the [instructions](#running-experiments) below for details on how to launch experiments.
31
31
32
32
4)**Analysis**: [`plot_experiments.ipynb`](plot_experiments.ipynb) contains code to plot the standard plots from the paper.
33
33
34
-
## Installation
35
-
36
-
### Production use
37
-
We recommend using `conda` to manage environments. After setup of the environment, you can install MDP Playground in two ways:
38
-
#### Manual
39
-
To install MDP Playground manually, clone the repository and run:
40
-
```bash
41
-
pip install -e .[extras]
42
-
```
43
-
This might be the preferred way if you want easy access to the included experiments.
44
34
45
-
#### From PyPI
46
-
MDP Playground is also on PyPI. Just run:
47
-
```bash
48
-
pip install mdp_playground[extras]
49
-
```
35
+
## Running experiments from the main paper
36
+
For reproducing experiments from the main paper, please continue reading.
50
37
38
+
For general instructions, please see [here](#installation).
51
39
52
-
### Reproducing results from the paper
53
-
We recommend using `conda` environments to manage virtual `Python` environments to run the experiments. Unfortunately, you will have to maintain 2 environments - 1 for the "older" **discrete toy** experiments and 1 for the "newer" **continuous and complex** experiments from the paper. As mentioned in Appendix P in the paper, this is because of issues with Ray, the library that we used for our baseline agents.
40
+
### Installation for running experiments from the main paper
41
+
We recommend using `conda` environments to manage virtual `Python` environments to run the experiments. Unfortunately, you will have to maintain 2 environments - 1 for the "older" **discrete toy** experiments and 1 for the "newer" **continuous and complex** experiments from the paper. As mentioned in Appendix section **Tuned Hyperparameters** in the paper, this is because of issues with Ray, the library that we used for our baseline agents.
54
42
55
43
Please follow the following commands to install for the discrete toy experiments:
56
44
```bash
57
45
conda create -n py36_toy_rl_disc_toy python=3.6
58
46
conda activate py36_toy_rl_disc_toy
59
47
cd mdp-playground
48
+
pip install -r requirements.txt
60
49
pip install -e .[extras_disc]
61
50
```
62
51
63
-
Please follow the following commands to install for the continuous and complex experiments:
52
+
Please follow the following commands to install for the continuous and complex experiments. **IMPORTANT**: In case, you do not have MuJoCo, please ignore any mujoco-py related installation errors below:
The `exp_name` is a prefix for the filenames of CSV files where stats for the experiments are recorded. The CSV stats files will be saved to the current directory.<br>
83
-
Each of the command line arguments has defaults. Please refer to the documentation inside [`run_experiments.py`](run_experiments.py) for further details on the command line arguments. (Or run it with the `-h` flag to bring up help.)
84
-
85
-
The config files for experiments from the [paper](https://arxiv.org/abs/1909.07750) are in the experiments directory.<br>
86
-
The name of the file corresponding to an experiment is formed as: `<algorithm_name>_<dimension_names>.py`<br>
87
-
Some sample `algorithm_name`s are: `dqn`, `rainbow`, `a3c`, `a3c_lstm`, `ddpg`, `td3` and `sac`<br>
88
-
Some sample `dimension_name`s are: `seq_del` (for **delay** and **sequence length** varied together), `p_r_noises` (for **P** and **R noises** varied together),
For example, for algorithm **DQN** when varying dimensions **delay** and **sequence length**, the corresponding experiment file is [`dqn_seq_del.py`](experiments/dqn_seq_del.py)
91
-
92
-
## Running experiments from the main paper
93
63
We list here the commands for the experiments from the main paper:
# In case, you want to parallelise on a cluster, please provide the CLI argument -n <config_number> at the end of the given commands. Please refer to the documentation for run_experiments.py for this.
The `exp_name` is a prefix for the filenames of CSV files where stats for the experiments are recorded. The CSV stats files will be saved to the current directory.<br>
122
+
Each of the command line arguments has defaults. Please refer to the documentation inside [`run_experiments.py`](run_experiments.py) for further details on the command line arguments. (Or run it with the `-h` flag to bring up help.)
123
+
124
+
The config files for experiments from the [paper](https://arxiv.org/abs/1909.07750) are in the experiments directory.<br>
125
+
The name of the file corresponding to an experiment is formed as: `<algorithm_name>_<dimension_names>.py`<br>
126
+
Some sample `algorithm_name`s are: `dqn`, `rainbow`, `a3c`, `a3c_lstm`, `ddpg`, `td3` and `sac`<br>
127
+
Some sample `dimension_name`s are: `seq_del` (for **delay** and **sequence length** varied together), `p_r_noises` (for **P** and **R noises** varied together),
For example, for algorithm **DQN** when varying dimensions **delay** and **sequence length**, the corresponding experiment file is [`dqn_seq_del.py`](experiments/dqn_seq_del.py)
130
+
131
+
The CSV stats files will be saved to the current directory and can be analysed in [`plot_experiments.ipynb`](plot_experiments.ipynb).
132
+
124
133
## Plotting
125
134
To plot results from experiments, run `jupyter-notebook` and open [`plot_experiments.ipynb`](plot_experiments.ipynb) in Jupyter. There are instructions within each of the cells on how to generate and save plots.
0 commit comments