[RL-baseline] Model v5, experiment #1 #43

ziritrion · 2021-04-02T09:47:10Z

Action set #0 was chosen for this experiment:
[0.0, 0.0, 0.0], # no action
[0.0, 0.8, 0.0], # throttle
[0.0, 0.3, 0.0], # throttle
[0.0, 0.0, 0.6], # break
[0.0, 0.0, 0.2], # break
[-0.9, 0.0, 0.0], # left
[-0.5, 0.0, 0.0], # left
[-0.2, 0.0, 0.0], # left
[0.9, 0.0, 0.0], # right
[0.5, 0.0, 0.0], # right
[0.2, 0.0, 0.0], # right

Loss, Entropy and Running Reward were very low between the 10k and 15k episode marks but the model managed to overcome it and ended up with a maximum running reward of 528, which is also the final running reward at the end of the 20k episodes. It's likely that the network could keep improving if we run it for a few more episodes.

Tensorboard captures below:

Sample video below. As with all samples before, the model hasn't managed to learn to break before a sharp turn, but in this example the road appears again by sheer luck and managed to get back in.
https://user-images.githubusercontent.com/1465235/113404852-2128a500-93a9-11eb-8881-05101c3cd0e0.mp4

ziritrion · 2021-04-03T14:51:20Z

Updated the experiment with 10k additional episodes for a total of 30k. We hoped that due to the upward trend in the Running Reward right before 20k episodes we could see an even better value, but this has not been the case.

The Running Reward improved slightly right at the begging but then steadily dropped and never recovered. The final Running Reward value is 250.

Sample video below.
https://user-images.githubusercontent.com/1465235/113481938-a640b680-949c-11eb-916a-5d2aaecf5857.mp4

ziritrion added 4 commits March 31, 2021 12:12

Experiment 1 with action set #0

310005c

20k episodes, running reward 528

8239def

Setup for 30k more episodes

0600141

30k episodes, running reward 250

2f6e319

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RL-baseline] Model v5, experiment #1 #43

[RL-baseline] Model v5, experiment #1 #43

ziritrion commented Apr 2, 2021

ziritrion commented Apr 3, 2021 •

edited

Loading

[RL-baseline] Model v5, experiment #1 #43

Are you sure you want to change the base?

[RL-baseline] Model v5, experiment #1 #43

Conversation

ziritrion commented Apr 2, 2021

ziritrion commented Apr 3, 2021 • edited Loading

ziritrion commented Apr 3, 2021 •

edited

Loading