Char01 DQN

chenyingyinglalala

Nov 3, 2019

2ad8b9f · Nov 3, 2019

Name	Name	Last commit message	Last commit date
parent directory ..
DQN/pic	DQN/pic	add	Nov 3, 2019
DQN.py	DQN.py	add	Nov 3, 2019
DQN_CartPole-v0.py	DQN_CartPole-v0.py	add	Nov 3, 2019
DQN_MountainCar-v0.py	DQN_MountainCar-v0.py	add	Nov 3, 2019
DQN_mountain_car_v1.py	DQN_mountain_car_v1.py	add	Nov 3, 2019
naiveDQN.py	naiveDQN.py	add	Nov 3, 2019
readme.md	readme.md	add	Nov 3, 2019

readme.md

Requirment：

tensorflow 1.10
pytorch 4.1
tensorboardX
gym

Tips for MountainCar-v0 env:

This is very sparse for MountainCar-v0, it is 0 at the beginning, only when the top of the mountain is 1, there is a reward. This leads to the fact that if the sample to the top of the mountain is not taken during training, basically the train will not come out. So you can change the reward, for example, to change to the current position of the Car is positively related. Of course, there is a more advanced approach to inverse reinforcement learning (using GAN).

This is value loss for DQN, We can see that the loss increaded to 1e13 however, the network work well. This is because the training is going on, the target_net and act_net are very different, so the calculated loss becomes very large. The previous loss was small because the reward was very sparse, resulting in a small update of the two networks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

Char01 DQN

Char01 DQN

readme.md

Requirment：

Tips for MountainCar-v0 env:

Files

Char01 DQN

Directory actions

More options

Directory actions

More options

Latest commit

History

Char01 DQN

Folders and files

parent directory

readme.md

Requirment：

Tips for MountainCar-v0 env: