Extended version of DDPG_baseline for HER implementation. Extended features:
- Gym-style Gazebo-ROS environment (supports goal observation)
- Exponential moving average normalizer for observations
- Rolloout-train type training sequence Modified author: [email protected]