Skip to content

Commit e4caa7f

Browse files
committed
Fix wrapper comment
1 parent b931b39 commit e4caa7f

File tree

1 file changed

+7
-4
lines changed

1 file changed

+7
-4
lines changed

src/reward_preprocessing/procgen.py

+7-4
Original file line numberDiff line numberDiff line change
@@ -64,15 +64,18 @@ def register_procgen_envs(
6464

6565
class ProcgenFinalObsWrapper(gym.Wrapper):
6666
"""Returns the final observation of gym3 procgen environment, correcting for the
67-
implicit reset.
67+
fact that Procgen gym environments return the second-to-last observation again
68+
instead of the final observation.
69+
6870
Only works correctly when the 'done' signal coincides with the end of an episode
6971
(which is not the case when using e.g. the seals AutoResetWrapper).
7072
Requires the use of the PavelCz/procgenAISC fork, which adds the 'final_obs' value.
7173
7274
Since procgen builds on gym3, it always resets the environment after a terminal
73-
state. The 'obs' returned will then be the first observation of the next episode.
74-
In our fork of procgen, we save the last observation of the terminated episode in
75-
the info dict.
75+
state. The final 'obs' returned when done==True will be the obs that was already
76+
returned in the previous step. In our fork of procgen, we save the true last
77+
observation of the terminated episode in the info dict. This wrapper extracts that
78+
obs and returns it.
7679
"""
7780

7881
def step(self, action):

0 commit comments

Comments
 (0)