Skip to content

Commit 762f34c

Browse files
authored
Merge pull request dennybritz#102 from sstarzycki/patch-1
Update description of env.P[s][a]
2 parents e7085b2 + 2b576bd commit 762f34c

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

DP/Policy Evaluation.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@
4141
" Args:\n",
4242
" policy: [S, A] shaped matrix representing the policy.\n",
4343
" env: OpenAI env. env.P represents the transition probabilities of the environment.\n",
44-
" env.P[s][a] is a (prob, next_state, reward, done) tuple.\n",
44+
" env.P[s][a] is a list of transition tuples (prob, next_state, reward, done).\n",
4545
" theta: We stop evaluation once our value function change is less than theta for all states.\n",
4646
" discount_factor: gamma discount factor.\n",
4747
" \n",

0 commit comments

Comments
 (0)