File tree 2 files changed +4
-4
lines changed
2 files changed +4
-4
lines changed Original file line number Diff line number Diff line change 50
50
" Args:\n " ,
51
51
" Q: A dictionary that maps from state -> action-values.\n " ,
52
52
" Each value is a numpy array of length nA (see below)\n " ,
53
- " epsilon: The probability to select a random action . float between 0 and 1.\n " ,
53
+ " epsilon: The probability to select a random action. Float between 0 and 1.\n " ,
54
54
" nA: Number of actions in the environment.\n " ,
55
55
" \n " ,
56
56
" Returns:\n " ,
82
82
" num_episodes: Number of episodes to run for.\n " ,
83
83
" discount_factor: Gamma discount factor.\n " ,
84
84
" alpha: TD learning rate.\n " ,
85
- " epsilon: Chance the sample a random action. Float betwen 0 and 1.\n " ,
85
+ " epsilon: Chance to sample a random action. Float between 0 and 1.\n " ,
86
86
" \n " ,
87
87
" Returns:\n " ,
88
88
" A tuple (Q, episode_lengths).\n " ,
Original file line number Diff line number Diff line change 49
49
" Args:\n " ,
50
50
" Q: A dictionary that maps from state -> action-values.\n " ,
51
51
" Each value is a numpy array of length nA (see below)\n " ,
52
- " epsilon: The probability to select a random action . float between 0 and 1.\n " ,
52
+ " epsilon: The probability to select a random action. Float between 0 and 1.\n " ,
53
53
" nA: Number of actions in the environment.\n " ,
54
54
" \n " ,
55
55
" Returns:\n " ,
81
81
" num_episodes: Number of episodes to run for.\n " ,
82
82
" discount_factor: Gamma discount factor.\n " ,
83
83
" alpha: TD learning rate.\n " ,
84
- " epsilon: Chance the sample a random action. Float betwen 0 and 1.\n " ,
84
+ " epsilon: Chance to sample a random action. Float between 0 and 1.\n " ,
85
85
" \n " ,
86
86
" Returns:\n " ,
87
87
" A tuple (Q, episode_lengths).\n " ,
You can’t perform that action at this time.
0 commit comments