-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Copy task - Learning on input of length 1 #4
Comments
Man this is so exciting! Le lundi 21 septembre 2015, Tristan Deleu [email protected] a
Adrien Ball+33 (0) 6 70 87 57 78 |
Nice! -- On 22 Sep 2015 at 09:08:17, Adrien Ball ([email protected]) wrote: Man this is so exciting! Le lundi 21 septembre 2015, Tristan Deleu [email protected] a
Adrien Ball+33 (0) 6 70 87 57 78 |
As suggested by @adrienball, I ran an experiment to learn the NTM on only length one inputs to see if it could already learn such a simple behavior (even if it overfits). The NTM successfully recovered the length one inputs:
When I tested this trained NTM on longer inputs, it consistently failed at recovering the whole sequences (as expected, due to the lack of variety in the input lengths), but generally succeeded to remember the first vector. However some interesting patterns emerged:
Parameters of the experiment
learning_rate=1e-3
(other parameters left as is from his previous paper)[add, key, beta]
, 1 + ReLu forgamma
, sigmoid for[gate, dense_output]
, softmax forshift
Learning curve
Gray: Cost function, Red: Moving average of the cost function over 500 iterations
The text was updated successfully, but these errors were encountered: