Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doubt on reset session_level init Hs[0] #11

Open
gescobedo opened this issue Dec 10, 2018 · 3 comments
Open

Doubt on reset session_level init Hs[0] #11

gescobedo opened this issue Dec 10, 2018 · 3 comments

Comments

@gescobedo
Copy link

In the initialization of session_level you use a mask for resetting the hidden state of the cell for some items of the batch and generate h_s = Hs[0] * (1 - Sstart[:, None]) + h_s_init * Sstart[:, None] but then you use the same old state Hs[0] to calculate the final output state of the current cell. Why is Hs[0] used instead of h_s in h_s = (1.0 - z) * Hs[0] + z * h_s_candidate.T?. Thanks in advance.

@mquad
Copy link
Owner

mquad commented Dec 13, 2018

Hi @gescobedo , first thanks for carefully checking the code.
I double checked the code and this looks like an actual bug that slipped through the cracks.
Effectively, while h_s_candidate is computed correctly, Hs[0] will still contain the previous session value, thus invalidating the final update of h_s through the GRU update gate.
I will work on a patch and rerun some experiments to see what's the impact of this issue as soon as I can.

@mquad
Copy link
Owner

mquad commented Dec 31, 2018

Hi @gescobedo , I just pushed a fix for the initialization issue.
I could test small nets (100+100), init version, only over XING for the moment. These are the results, averaged over 10 random seeds. Original refers to the previous (buggy) version, where Fixed is the new one.

XING INIT
RECALL MRR
Seed Original Fixed Original Fixed
0 0,1320 0,1324 0,0823 0,0829
1 0,1353 0,1334 0,0850 0,0833
2 0,1348 0,1331 0,0840 0,0824
3 0,1339 0,1338 0,0842 0,0827
4 0,1315 0,1337 0,0810 0,0821
5 0,1333 0,1325 0,0825 0,0829
6 0,1343 0,1339 0,0844 0,0838
7 0,1331 0,1325 0,0836 0,0833
8 0,1338 0,1340 0,0837 0,0827
9 0,1332 0,1315 0,0834 0,0809
MEAN 0,1335 0,1331 0,0834 0,0827
STD 0,0012 0,0008 0,0012 0,0008

As you may notice there's a small drop in peformance, albeit not stat. sig.
The hyperparameters of the Fixed version, computed after 100 runs of random search, are:

--learning_rate 0.1 --momentum 0.1 --batch_size 100 \
--dropout_p_hidden_usr 0.1 \
--dropout_p_hidden_ses 0.2 \
--dropout_p_init 0.2 \

I'll let you know when I have more data on this issue.

@gescobedo
Copy link
Author

Hi @mquad thanks for checking this issue so quickly, it is good to know that it does not have a big drop in performance. I will keep an eye on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants