Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An extra dropout layer? #5

Open
VincentPisztora opened this issue Nov 7, 2024 · 0 comments
Open

An extra dropout layer? #5

VincentPisztora opened this issue Nov 7, 2024 · 0 comments

Comments

@VincentPisztora
Copy link

Hi there - thank you for posting this repository - it has been very helpful. In the model.py file, if I'm reading it right, there are two back-to-back dropout layers in the TransformerBlock call. The first is the last layer of the "mlp" layer (line 67) called on line 82 and the second is the "dropout2" layer (line 73) which is called on line 83. Should one of the two be removed? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant