MultiHeadAttention parameter setting #180

Open

Open

MultiHeadAttention parameter setting#180

Assignees

Labels

Is the output linear layer parameter of the MultiHeadAttention class incorrectly set in mha.py file? in_features should be heads*d_k?

Metadata

Assignees

vpj

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests