We view Transformers and RNNs as sequence-to-sequence maps and compare their ability to permute their input sequence of tokens. We find that without positional encoding, Transformers perform worse than RNNs when attempting to learn an unknown permutation transformation from labeled training data. However, the opposite is true once positional encodings are incorporated.
-
Notifications
You must be signed in to change notification settings - Fork 0
vinodkraman/Learning-Permutations
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Comparing Transformers and RNNs in their ability to learn permutations.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published