Skip to content

pangoLEARN data release 2021-03-29

Choose a tag to compare

@aineniamh aineniamh released this 31 Mar 07:19
· 214 commits to master since this release

Release notes

  • pangoLEARN trained with downsample. Sequences dowsampled from 382,806 designated seqs to 177,916 seqs, maintaining diversity representing every node in the tree.
  • Downsampling took 2.5 hours: training time went from >20 hours to 3 hours.
  • We're hosting both the designated metadata here as well as the final downsample that went into the training.
  • 1,249 lineages, 1,169 lineages downsampled as they had greater than 10 sequences designated.