You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
These sequences are mapping to Melanogaster (SH00009.10FU).
Maybe they are not 'classifiable' and therefore map to the first close sequence in the DB file? (Melanogaster ref is the 9th sequence in the reference file of 159189 sequences).
Melanogaster is definitely not the closest match in the DB.
The previous version I used (v2.21) resulted in these sequences not being classified
The text was updated successfully, but these errors were encountered:
I am unable to reproduce the problem with the given information. Could you please provide the sequences of some of the query sequences in refs.fasta that you used, e.g. Laccaria amethystina ITS. Could you please also indicate exactly which database file you have used, as I cannot find the SH0000009.10FU sequence in the UNITE version 10 files available at https://unite.ut.ee/repository.php.
It is UNITE version 10, mutated so that the SH codes are in place of the species name.
Added to the UNITE10.fasta file are a few of my own reference sequences (those that did not map to any SH in a previous round of SINTAX classification.), I've given them fake SH codes for easier downstream string manipulation (SH0*[1-9].09FU).
Attached are part of refs.fasta and the added sequences to UNITE10 that show the mutated formatting.
Sorry, but I am still unable to reproduce the results you get. Are you sure the input and database files are properly formatted? I've performed several tests with your sequences that all give reasonable results.
Could you try to make a tiny example (as small as possible) that still gives the wrong results, and present the exact files used and the exact command line?
command:
$ vsearch --db UNITE10.fasta --sintax refs.fasta --tabbedout refs_sintaxonomy.tsv --sintax_cutoff 0.8 --sintax_random
vsearch v2.28.1_linux_x86_64, 251.2GB RAM, 96 cores
produces this output for the first 5 sequences (these are not closely related):
<style> </style>These sequences are mapping to Melanogaster (SH00009.10FU).
Maybe they are not 'classifiable' and therefore map to the first close sequence in the DB file? (Melanogaster ref is the 9th sequence in the reference file of 159189 sequences).
Melanogaster is definitely not the closest match in the DB.
The previous version I used (v2.21) resulted in these sequences not being classified
The text was updated successfully, but these errors were encountered: