-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sam2conseq script will report "N" instead of majority consensus #17
Comments
Thanks for reporting this @ewong347 - however, I can't reproduce this issue. |
Weird- I just tried to reproduce this issue as well with no luck. When I re-ran the pipeline on the sequence, the new alignment also included more NT's on the 5' end of the sequence. Might just have been a hiccup on my end. I think we can close this issue for now. I'll keep an eye out to see if I notice anything like this for the final couple sequences |
Well hiccups are not OK! |
I used the workflow on the main page for sam2conseq: fasterq-dump SRRxxxx cutadapt -q 20,20 -a CTGTCTCTTATACACATCT -o SRRxxxx.trim.fastq SRRxxxx.fastq bowtie2 -x NC_045512 -U SRRxxxx.trim.fastq -S SRRxxxx.sam --local python3 sam2conseq.py --unpaired SRRxxxx.sam freqs.csv SRRxxxx.conseq.txt |
Nothing unusual there. @ewong347 can you please re-run the last step a few times to see if you can reproduce this issue? Also please check your command history to see if there was something different in that run. |
Expected behavior: sam2conseq should choose NT with highest frequency if the minor NT is only observed once.
Observed behavior: sam2conseq reports N instead of taking the major Nucleotide, even if the ratio is 350: 1 (position 7676)
WA11-UW7 (SRR11278092), aligned sequence, depth freqs found in langley/covid/problematic
pos 224
Freqs table:
The text was updated successfully, but these errors were encountered: