Step 4: Extract Unique Reads

Step 4: Extract Unique Reads:

Extract the name and length of each reference in the database.
Extract the Unique Reads.
Build the Coverage Plot (it takes the plot window size and the read classifier mode).

$ grep '>' fungi_ConcatContigs.fa > GenomeInformation.txt
$ python ReadClassifier.py SRR3546361_MergedContigs_Sorted.sam 1 > Read_FullList.sam
$ python CoveragePlot.py Read_FullList.sam GenomeInformation.txt 100 1

This command also extracts the uniquereads.

$ samtools view SRR3546361_MergedContigs_Sorted.bam | awk 'BEGIN { FS="\t" } { c[$1]++; l[$1,c[$1]]=$0 } END { for (i in c) { if (c[i] == 1) for (j = 1; j <= c[i]; j++) print l[i,j] } }' | sort -t$'\t' -k 3,3 -V -k 1,1 > Read_FullList.sam

Check the correctness of the results:

To count the number of Unique reads, then you compare this number with the number of extracted unique reads:

$ samtools view SRR3546361_MergedContigs_Filtered.bam | awk '{print $1}'| uniq -c | awk -F ' ' '{if ($1==1) print $2}' | wc -l

or more accurately compare the reads:

$ samtools view SRR3546361_MergedContigs_Filtered.bam | awk '{print $1}'| uniq -c | awk -F ' ' '{if ($1==1) print $2}' | sort > test.txt
$ awk '{print $1}' UniqueReads_Sorted.sam | sort > test2.txt
$ diff test.txt test2.txt

alt text

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Step 4: Extract Unique Reads

Step 4: Extract Unique Reads:

Check the correctness of the results:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally