-
Notifications
You must be signed in to change notification settings - Fork 0
Progress
mealser edited this page Jul 19, 2017
·
11 revisions
- Concatenate all Contigs for each genome & build the genome database (.fa file).
- Extract unique Reads.
- Extract multimapped reads within genomes.
- Extract multimapped reads across genomes.
- Build "coverage plot" for 1, 2, and 3.
- Nominate best multimapped read within a genome (based on edit distance, then alignment score, then randomly).
- Nominate best multimapped read across genomes (based on the genome that has highest percentage of (total number of unique reads x their total length / reference length).
- Build histogram plot of edit distance for 1, 2, and 3.
- Build histogram plot of matches (if read length=150 & CIGAR= 50S100M1I10M10S, then matches%=110/150) for 1, 2, and 3.
- Calculate relative abundance for each genome.
- Build our comprehensive reference database (fungi, eukaryote, and plasmid).
- For each fungal reference in the database, we want to generate all substrings using sliding window (overlapping or non-overlapping) and map them to all bacterial references. Then from the .sam file we can select one best bacteria that maps to our generated substring. We plot this result on a new track of the same coverage plot (inner circle of the plot).
- Interactive HighCharts.
- Task #1
- Task #2
- Task #3
- Task #4
- Task #5
- Task #6