You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/user/nextclade-web/analysis-results-table.md
+14-13Lines changed: 14 additions & 13 deletions
Original file line number
Diff line number
Diff line change
@@ -1,20 +1,19 @@
1
1
## Analysis results table
2
2
3
-
Nextclade analyzes your sequences locally in your browser. That means, sequences never leave your computer, ensuring full privacy by design.
3
+
Nextclade analyzes your sequences locally in your browser. Sequences never leave your computer, ensuring full privacy by design.
4
4
5
5
> ⚠️ Since your computer is doing all the computational work (rather than a remote server), it is advisable to analyze at most a few hundred of sequences at a time, depending on your computer hardware. Nextclade leverages all processor cores available on your computer and might require large amounts of system memory to operate. For large-scale analysis (thousands to millions of sequences) you might want to try [Nextclade CLI](nextclade-cli) instead.
6
6
7
7
The analysis pipeline comprises the following steps:
8
8
9
-
1. Sequence alignment: Sequences are aligned to the reference genome using our custom Nextalign alignment algorithm.
10
-
2. Translation: Nucleotide sequences are translated into amino acid sequences.
11
-
3. Mutation calling: Nucleotide and amino acid changes are identified
12
-
4. Detection of PCR primer changes
13
-
5. Phylogenetic placement: Sequences are placed on a reference tree, private mutations analyzed
14
-
6. Clade assignment: Clades are taken from the parent node on the tree
15
-
7. Quality Control (QC): Quality control metrics are calculated
9
+
1. Sequence alignment: Sequences are aligned to the reference genome using a banded Waterman-Smith sequence alignment algorithm.
10
+
1. Translation: Coding nucleotide segments are extracted and translated to amino acid sequences.
11
+
1. Mutation calling: Nucleotide and amino acid changes are identified
12
+
1. Phylogenetic placement: Sequences are placed on a reference tree, private mutations are identified
13
+
1. Clade assignment: Clades are inferred from the place the sequence attached on the reference tree
14
+
1. Quality Control (QC): Quality control metrics are calculated
16
15
17
-
See [Algorithm](algorithm) section for more details.
16
+
See the [Algorithm](algorithm) section of these docs for more details.
18
17
19
18
You can get a quick overview of the results screen in the screenshot below:
20
19

@@ -27,24 +26,26 @@ Nextclade implements a variety of quality control metrics to quickly spot proble
27
26
28
27
Every icon corresponds to a different metric. See [Quality control](algorithm/07-quality-control) section for the detailed explanation of QC metrics.
29
28
29
+
> Bear in mind that QC metrics are heuristics and that good quality sequences can occasionally fail some of the metrics (e.g. due to recombination or absence of close relatives in the reference tree).
30
+
30
31
### Table data
31
32
32
33
Nextclade automatically infers the (probable) clade a sequence belongs to and displays the result in the table. Clades are determined by identifying the clade of the nearest neighbour on a reference tree.
33
34
34
35
The result table further displays for each sequence:
35
36
36
-
- "Mut.": number of mutations with respect to the root of the reference tree
37
+
- "Mut.": number of mutations with respect to the reference sequence
37
38
- "non-ACGTN": number of ambiguous nucleotides that are not _N_
38
39
- "Ns": number of missing nucleotides indicated by _N_
39
40
- "Gaps": number of nucleotides that are deleted with respect to the reference sequence
40
41
- "Ins.": number of nucleotides that are inserted with respect to the reference sequence
41
42
- "FS": Number of uncommon frame shifts (total number, including common frame shifts are in parentheses)
42
43
- "SC": Number of uncommon premature stop codons (total number, including common premature stops are in parentheses)
43
44
44
-
Hovering over table entries reveals more detailed information. For example, hovering over the number of mutations reveals which nucleotides and aminoacids have changed with respect to the reference, as well as so-called _private_ mutations (mutations that differ from the nearest neighbor on the reference tree), which are are split into:
45
+
Hovering over table entries reveals more detailed information in tooltips. For example, hovering over the number of mutations reveals which nucleotides and aminoacids have changed with respect to the reference, as well as so-called _private_ mutations (mutations that differ from the nearest neighbor on the reference tree), which are are split into:
45
46
46
-
- Reversions: mutations back to reference, often a sign of sequencing problems
47
-
- Labeled: Mutations that are known, for example because they occur often in a clade. If multiple labeled mutations from the same clade appear, it is a sign of contamination, co-infection or recombination.
47
+
- Reversions: mutations back to reference, often a sign of sequencing pipeline problems (e.g. faulty primer trimming or reference bias).
48
+
- Labeled: Mutations that are known, for example because they characteristically occur in a clade. If multiple labeled mutations from the same clade appear, it is often a sign of contamination, co-infection or recombination.
48
49
- Unlabeled: Mutations that are neither reversions nor labeled.
49
50
50
51
In the screenshot below, the mouse hovers over a _20J (Gamma)_ sequence. The tooltip shows there are 3 reversion and 4 labeled mutations, indicative of sequence quality problems, potentially a contamination with _20I (Alpha)_.
0 commit comments