Skip to content

Commit b99b475

Browse files
docs(Analysis results): edit pass
1 parent de7aa5d commit b99b475

File tree

1 file changed

+14
-13
lines changed

1 file changed

+14
-13
lines changed

docs/user/nextclade-web/analysis-results-table.md

Lines changed: 14 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,19 @@
11
## Analysis results table
22

3-
Nextclade analyzes your sequences locally in your browser. That means, sequences never leave your computer, ensuring full privacy by design.
3+
Nextclade analyzes your sequences locally in your browser. Sequences never leave your computer, ensuring full privacy by design.
44

55
> ⚠️ Since your computer is doing all the computational work (rather than a remote server), it is advisable to analyze at most a few hundred of sequences at a time, depending on your computer hardware. Nextclade leverages all processor cores available on your computer and might require large amounts of system memory to operate. For large-scale analysis (thousands to millions of sequences) you might want to try [Nextclade CLI](nextclade-cli) instead.
66
77
The analysis pipeline comprises the following steps:
88

9-
1. Sequence alignment: Sequences are aligned to the reference genome using our custom Nextalign alignment algorithm.
10-
2. Translation: Nucleotide sequences are translated into amino acid sequences.
11-
3. Mutation calling: Nucleotide and amino acid changes are identified
12-
4. Detection of PCR primer changes
13-
5. Phylogenetic placement: Sequences are placed on a reference tree, private mutations analyzed
14-
6. Clade assignment: Clades are taken from the parent node on the tree
15-
7. Quality Control (QC): Quality control metrics are calculated
9+
1. Sequence alignment: Sequences are aligned to the reference genome using a banded Waterman-Smith sequence alignment algorithm.
10+
1. Translation: Coding nucleotide segments are extracted and translated to amino acid sequences.
11+
1. Mutation calling: Nucleotide and amino acid changes are identified
12+
1. Phylogenetic placement: Sequences are placed on a reference tree, private mutations are identified
13+
1. Clade assignment: Clades are inferred from the place the sequence attached on the reference tree
14+
1. Quality Control (QC): Quality control metrics are calculated
1615

17-
See [Algorithm](algorithm) section for more details.
16+
See the [Algorithm](algorithm) section of these docs for more details.
1817

1918
You can get a quick overview of the results screen in the screenshot below:
2019
![Results overview](../assets/web_overview.png)
@@ -27,24 +26,26 @@ Nextclade implements a variety of quality control metrics to quickly spot proble
2726

2827
Every icon corresponds to a different metric. See [Quality control](algorithm/07-quality-control) section for the detailed explanation of QC metrics.
2928

29+
> Bear in mind that QC metrics are heuristics and that good quality sequences can occasionally fail some of the metrics (e.g. due to recombination or absence of close relatives in the reference tree).
30+
3031
### Table data
3132

3233
Nextclade automatically infers the (probable) clade a sequence belongs to and displays the result in the table. Clades are determined by identifying the clade of the nearest neighbour on a reference tree.
3334

3435
The result table further displays for each sequence:
3536

36-
- "Mut.": number of mutations with respect to the root of the reference tree
37+
- "Mut.": number of mutations with respect to the reference sequence
3738
- "non-ACGTN": number of ambiguous nucleotides that are not _N_
3839
- "Ns": number of missing nucleotides indicated by _N_
3940
- "Gaps": number of nucleotides that are deleted with respect to the reference sequence
4041
- "Ins.": number of nucleotides that are inserted with respect to the reference sequence
4142
- "FS": Number of uncommon frame shifts (total number, including common frame shifts are in parentheses)
4243
- "SC": Number of uncommon premature stop codons (total number, including common premature stops are in parentheses)
4344

44-
Hovering over table entries reveals more detailed information. For example, hovering over the number of mutations reveals which nucleotides and aminoacids have changed with respect to the reference, as well as so-called _private_ mutations (mutations that differ from the nearest neighbor on the reference tree), which are are split into:
45+
Hovering over table entries reveals more detailed information in tooltips. For example, hovering over the number of mutations reveals which nucleotides and aminoacids have changed with respect to the reference, as well as so-called _private_ mutations (mutations that differ from the nearest neighbor on the reference tree), which are are split into:
4546

46-
- Reversions: mutations back to reference, often a sign of sequencing problems
47-
- Labeled: Mutations that are known, for example because they occur often in a clade. If multiple labeled mutations from the same clade appear, it is a sign of contamination, co-infection or recombination.
47+
- Reversions: mutations back to reference, often a sign of sequencing pipeline problems (e.g. faulty primer trimming or reference bias).
48+
- Labeled: Mutations that are known, for example because they characteristically occur in a clade. If multiple labeled mutations from the same clade appear, it is often a sign of contamination, co-infection or recombination.
4849
- Unlabeled: Mutations that are neither reversions nor labeled.
4950

5051
In the screenshot below, the mouse hovers over a _20J (Gamma)_ sequence. The tooltip shows there are 3 reversion and 4 labeled mutations, indicative of sequence quality problems, potentially a contamination with _20I (Alpha)_.

0 commit comments

Comments
 (0)