Skip to content

Commit 1067fc2

Browse files
committed
Updated information on host decontamination for the metagenomics pipeline
1 parent bf94dcd commit 1067fc2

File tree

3 files changed

+16
-1
lines changed

3 files changed

+16
-1
lines changed

docs/NextFlow/metagenomics.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -167,6 +167,7 @@ To adjust the `cluster` profile settings, stay within the appropriate section at
167167
| *trimgalore*.length | The minimum post-trimming read length required for a read to be retained. (Default: `25`) |
168168
| *trimgalore*.adapter | Adapter sequence to be trimmed from R1. Defaults to the Illumina adapters. (Default: `'AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC'`) |
169169
| *trimgalore*.adapter2 | Adapter sequence to be trimmed from R1. Defaults to the Illumina adapters. (Default: `'AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT'`) |
170+
| *decontaminate*.host | The relevant host species you are wanting your reads decontaminated for. Currently, the options are either `'human'` or `'mouse'`. If this field is left empty, the pipeline will default to human for decontamination. |
170171
| *decontaminate*.hostIndex | The base file path to the host index `.b2tl` files. Be sure to just provide the base name, e.g. `'path/to/folder/chm13v2.0_GRCh38_full_plus_decoy'`. If you don't provide anything, the pipeline will generate the host index for you and place a copy in the results folder. (Default: `''`) |
171172
| *taxonomy*.kraken2_db | The folder path to the kraken2 database if you have one prepared. If you don't provide anything, the pipeline will generate it for you and place a copy in the results folder. (Default: `''`) |
172173
| *taxonomy*.kmer_length | The sliding kmer length for Kraken2 to use for generating the database. This will also be used for generating the Bracken-corrected database version. (Default: `35`) |
@@ -215,6 +216,20 @@ To adjust the `cluster` profile settings, stay within the appropriate section at
215216

216217
Nextflow error status (137) relates to insufficent RAM allocated to the job. If you get this error, try allocating more resources to the job that failed.
217218

219+
!!! tip "Host decontamination"
220+
221+
As mentioned in the parameters customisation, you can currently select between either human or mouse for host decontamination.
222+
223+
- The `'mouse'` option will use the [GRCm39 mouse genome (release M33)](https://www.gencodegenes.org/mouse/release_M33.html) provided by Gencode.
224+
225+
- The `'human'` option will utilise a custom composite of both the [telomere-to-telomere consortium CHM13](https://github.com/marbl/CHM13) and [1000 Genomes GRCh38 (full analysis set + decoy)](https://www.internationalgenome.org/data/) human genomes to provide better coverage for decontamination.<sup>[1](https://journals.asm.org/doi/10.1128/mbio.01607-23)</sup>
226+
227+
If you want to generate your own genome index for contamination, download the appropriate genome `.fasta` file, and run the following code (where `${fasta_name}` is the genome file and `${index_prefix}` is the base genome index name you want, e.g. `'GRCm39'`), ensuring you use the `--large-index` flag:
228+
229+
```
230+
bowtie2-build ${fasta_name} ${index_prefix} --large-index
231+
```
232+
218233
## Outputs 📤
219234

220235
Several outputs will be copied from their respective Nextflow `work` directories to the output folder of your choice (default: `results`).

site/search/search_index.json

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

site/sitemap.xml.gz

0 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)