diff --git a/CHANGELOG.md b/CHANGELOG.md index 6b954cd..45d0f6a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,17 @@ ## Unreleased +#### Added `HamronizationNormalizer` +- Removed the `is_hamronized` property for all normalizers and removed `--hamronized` flag for CLI. +- All hamronized results now go through the `HamronizationNormalizer` class. +- HamronizationNormalizer reads a hamronized file line by line, procures input genes, and loads all ARO mapping tables to support hamronized results that combine the outputs from multiple tools and databases. +- For CLI hamronization commands will look like: +```bash +argnorm hamronization -i PATH_TO_INPUT -o PATH_TO_OUTPUT +``` + +> Note: Updated preprocessing of resfinder genes. Concatenating entries from 'gene_name' and 'reference_accession' in hamronized results to form input genes for HamronizationNormalizer. While this improves ARO mapping accuracy (previously only `gene_symbol` was used and several genes can have the same `gene_symbol`), this simplifies preprocessing of resfinder inputs (if `gene_symbol` is used, two different preprocessing functions are required for `resfinder` and `abricate` for resfinder db). + #### Update `confers_resistance_to()` to use `regulates`, `part_of`, and `participates_in` ARO relationships Previously, argNorm used the `is_a` ARO relationship along with `confers_resistance_to_drug_class` and `confers_resistance_to_antibiotic` to map ARGs to the drugs they confer resistance to. While this worked well for most genes, some ARGs such as those coding for efflux pumps/proteins (e.g. `ARO:3003548`, `ARO:3000826`, `ARO:3003066`) were previously not mapped to any drugs. This is because none of their superclasses mapped to drugs/antibiotics via `confers_resistance_to_antibiotic` or `confers_resistance_to_drug_class`. However, these genes were related to other ARGs that did map to drugs via the `regulates`, `part_of`, or `participates_in` ARO relationships. argNorm now also utilizes these three relationships to ensure that even if the superclasses (derived using `is_a`) of an ARG don't map to a drug, the gene can be assigned a drug mapping. diff --git a/README.md b/README.md index a6beb0e..009ddcf 100644 --- a/README.md +++ b/README.md @@ -51,18 +51,18 @@ The `resistance_to_drug_classes` column will contain ARO numbers of the broader If you use argNorm in a publication, please cite the preprint: > Ugarcina Perovic S, Ramji V et al. argNorm: Normalization of Antibiotic Resistance Gene Annotations to the Antibiotic Resistance Ontology (ARO). Queensland University of Technology ePrints, 2024. DOI: https://doi.org/10.5204/rep.eprints.252448 [Preprint] (Under review). -## Supported tools and databases +## Supported ARG annotation tools and databases | ARG database | Tool for ARG annotation | | ---------------------------------- | ------------------------------------------------------- | -| ARG-ANNOT v5.0 | [ABRicate v1.0.1](https://github.com/tseemann/abricate) | -| DeepARG v2 | [DeepARG v1.0.2](https://bench.cs.vt.edu/deeparg) | -| Groot v1.1.2 | [GROOT v1.1.2](https://github.com/will-rowe/groot) | -| MEGARes v3.0 | [ABRicate v1.0.1](https://github.com/tseemann/abricate) | -| NCBI Reference Gene Database v3.12 | [ABRicate v1.0.1](https://github.com/tseemann/abricate) & [AMRFinderPlus v3.10.30](https://github.com/ncbi/amr) | -| ResFinder v4.0 | [ABRicate v1.0.1](https://github.com/tseemann/abricate) & [ResFinder v4.0](https://bitbucket.org/genomicepidemiology/resfinder/src/master/) | -| ResFinderFG v2.0 | [ABRicate v1.0.1](https://github.com/tseemann/abricate) | -| SARG (reads mode) v3.2.1 | [ARGs-OAP v2.3](https://galaxyproject.org/use/args-oap/) | +| ARG-ANNOT v5.0 | [ABRicate v1.0.1](https://github.com/tseemann/abricate) & [hAMRonization](https://github.com/pha4ge/hAMRonization) | +| DeepARG v2 | [DeepARG v1.0.2](https://bench.cs.vt.edu/deeparg) & [hAMRonization](https://github.com/pha4ge/hAMRonization) | +| Groot v1.1.2 | [GROOT v1.1.2](https://github.com/will-rowe/groot) & [hAMRonization](https://github.com/pha4ge/hAMRonization) | +| MEGARes v3.0 | [ABRicate v1.0.1](https://github.com/tseemann/abricate) & [hAMRonization](https://github.com/pha4ge/hAMRonization) | +| NCBI Reference Gene Database v3.12 | [ABRicate v1.0.1](https://github.com/tseemann/abricate), [AMRFinderPlus v3.10.30](https://github.com/ncbi/amr), & [hAMRonization](https://github.com/pha4ge/hAMRonization) | +| ResFinder v4.0 | [ABRicate v1.0.1](https://github.com/tseemann/abricate), [ResFinder v4.0](https://bitbucket.org/genomicepidemiology/resfinder/src/master/), & [hAMRonization](https://github.com/pha4ge/hAMRonization) | +| ResFinderFG v2.0 | [ABRicate v1.0.1](https://github.com/tseemann/abricate) & [hAMRonization](https://github.com/pha4ge/hAMRonization) | +| SARG (reads mode) v3.2.1 | [ARGs-OAP v2.3](https://galaxyproject.org/use/args-oap/) & [hAMRonization](https://github.com/pha4ge/hAMRonization) | - Note: ARG database and ARG annotation tool versions can change. argNorm is only intended for supported versions listed above. - Note: the argNorm tool will be periodically updated to support the latest versions of databases and annotation tools if they undergo significant changes. @@ -98,7 +98,7 @@ argNorm is readily available in the funcscan pipeline which can be accessed (her Here is a basic outline of calling argNorm. ```bash -argnorm [tool] [--db] -i [path to original_annotation.tsv] -o [path to annotation_result_with_aro.tsv] [--hamronized (if hAMRonization used)] +argnorm [tool] [--db] -i [path to original_annotation.tsv] -o [path to annotation_result_with_aro.tsv] ``` ### `tool` (required) @@ -109,6 +109,7 @@ The most important ***required positional*** argument is `tool` (see [here](#sup - `resfinder` - `amrfinderplus` - `groot` +- `hamronization` ### I/O (required) - `-i` or `--input`: path to the annotation result @@ -135,31 +136,26 @@ ARG annotation tools can use several ARG databases for annotation. Hence, the `t | `resfinder` | Not required | | `amrfinderplus` | Not required | | `groot` | Any from `groot-argannot`, `groot-resfinder`, `groot-db`, `groot-core-db`, or `groot-card` | - -### `--hamronized` (optional) -Use this if the input is hamronized by [hAMRonization](https://github.com/pha4ge/hAMRonization) +| `hamronization` | Not required | ### `-h` or `--help` Use `argnorm -h` or `argnorm --help` to see available options. ```bash >argnorm -h -usage: argnorm [-h] - [--db {sarg,ncbi,resfinder,deeparg,megares,argannot,resfinderfg,groot-argannot,groot-resfinder,groot-db,groot-core-db,groot-card}] - [--hamronized] [-i INPUT] [-o OUTPUT] - {argsoap,abricate,deeparg,resfinder,amrfinderplus,groot} +usage: argnorm [-h] [--db {sarg,ncbi,resfinder,deeparg,megares,argannot,resfinderfg,groot-argannot,groot-resfinder,groot-db,groot-core-db,groot-card}] [-i INPUT] [-o OUTPUT] + {argsoap,abricate,deeparg,resfinder,amrfinderplus,groot,hamronization} argNorm normalizes ARG annotation results from different tools and databases to the same ontology, namely ARO (Antibiotic Resistance Ontology). positional arguments: - {argsoap,abricate,deeparg,resfinder,amrfinderplus,groot} + {argsoap,abricate,deeparg,resfinder,amrfinderplus,groot,hamronization} The tool you used to do ARG annotation. -optional arguments: +options: -h, --help show this help message and exit --db {sarg,ncbi,resfinder,deeparg,megares,argannot,resfinderfg,groot-argannot,groot-resfinder,groot-db,groot-core-db,groot-card} The database you used to do ARG annotation. - --hamronized Use this if the input is hamronized (processed using the hAMRonization tool) -i INPUT, --input INPUT The annotation result you have -o OUTPUT, --output OUTPUT @@ -209,23 +205,19 @@ argnorm -h ``` > argnorm -h -usage: argnorm [-h] - [--db {sarg,ncbi,resfinder,deeparg,megares,argannot,resfinderfg}] - [--hamronized] [-i INPUT] [-o OUTPUT] - {argsoap,abricate,deeparg,resfinder,amrfinderplus} +usage: argnorm [-h] [--db {sarg,ncbi,resfinder,deeparg,megares,argannot,resfinderfg,groot-argannot,groot-resfinder,groot-db,groot-core-db,groot-card}] [-i INPUT] [-o OUTPUT] + {argsoap,abricate,deeparg,resfinder,amrfinderplus,groot,hamronization} argNorm normalizes ARG annotation results from different tools and databases to the same ontology, namely ARO (Antibiotic Resistance Ontology). positional arguments: - {argsoap,abricate,deeparg,resfinder,amrfinderplus} + {argsoap,abricate,deeparg,resfinder,amrfinderplus,groot,hamronization} The tool you used to do ARG annotation. -optional arguments: +options: -h, --help show this help message and exit - --db {sarg,ncbi,resfinder,deeparg,megares,argannot,resfinderfg} + --db {sarg,ncbi,resfinder,deeparg,megares,argannot,resfinderfg,groot-argannot,groot-resfinder,groot-db,groot-core-db,groot-card} The database you used to do ARG annotation. - --hamronized Use this if the input is hamronized (processed using - the hAMRonization tool) -i INPUT, --input INPUT The annotation result you have -o OUTPUT, --output OUTPUT @@ -257,10 +249,10 @@ wget https://raw.githubusercontent.com/BigDataBiology/argNorm/main/examples/raw/ Here is a basic outline of most argNorm commands: ```bash -argnorm [tool] -i [original_annotation.tsv] -o [argnorm_result.tsv] [--hamronized] +argnorm [tool] -i [original_annotation.tsv] -o [argnorm_result.tsv] [--db] ``` -Here, `tool` refers to the ARG annotation tool used (ResFinder in this case). `original_annotation.tsv` is the path to the input data and `argnorm_result.tsv` is the path to output file where the resulting table from argNorm will be stored. `--hamronized` is an option to indicate if the input data is a result of using the [hAMRonization package](https://github.com/pha4ge/hAMRonization). In our example, the input data is not a result of using the hAMRonization package, and so the `--hamronized` option can be omitted. +Here, `tool` refers to the ARG annotation tool used (ResFinder in this case). `original_annotation.tsv` is the path to the input data and `argnorm_result.tsv` is the path to output file where the resulting table from argNorm will be stored. `--db` is the ARG databases used along with `tool` to perform annotation. ResFinder does not require a `--db` (argNorm will automatically load up the ResFinder database), however, `--db` is required for the ARG annotation tools `groot` and `abricate`. To run argNorm on the input data, use this command in your terminal: diff --git a/docs/api.md b/docs/api.md index f0d23cc..96adad6 100644 --- a/docs/api.md +++ b/docs/api.md @@ -84,9 +84,8 @@ print(drugs_to_drug_classes(['ARO:0000030', 'ARO:0000051', 'ARO:0000069', 'ARO:3 Normalizers classes for specific tools which normalize ARG annotation outputs. Same functionality as CLI. -All normalizers have 2 parameters: +All normalizers have 1 optional parameter: * database (str): name of database. Can be: argannot, deeparg, megares, ncbi, resfinderfg, sarg, groot-db, groot-core-db, groot-card, groot-argannot, and groot-resfinder. -* is_hamronized (bool, False by default): whether or not the ARG annotation output has been processed by the hamronization package. > Note: the database parameter only needs to be specified for AbricateNormalizer and GrootNormalizer. ncbi, deeparg, resfinder, sarg, megares, argannot, resfinderfg are the supported databases for AbricateNormalizer and groot-db, groot-core-db, groot-argannot, groot-resfinder, and groot-card are the supported databases for GrootNormalizer. @@ -97,6 +96,7 @@ Available normalizers: * argnorm.normalizers.AMRFinderPlusNormalizer * argnorm.normalizers.AbricateNormalizer * argnorm.normalizers.GrootNormalizer +* argnorm.normalizers.HamronizationNormalizer ### Methods @@ -128,18 +128,7 @@ resfinder_normalizer.run('./resfinder.resfinder.orfs.tsv').to_csv('./resfinder.r This will create a file called `resfinder.resfinder.orfs.normed.tsv` with ARO mappings and drug categorization. -### Example 2: using AbricteNormalizer with the ResFinderFG database - -The database parameter needs to be specified for the AbricateNormalizer. Supported databases are: -* `ncbi` -* `deeparg` -* `resfinder` -* `sarg` -* `megares` -* `argannot` -* `resfinderfg` - -For this example, we will run the AbricateNormalizer with the [`resfinderfg` database option](https://www.big-data-biology.org/paper/2022_resfinderfgv2/). +### Example 2: using HamronizationNormalizer Download the sample data [here](https://raw.githubusercontent.com/BigDataBiology/argNorm/7ee9d74c9fa51956ecb7706fa979cc0696ae305d/examples/hamronized/abricate.resfinderfg.tsv), and store it in a folder called `argnorm_normalizers_tutorial`. @@ -151,13 +140,11 @@ wget https://raw.githubusercontent.com/BigDataBiology/argNorm/7ee9d74c9fa51956ec Save the following piece of Python code in the `argnorm_normalizers_tutorial` folder, and run the script. -> Note: the data is hamronized, and so the `is_hamronized` parameter should be set to `True`. - ``` -from argnorm.normalizers import AbricateNormalizer +from argnorm.normalizers import HamronizationNormalizer -abricate_normalizer = AbricateNormalizer(database='resfinderfg', is_hamronized=True) -abricate_normalizer.run('./abricate.resfinderfg.tsv').to_csv('./abricate.resfinderfg.normed.tsv', sep='\t') +normalizer = HamronizationNormalizer() +normalizer.run('./abricate.resfinderfg.tsv').to_csv('./abricate.resfinderfg.normed.tsv', sep='\t') ``` -This will create a file called `abricate.resfinderfg.normed.tsv` with ARO mappings and drug categorization. +This will create a file called `abricate.resfinderfg.normed.tsv` with ARO mappings and drug categorization. \ No newline at end of file diff --git a/docs/cli.md b/docs/cli.md index 429935a..a210fd7 100644 --- a/docs/cli.md +++ b/docs/cli.md @@ -16,23 +16,19 @@ argnorm -h ``` > argnorm -h -usage: argnorm [-h] - [--db {sarg,ncbi,resfinder,deeparg,megares,argannot,resfinderfg}] - [--hamronized] [-i INPUT] [-o OUTPUT] - {argsoap,abricate,deeparg,resfinder,amrfinderplus} +usage: argnorm [-h] [--db {sarg,ncbi,resfinder,deeparg,megares,argannot,resfinderfg,groot-argannot,groot-resfinder,groot-db,groot-core-db,groot-card}] [-i INPUT] [-o OUTPUT] + {argsoap,abricate,deeparg,resfinder,amrfinderplus,groot,hamronization} argNorm normalizes ARG annotation results from different tools and databases to the same ontology, namely ARO (Antibiotic Resistance Ontology). positional arguments: - {argsoap,abricate,deeparg,resfinder,amrfinderplus} + {argsoap,abricate,deeparg,resfinder,amrfinderplus,groot,hamronization} The tool you used to do ARG annotation. -optional arguments: +options: -h, --help show this help message and exit - --db {sarg,ncbi,resfinder,deeparg,megares,argannot,resfinderfg} + --db {sarg,ncbi,resfinder,deeparg,megares,argannot,resfinderfg,groot-argannot,groot-resfinder,groot-db,groot-core-db,groot-card} The database you used to do ARG annotation. - --hamronized Use this if the input is hamronized (processed using - the hAMRonization tool) -i INPUT, --input INPUT The annotation result you have -o OUTPUT, --output OUTPUT @@ -64,10 +60,10 @@ wget https://raw.githubusercontent.com/BigDataBiology/argNorm/main/examples/raw/ Here is a basic outline of most argNorm commands: ```bash -argnorm [tool] -i [original_annotation.tsv] -o [argnorm_result.tsv] [--hamronized] +argnorm [tool] -i [original_annotation.tsv] -o [argnorm_result.tsv] [--db] ``` -Here, `tool` refers to the ARG annotation tool used (ResFinder in this case). `original_annotation.tsv` is the path to the input data and `argnorm_result.tsv` is the path to output file where the resulting table from argNorm will be stored. `--hamronized` is an option to indicate if the input data is a result of using the [hAMRonization package](https://github.com/pha4ge/hAMRonization). In our example, the input data is not a result of using the hAMRonization package, and so the `--hamronized` option can be omitted. +Here, `tool` refers to the ARG annotation tool used (ResFinder in this case). `original_annotation.tsv` is the path to the input data and `argnorm_result.tsv` is the path to output file where the resulting table from argNorm will be stored. `--db` is the ARG databases used along with `tool` to perform annotation. ResFinder does not require a `--db` (argNorm will automatically load up the ResFinder database), however, `--db` is required for the ARG annotation tools `groot` and `abricate`. To run argNorm on the input data, use this command in your terminal: @@ -97,31 +93,18 @@ argnorm [tool] -i [original_annotation.tsv] -o [annotation_result_with_aro.tsv] ```bash argnorm argsoap -i examples/raw/args-oap.sarg.reads.tsv -o outputs/raw/args-oap.sarg.reads.tsv - -argnorm argsoap -i examples/hamronized/args-oap.sarg.reads.tsv -o outputs/hamronized/args-oap.sarg.reads.tsv --hamronized ``` ### DeepARG ```bash argnorm deeparg -i examples/raw/deeparg.deeparg.orfs.tsv -o outputs/raw/deeparg.deeparg.orfs.tsv - -argnorm deeparg -i examples/hamronized/deeparg.deeparg.orfs.tsv -o outputs/hamronized/deeparg.deeparg.orfs.tsv --hamronized ``` ### ABRicate When using abricate, it is necessary to specify the database used: -#### Hamronized -```bash -argnorm abricate --db ncbi -i examples/hamronized/abricate.ncbi.tsv -o outputs/hamronized/abricate.ncbi.tsv --hamronized -argnorm abricate --db megares -i examples/hamronized/abricate.megares.tsv -o outputs/hamronized/abricate.megares.tsv --hamronized -argnorm abricate --db argannot -i examples/hamronized/abricate.argannot.tsv -o outputs/hamronized/abricate.argannot.tsv --hamronized -argnorm abricate --db resfinder -i examples/hamronized/abricate.resfinder.tsv -o outputs/hamronized/abricate.resfinder.tsv --hamronized -``` - -#### Raw ```bash argnorm abricate --db ncbi -i examples/raw/abricate.ncbi.tsv -o outputs/raw/abricate.ncbi.tsv argnorm abricate --db megares -i examples/raw/abricate.megares.tsv -o outputs/raw/abricate.megarest.tsv @@ -130,13 +113,6 @@ argnorm abricate --db argannot -i examples/raw/abricate.argannot.tsv -o outputs/ ### ResFinder -#### Hamronized -```bash -argnorm resfinder -i examples/hamronized/resfinder.resfinder.orfs.tsv -o outputs/hamronized/resfinder.resfinder.orfs.tsv --hamronized -argnorm resfinder -i examples/hamronized/resfinder.resfinder.reads.tsv -o outputs/hamronized/resfinder.resfinder.reads.tsv --hamronized -``` - -#### Raw ```bash argnorm resfinder -i examples/raw/resfinder.resfinder.orfs.tsv -o outputs/raw/resfinder.resfinder.orfs.tsv argnorm resfinder -i examples/raw/resfinder.resfinder.reads.tsv -o outputs/raw/resfinder.resfinder.reads.tsv @@ -145,8 +121,6 @@ argnorm resfinder -i examples/raw/resfinder.resfinder.reads.tsv -o outputs/raw/r ### AMRFinderPlus ```bash argnorm amrfinderplus -i examples/raw/amrfinderplus.ncbi.orfs.tsv -o outputs/raw/amrfinderplus.ncbi.orfs.tsv - -argnorm amrfinderplus -i examples/hamronized/amrfinderplus.ncbi.orfs.tsv -o outputs/hamronized/amrfinderplus.ncbi.orfs.tsv ``` ### GROOT @@ -156,10 +130,22 @@ argnorm groot -i examples/raw/groot.resfinder.tsv -o outputs/raw/groot.resfinder argnorm groot -i examples/raw/groot.card.tsv -o outputs/raw/groot.card.tsv --db groot-card argnorm groot -i examples/raw/groot.groot-db.tsv -o outputs/raw/groot.groot-db.tsv --db groot-db argnorm groot -i examples/raw/groot.groot-core-db.tsv -o ouptuts/raw/groot.groot-core-db.tsv --db groot-core-db +``` + +### Hamronization -argnorm groot -i examples/hamronized/groot.argannot.tsv -o outputs/hamronized/groot.argannot.tsv --db groot-argannot --hamronized -argnorm groot -i examples/hamronized/groot.resfinder.tsv -o outputs/hamronized/groot.resfinder.tsv --db groot-resfinder --hamronized -argnorm groot -i examples/hamronized/groot.card.tsv -o outputs/hamronized/groot.card.tsv --db groot-card --hamronized -argnorm groot -i examples/hamronized/groot.groot-db.tsv -o outputs/hamronized/groot.groot-db.tsv --db groot-db --hamronized -argnorm groot -i examples/hamronized/groot.groot-core-db.tsv -o outputs/hamronized/groot.groot-core-db.tsv --db groot-core-db --hamronized -``` \ No newline at end of file +```bash +argnorm hamronization -i examples/hamronized/args-oap.sarg.reads.tsv -o outputs/hamronized/args-oap.sarg.reads.tsv +argnorm hamronization -i examples/hamronized/deeparg.deeparg.orfs.tsv -o outputs/hamronized/deeparg.deeparg.orfs.tsv +argnorm hamronization -i examples/hamronized/abricate.ncbi.tsv -o outputs/hamronized/abricate.ncbi.tsv +argnorm hamronization -i examples/hamronized/abricate.megares.tsv -o outputs/hamronized/abricate.megares.tsv +argnorm hamronization -i examples/hamronized/abricate.argannot.tsv -o outputs/hamronized/abricate.argannot.tsv +argnorm hamronization -i examples/hamronized/abricate.resfinder.tsv -o outputs/hamronized/abricate.resfinder.tsv +argnorm hamronization -i examples/hamronized/resfinder.resfinder.orfs.tsv -o outputs/hamronized/resfinder.resfinder.orfs.tsv +argnorm hamronization -i examples/hamronized/resfinder.resfinder.reads.tsv -o outputs/hamronized/resfinder.resfinder.reads.tsv +argnorm hamronization -i examples/hamronized/amrfinderplus.ncbi.orfs.tsv -o outputs/hamronized/amrfinderplus.ncbi.orfs.tsv +argnorm hamronization -i examples/hamronized/groot.argannot.tsv -o outputs/hamronized/groot.argannot.tsv +argnorm hamronization -i examples/hamronized/groot.resfinder.tsv -o outputs/hamronized/groot.resfinder.tsv +argnorm hamronization -i examples/hamronized/groot.card.tsv -o outputs/hamronized/groot.card.tsv +argnorm hamronization -i examples/hamronized/groot.groot-db.tsv -o outputs/hamronized/groot.groot-db.tsv +argnorm hamronization -i examples/hamronized/groot.groot-core-db.tsv -o outputs/hamronized/groot.groot-core-db.tsv \ No newline at end of file diff --git a/docs/index.md b/docs/index.md index d38cb29..4917144 100644 --- a/docs/index.md +++ b/docs/index.md @@ -51,18 +51,18 @@ The `resistance_to_drug_classes` column will contain ARO numbers of the broader If you use argNorm in a publication, please cite the preprint: > Ugarcina Perovic S, Ramji V et al. argNorm: Normalization of Antibiotic Resistance Gene Annotations to the Antibiotic Resistance Ontology (ARO). Queensland University of Technology ePrints, 2024. DOI: https://doi.org/10.5204/rep.eprints.252448 [Preprint] (Under review). -## Supported tools and databases +## Supported ARG annotation tools and databases | ARG database | Tool for ARG annotation | | ---------------------------------- | ------------------------------------------------------- | -| ARG-ANNOT v5.0 | [ABRicate v1.0.1](https://github.com/tseemann/abricate) | -| DeepARG v2 | [DeepARG v1.0.2](https://bench.cs.vt.edu/deeparg) | -| Groot v1.1.2 | [GROOT v1.1.2](https://github.com/will-rowe/groot) | -| MEGARes v3.0 | [ABRicate v1.0.1](https://github.com/tseemann/abricate) | -| NCBI Reference Gene Database v3.12 | [ABRicate v1.0.1](https://github.com/tseemann/abricate) & [AMRFinderPlus v3.10.30](https://github.com/ncbi/amr) | -| ResFinder v4.0 | [ABRicate v1.0.1](https://github.com/tseemann/abricate) & [ResFinder v4.0](https://bitbucket.org/genomicepidemiology/resfinder/src/master/) | -| ResFinderFG v2.0 | [ABRicate v1.0.1](https://github.com/tseemann/abricate) | -| SARG (reads mode) v3.2.1 | [ARGs-OAP v2.3](https://galaxyproject.org/use/args-oap/) | +| ARG-ANNOT v5.0 | [ABRicate v1.0.1](https://github.com/tseemann/abricate) & [hAMRonization](https://github.com/pha4ge/hAMRonization) | +| DeepARG v2 | [DeepARG v1.0.2](https://bench.cs.vt.edu/deeparg) & [hAMRonization](https://github.com/pha4ge/hAMRonization) | +| Groot v1.1.2 | [GROOT v1.1.2](https://github.com/will-rowe/groot) & [hAMRonization](https://github.com/pha4ge/hAMRonization) | +| MEGARes v3.0 | [ABRicate v1.0.1](https://github.com/tseemann/abricate) & [hAMRonization](https://github.com/pha4ge/hAMRonization) | +| NCBI Reference Gene Database v3.12 | [ABRicate v1.0.1](https://github.com/tseemann/abricate), [AMRFinderPlus v3.10.30](https://github.com/ncbi/amr), & [hAMRonization](https://github.com/pha4ge/hAMRonization) | +| ResFinder v4.0 | [ABRicate v1.0.1](https://github.com/tseemann/abricate), [ResFinder v4.0](https://bitbucket.org/genomicepidemiology/resfinder/src/master/), & [hAMRonization](https://github.com/pha4ge/hAMRonization) | +| ResFinderFG v2.0 | [ABRicate v1.0.1](https://github.com/tseemann/abricate) & [hAMRonization](https://github.com/pha4ge/hAMRonization) | +| SARG (reads mode) v3.2.1 | [ARGs-OAP v2.3](https://galaxyproject.org/use/args-oap/) & [hAMRonization](https://github.com/pha4ge/hAMRonization) | - Note: ARG database and ARG annotation tool versions can change. argNorm is only intended for supported versions listed above. - Note: the argNorm tool will be periodically updated to support the latest versions of databases and annotation tools if they undergo significant changes. @@ -98,7 +98,7 @@ argNorm is readily available in the funcscan pipeline which can be accessed (her Here is a basic outline of calling argNorm. ```bash -argnorm [tool] [--db] -i [path to original_annotation.tsv] -o [path to annotation_result_with_aro.tsv] [--hamronized (if hAMRonization used)] +argnorm [tool] [--db] -i [path to original_annotation.tsv] -o [path to annotation_result_with_aro.tsv] ``` ### `tool` (required) @@ -109,6 +109,7 @@ The most important ***required positional*** argument is `tool` (see [here](#sup - `resfinder` - `amrfinderplus` - `groot` +- `hamronization` ### I/O (required) - `-i` or `--input`: path to the annotation result @@ -135,6 +136,7 @@ ARG annotation tools can use several ARG databases for annotation. Hence, the `t | `resfinder` | Not required | | `amrfinderplus` | Not required | | `groot` | Any from `groot-argannot`, `groot-resfinder`, `groot-db`, `groot-core-db`, or `groot-card` | +| `hamronization` | Not required | ### `--hamronized` (optional) Use this if the input is hamronized by [hAMRonization](https://github.com/pha4ge/hAMRonization) @@ -144,22 +146,19 @@ Use `argnorm -h` or `argnorm --help` to see available options. ```bash >argnorm -h -usage: argnorm [-h] - [--db {sarg,ncbi,resfinder,deeparg,megares,argannot,resfinderfg,groot-argannot,groot-resfinder,groot-db,groot-core-db,groot-card}] - [--hamronized] [-i INPUT] [-o OUTPUT] - {argsoap,abricate,deeparg,resfinder,amrfinderplus,groot} +usage: argnorm [-h] [--db {sarg,ncbi,resfinder,deeparg,megares,argannot,resfinderfg,groot-argannot,groot-resfinder,groot-db,groot-core-db,groot-card}] [-i INPUT] [-o OUTPUT] + {argsoap,abricate,deeparg,resfinder,amrfinderplus,groot,hamronization} argNorm normalizes ARG annotation results from different tools and databases to the same ontology, namely ARO (Antibiotic Resistance Ontology). positional arguments: - {argsoap,abricate,deeparg,resfinder,amrfinderplus,groot} + {argsoap,abricate,deeparg,resfinder,amrfinderplus,groot,hamronization} The tool you used to do ARG annotation. -optional arguments: +options: -h, --help show this help message and exit --db {sarg,ncbi,resfinder,deeparg,megares,argannot,resfinderfg,groot-argannot,groot-resfinder,groot-db,groot-core-db,groot-card} The database you used to do ARG annotation. - --hamronized Use this if the input is hamronized (processed using the hAMRonization tool) -i INPUT, --input INPUT The annotation result you have -o OUTPUT, --output OUTPUT @@ -209,23 +208,19 @@ argnorm -h ``` > argnorm -h -usage: argnorm [-h] - [--db {sarg,ncbi,resfinder,deeparg,megares,argannot,resfinderfg}] - [--hamronized] [-i INPUT] [-o OUTPUT] - {argsoap,abricate,deeparg,resfinder,amrfinderplus} +usage: argnorm [-h] [--db {sarg,ncbi,resfinder,deeparg,megares,argannot,resfinderfg,groot-argannot,groot-resfinder,groot-db,groot-core-db,groot-card}] [-i INPUT] [-o OUTPUT] + {argsoap,abricate,deeparg,resfinder,amrfinderplus,groot,hamronization} argNorm normalizes ARG annotation results from different tools and databases to the same ontology, namely ARO (Antibiotic Resistance Ontology). positional arguments: - {argsoap,abricate,deeparg,resfinder,amrfinderplus} + {argsoap,abricate,deeparg,resfinder,amrfinderplus,groot,hamronization} The tool you used to do ARG annotation. -optional arguments: +options: -h, --help show this help message and exit - --db {sarg,ncbi,resfinder,deeparg,megares,argannot,resfinderfg} + --db {sarg,ncbi,resfinder,deeparg,megares,argannot,resfinderfg,groot-argannot,groot-resfinder,groot-db,groot-core-db,groot-card} The database you used to do ARG annotation. - --hamronized Use this if the input is hamronized (processed using - the hAMRonization tool) -i INPUT, --input INPUT The annotation result you have -o OUTPUT, --output OUTPUT @@ -257,10 +252,10 @@ wget https://raw.githubusercontent.com/BigDataBiology/argNorm/main/examples/raw/ Here is a basic outline of most argNorm commands: ```bash -argnorm [tool] -i [original_annotation.tsv] -o [argnorm_result.tsv] [--hamronized] +argnorm [tool] -i [original_annotation.tsv] -o [argnorm_result.tsv] [--db] ``` -Here, `tool` refers to the ARG annotation tool used (ResFinder in this case). `original_annotation.tsv` is the path to the input data and `argnorm_result.tsv` is the path to output file where the resulting table from argNorm will be stored. `--hamronized` is an option to indicate if the input data is a result of using the [hAMRonization package](https://github.com/pha4ge/hAMRonization). In our example, the input data is not a result of using the hAMRonization package, and so the `--hamronized` option can be omitted. +Here, `tool` refers to the ARG annotation tool used (ResFinder in this case). `original_annotation.tsv` is the path to the input data and `argnorm_result.tsv` is the path to output file where the resulting table from argNorm will be stored. `--db` is the ARG databases used along with `tool` to perform annotation. ResFinder does not require a `--db` (argNorm will automatically load up the ResFinder database), however, `--db` is required for the ARG annotation tools `groot` and `abricate`. To run argNorm on the input data, use this command in your terminal: