Skip to content

Commit

Permalink
Merge pull request #884 from nextstrain/docs/skip-augur-tree
Browse files Browse the repository at this point in the history
[docs] describe how to use your own tree builder
  • Loading branch information
jameshadfield authored Apr 8, 2022
2 parents 8387c31 + 41e94d6 commit 7db4dc4
Show file tree
Hide file tree
Showing 2 changed files with 52 additions and 0 deletions.
1 change: 1 addition & 0 deletions docs/faq/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,4 @@ common questions and problems users run into.
metadata
clades
Specifying `refine` rates <refine>
Creating a tree using your own tree builder <skip_augur_tree>
51 changes: 51 additions & 0 deletions docs/faq/skip_augur_tree.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
===========================================
Creating a tree using your own tree builder
===========================================

The `augur tree` command is a light wrapper around tree building programs such as IQ-TREE, RAxML and FastTree.
It's possible that the functionality you want isn't available in those programs, or that it is available but that `augur tree` doesn't expose the functionality you need.
The solution is just to use your own tree building program directly.
This guide explains the general approach you can use.


Exclude sites
-------------

Augur tree allows you to mask certain problematic sites before tree-building via `augur tree --exclude-sites <exclude_file> --alignment <alignment>` , where each line in `<exclude_file>` is a 1-based nucleotide position to mask from the alignment. We can recreate this via the following command:

.. code-block:: bash
augur mask --sequences <alignment> --mask <exclude_file> --output <masked_alignment>
If you use this, remember to use the `<masked_alignment>` file for tree building, not the input `<alignment>` file!

Build your tree
---------------

This is up to you!
The end result should be a newick tree file with branch lengths representing nucleotide divergence (or similar).
In a typical nextstrain workflow, this file will be handed to `augur refine` for temporal inference (see below if you also want to skip this step).
Note that augur refine may change the tree structure in two ways:

* Rerooting (avoid via `augur refine --keep-root`)
* Resolving polytomies using temporal information (avoid via `augur refine --keep-polytomies`)


There are some gotchas involved in tree building, such as:

* IQ-TREE will change taxon names if they include any of the following characters: `/|()*`. This will result in the tree and alignment falling out of sync. Other tree builders may have similar behavior.
* Bootstrap values are allowed, and should be parsed correctly by `augur refine`.



Avoiding augur refine
---------------------

While augur refine is typically used to reroot the tree and infer the times of internal nodes, it also performs two important tasks which subsequent augur commands expect:

* Labels internal node names. These are typically in the format `NODE_0000001`, but this is not required.
* Stores branch lengths already present in the newick tree as a node-data JSON file, which `augur export` uses.

If you can address both of these then you can easily skip the `augur refine` step.

0 comments on commit 7db4dc4

Please sign in to comment.