Skip to content

Phylogenetics

Alden Dirks edited this page Aug 19, 2022 · 14 revisions

Gather Sequences

Align Sequences

Trim Alignment(s) and Concatenate Multi-locus Data

Run Phylogenetic Analyses

IQTREE

RAxML

MrBayes

Introduction

MrBayes is the standard software used for Bayesian phylogenetic analysis (versus the maximum likelihood approach used in IQTREE and RAxML). In modern phylogenetics, these two methods are complementary. Branch support values from both approaches (bootstrap values and posterior probabilities, respectively) should be presented together in a tree. Reviewers may ask for a Bayesian analysis if only maximum likelihood is conducted.

MrBayes is installed as a module on the Great Lakes Slurm HPC Cluster but I have not been able to get it to work. Instead, MrBayes can be run on the CIPRES Science Gateway portal. There is apparently a plugin for Geneious Prime to run MrBayes and other software on CIPRES, but I have not investigated this yet.

Convert Your FASTA Alignment into a MrBayes NEXUS File

To get started, you need to reformat your alignment in NEXUS format, and not just the standard NEXUS format, but a unique MrBayes NEXUS format. One way to do this is with Mesquite. On a Mac, run Mesquite by double clicking the Mesquite.jar file. The first time you are opening it, you will need to "control click", select "open", and tell the computer yes you trust this file. Click "File > Open File..." and select your alignment.

Select your interpreter (FASTA) and save the imported file as a NEXUS file. You should see a Character Matrix window. Click "File > Export", scroll to the bottom of the list, and select "Export NEXUS for MrBayes". You can just accept the default settings in the next window by clicking "Export" (we will override these settings when we run MrBayes in CIPRES).

Run MrBayes in CIPRES

Make a CIPRES profile if you have not yet done so. Create a new folder for your project. Click "Data" in the file tree to the left and upload your MrBayes NEXUS alignment. Then click on "Tasks" in the file tree and click "Create New Task". Give your task a description. Select your input data (the alignment we just uploaded). This will bring you to the "Select Tool" tab. Scroll down and click "MrBayes on XSEDE (3.2.7a)". Back in the "Task Summary" tab, click "148 Paramaters Set" (the button for the Input Parameters section). In the "Simple Parameters" section, uncheck "My Data Contains a MrBayes Data Block (CHECK THIS OR MrBayes BLOCK ENTRIES WILL BE OVERWRITTEN!!!)" (we want to overwrite the default MrBayes block outputted by Mesquite) and change the default 168 "Maximum Hours to Run" to something much smaller, like 8.

Go down to "Advanced Parameters" and change settings in the "Likelihood Model Parameters" section. I don't know what all these options are so you will need to investigate them according to the best substitution model given to you by IQTREE's ModelFinder. For my analysis of Tolypocladium ITS sequences I used the model HKY+F+I+G4, which resulted in the following settings (changes to Nst= and Rates=):

Then go down to "Parameters for MCMC" and change "Number of Generations" to something in the millions, like 2,000,000. These are all the basic settings that need to be changed. Feel free to change other default settings as you feel fit. Click "Save Parameters" to be taken back to the main task window. Click "Save and Run Task".

Analyze Output

Once CIPRES has completed your job, download all the output files. Unzip the download and drag the two .t files (i.e., infile.nex.run1.t and infile.nex.run2.t) into a Geneious Prime directory. For each one separately, click the file, click "Tree", and choose your output directory (just choose same one that contains the two .t files). In "Consensus Tree Builder", change "Support Threshold %" to 95. Click "OK".

You will then see the consensus tree. You can click your outgroup sequence and then click "Root" to root your tree. You might see lots of polytomies and then select branches with values that are greater than 95. These are your posterior probability values. When editing your tree for publication, you can thicken these branches that have greater than 95 posterior probability, which actually serves as a statistical measure of support (we choose 95 because alpha = 0.05).

I'm not really sure why we make two trees, but you can do this for both of them and see if they are in agreement.

Visualize and Edit Tree

References

IQTREE

CIPRES

MrBayes