IQ-Tree-Teaching

From Research Computing Center Wiki
Jump to navigation Jump to search


Category

Bioinformatics

Program On

Teaching

Version

1.6.12

Author / Distributor

Please see http://www.iqtree.org/

Description

Efficient phylogenomic software by maximum likelihood. More information: http://www.iqtree.org/

Running Program

  • Version 1.6.12, installed in /apps/eb/IQ-TREE/1.6-12-foss-2019b

To use this version of IQ-TREE, please first load the module with

module load IQ-TREE/1.6.12-foss-2019b


Sample job submission script (sub.sh) to run IQ-Tree version. 1.6.12:

#!/bin/bash
#SBATCH --job-name=jobName
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=2gb
#SBATCH --time=04:00:00
#SBATCH --output=IQTREE.%j.out
#SBATCH --error=IQTREE.%j.err

cd $SLURM_SUBMIT_DIR

module load IQ-TREE/1.6.12-foss-2019b
iqtree [options]

In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.

Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.

Here is an example of job submission command:

sbatch ./sub.sh 

Documentation

module load IQ-TREE/1.6.12-foss-2019b

iqtree -h
IQ-TREE multicore version 1.6.12 for Linux 64-bit built Jul  9 2020
Developed by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor,
Heiko Schmidt, Dominik Schrempf, Michael Woodhams.

Usage: iqtree -s <alignment> [OPTIONS]

GENERAL OPTIONS:
  -? or -h             Print this help dialog
  -version             Display version number
  -s <alignment>       Input alignment in PHYLIP/FASTA/NEXUS/CLUSTAL/MSF format
  -st <data_type>      BIN, DNA, AA, NT2AA, CODON, MORPH (default: auto-detect)
  -q <partition_file>  Edge-linked partition model (file in NEXUS/RAxML format)
 -spp <partition_file> Like -q option but allowing partition-specific rates
  -sp <partition_file> Edge-unlinked partition model (like -M option of RAxML)
  -t <start_tree_file> or -t BIONJ or -t RANDOM
                       Starting tree (default: 99 parsimony tree and BIONJ)
  -te <user_tree_file> Like -t but fixing user tree (no tree search performed)
  -o <outgroup_taxon>  Outgroup taxon name for writing .treefile
  -pre <PREFIX>        Prefix for all output files (default: aln/partition)
  -nt <num_threads>    Number of cores/threads or AUTO for automatic detection
  -ntmax <max_threads> Max number of threads by -nt AUTO (default: #CPU cores)
  -seed <number>       Random seed number, normally used for debugging purpose
  -v, -vv, -vvv        Verbose mode, printing more messages to screen
  -quiet               Quiet mode, suppress printing to screen (stdout)
  -keep-ident          Keep identical sequences (default: remove & finally add)
  -safe                Safe likelihood kernel to avoid numerical underflow
  -mem RAM             Maximal RAM usage for memory saving mode
  --runs NUMBER        Number of indepedent runs (default: 1)

CHECKPOINTING TO RESUME STOPPED RUN:
  -redo                Redo analysis even for successful runs (default: resume)
  -cptime <seconds>    Minimum checkpoint time interval (default: 60 sec)

LIKELIHOOD MAPPING ANALYSIS:
  -lmap <#quartets>    Number of quartets for likelihood mapping analysis
  -lmclust <clustfile> NEXUS file containing clusters for likelihood mapping
  -wql                 Print quartet log-likelihoods to .quartetlh file

NEW STOCHASTIC TREE SEARCH ALGORITHM:
  -ninit <number>      Number of initial parsimony trees (default: 100)
  -ntop <number>       Number of top initial trees (default: 20)
  -nbest <number>      Number of best trees retained during search (defaut: 5)
  -n <#iterations>     Fix number of iterations to stop (default: auto)
  -nstop <number>      Number of unsuccessful iterations to stop (default: 100)
  -pers <proportion>   Perturbation strength for randomized NNI (default: 0.5)
  -sprrad <number>     Radius for parsimony SPR search (default: 6)
  -allnni              Perform more thorough NNI search (default: off)
  -g <constraint_tree> (Multifurcating) topological constraint tree file
  -fast                Fast search to resemble FastTree

ULTRAFAST BOOTSTRAP:
  -bb <#replicates>    Ultrafast bootstrap (>=1000)
  -bsam GENE|GENESITE  Resample GENE or GENE+SITE for partition (default: SITE)
  -wbt                 Write bootstrap trees to .ufboot file (default: none)
  -wbtl                Like -wbt but also writing branch lengths
  -nm <#iterations>    Maximum number of iterations (default: 1000)
  -nstep <#iterations> #Iterations for UFBoot stopping rule (default: 100)
  -bcor <min_corr>     Minimum correlation coefficient (default: 0.99)
  -beps <epsilon>      RELL epsilon to break tie (default: 0.5)
  -bnni                Optimize UFBoot trees by NNI on bootstrap alignment
  -j <jackknife>       Proportion of sites for jackknife (default: NONE)

STANDARD NON-PARAMETRIC BOOTSTRAP:
  -b <#replicates>     Bootstrap + ML tree + consensus tree (>=100)
  -bc <#replicates>    Bootstrap + consensus tree
  -bo <#replicates>    Bootstrap only

SINGLE BRANCH TEST:
  -alrt <#replicates>  SH-like approximate likelihood ratio test (SH-aLRT)
  -alrt 0              Parametric aLRT test (Anisimova and Gascuel 2006)
  -abayes              approximate Bayes test (Anisimova et al. 2011)
  -lbp <#replicates>   Fast local bootstrap probabilities

MODEL-FINDER:
  -m TESTONLY          Standard model selection (like jModelTest, ProtTest)
  -m TEST              Standard model selection followed by tree inference
  -m MF                Extended model selection with FreeRate heterogeneity
  -m MFP               Extended model selection followed by tree inference
  -m TESTMERGEONLY     Find best partition scheme (like PartitionFinder)
  -m TESTMERGE         Find best partition scheme followed by tree inference
  -m MF+MERGE          Find best partition scheme incl. FreeRate heterogeneity
  -m MFP+MERGE         Like -m MF+MERGE followed by tree inference
  -rcluster <percent>  Percentage of partition pairs (relaxed clustering alg.)
  -rclusterf <perc.>   Percentage of partition pairs (fast relaxed clustering)
  -rcluster-max <num>  Max number of partition pairs (default: 10*#partitions)
  -mset program        Restrict search to models supported by other programs
                       (raxml, phyml or mrbayes)
  -mset <lm-subset>    Restrict search to a subset of the Lie-Markov models
                       Options for lm-subset are:
                       liemarkov, liemarkovry, liemarkovws, liemarkovmk, strandsymmetric
  -mset m1,...,mk      Restrict search to models in a comma-separated list
                       (e.g. -mset WAG,LG,JTT)
  -msub source         Restrict search to AA models for specific sources
                       (nuclear, mitochondrial, chloroplast or viral)
  -mfreq f1,...,fk     Restrict search to using a list of state frequencies
                       (default AA: -mfreq FU,F; codon: -mfreq ,F1x4,F3x4,F)
  -mrate r1,...,rk     Restrict search to a list of rate-across-sites models
                       (e.g. -mrate E,I,G,I+G,R is used for -m MF)
  -cmin <kmin>         Min #categories for FreeRate model [+R] (default: 2)
  -cmax <kmax>         Max #categories for FreeRate model [+R] (default: 10)
  -merit AIC|AICc|BIC  Optimality criterion to use (default: all)
  -mtree               Perform full tree search for each model considered
  -mredo               Ignore model results computed earlier (default: reuse)
  -madd mx1,...,mxk    List of mixture models to also consider
  -mdef <nexus_file>   A model definition NEXUS file (see Manual)

SUBSTITUTION MODEL:
  -m <model_name>
                  DNA: HKY (default), JC, F81, K2P, K3P, K81uf, TN/TrN, TNef,
                       TIM, TIMef, TVM, TVMef, SYM, GTR, or 6-digit model
                       specification (e.g., 010010 = HKY)
              Protein: LG (default), Poisson, cpREV, mtREV, Dayhoff, mtMAM,
                       JTT, WAG, mtART, mtZOA, VT, rtREV, DCMut, PMB, HIVb,
                       HIVw, JTTDCMut, FLU, Blosum62, GTR20, mtMet, mtVer, mtInv
      Protein mixture: C10,...,C60, EX2, EX3, EHO, UL2, UL3, EX_EHO, LG4M, LG4X
               Binary: JC2 (default), GTR2
      Empirical codon: KOSI07, SCHN05
    Mechanistic codon: GY (default), MG, MGK, GY0K, GY1KTS, GY1KTV, GY2K,
                       MG1KTS, MG1KTV, MG2K
 Semi-empirical codon: XX_YY where XX is empirical and YY is mechanistic model
       Morphology/SNP: MK (default), ORDERED, GTR
       Lie Markov DNA: One of the following, optionally prefixed by RY, WS or MK:
                       1.1,  2.2b, 3.3a, 3.3b,  3.3c,
                       3.4,  4.4a, 4.4b, 4.5a,  4.5b,
                       5.6a, 5.6b, 5.7a, 5.7b,  5.7c,
                       5.11a,5.11b,5.11c,5.16,  6.6,
                       6.7a, 6.7b, 6.8a, 6.8b,  6.17a,
                       6.17b,8.8,  8.10a,8.10b, 8.16,
                       8.17, 8.18, 9.20a,9.20b,10.12,
                       10.34,12.12
       Non-reversible: STRSYM (strand symmetric model, synonymous with WS6.6)
       Non-reversible: UNREST (most general unrestricted model, functionally equivalent to 12.12)
       Models can have parameters appended in brackets.
           e.g. '-mRY3.4{0.2,-0.3}+I' specifies parameters for
           RY3.4 model but leaves proportion of invariant sites
           unspecified. '-mRY3.4{0.2,-0.3}+I{0.5} gives both.
           When this is done, the given parameters will be taken
           as fixed (default) or as start point for optimization
           (if -optfromgiven option supplied)

        Otherwise: Name of file containing user-model parameters
                   (rate parameters and state frequencies)

STATE FREQUENCY:
  Append one of the following +F... to -m <model_name>
  +F                   Empirically counted frequencies from alignment
  +FO (letter-O)       Optimized frequencies by maximum-likelihood
  +FQ                  Equal frequencies
  +FRY, +FWS, +FMK     For DNA models only, +FRY is freq(A+G)=1/2=freq(C+T),
                       +FWS is freq(A+T)=1/2=freq(C+G), +FMK is freq(A+C)=1/2=freq(G+T).
  +F####               where # are digits - for DNA models only, for basis in ACGT order,
                       digits indicate which frequencies are constrained to be the same.
                       E.g. +F1221 means freq(A)=freq(T), freq(C)=freq(G).
  +FU                  Amino-acid frequencies by the given protein matrix
  +F1x4 (codon model)  Equal NT frequencies over three codon positions
  +F3x4 (codon model)  Unequal NT frequencies over three codon positions

MIXTURE MODEL:
  -m "MIX{model1,...,modelK}"   Mixture model with K components
  -m "FMIX{freq1,...freqK}"     Frequency mixture model with K components
  -mwopt               Turn on optimizing mixture weights (default: none)

RATE HETEROGENEITY AMONG SITES:
  -m modelname+I       A proportion of invariable sites
  -m modelname+G[n]    Discrete Gamma model with n categories (default n=4)
  -m modelname*G[n]    Discrete Gamma model with unlinked model parameters
  -m modelname+I+G[n]  Invariable sites plus Gamma model with n categories
  -m modelname+R[n]    FreeRate model with n categories (default n=4)
  -m modelname*R[n]    FreeRate model with unlinked model parameters
  -m modelname+I+R[n]  Invariable sites plus FreeRate model with n categories
  -m modelname+Hn      Heterotachy model with n classes
  -m modelname*Hn      Heterotachy model with n classes and unlinked parameters
  -a <Gamma_shape>     Gamma shape parameter for site rates (default: estimate)
  -amin <min_shape>    Min Gamma shape parameter for site rates (default: 0.02)
  -gmedian             Median approximation for +G site rates (default: mean)
  --opt-gamma-inv      More thorough estimation for +I+G model parameters
  -i <p_invar>         Proportion of invariable sites (default: estimate)
  -wsr                 Write site rates to .rate file
  -mh                  Computing site-specific rates to .mhrate file using
                       Meyer & von Haeseler (2003) method

POLYMORPHISM AWARE MODELS (PoMo):
 -s <counts_file>      Input counts file (see manual)
 -m <MODEL>+P          DNA substitution model (see above) used with PoMo
   +N<POPSIZE>         Virtual population size (default: 9)
   +[WB|WH|S]          Sampling method (default: +WB), WB: Weighted binomial,
                       WH: Weighted hypergeometric S: Sampled sampling
   +G[n]               Discrete Gamma rate model with n categories (default n=4)

ASCERTAINMENT BIAS CORRECTION:
  -m modelname+ASC     Correction for absence of invariant sites in alignment

SINGLE TOPOLOGY HETEROTACHY MODEL:
 -m <model_name>+H[k]  Heterotachy model mixed branch lengths with k classes
 -m "MIX{m1,...mK}+H"
 -nni-eval <m>         Loop m times for NNI evaluation (default m=1)

SITE-SPECIFIC FREQUENCY MODEL:
  -ft <tree_file>      Input tree to infer site frequency model
  -fs <in_freq_file>   Input site frequency model file
  -fmax                Posterior maximum instead of mean approximation

CONSENSUS RECONSTRUCTION:
  -t <tree_file>       Set of input trees for consensus reconstruction
  -minsup <threshold>  Min split support in range [0,1]; 0.5 for majority-rule
                       consensus (default: 0, i.e. extended consensus)
  -bi <burnin>         Discarding <burnin> trees at beginning of <treefile>
  -con                 Computing consensus tree to .contree file
  -net                 Computing consensus network to .nex file
  -sup <target_tree>   Assigning support values for <target_tree> to .suptree
  -suptag <name>       Node name (or ALL) to assign tree IDs where node occurs

ROBINSON-FOULDS DISTANCE:
  -rf_all              Computing all-to-all RF distances of trees in <treefile>
  -rf <treefile2>      Computing all RF distances between two sets of trees
                       stored in <treefile> and <treefile2>
  -rf_adj              Computing RF distances of adjacent trees in <treefile>

TREE TOPOLOGY TEST:
  -z <trees_file>      Evaluating a set of user trees
  -zb <#replicates>    Performing BP,KH,SH,ELW tests for trees passed via -z
  -zw                  Also performing weighted-KH and weighted-SH tests
  -au                  Also performing approximately unbiased (AU) test

ANCESTRAL STATE RECONSTRUCTION:
  -asr                 Ancestral state reconstruction by empirical Bayes
  -asr-min <prob>      Min probability of ancestral state (default: equil freq)

GENERATING RANDOM TREES:
  -r <num_taxa>        Create a random tree under Yule-Harding model
  -ru <num_taxa>       Create a random tree under Uniform model
  -rcat <num_taxa>     Create a random caterpillar tree
  -rbal <num_taxa>     Create a random balanced tree
  -rcsg <num_taxa>     Create a random circular split network
  -rlen <min_len> <mean_len> <max_len>  
                       min, mean, and max branch lengths of random trees

MISCELLANEOUS:
  -wt                  Write locally optimal trees into .treels file
  -blfix               Fix branch lengths of user tree passed via -te
  -blscale             Scale branch lengths of user tree passed via -t
  -blmin               Min branch length for optimization (default 0.000001)
  -blmax               Max branch length for optimization (default 100)
  -wsr                 Write site rates and categories to .rate file
  -wsl                 Write site log-likelihoods to .sitelh file
  -wslr                Write site log-likelihoods per rate category
  -wslm                Write site log-likelihoods per mixture class
  -wslmr               Write site log-likelihoods per mixture+rate class
  -wspr                Write site probabilities per rate category
  -wspm                Write site probabilities per mixture class
  -wspmr               Write site probabilities per mixture+rate class
  -wpl                 Write partition log-likelihoods to .partlh file
  -fconst f1,...,fN    Add constant patterns into alignment (N=#nstates)
  -me <epsilon>        LogL epsilon for parameter estimation (default 0.01)
  --no-outfiles        Suppress printing output files
  --eigenlib           Use Eigen3 library
  -alninfo             Print alignment sites statistics to .alninfo
  -czb                 Collapse zero branches in final tree
  --show-lh            Compute tree likelihood without optimisation

Back to Top

Installation

IQ-Tree


System

64-bit Linux