IQ-Tree-Sapelo2

From Research Computing Center Wiki
Jump to navigation Jump to search


Category

Bioinformatics

Program On

Sapelo2

Version

1.6.12, 2.2.2.6

Author / Distributor

Please see http://www.iqtree.org/

Description

Efficient phylogenomic software by maximum likelihood. More information: http://www.iqtree.org/

Running Program

  • Version 2.2.2.6, installed in /apps/eb/IQ-TREE/2.2.2.6-gompi-2022a

To use this version of IQ-TREE, please first load the module with

module load IQ-TREE/2.2.2.6-gompi-2022a
  • Version 1.6.12, installed in /apps/eb/IQ-TREE/1.6-12-foss-2019b

To use this version of IQ-TREE, please first load the module with

module load IQ-TREE/1.6.12-foss-2019b


Sample job submission script (sub.sh) to run IQ-Tree version. 1.6.12:

#!/bin/bash
#SBATCH --job-name=jobName
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=2gb
#SBATCH --time=04:00:00
#SBATCH --output=IQTREE.%j.out
#SBATCH --error=IQTREE.%j.err

cd $SLURM_SUBMIT_DIR

module load IQ-TREE/1.6.12-foss-2019b
iqtree [options]

In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.

Please refer to Running Jobs on Sapelo2 for more information on running jobs on the Sapelo2 cluster.

Here is an example of job submission command:

sbatch ./sub.sh 

Documentation

module load IQ-TREE/1.6.12-foss-2019b

iqtree -h
IQ-TREE multicore version 1.6.12 for Linux 64-bit built Jul  9 2020
Developed by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor,
Heiko Schmidt, Dominik Schrempf, Michael Woodhams.

Usage: iqtree -s <alignment> [OPTIONS]

GENERAL OPTIONS:
  -? or -h             Print this help dialog
  -version             Display version number
  -s <alignment>       Input alignment in PHYLIP/FASTA/NEXUS/CLUSTAL/MSF format
  -st <data_type>      BIN, DNA, AA, NT2AA, CODON, MORPH (default: auto-detect)
  -q <partition_file>  Edge-linked partition model (file in NEXUS/RAxML format)
 -spp <partition_file> Like -q option but allowing partition-specific rates
  -sp <partition_file> Edge-unlinked partition model (like -M option of RAxML)
  -t <start_tree_file> or -t BIONJ or -t RANDOM
                       Starting tree (default: 99 parsimony tree and BIONJ)
  -te <user_tree_file> Like -t but fixing user tree (no tree search performed)
  -o <outgroup_taxon>  Outgroup taxon name for writing .treefile
  -pre <PREFIX>        Prefix for all output files (default: aln/partition)
  -nt <num_threads>    Number of cores/threads or AUTO for automatic detection
  -ntmax <max_threads> Max number of threads by -nt AUTO (default: #CPU cores)
  -seed <number>       Random seed number, normally used for debugging purpose
  -v, -vv, -vvv        Verbose mode, printing more messages to screen
  -quiet               Quiet mode, suppress printing to screen (stdout)
  -keep-ident          Keep identical sequences (default: remove & finally add)
  -safe                Safe likelihood kernel to avoid numerical underflow
  -mem RAM             Maximal RAM usage for memory saving mode
  --runs NUMBER        Number of indepedent runs (default: 1)

CHECKPOINTING TO RESUME STOPPED RUN:
  -redo                Redo analysis even for successful runs (default: resume)
  -cptime <seconds>    Minimum checkpoint time interval (default: 60 sec)

LIKELIHOOD MAPPING ANALYSIS:
  -lmap <#quartets>    Number of quartets for likelihood mapping analysis
  -lmclust <clustfile> NEXUS file containing clusters for likelihood mapping
  -wql                 Print quartet log-likelihoods to .quartetlh file

NEW STOCHASTIC TREE SEARCH ALGORITHM:
  -ninit <number>      Number of initial parsimony trees (default: 100)
  -ntop <number>       Number of top initial trees (default: 20)
  -nbest <number>      Number of best trees retained during search (defaut: 5)
  -n <#iterations>     Fix number of iterations to stop (default: auto)
  -nstop <number>      Number of unsuccessful iterations to stop (default: 100)
  -pers <proportion>   Perturbation strength for randomized NNI (default: 0.5)
  -sprrad <number>     Radius for parsimony SPR search (default: 6)
  -allnni              Perform more thorough NNI search (default: off)
  -g <constraint_tree> (Multifurcating) topological constraint tree file
  -fast                Fast search to resemble FastTree

ULTRAFAST BOOTSTRAP:
  -bb <#replicates>    Ultrafast bootstrap (>=1000)
  -bsam GENE|GENESITE  Resample GENE or GENE+SITE for partition (default: SITE)
  -wbt                 Write bootstrap trees to .ufboot file (default: none)
  -wbtl                Like -wbt but also writing branch lengths
  -nm <#iterations>    Maximum number of iterations (default: 1000)
  -nstep <#iterations> #Iterations for UFBoot stopping rule (default: 100)
  -bcor <min_corr>     Minimum correlation coefficient (default: 0.99)
  -beps <epsilon>      RELL epsilon to break tie (default: 0.5)
  -bnni                Optimize UFBoot trees by NNI on bootstrap alignment
  -j <jackknife>       Proportion of sites for jackknife (default: NONE)

STANDARD NON-PARAMETRIC BOOTSTRAP:
  -b <#replicates>     Bootstrap + ML tree + consensus tree (>=100)
  -bc <#replicates>    Bootstrap + consensus tree
  -bo <#replicates>    Bootstrap only

SINGLE BRANCH TEST:
  -alrt <#replicates>  SH-like approximate likelihood ratio test (SH-aLRT)
  -alrt 0              Parametric aLRT test (Anisimova and Gascuel 2006)
  -abayes              approximate Bayes test (Anisimova et al. 2011)
  -lbp <#replicates>   Fast local bootstrap probabilities

MODEL-FINDER:
  -m TESTONLY          Standard model selection (like jModelTest, ProtTest)
  -m TEST              Standard model selection followed by tree inference
  -m MF                Extended model selection with FreeRate heterogeneity
  -m MFP               Extended model selection followed by tree inference
  -m TESTMERGEONLY     Find best partition scheme (like PartitionFinder)
  -m TESTMERGE         Find best partition scheme followed by tree inference
  -m MF+MERGE          Find best partition scheme incl. FreeRate heterogeneity
  -m MFP+MERGE         Like -m MF+MERGE followed by tree inference
  -rcluster <percent>  Percentage of partition pairs (relaxed clustering alg.)
  -rclusterf <perc.>   Percentage of partition pairs (fast relaxed clustering)
  -rcluster-max <num>  Max number of partition pairs (default: 10*#partitions)
  -mset program        Restrict search to models supported by other programs
                       (raxml, phyml or mrbayes)
  -mset <lm-subset>    Restrict search to a subset of the Lie-Markov models
                       Options for lm-subset are:
                       liemarkov, liemarkovry, liemarkovws, liemarkovmk, strandsymmetric
  -mset m1,...,mk      Restrict search to models in a comma-separated list
                       (e.g. -mset WAG,LG,JTT)
  -msub source         Restrict search to AA models for specific sources
                       (nuclear, mitochondrial, chloroplast or viral)
  -mfreq f1,...,fk     Restrict search to using a list of state frequencies
                       (default AA: -mfreq FU,F; codon: -mfreq ,F1x4,F3x4,F)
  -mrate r1,...,rk     Restrict search to a list of rate-across-sites models
                       (e.g. -mrate E,I,G,I+G,R is used for -m MF)
  -cmin <kmin>         Min #categories for FreeRate model [+R] (default: 2)
  -cmax <kmax>         Max #categories for FreeRate model [+R] (default: 10)
  -merit AIC|AICc|BIC  Optimality criterion to use (default: all)
  -mtree               Perform full tree search for each model considered
  -mredo               Ignore model results computed earlier (default: reuse)
  -madd mx1,...,mxk    List of mixture models to also consider
  -mdef <nexus_file>   A model definition NEXUS file (see Manual)

SUBSTITUTION MODEL:
  -m <model_name>
                  DNA: HKY (default), JC, F81, K2P, K3P, K81uf, TN/TrN, TNef,
                       TIM, TIMef, TVM, TVMef, SYM, GTR, or 6-digit model
                       specification (e.g., 010010 = HKY)
              Protein: LG (default), Poisson, cpREV, mtREV, Dayhoff, mtMAM,
                       JTT, WAG, mtART, mtZOA, VT, rtREV, DCMut, PMB, HIVb,
                       HIVw, JTTDCMut, FLU, Blosum62, GTR20, mtMet, mtVer, mtInv
      Protein mixture: C10,...,C60, EX2, EX3, EHO, UL2, UL3, EX_EHO, LG4M, LG4X
               Binary: JC2 (default), GTR2
      Empirical codon: KOSI07, SCHN05
    Mechanistic codon: GY (default), MG, MGK, GY0K, GY1KTS, GY1KTV, GY2K,
                       MG1KTS, MG1KTV, MG2K
 Semi-empirical codon: XX_YY where XX is empirical and YY is mechanistic model
       Morphology/SNP: MK (default), ORDERED, GTR
       Lie Markov DNA: One of the following, optionally prefixed by RY, WS or MK:
                       1.1,  2.2b, 3.3a, 3.3b,  3.3c,
                       3.4,  4.4a, 4.4b, 4.5a,  4.5b,
                       5.6a, 5.6b, 5.7a, 5.7b,  5.7c,
                       5.11a,5.11b,5.11c,5.16,  6.6,
                       6.7a, 6.7b, 6.8a, 6.8b,  6.17a,
                       6.17b,8.8,  8.10a,8.10b, 8.16,
                       8.17, 8.18, 9.20a,9.20b,10.12,
                       10.34,12.12
       Non-reversible: STRSYM (strand symmetric model, synonymous with WS6.6)
       Non-reversible: UNREST (most general unrestricted model, functionally equivalent to 12.12)
       Models can have parameters appended in brackets.
           e.g. '-mRY3.4{0.2,-0.3}+I' specifies parameters for
           RY3.4 model but leaves proportion of invariant sites
           unspecified. '-mRY3.4{0.2,-0.3}+I{0.5} gives both.
           When this is done, the given parameters will be taken
           as fixed (default) or as start point for optimization
           (if -optfromgiven option supplied)

        Otherwise: Name of file containing user-model parameters
                   (rate parameters and state frequencies)

STATE FREQUENCY:
  Append one of the following +F... to -m <model_name>
  +F                   Empirically counted frequencies from alignment
  +FO (letter-O)       Optimized frequencies by maximum-likelihood
  +FQ                  Equal frequencies
  +FRY, +FWS, +FMK     For DNA models only, +FRY is freq(A+G)=1/2=freq(C+T),
                       +FWS is freq(A+T)=1/2=freq(C+G), +FMK is freq(A+C)=1/2=freq(G+T).
  +F####               where # are digits - for DNA models only, for basis in ACGT order,
                       digits indicate which frequencies are constrained to be the same.
                       E.g. +F1221 means freq(A)=freq(T), freq(C)=freq(G).
  +FU                  Amino-acid frequencies by the given protein matrix
  +F1x4 (codon model)  Equal NT frequencies over three codon positions
  +F3x4 (codon model)  Unequal NT frequencies over three codon positions

MIXTURE MODEL:
  -m "MIX{model1,...,modelK}"   Mixture model with K components
  -m "FMIX{freq1,...freqK}"     Frequency mixture model with K components
  -mwopt               Turn on optimizing mixture weights (default: none)

RATE HETEROGENEITY AMONG SITES:
  -m modelname+I       A proportion of invariable sites
  -m modelname+G[n]    Discrete Gamma model with n categories (default n=4)
  -m modelname*G[n]    Discrete Gamma model with unlinked model parameters
  -m modelname+I+G[n]  Invariable sites plus Gamma model with n categories
  -m modelname+R[n]    FreeRate model with n categories (default n=4)
  -m modelname*R[n]    FreeRate model with unlinked model parameters
  -m modelname+I+R[n]  Invariable sites plus FreeRate model with n categories
  -m modelname+Hn      Heterotachy model with n classes
  -m modelname*Hn      Heterotachy model with n classes and unlinked parameters
  -a <Gamma_shape>     Gamma shape parameter for site rates (default: estimate)
  -amin <min_shape>    Min Gamma shape parameter for site rates (default: 0.02)
  -gmedian             Median approximation for +G site rates (default: mean)
  --opt-gamma-inv      More thorough estimation for +I+G model parameters
  -i <p_invar>         Proportion of invariable sites (default: estimate)
  -wsr                 Write site rates to .rate file
  -mh                  Computing site-specific rates to .mhrate file using
                       Meyer & von Haeseler (2003) method

POLYMORPHISM AWARE MODELS (PoMo):
 -s <counts_file>      Input counts file (see manual)
 -m <MODEL>+P          DNA substitution model (see above) used with PoMo
   +N<POPSIZE>         Virtual population size (default: 9)
   +[WB|WH|S]          Sampling method (default: +WB), WB: Weighted binomial,
                       WH: Weighted hypergeometric S: Sampled sampling
   +G[n]               Discrete Gamma rate model with n categories (default n=4)

ASCERTAINMENT BIAS CORRECTION:
  -m modelname+ASC     Correction for absence of invariant sites in alignment

SINGLE TOPOLOGY HETEROTACHY MODEL:
 -m <model_name>+H[k]  Heterotachy model mixed branch lengths with k classes
 -m "MIX{m1,...mK}+H"
 -nni-eval <m>         Loop m times for NNI evaluation (default m=1)

SITE-SPECIFIC FREQUENCY MODEL:
  -ft <tree_file>      Input tree to infer site frequency model
  -fs <in_freq_file>   Input site frequency model file
  -fmax                Posterior maximum instead of mean approximation

CONSENSUS RECONSTRUCTION:
  -t <tree_file>       Set of input trees for consensus reconstruction
  -minsup <threshold>  Min split support in range [0,1]; 0.5 for majority-rule
                       consensus (default: 0, i.e. extended consensus)
  -bi <burnin>         Discarding <burnin> trees at beginning of <treefile>
  -con                 Computing consensus tree to .contree file
  -net                 Computing consensus network to .nex file
  -sup <target_tree>   Assigning support values for <target_tree> to .suptree
  -suptag <name>       Node name (or ALL) to assign tree IDs where node occurs

ROBINSON-FOULDS DISTANCE:
  -rf_all              Computing all-to-all RF distances of trees in <treefile>
  -rf <treefile2>      Computing all RF distances between two sets of trees
                       stored in <treefile> and <treefile2>
  -rf_adj              Computing RF distances of adjacent trees in <treefile>

TREE TOPOLOGY TEST:
  -z <trees_file>      Evaluating a set of user trees
  -zb <#replicates>    Performing BP,KH,SH,ELW tests for trees passed via -z
  -zw                  Also performing weighted-KH and weighted-SH tests
  -au                  Also performing approximately unbiased (AU) test

ANCESTRAL STATE RECONSTRUCTION:
  -asr                 Ancestral state reconstruction by empirical Bayes
  -asr-min <prob>      Min probability of ancestral state (default: equil freq)

GENERATING RANDOM TREES:
  -r <num_taxa>        Create a random tree under Yule-Harding model
  -ru <num_taxa>       Create a random tree under Uniform model
  -rcat <num_taxa>     Create a random caterpillar tree
  -rbal <num_taxa>     Create a random balanced tree
  -rcsg <num_taxa>     Create a random circular split network
  -rlen <min_len> <mean_len> <max_len>  
                       min, mean, and max branch lengths of random trees

MISCELLANEOUS:
  -wt                  Write locally optimal trees into .treels file
  -blfix               Fix branch lengths of user tree passed via -te
  -blscale             Scale branch lengths of user tree passed via -t
  -blmin               Min branch length for optimization (default 0.000001)
  -blmax               Max branch length for optimization (default 100)
  -wsr                 Write site rates and categories to .rate file
  -wsl                 Write site log-likelihoods to .sitelh file
  -wslr                Write site log-likelihoods per rate category
  -wslm                Write site log-likelihoods per mixture class
  -wslmr               Write site log-likelihoods per mixture+rate class
  -wspr                Write site probabilities per rate category
  -wspm                Write site probabilities per mixture class
  -wspmr               Write site probabilities per mixture+rate class
  -wpl                 Write partition log-likelihoods to .partlh file
  -fconst f1,...,fN    Add constant patterns into alignment (N=#nstates)
  -me <epsilon>        LogL epsilon for parameter estimation (default 0.01)
  --no-outfiles        Suppress printing output files
  --eigenlib           Use Eigen3 library
  -alninfo             Print alignment sites statistics to .alninfo
  -czb                 Collapse zero branches in final tree
  --show-lh            Compute tree likelihood without optimisation

Back to Top

Installation

IQ-Tree


System

64-bit Linux