IQ-Tree-Teaching: Difference between revisions
No edit summary |
No edit summary |
||
Line 46: | Line 46: | ||
<nowiki>#</nowiki>SBATCH --mail-user=<u>username@uga.edu</u><br> | <nowiki>#</nowiki>SBATCH --mail-user=<u>username@uga.edu</u><br> | ||
<nowiki>#</nowiki>SBATCH --ntasks=<u>1</u><br> | <nowiki>#</nowiki>SBATCH --ntasks=<u>1</u><br> | ||
<nowiki>#</nowiki>SBATCH --mem=<u>2gb</u><br> | <nowiki>#</nowiki>SBATCH --mem=<u>2gb</u><br> | ||
<nowiki>#</nowiki>SBATCH --time=<u>04:00:00</u><br> | <nowiki>#</nowiki>SBATCH --time=<u>04:00:00</u><br> |
Revision as of 11:09, 7 January 2020
Category
Bioinformatics
Program On
Teaching
Version
1.5.5, 1.6.5
Author / Distributor
Please see http://www.iqtree.org/
Description
Efficient phylogenomic software by maximum likelihood. More information: http://www.iqtree.org/
Running Program
- Version 1.5.5, installed in /usr/local/apps/eb/IQ-TREE/1.5.5-foss-2016b-omp-mpi
To use this version of IQ-TREE, please first load the module with
module load IQ-TREE/1.5.5-foss-2016b-omp-mpi
- Version 1.6.5, installed in /usr/local/apps/eb/IQ-TREE/1.6.5-omp
To use this version of IQ-TREE, please first load the module with
module load IQ-TREE/1.6.5-omp
Sample job submission script (sub.sh) to run IQ-Tree version. 1.6.5:
#!/bin/bash
#SBATCH --job-name=j_BEDTools
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=2gb
#SBATCH --time=04:00:00
#SBATCH --output=IQTREE.%j.out
#SBATCH --error=IQTREE.%j.err
cd $SLURM_SUBMIT_DIR
module load IQ-TREE/1.6.5-omp
iqtree [options]
In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.
Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.
Here is an example of job submission command:
sbatch ./sub.sh
Documentation
module load IQ-TREE/1.6.5-omp iqtree -h IQ-TREE multicore version 1.6.5 for Linux 64-bit built May 8 2018 Developed by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor, Heiko Schmidt, Dominik Schrempf, Michael Woodhams. Usage: iqtree -s <alignment> [OPTIONS] GENERAL OPTIONS: -? or -h Print this help dialog -version Display version number -s <alignment> Input alignment in PHYLIP/FASTA/NEXUS/CLUSTAL/MSF format -st <data_type> BIN, DNA, AA, NT2AA, CODON, MORPH (default: auto-detect) -q <partition_file> Edge-linked partition model (file in NEXUS/RAxML format) -spp <partition_file> Like -q option but allowing partition-specific rates -sp <partition_file> Edge-unlinked partition model (like -M option of RAxML) -t <start_tree_file> or -t BIONJ or -t RANDOM Starting tree (default: 99 parsimony tree and BIONJ) -te <user_tree_file> Like -t but fixing user tree (no tree search performed) -o <outgroup_taxon> Outgroup taxon name for writing .treefile -pre <PREFIX> Prefix for all output files (default: aln/partition) -nt <num_threads> Number of cores/threads or AUTO for automatic detection -ntmax <max_threads> Max number of threads by -nt AUTO (default: #CPU cores) -seed <number> Random seed number, normally used for debugging purpose -v, -vv, -vvv Verbose mode, printing more messages to screen -quiet Quiet mode, suppress printing to screen (stdout) -keep-ident Keep identical sequences (default: remove & finally add) -safe Safe likelihood kernel to avoid numerical underflow -mem RAM Maximal RAM usage for memory saving mode --runs NUMBER Number of indepedent runs (default: 1) CHECKPOINTING TO RESUME STOPPED RUN: -redo Redo analysis even for successful runs (default: resume) -cptime <seconds> Minimum checkpoint time interval (default: 60 sec) LIKELIHOOD MAPPING ANALYSIS: -lmap <#quartets> Number of quartets for likelihood mapping analysis -lmclust <clustfile> NEXUS file containing clusters for likelihood mapping -wql Print quartet log-likelihoods to .quartetlh file NEW STOCHASTIC TREE SEARCH ALGORITHM: -ninit <number> Number of initial parsimony trees (default: 100) -ntop <number> Number of top initial trees (default: 20) -nbest <number> Number of best trees retained during search (defaut: 5) -n <#iterations> Fix number of iterations to stop (default: auto) -nstop <number> Number of unsuccessful iterations to stop (default: 100) -pers <proportion> Perturbation strength for randomized NNI (default: 0.5) -sprrad <number> Radius for parsimony SPR search (default: 6) -allnni Perform more thorough NNI search (default: off) -g <constraint_tree> (Multifurcating) topological constraint tree file -fast Fast search to resemble FastTree ULTRAFAST BOOTSTRAP: -bb <#replicates> Ultrafast bootstrap (>=1000) -bsam GENE|GENESITE Resample GENE or GENE+SITE for partition (default: SITE) -wbt Write bootstrap trees to .ufboot file (default: none) -wbtl Like -wbt but also writing branch lengths -nm <#iterations> Maximum number of iterations (default: 1000) -nstep <#iterations> #Iterations for UFBoot stopping rule (default: 100) -bcor <min_corr> Minimum correlation coefficient (default: 0.99) -beps <epsilon> RELL epsilon to break tie (default: 0.5) -bnni Optimize UFBoot trees by NNI on bootstrap alignment -j <jackknife> Proportion of sites for jackknife (default: NONE) STANDARD NON-PARAMETRIC BOOTSTRAP: -b <#replicates> Bootstrap + ML tree + consensus tree (>=100) -bc <#replicates> Bootstrap + consensus tree -bo <#replicates> Bootstrap only SINGLE BRANCH TEST: -alrt <#replicates> SH-like approximate likelihood ratio test (SH-aLRT) -alrt 0 Parametric aLRT test (Anisimova and Gascuel 2006) -abayes approximate Bayes test (Anisimova et al. 2011) -lbp <#replicates> Fast local bootstrap probabilities MODEL-FINDER: -m TESTONLY Standard model selection (like jModelTest, ProtTest) -m TEST Standard model selection followed by tree inference -m MF Extended model selection with FreeRate heterogeneity -m MFP Extended model selection followed by tree inference -m TESTMERGEONLY Find best partition scheme (like PartitionFinder) -m TESTMERGE Find best partition scheme followed by tree inference -m MF+MERGE Find best partition scheme incl. FreeRate heterogeneity -m MFP+MERGE Like -m MF+MERGE followed by tree inference -rcluster <percent> Percentage of partition pairs (relaxed clustering alg.) -rclusterf <perc.> Percentage of partition pairs (fast relaxed clustering) -rcluster-max <num> Max number of partition pairs (default: 10*#partitions) -mset program Restrict search to models supported by other programs (raxml, phyml or mrbayes) -mset <lm-subset> Restrict search to a subset of the Lie-Markov models Options for lm-subset are: liemarkov, liemarkovry, liemarkovws, liemarkovmk, strandsymmetric -mset m1,...,mk Restrict search to models in a comma-separated list (e.g. -mset WAG,LG,JTT) -msub source Restrict search to AA models for specific sources (nuclear, mitochondrial, chloroplast or viral) -mfreq f1,...,fk Restrict search to using a list of state frequencies (default AA: -mfreq FU,F; codon: -mfreq ,F1x4,F3x4,F) -mrate r1,...,rk Restrict search to a list of rate-across-sites models (e.g. -mrate E,I,G,I+G,R is used for -m MF) -cmin <kmin> Min #categories for FreeRate model [+R] (default: 2) -cmax <kmax> Max #categories for FreeRate model [+R] (default: 10) -merit AIC|AICc|BIC Optimality criterion to use (default: all) -mtree Perform full tree search for each model considered -mredo Ignore model results computed earlier (default: reuse) -madd mx1,...,mxk List of mixture models to also consider -mdef <nexus_file> A model definition NEXUS file (see Manual) SUBSTITUTION MODEL: -m <model_name> DNA: HKY (default), JC, F81, K2P, K3P, K81uf, TN/TrN, TNef, TIM, TIMef, TVM, TVMef, SYM, GTR, or 6-digit model specification (e.g., 010010 = HKY) Protein: LG (default), Poisson, cpREV, mtREV, Dayhoff, mtMAM, JTT, WAG, mtART, mtZOA, VT, rtREV, DCMut, PMB, HIVb, HIVw, JTTDCMut, FLU, Blosum62, GTR20, mtMet, mtVer, mtInv Protein mixture: C10,...,C60, EX2, EX3, EHO, UL2, UL3, EX_EHO, LG4M, LG4X Binary: JC2 (default), GTR2 Empirical codon: KOSI07, SCHN05 Mechanistic codon: GY (default), MG, MGK, GY0K, GY1KTS, GY1KTV, GY2K, MG1KTS, MG1KTV, MG2K Semi-empirical codon: XX_YY where XX is empirical and YY is mechanistic model Morphology/SNP: MK (default), ORDERED, GTR Lie Markov DNA: One of the following, optionally prefixed by RY, WS or MK: 1.1, 2.2b, 3.3a, 3.3b, 3.3c, 3.4, 4.4a, 4.4b, 4.5a, 4.5b, 5.6a, 5.6b, 5.7a, 5.7b, 5.7c, 5.11a,5.11b,5.11c,5.16, 6.6, 6.7a, 6.7b, 6.8a, 6.8b, 6.17a, 6.17b,8.8, 8.10a,8.10b, 8.16, 8.17, 8.18, 9.20a,9.20b,10.12, 10.34,12.12 Non-reversible: STRSYM (strand symmetric model, synonymous with WS6.6) Non-reversible: UNREST (most general unrestricted model, functionally equivalent to 12.12) Models can have parameters appended in brackets. e.g. '-mRY3.4{0.2,-0.3}+I' specifies parameters for RY3.4 model but leaves proportion of invariant sites unspecified. '-mRY3.4{0.2,-0.3}+I{0.5} gives both. When this is done, the given parameters will be taken as fixed (default) or as start point for optimization (if -optfromgiven option supplied) Otherwise: Name of file containing user-model parameters (rate parameters and state frequencies) STATE FREQUENCY: Append one of the following +F... to -m <model_name> +F Empirically counted frequencies from alignment +FO (letter-O) Optimized frequencies by maximum-likelihood +FQ Equal frequencies +FRY, +FWS, +FMK For DNA models only, +FRY is freq(A+G)=1/2=freq(C+T), +FWS is freq(A+T)=1/2=freq(C+G), +FMK is freq(A+C)=1/2=freq(G+T). +F#### where # are digits - for DNA models only, for basis in ACGT order, digits indicate which frequencies are constrained to be the same. E.g. +F1221 means freq(A)=freq(T), freq(C)=freq(G). +FU Amino-acid frequencies by the given protein matrix +F1x4 (codon model) Equal NT frequencies over three codon positions +F3x4 (codon model) Unequal NT frequencies over three codon positions MIXTURE MODEL: -m "MIX{model1,...,modelK}" Mixture model with K components -m "FMIX{freq1,...freqK}" Frequency mixture model with K components -mwopt Turn on optimizing mixture weights (default: none) RATE HETEROGENEITY AMONG SITES: -m modelname+I A proportion of invariable sites -m modelname+G[n] Discrete Gamma model with n categories (default n=4) -m modelname*G[n] Discrete Gamma model with unlinked model parameters -m modelname+I+G[n] Invariable sites plus Gamma model with n categories -m modelname+R[n] FreeRate model with n categories (default n=4) -m modelname*R[n] FreeRate model with unlinked model parameters -m modelname+I+R[n] Invariable sites plus FreeRate model with n categories -m modelname+Hn Heterotachy model with n classes -m modelname*Hn Heterotachy model with n classes and unlinked parameters -a <Gamma_shape> Gamma shape parameter for site rates (default: estimate) -amin <min_shape> Min Gamma shape parameter for site rates (default: 0.02) -gmedian Median approximation for +G site rates (default: mean) --opt-gamma-inv More thorough estimation for +I+G model parameters -i <p_invar> Proportion of invariable sites (default: estimate) -wsr Write site rates to .rate file -mh Computing site-specific rates to .mhrate file using Meyer & von Haeseler (2003) method POLYMORPHISM AWARE MODELS (PoMo): -s <counts_file> Input counts file (see manual) -m <MODEL>+P DNA substitution model (see above) used with PoMo +N<POPSIZE> Virtual population size (default: 9) +[WB|WH|S] Sampling method (default: +WB), WB: Weighted binomial, WH: Weighted hypergeometric S: Sampled sampling +G[n] Discrete Gamma rate model with n categories (default n=4) ASCERTAINMENT BIAS CORRECTION: -m modelname+ASC Correction for absence of invariant sites in alignment SINGLE TOPOLOGY HETEROTACHY MODEL: -m <model_name>+H[k] Heterotachy model mixed branch lengths with k classes -m "MIX{m1,...mK}+H" -nni-eval <m> Loop m times for NNI evaluation (default m=1) SITE-SPECIFIC FREQUENCY MODEL: -ft <tree_file> Input tree to infer site frequency model -fs <in_freq_file> Input site frequency model file -fmax Posterior maximum instead of mean approximation CONSENSUS RECONSTRUCTION: -t <tree_file> Set of input trees for consensus reconstruction -minsup <threshold> Min split support in range [0,1]; 0.5 for majority-rule consensus (default: 0, i.e. extended consensus) -bi <burnin> Discarding <burnin> trees at beginning of <treefile> -con Computing consensus tree to .contree file -net Computing consensus network to .nex file -sup <target_tree> Assigning support values for <target_tree> to .suptree -suptag <name> Node name (or ALL) to assign tree IDs where node occurs ROBINSON-FOULDS DISTANCE: -rf_all Computing all-to-all RF distances of trees in <treefile> -rf <treefile2> Computing all RF distances between two sets of trees stored in <treefile> and <treefile2> -rf_adj Computing RF distances of adjacent trees in <treefile> TREE TOPOLOGY TEST: -z <trees_file> Evaluating a set of user trees -zb <#replicates> Performing BP,KH,SH,ELW tests for trees passed via -z -zw Also performing weighted-KH and weighted-SH tests -au Also performing approximately unbiased (AU) test ANCESTRAL STATE RECONSTRUCTION: -asr Ancestral state reconstruction by empirical Bayes -asr-min <prob> Min probability of ancestral state (default: equil freq) GENERATING RANDOM TREES: -r <num_taxa> Create a random tree under Yule-Harding model -ru <num_taxa> Create a random tree under Uniform model -rcat <num_taxa> Create a random caterpillar tree -rbal <num_taxa> Create a random balanced tree -rcsg <num_taxa> Create a random circular split network -rlen <min_len> <mean_len> <max_len> min, mean, and max branch lengths of random trees MISCELLANEOUS: -wt Write locally optimal trees into .treels file -blfix Fix branch lengths of user tree passed via -te -blscale Scale branch lengths of user tree passed via -t -blmin Min branch length for optimization (default 0.000001) -blmax Max branch length for optimization (default 100) -wsr Write site rates and categories to .rate file -wsl Write site log-likelihoods to .sitelh file -wslr Write site log-likelihoods per rate category -wslm Write site log-likelihoods per mixture class -wslmr Write site log-likelihoods per mixture+rate class -wspr Write site probabilities per rate category -wspm Write site probabilities per mixture class -wspmr Write site probabilities per mixture+rate class -wpl Write partition log-likelihoods to .partlh file -fconst f1,...,fN Add constant patterns into alignment (N=#nstates) -me <epsilon> LogL epsilon for parameter estimation (default 0.01) --no-outfiles Suppress printing output files --eigenlib Use Eigen3 library -alninfo Print alignment sites statistics to .alninfo -czb Collapse zero branches in final tree --show-lh Compute tree likelihood without optimisation
Installation
System
64-bit Linux