RAxML-NG-Sapelo2: Difference between revisions
(Created page with "Category:Sapelo2Category:SoftwareCategory:Bioinformatics === Category === Bioinformatics === Program On === Sapelo2 === Version === 1.2.2 ===Author / Distributor=== Alexey M. Kozlov and Alexandros Stamatakis ===Description=== RAxML-NG is a phylogenetic tree inference tool which uses maximum-likelihood (ML) optimality criterion. Its search heuristic is based on iteratively performing a series of Subtree Pruning and Regrafting (SPR) moves, which allows to qu...") |
(There was still a reference to using an "MPI version" even though we don't have an MPI version of this software installed.) |
||
(One intermediate revision by the same user not shown) | |||
Line 29: | Line 29: | ||
Example of shell script to run | Example of shell script to run in the batch queue, using 8 cores on a single node: | ||
<div class="gscript2"> | <div class="gscript2"> | ||
Line 46: | Line 46: | ||
cd $SLURM_SUBMIT_DIR<br> | cd $SLURM_SUBMIT_DIR<br> | ||
ml RAxML-NG/1.2.2-GCC-12.2.0<br> | ml RAxML-NG/1.2.2-GCC-12.2.0<br> | ||
raxml-ng --all --msa testAA.fa --model LG+G8+F --tree pars{10} --bs-trees 200 --threads 8<br> | |||
</div> | </div> | ||
Latest revision as of 09:00, 22 October 2024
Category
Bioinformatics
Program On
Sapelo2
Version
1.2.2
Author / Distributor
Alexey M. Kozlov and Alexandros Stamatakis
Description
RAxML-NG is a phylogenetic tree inference tool which uses maximum-likelihood (ML) optimality criterion. Its search heuristic is based on iteratively performing a series of Subtree Pruning and Regrafting (SPR) moves, which allows to quickly navigate to the best-known ML tree.
Running Program
Also refer to Running Jobs on Sapelo2
For more information on Environment Modules on Sapelo2 please see the Lmod page.
RAxML-NG
To use RaxML-NG, please first load the module with
module load RAxML-NG/1.2.2-GCC-12.2.0
Keep in mind that because this version doesn't use MPI, the job cannot be run on multiple nodes, thought it can use multiple threads on a single node.
Example of shell script to run in the batch queue, using 8 cores on a single node:
#!/bin/bash
#SBATCH --job-name=j_RAxML-NG
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8 #this needs to be the same as the '--threads' option below!
#SBATCH --mem=64GB
#SBATCH --time=08:00:00
#SBATCH --output=RAxML-NG.%j.out
#SBATCH --error=RAxML-NG.%j.err
cd $SLURM_SUBMIT_DIR
ml RAxML-NG/1.2.2-GCC-12.2.0
raxml-ng --all --msa testAA.fa --model LG+G8+F --tree pars{10} --bs-trees 200 --threads 8
In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.
Please refer to Running Jobs on Sapelo2.
Documentation
ml RAxML-NG/1.2.2-GCC-12.2.0 raxml-ng --help RAxML-NG v. 1.2.2-master released on 30.04.2024 by The Exelixis Lab. Developed by: Alexey M. Kozlov and Alexandros Stamatakis. Contributors: Diego Darriba, Tomas Flouri, Benoit Morel, Sarah Lutteropp, Ben Bettisworth, Julia Haag, Anastasis Togkousidis. Latest version: https://github.com/amkozlov/raxml-ng Questions/problems/suggestions? Please visit: https://groups.google.com/forum/#!forum/raxml Usage: raxml-ng [OPTIONS] Commands (mutually exclusive): --help display help information --version display version information --evaluate evaluate the likelihood of a tree (with model+brlen optimization) --search ML tree search (default: 10 parsimony + 10 random starting trees) --bootstrap bootstrapping (default: use bootstopping to auto-detect #replicates) --all all-in-one (ML search + bootstrapping) --support compute bipartition support for a given reference tree (e.g., best ML tree) and a set of replicate trees (e.g., from a bootstrap analysis) --bsconverge test for bootstrapping convergence using autoMRE criterion --bsmsa generate bootstrap replicate MSAs --terrace check whether a tree lies on a phylogenetic terrace --check check alignment correctness and remove empty columns/rows --parse parse alignment, compress patterns and create binary MSA file --start generate parsimony/random starting trees and exit --rfdist compute pair-wise Robinson-Foulds (RF) distances between trees --consense [ STRICT | MR | MR<n> | MRE ] build strict, majority-rule (MR) or extended MR (MRE) consensus tree (default: MR) eg: --consense MR75 --tree bsrep.nw --ancestral ancestral state reconstruction at all inner nodes --sitelh print per-site log-likelihood values Command shortcuts (mutually exclusive): --search1 Alias for: --search --tree rand{1} --loglh Alias for: --evaluate --opt-model off --opt-branches off --nofiles --log result --rf Alias for: --rfdist --nofiles --log result Input and output options: --tree rand{N} | pars{N} | FILE starting tree: rand(om), pars(imony) or user-specified (newick file) N = number of trees (default: rand{10},pars{10}) --msa FILE alignment file --msa-format VALUE alignment file format: FASTA, PHYLIP, CATG or AUTO-detect (default) --data-type VALUE data type: DNA, AA, BIN(ary) or AUTO-detect (default) --tree-constraint FILE constraint tree --prefix STRING prefix for output files (default: MSA file name) --log VALUE log verbosity: ERROR,WARNING,RESULT,INFO,PROGRESS,DEBUG (default: PROGRESS) --redo overwrite existing result files and ignore checkpoints (default: OFF) --nofiles do not create any output files, print results to the terminal only --precision VALUE number of decimal places to print (default: 6) --outgroup o1,o2,..,oN comma-separated list of outgroup taxon names (it's just a drawing option!) --site-weights FILE file with MSA column weights (positive integers only!) General options: --seed VALUE seed for pseudo-random number generator (default: current time) --pat-comp on | off alignment pattern compression (default: ON) --tip-inner on | off tip-inner case optimization (default: OFF) --site-repeats on | off use site repeats optimization, 10%-60% faster than tip-inner (default: ON) --threads VALUE number of parallel threads to use (default: 1) --workers VALUE number of tree searches to run in parallel (default: 1) --simd none | sse3 | avx | avx2 vector instruction set to use (default: auto-detect). --rate-scalers on | off use individual CLV scalers for each rate category (default: ON for >2000 taxa) --force [ <CHECKS> ] disable safety checks (please think twice!) Model options: --model <name>+G[n]+<Freqs> | FILE model specification OR partition file --brlen linked | scaled | unlinked branch length linkage between partitions (default: scaled) --blmin VALUE minimum branch length (default: 1e-6) --blmax VALUE maximum branch length (default: 100) --blopt nr_fast | nr_safe branch length optimization method (default: nr_fast) nr_oldfast | nr_oldsafe --opt-model on | off ML optimization of all model parameters (default: ON) --opt-branches on | off ML optimization of all branch lengths (default: ON) --prob-msa on | off use probabilistic alignment (works with CATG and VCF) --lh-epsilon VALUE log-likelihood epsilon for optimization/tree search (default: 0.1) Topology search options: --spr-radius VALUE SPR re-insertion radius for fast iterations (default: AUTO) --spr-cutoff VALUE | off relative LH cutoff for descending into subtrees (default: 1.0) --lh-epsilon-triplet VALUE log-likelihood epsilon for branch length triplet optimization (default: 1000) Bootstrapping options: --bs-trees VALUE number of bootstraps replicates --bs-trees autoMRE{N} use MRE-based bootstrap convergence criterion, up to N replicates (default: 1000) --bs-trees FILE Newick file containing set of bootstrap replicate trees (with --support) --bs-cutoff VALUE cutoff threshold for the MRE-based bootstopping criteria (default: 0.03) --bs-metric fbp | tbe branch support metric: fbp = Felsenstein bootstrap (default), tbe = transfer distance --bs-write-msa on | off write all bootstrap alignments (default: OFF) EXAMPLES: 1. Perform tree inference on DNA alignment (10 random + 10 parsimony starting trees, general time-reversible model, ML estimate of substitution rates and nucleotide frequencies, discrete GAMMA model of rate heterogeneity with 4 categories): ./raxml-ng --msa testDNA.fa --model GTR+G 2. Perform an all-in-one analysis (ML tree search + non-parametric bootstrap) (10 randomized parsimony starting trees, fixed empirical substitution matrix (LG), empirical aminoacid frequencies from alignment, 8 discrete GAMMA categories, 200 bootstrap replicates): ./raxml-ng --all --msa testAA.fa --model LG+G8+F --tree pars{10} --bs-trees 200 3. Optimize branch lengths and free model parameters on a fixed topology (using multiple partitions with proportional branch lengths) ./raxml-ng --evaluate --msa testAA.fa --model partitions.txt --tree test.tree --brlen scaled
Installation
Source code for RAxML 8.2.12 downloaded from githhub and compiled with intel-2019b compilers and intel MPI.
System
64-bit Linux