BEST Bayesian

From Research Computing Center Wiki
Revision as of 20:38, 7 February 2013 by Curtis E. Combs Jr. (talk | contribs) (Created page with "Category:ZclusterCategory:SoftwareCategory:Bioinformatics === Category === Bioinformatics === Program On === zcluster === Version === 2.3.1 === Author / ...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Category

Bioinformatics

Program On

zcluster

Version

2.3.1

Author / Distributor

Liang Liu

Description

BLAST (Basic Local Alignment Search Tool), more details at NCBI

Running Program

BEST is a free phylogenetics program to estimate the joint posterior distribution of gene trees and species tree using multilocus molecular data that accounts for deep coalescence but not for other issues such as horizontal transfer or gene duplication.

This code works as a modification of the popular phylogenetics software package

MrBayes

mbbest, /usr/local/BEST/latest/mbbest are always pointed to the latest update version.

Version: /usr/local/BEST/2.3.1/mbbest /usr/local/BEST/2.2/mbbest

Example of interactive program. Please refer to here about running jobs at interactive nodes.

ssh -Y inter2
mbbest
...
exit

When running MPI jobs on the rcluster interactive node (inter2), please first create a file host.list. Refer to Running an Interactive Job for details about host.list. The host.file will as:


inter2
inter2
inter2
inter2

Then run the mpirun command with:

ssh -Y inter2
mpirun -np 2 -machinefile host.list mbbest


Where the "2" in parameter is the max number of parallel processes at node inter1. Users can set to 2 -4 based on their needs. For more than 4 processes, please contact rcc ahead for arrangement and use queue submit your job. Then follow the instruction on the screen. To quit the program, type q and ENTER Run the job in the queue, also refer to submit jobs to queues at rcluster, IOB Below is an example file defines the data. Save this file as e.x. yh.nex

cp /usr/local/mrbayes/primates.nex yh.nex


Content of yh.nex is:

#NEXUS
begin data;
dimensions ntax=12 nchar=898;
format datatype=dna interleave=no gap=-;
matrix
Tarsius_syrichta AAGTTTCATTGGAGCCACCACTCTTATAATTGCCCATGGCCTCACCTCCTCCCTATTATTTTGC
TACGAACGAGTCCACAGTCGAACAATAGCACTAGCCCGTGGCCTTCAAACCCTATTACCTCTTGCAGCAACATGA
.... 
end;


Define all the parameters in the .par file, e.x. yh.par. Notice the yh.nex is a data file defined above. This example performs three single-run analyses of the data yh.nex.

set autoclose=yes nowarn=yes;
execute yh.nex;
lset nst=6 rates=gamma;
mcmc nruns=1 ngen=10000 samplefreq=10 file=yh.nex1;
mcmc file=yh.nex2;
mcmc file=yh.nex3;


Below is a script subp.sh to the batch queue, yh.par is the file definced above

#!/bin/bash
cd my-working-dir
echo $LSB_HOSTS
cat /dev/null > mlist.$$
for variable in $LSB_HOSTS; do
echo $variable >> mlist.$$ 
done
mpirun -np 2 -machinefile mlist.$$ /usr/local/BEST/latest/mbbest < yh.par > myout
rm -f mlist.$$
x



After saved the subp.sh, use following add excute permission to the subp.sh

chmod u+x subp.sh


Then submit all these to the queue as

bsub -n 2 -q rcc-r32-30d -o mbbtest.out.%J -e mbbtest.err.%J ./subp.sh

the -n 2 is number of processors for teh MPIRUN, it has to be matched with the number of processors (-np) in the subp.sh. If you are using mri/iob queue, the number could be set as 4.

Documentation

Online tutorial available at BEST website.

Commands that are available from the command

  line or from a MrBayes block include:                                         
                                                                                
  About           -- Describes the program
  Acknowledgments -- Shows program acknowledgments
  Charset         -- Assigns a group of sites to a set
  Charstat        -- Shows status of characters
  Citations       -- Appropriate citation of program
  Comparetree     -- Compares the trees from two tree files
  Constraint      -- Defines a constraint on tree topology
  Ctype           -- Assigns ordering for the characters
  Databreaks      -- Defines nucleotide pairs (doublets) for stem models
  Delete          -- Deletes taxa from the analysis
  Deroot          -- Deroots user tree
  Disclaimer      -- Describes program disclaimer
  Exclude         -- Excludes sites from the analysis
  Execute         -- Executes a file
  Help            -- Provides detailed description of commands
  Include         -- Includes sites
  Link            -- Links parameters across character partitions
  Log             -- Logs screen output to a file
  Lset            -- Sets the parameters of the likelihood model
  Manual          -- Prints a command reference to a text file
  Mcmc            -- Starts Markov chain Monte Carlo analysis
  Mcmcp           -- Sets the parameters of a chain (without starting analysis)
  Outgroup        -- Changes outgroup taxon
  Pairs           -- Defines nucleotide pairs (doublets) for stem models
  Partition       -- Assigns a character partition
  Plot            -- Plots parameters from MCMC analysis
  Prset           -- Sets the priors for the parameters
  Props           -- Set proposal probabilities
  Quit            -- Quits the program
  Report          -- Controls how model parameters are reported
  Restore         -- Restores taxa
  Root            -- Roots user tree
  Set             -- Sets run conditions and defines active data partition
  Showmatrix      -- Shows current character matrix
  Showmodel       -- Shows model settings
  Showtree        -- Shows user tree
  Sump            -- Summarizes parameters from MCMC analysis
  Sumt            -- Summarizes trees from MCMC analysis
  Taxastat        -- Shows status of taxa
  Taxset          -- Assigns a group of taxa to a set
  Unlink          -- Unlinks parameters across character partitions
  Usertree        -- Defines a single user tree
  Version         -- Shows program version
                                                                                
  Commands that should be in a NEXUS file (data                                 
  block or trees block) include:                                                
                                                                                
  Begin           -- Denotes beginning of block in file
  Dimensions      -- Defines size of character matrix
  End             -- Denotes end of a block in file
  Endblock        -- Alternative way of denoting end of a block
  Format          -- Defines character format in data block
  Matrix          -- Defines matrix of characters in data block
  Translate       -- Defines alternative names for taxa
  Tree            -- Defines a tree from MCMC analysis
                                                                                
  Note that this program supports the use of the shortest unambiguous           
  spelling of the above commands (e.g., "exe" instead of "execute"). 

Installation

Source downloaded from BEST website.

System

64-bit Linux