OrthoFinder-Teaching

From Research Computing Center Wiki
Jump to navigation Jump to search

Category

Bioinformatics

Program On

Teaching

Version

2.1.2

Author / Distributor

OrthoFinder

Description

"OrthoFinder is a fast, accurate and comprehensive analysis tool for comparative genomics. It finds orthologues and orthogroups infers rooted gene trees for all orthogroups and infers a rooted species tree for the species being analysed. OrthoFinder also provides comprehensive statistics for comparative genomic analyses. OrthoFinder is simple to use and all you need to run it is a set of protein sequence files (one per species) in FASTA format." More details are at OrthoFinder

Running Program

The last version of this application is at /usr/local/apps/gb/orthofinder/2.1.2

To use this version, please load the module with

ml orthofinder/2.1.2 

Here is an example of a shell script, sub.sh, to run on the batch queue:

#!/bin/bash
#SBATCH --job-name=j_OrthoFinder
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=10gb
#SBATCH --time=08:00:00
#SBATCH --output=OrthoFinder.%j.out
#SBATCH --error=OrthoFinder.%j.err

cd $SLURM_SUBMIT_DIR
ml orthofinder/2.1.2
orthofinder.py [options]

In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.

Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.


Here is an example of job submission command:

sbatch ./sub.sh 

Documentation

ml orthofinder/2.1.2 
orthofinder.py -h
OrthoFinder version 2.1.2 Copyright (C) 2014 David Emms

SIMPLE USAGE:
Run full OrthoFinder analysis on FASTA format proteomes in <dir>
  orthofinder [options] -f <dir>

Add new species in <dir1> to previous run in <dir2> and run new analysis
  orthofinder [options] -f <dir1> -b <dir2>

OPTIONS:
 -t <int>          Number of parallel sequence search threads [Default = 16]
 -a <int>          Number of parallel analysis threads [Default = 1]
 -M <txt>          Method for gene tree inference. Options 'dendroblast' & 'msa'
                   [Default = dendroblast]
 -S <txt>          Sequence search program [Default = blast]
                   Options: blast, blast_gz, diamond
 -A <txt>          MSA program, requires '-M msa' [Default = mafft]
                   Options: mafft, muscle, mafft
 -T <txt>          Tree inference method, requires '-M msa' [Default = fasttree]
                   Options: mafft, iqtree, fasttree, raxml
 -R <txt>          Tree reconciliation method [Default = of_recon]
                   Options: of_recon, dlcpar, dlcpar_deepsearch
 -s <file>         User-specified rooted species tree
 -I <int>          MCL inflation parameter [Default = 1.5]
 -x <file>         Info for outputting results in OrthoXML format
 -p <dir>          Write the temporary pickle files to <dir>
 -1                Only perform one-way sequence search 
 -n <txt>          Name to append to the results directory
 -h                Print this help text

WORKFLOW STOPPING OPTIONS:
 -op               Stop after preparing input files for BLAST
 -og               Stop after inferring orthogroups
 -os               Stop after writing sequence files for orthogroups
                   (requires '-M msa')
 -oa               Stop after inferring alignments for orthogroups
                   (requires '-M msa')
 -ot               Stop after inferring gene trees for orthogroups 

WORKFLOW RESTART COMMANDS:
 -b  <dir>         Start OrthoFinder from pre-computed BLAST results in <dir>
 -fg <dir>         Start OrthoFinder from pre-computed orthogroups in <dir>
 -ft <dir>         Start OrthoFinder from pre-computed gene trees in <dir>

LICENSE:
 Distributed under the GNU General Public License (GPLv3). See License.md

CITATION:
 When publishing work that uses OrthoFinder please cite:
 Emms D.M. & Kelly S. (2015), Genome Biology 16:157

Back to Top

Installation

Source code is obtained from OrthoFinder

System

64-bit Linux