OrthoFinder-Sapelo2: Difference between revisions

From Research Computing Center Wiki
Jump to navigation Jump to search
(Updated to reflect that necessary (default) tools are included in OrthoFinder/2.5.5-foss-2022a. Will update further when OrthoFinder/2.5.5-foss-2023a has been fixed.)
 
(5 intermediate revisions by 2 users not shown)
Line 9: Line 9:


=== Version ===
=== Version ===
2.3.8, 2.3.11, 2.5.2
2.5.4, 2.5.5


=== Author / Distributor ===
=== Author / Distributor ===
Line 23: Line 23:
Please refer to [[Running Jobs on Sapelo2]]
Please refer to [[Running Jobs on Sapelo2]]


*Version 2.3.8, installed at /apps/eb/OrthoFinder/2.3.8-foss-2019b-Python-2.7.16
*Version 2.3.11, installed at /apps/eb/OrthoFinder/2.3.11-intel-2019b-Python-3.7.4
*Version 2.5.2, installed at /apps/eb/OrthoFinder/2.5.2-foss-2019b-Python-3.7.4


'''Version 2.5.4'''


To use the version 2.3.8, please first load the module with
Version 2.5.4, installed at
*/apps/eb/OrthoFinder/2.5.4-foss-2022a/
 
To use version 2.5.4, please first load the module with


<pre class="gscript">
<pre class="gscript">
ml OrthoFinder/2.3.8-foss-2019b-Python-2.7.16
ml OrthoFinder/2.5.4-foss-2022a
</pre>  
</pre>  
</br>
'''Version 2.5.5'''


To use the version 2.3.1, please first load the module with
Version 2.5.5, installed at
 
*/apps/eb/OrthoFinder/2.5.5-foss-2022a/
<pre class="gscript">
ml OrthoFinder/2.3.11-intel-2019b-Python-3.7.4
</pre>


To use the version 2.5.2, please first load the module with
To use version 2.5.5, please first load the module with


<pre class="gscript">
<pre class="gscript">
ml OrthoFinder/2.5.2-foss-2019b-Python-3.7.4
ml OrthoFinder/2.5.5-foss-2022a
</pre>  
</pre>  
</br>
Note that if the option<pre class="gscript">-M msa</pre>is used as shown in the documentation below, this will allow for both multiple sequence alignment building as well as tree inference. The default methods for these processes (MAFFT and FastTree, respectively) are loaded with OrthoFinder/2.5.5-foss-2022a, but if you wish to use alternative methods as described in the documentation, you will need to load the modules for these methods yourself. Please refer to [[Available Toolchains and Toolchain Compatibility]] if you are unsure of what installations of these software will be compatible with our installation of OrthoFinder.


 
Here is an example of a shell script sub.sh to run OrthoFinder v2.5.5 at the batch queue:  
Here is an example of a shell script sub.sh to run orthofinder.py of version 2.5.2 at the batch queue:  
<pre class="gscript">
<pre class="gscript">
#!/bin/bash
#!/bin/bash
Line 53: Line 54:
#SBATCH --partition=batch             
#SBATCH --partition=batch             
#SBATCH --ntasks=1                 
#SBATCH --ntasks=1                 
#SBATCH --cpus-per-task=2       
#SBATCH --cpus-per-task=8     
#SBATCH --mem=10gb                    
#SBATCH --mem=32gb                    
#SBATCH --time=120:00:00           
#SBATCH --time=120:00:00           
#SBATCH --output=log.%j.out     
#SBATCH --output=log.%j.out     
Line 63: Line 64:
cd $SLURM_SUBMIT_DIR
cd $SLURM_SUBMIT_DIR


ml OrthoFinder/2.5.2-foss-2019b-Python-3.7.4
ml OrthoFinder/2.5.5-foss-2022a


orthofinder.py [options]   
orthofinder -t 8 -a 8 [options]   
</pre>
</pre>


Line 76: Line 77:
=== Documentation ===
=== Documentation ===
   
   
<pre class="gcommand">
<pre class="gcommand">
ml OrthoFinder/2.5.2-foss-2019b-Python-3.7.4
ml OrthoFinder/2.5.5-foss-2022a
zhuofei@b1-24 ~$ orthofinder -h
orthofinder -h


OrthoFinder version 2.5.2 Copyright (C) 2014 David Emms
OrthoFinder version 2.5.5 Copyright (C) 2014 David Emms


SIMPLE USAGE:
SIMPLE USAGE:
Line 90: Line 91:


OPTIONS:
OPTIONS:
  -t <int>        Number of parallel sequence search threads [Default = 64]
  -t <int>        Number of parallel sequence search threads [Default = 16]
  -a <int>        Number of parallel analysis threads
  -a <int>        Number of parallel analysis threads
  -d              Input is DNA sequences
  -d              Input is DNA sequences
Line 103: Line 104:
  -s <file>      User-specified rooted species tree
  -s <file>      User-specified rooted species tree
  -I <int>        MCL inflation parameter [Default = 1.5]
  -I <int>        MCL inflation parameter [Default = 1.5]
--fewer-files  Only create one orthologs file per species
  -x <file>      Info for outputting results in OrthoXML format
  -x <file>      Info for outputting results in OrthoXML format
  -p <dir>        Write the temporary pickle files to <dir>
  -p <dir>        Write the temporary pickle files to <dir>

Latest revision as of 11:24, 24 June 2024

Category

Bioinformatics

Program On

Sapelo2

Version

2.5.4, 2.5.5

Author / Distributor

OrthoFinder

Description

"OrthoFinder is a fast, accurate and comprehensive analysis tool for comparative genomics. It finds orthologues and orthogroups infers rooted gene trees for all orthogroups and infers a rooted species tree for the species being analysed. OrthoFinder also provides comprehensive statistics for comparative genomic analyses. OrthoFinder is simple to use and all you need to run it is a set of protein sequence files (one per species) in FASTA format." More details are at OrthoFinder

Running Program

Please refer to Running Jobs on Sapelo2


Version 2.5.4

Version 2.5.4, installed at

  • /apps/eb/OrthoFinder/2.5.4-foss-2022a/

To use version 2.5.4, please first load the module with

ml OrthoFinder/2.5.4-foss-2022a


Version 2.5.5

Version 2.5.5, installed at

  • /apps/eb/OrthoFinder/2.5.5-foss-2022a/

To use version 2.5.5, please first load the module with

ml OrthoFinder/2.5.5-foss-2022a


Note that if the option

-M msa

is used as shown in the documentation below, this will allow for both multiple sequence alignment building as well as tree inference. The default methods for these processes (MAFFT and FastTree, respectively) are loaded with OrthoFinder/2.5.5-foss-2022a, but if you wish to use alternative methods as described in the documentation, you will need to load the modules for these methods yourself. Please refer to Available Toolchains and Toolchain Compatibility if you are unsure of what installations of these software will be compatible with our installation of OrthoFinder.

Here is an example of a shell script sub.sh to run OrthoFinder v2.5.5 at the batch queue:

#!/bin/bash
#SBATCH --job-name=j_OrthoFinder   
#SBATCH --partition=batch            
#SBATCH --ntasks=1                  	
#SBATCH --cpus-per-task=8       
#SBATCH --mem=32gb                    
#SBATCH --time=120:00:00           
#SBATCH --output=log.%j.out     
#SBATCH --error=log.%j.err          
#SBATCH --mail-user=username@uga.edu  
#SBATCH --mail-type=ALL   

cd $SLURM_SUBMIT_DIR

ml OrthoFinder/2.5.5-foss-2022a

orthofinder -t 8 -a 8 [options]   


Here is an example of job submission

sbatch ./sub.sh 

Documentation

ml OrthoFinder/2.5.5-foss-2022a 
orthofinder -h

OrthoFinder version 2.5.5 Copyright (C) 2014 David Emms

SIMPLE USAGE:
Run full OrthoFinder analysis on FASTA format proteomes in <dir>
  orthofinder [options] -f <dir>

Add new species in <dir1> to previous run in <dir2> and run new analysis
  orthofinder [options] -f <dir1> -b <dir2>

OPTIONS:
 -t <int>        Number of parallel sequence search threads [Default = 16]
 -a <int>        Number of parallel analysis threads
 -d              Input is DNA sequences
 -M <txt>        Method for gene tree inference. Options 'dendroblast' & 'msa'
                 [Default = dendroblast]
 -S <txt>        Sequence search program [Default = diamond]
                 Options: blast, diamond, diamond_ultra_sens, blast_gz, mmseqs, blast_nucl
 -A <txt>        MSA program, requires '-M msa' [Default = mafft]
                 Options: mafft, muscle
 -T <txt>        Tree inference method, requires '-M msa' [Default = fasttree]
                 Options: fasttree, raxml, raxml-ng, iqtree
 -s <file>       User-specified rooted species tree
 -I <int>        MCL inflation parameter [Default = 1.5]
 --fewer-files   Only create one orthologs file per species
 -x <file>       Info for outputting results in OrthoXML format
 -p <dir>        Write the temporary pickle files to <dir>
 -1              Only perform one-way sequence search
 -X              Don't add species names to sequence IDs
 -y              Split paralogous clades below root of a HOG into separate HOGs
 -z              Don't trim MSAs (columns>=90% gap, min. alignment length 500)
 -n <txt>        Name to append to the results directory
 -o <txt>        Non-default results directory
 -h              Print this help text

WORKFLOW STOPPING OPTIONS:
 -op             Stop after preparing input files for BLAST
 -og             Stop after inferring orthogroups
 -os             Stop after writing sequence files for orthogroups
                 (requires '-M msa')
 -oa             Stop after inferring alignments for orthogroups
                 (requires '-M msa')
 -ot             Stop after inferring gene trees for orthogroups 

WORKFLOW RESTART COMMANDS:
 -b  <dir>         Start OrthoFinder from pre-computed BLAST results in <dir>
 -fg <dir>         Start OrthoFinder from pre-computed orthogroups in <dir>
 -ft <dir>         Start OrthoFinder from pre-computed gene trees in <dir>

LICENSE:
 Distributed under the GNU General Public License (GPLv3). See License.md

CITATION:
 When publishing work that uses OrthoFinder please cite:
 Emms D.M. & Kelly S. (2019), Genome Biology 20:238

 If you use the species tree in your work then please also cite:
 Emms D.M. & Kelly S. (2017), MBE 34(12): 3267-3278
 Emms D.M. & Kelly S. (2018), bioRxiv https://doi.org/10.1101/267914

Back to Top

Installation

source code from OrthoFinder

System

64-bit Linux