OrthoFinder-Sapelo2: Difference between revisions

From Research Computing Center Wiki
Jump to navigation Jump to search
(Created page with "Category:Sapelo2oldCategory:SoftwareCategory:Bioinformatics === Category === Bioinformatics === Program On === Sapelo2 === Version === 2.1.2, 2.2.7, 2.3.7 ===...")
 
(Updated to reflect that necessary (default) tools are included in OrthoFinder/2.5.5-foss-2022a. Will update further when OrthoFinder/2.5.5-foss-2023a has been fixed.)
 
(9 intermediate revisions by 2 users not shown)
Line 1: Line 1:
[[Category:Sapelo2old]][[Category:Software]][[Category:Bioinformatics]]
[[Category:Sapelo2]][[Category:Software]][[Category:Bioinformatics]]
=== Category ===
=== Category ===


Line 9: Line 9:


=== Version ===
=== Version ===
2.1.2, 2.2.7, 2.3.7
2.5.4, 2.5.5


=== Author / Distributor ===
=== Author / Distributor ===
Line 21: Line 21:
=== Running Program ===
=== Running Program ===


Also refer to [[Running Jobs on Sapelo2]]
Please refer to [[Running Jobs on Sapelo2]]
Also refer to [[Running_Jobs_on_Sapelo2#Running_an_X-windows_application | Run X window Jobs]] and
[[Running_Jobs_on_Sapelo2#Running_an_Interactive_Job | Run interactive Jobs]]


*Version 2.1.2, installed at /usr/local/apps/gb/orthofinder/2.1.2


All required dependencies are installed:
'''Version 2.5.4'''
dlcpar/1.0, MCL/14.137-foss-2016b, FastME/2.1.5-foss-2016b, BLAST+/2.6.0-foss-2016b-Python-2.7.14, MAFFT/7.313-foss-2016b-with-extensions, FastTree/2.1.10-foss-2016b, and DIAMOND/0.9.19-foss-2016b. Python 2.7.14 is loaded in this module.  


To use this version of OrthoFinder, please first load the module with
Version 2.5.4, installed at
*/apps/eb/OrthoFinder/2.5.4-foss-2022a/


<pre class="gscript">
To use version 2.5.4, please first load the module with
module load orthofinder/2.1.2
</pre>
 
*Version 2.2.7, installed at /usr/local/apps/eb/OrthoFinder/2.2.7-foss-2018a-Python-2.7.14
 
All required dependencies are installed:
DLCpar/1.0-foss-2018a-Python-2.7.14, MCL/14.137-GCCcore-6.4.0-Perl-5.26.1, FastME/2.1.6.1-foss-2018a, BLAST+/2.7.1-foss-2018a, MAFFT/7.407-foss-2018a-with-extensions, FastTree/2.1.10-foss-2018a, DIAMOND/0.9.22-foss-2018a, and MMseqs2/5-9375b-foss-2018a. Python 2.7.14 is loaded in this module.
 
To use this version of OrthoFinder, please first load the module with


<pre class="gscript">
<pre class="gscript">
module load OrthoFinder/2.2.7-foss-2018a-Python-2.7.14
ml OrthoFinder/2.5.4-foss-2022a
</pre>  
</pre>  
</br>
'''Version 2.5.5'''


*Version 2.3.7, installed at /usr/local/apps/eb/OrthoFinder/2.3.7-foss-2018a-Python-2.7.14
Version 2.5.5, installed at  
 
*/apps/eb/OrthoFinder/2.5.5-foss-2022a/
OrthoFinder v2.3.7 includes all its dependencies now. No external modules will be loaded except for Python/2.7.14 module. More details at [https://github.com/davidemms/OrthoFinder/releases OrthoFinder].


To use this version of OrthoFinder, please first load the module with
To use version 2.5.5, please first load the module with


<pre class="gscript">
<pre class="gscript">
module load OrthoFinder/2.3.7-foss-2018a-Python-2.7.14
ml OrthoFinder/2.5.5-foss-2022a
</pre>  
</pre>  
</br>
Note that if the option<pre class="gscript">-M msa</pre>is used as shown in the documentation below, this will allow for both multiple sequence alignment building as well as tree inference. The default methods for these processes (MAFFT and FastTree, respectively) are loaded with OrthoFinder/2.5.5-foss-2022a, but if you wish to use alternative methods as described in the documentation, you will need to load the modules for these methods yourself. Please refer to [[Available Toolchains and Toolchain Compatibility]] if you are unsure of what installations of these software will be compatible with our installation of OrthoFinder.


 
Here is an example of a shell script sub.sh to run OrthoFinder v2.5.5 at the batch queue:  
Here is an example of a shell script sub.sh to run orthofinder.py of version 2.1.2 at the batch queue:  
<pre class="gscript">
<pre class="gscript">
#!/bin/bash
#!/bin/bash
#PBS -N j_OrthoFinder
#SBATCH --job-name=j_OrthoFinder  
#PBS -q batch
#SBATCH --partition=batch          
#PBS -l nodes=1:ppn=1:AMD
#SBATCH --ntasks=1                
#PBS -l walltime=480:00:00
#SBATCH --cpus-per-task=8     
#PBS -l mem=10gb
#SBATCH --mem=32gb                   
#SBATCH --time=120:00:00          
#SBATCH --output=log.%j.out   
#SBATCH --error=log.%j.err         
#SBATCH --mail-user=username@uga.edu 
#SBATCH --mail-type=ALL 


cd $PBS_O_WORKDIR
cd $SLURM_SUBMIT_DIR


module load orthofinder/2.1.2
ml OrthoFinder/2.5.5-foss-2022a


orthofinder.py [options] 1>job.out 2>job.err    
orthofinder -t 8 -a 8 [options]   
</pre>
</pre>


Here is an example of a shell script sub.sh to run orthofinder.py of version 2.2.7 at the batch queue:
<pre class="gscript">
#!/bin/bash
#PBS -N j_OrthoFinder
#PBS -q batch
#PBS -l nodes=1:ppn=1
#PBS -l walltime=480:00:00
#PBS -l mem=10gb
cd $PBS_O_WORKDIR
module load OrthoFinder/2.2.7-foss-2018a-Python-2.7.14
orthofinder.py [options] 1>job.out 2>job.err 
</pre>
Here is an example of a shell script sub.sh to run orthofinder.py of version 2.3.7 at the batch queue:
<pre class="gscript">
#!/bin/bash
#PBS -N j_OrthoFinder
#PBS -q batch
#PBS -l nodes=1:ppn=1
#PBS -l walltime=480:00:00
#PBS -l mem=10gb
cd $PBS_O_WORKDIR
module load OrthoFinder/2.3.7-foss-2018a-Python-2.7.14
module load BLAST+/2.7.1-foss-2018a
module load MAFFT/7.407-foss-2018a-with-extensions
orthofinder.py [options] 1>job.out 2>job.err
</pre>


Here is an example of job submission
Here is an example of job submission
<pre  class="gcommand">
<pre  class="gcommand">
qsub  ./sub.sh  
sbatch ./sub.sh  
</pre>
</pre>


=== Documentation ===
=== Documentation ===
   
   
<pre class="gcommand">
<pre class="gcommand">
ml OrthoFinder/2.5.5-foss-2022a
orthofinder -h


module load orthofinder/2.1.2
OrthoFinder version 2.5.5 Copyright (C) 2014 David Emms
orthofinder.py -h
 
OrthoFinder version 2.1.2 Copyright (C) 2014 David Emms


SIMPLE USAGE:
SIMPLE USAGE:
Line 130: Line 91:


OPTIONS:
OPTIONS:
  -t <int>         Number of parallel sequence search threads [Default = 16]
  -t <int>       Number of parallel sequence search threads [Default = 16]
  -a <int>         Number of parallel analysis threads [Default = 1]
  -a <int>       Number of parallel analysis threads
  -M <txt>         Method for gene tree inference. Options 'dendroblast' & 'msa'
-d              Input is DNA sequences
                  [Default = dendroblast]
  -M <txt>       Method for gene tree inference. Options 'dendroblast' & 'msa'
  -S <txt>         Sequence search program [Default = blast]
                [Default = dendroblast]
                  Options: blast, blast_gz, diamond
  -S <txt>       Sequence search program [Default = diamond]
  -A <txt>         MSA program, requires '-M msa' [Default = mafft]
                Options: blast, diamond, diamond_ultra_sens, blast_gz, mmseqs, blast_nucl
                  Options: mafft, muscle, mafft
  -A <txt>       MSA program, requires '-M msa' [Default = mafft]
  -T <txt>         Tree inference method, requires '-M msa' [Default = fasttree]
                Options: mafft, muscle
                  Options: mafft, iqtree, fasttree, raxml
  -T <txt>       Tree inference method, requires '-M msa' [Default = fasttree]
-R <txt>          Tree reconciliation method [Default = of_recon]
                Options: fasttree, raxml, raxml-ng, iqtree
                  Options: of_recon, dlcpar, dlcpar_deepsearch
  -s <file>       User-specified rooted species tree
  -s <file>         User-specified rooted species tree
  -I <int>       MCL inflation parameter [Default = 1.5]
  -I <int>         MCL inflation parameter [Default = 1.5]
--fewer-files  Only create one orthologs file per species
  -x <file>         Info for outputting results in OrthoXML format
  -x <file>       Info for outputting results in OrthoXML format
  -p <dir>         Write the temporary pickle files to <dir>
  -p <dir>       Write the temporary pickle files to <dir>
  -1               Only perform one-way sequence search  
  -1             Only perform one-way sequence search
  -n <txt>         Name to append to the results directory
-X              Don't add species names to sequence IDs
  -h               Print this help text
-y              Split paralogous clades below root of a HOG into separate HOGs
-z              Don't trim MSAs (columns>=90% gap, min. alignment length 500)
  -n <txt>       Name to append to the results directory
-o <txt>        Non-default results directory
  -h             Print this help text


WORKFLOW STOPPING OPTIONS:
WORKFLOW STOPPING OPTIONS:
  -op               Stop after preparing input files for BLAST
  -op             Stop after preparing input files for BLAST
  -og               Stop after inferring orthogroups
  -og             Stop after inferring orthogroups
  -os               Stop after writing sequence files for orthogroups
  -os             Stop after writing sequence files for orthogroups
                  (requires '-M msa')
                (requires '-M msa')
  -oa               Stop after inferring alignments for orthogroups
  -oa             Stop after inferring alignments for orthogroups
                  (requires '-M msa')
                (requires '-M msa')
  -ot               Stop after inferring gene trees for orthogroups  
  -ot             Stop after inferring gene trees for orthogroups  


WORKFLOW RESTART COMMANDS:
WORKFLOW RESTART COMMANDS:
Line 169: Line 134:
CITATION:
CITATION:
  When publishing work that uses OrthoFinder please cite:
  When publishing work that uses OrthoFinder please cite:
  Emms D.M. & Kelly S. (2015), Genome Biology 16:157
  Emms D.M. & Kelly S. (2019), Genome Biology 20:238
 
If you use the species tree in your work then please also cite:
Emms D.M. & Kelly S. (2017), MBE 34(12): 3267-3278
Emms D.M. & Kelly S. (2018), bioRxiv https://doi.org/10.1101/267914
</pre>
</pre>
[[#top|Back to Top]]
[[#top|Back to Top]]

Latest revision as of 11:24, 24 June 2024

Category

Bioinformatics

Program On

Sapelo2

Version

2.5.4, 2.5.5

Author / Distributor

OrthoFinder

Description

"OrthoFinder is a fast, accurate and comprehensive analysis tool for comparative genomics. It finds orthologues and orthogroups infers rooted gene trees for all orthogroups and infers a rooted species tree for the species being analysed. OrthoFinder also provides comprehensive statistics for comparative genomic analyses. OrthoFinder is simple to use and all you need to run it is a set of protein sequence files (one per species) in FASTA format." More details are at OrthoFinder

Running Program

Please refer to Running Jobs on Sapelo2


Version 2.5.4

Version 2.5.4, installed at

  • /apps/eb/OrthoFinder/2.5.4-foss-2022a/

To use version 2.5.4, please first load the module with

ml OrthoFinder/2.5.4-foss-2022a


Version 2.5.5

Version 2.5.5, installed at

  • /apps/eb/OrthoFinder/2.5.5-foss-2022a/

To use version 2.5.5, please first load the module with

ml OrthoFinder/2.5.5-foss-2022a


Note that if the option

-M msa

is used as shown in the documentation below, this will allow for both multiple sequence alignment building as well as tree inference. The default methods for these processes (MAFFT and FastTree, respectively) are loaded with OrthoFinder/2.5.5-foss-2022a, but if you wish to use alternative methods as described in the documentation, you will need to load the modules for these methods yourself. Please refer to Available Toolchains and Toolchain Compatibility if you are unsure of what installations of these software will be compatible with our installation of OrthoFinder.

Here is an example of a shell script sub.sh to run OrthoFinder v2.5.5 at the batch queue:

#!/bin/bash
#SBATCH --job-name=j_OrthoFinder   
#SBATCH --partition=batch            
#SBATCH --ntasks=1                  	
#SBATCH --cpus-per-task=8       
#SBATCH --mem=32gb                    
#SBATCH --time=120:00:00           
#SBATCH --output=log.%j.out     
#SBATCH --error=log.%j.err          
#SBATCH --mail-user=username@uga.edu  
#SBATCH --mail-type=ALL   

cd $SLURM_SUBMIT_DIR

ml OrthoFinder/2.5.5-foss-2022a

orthofinder -t 8 -a 8 [options]   


Here is an example of job submission

sbatch ./sub.sh 

Documentation

ml OrthoFinder/2.5.5-foss-2022a 
orthofinder -h

OrthoFinder version 2.5.5 Copyright (C) 2014 David Emms

SIMPLE USAGE:
Run full OrthoFinder analysis on FASTA format proteomes in <dir>
  orthofinder [options] -f <dir>

Add new species in <dir1> to previous run in <dir2> and run new analysis
  orthofinder [options] -f <dir1> -b <dir2>

OPTIONS:
 -t <int>        Number of parallel sequence search threads [Default = 16]
 -a <int>        Number of parallel analysis threads
 -d              Input is DNA sequences
 -M <txt>        Method for gene tree inference. Options 'dendroblast' & 'msa'
                 [Default = dendroblast]
 -S <txt>        Sequence search program [Default = diamond]
                 Options: blast, diamond, diamond_ultra_sens, blast_gz, mmseqs, blast_nucl
 -A <txt>        MSA program, requires '-M msa' [Default = mafft]
                 Options: mafft, muscle
 -T <txt>        Tree inference method, requires '-M msa' [Default = fasttree]
                 Options: fasttree, raxml, raxml-ng, iqtree
 -s <file>       User-specified rooted species tree
 -I <int>        MCL inflation parameter [Default = 1.5]
 --fewer-files   Only create one orthologs file per species
 -x <file>       Info for outputting results in OrthoXML format
 -p <dir>        Write the temporary pickle files to <dir>
 -1              Only perform one-way sequence search
 -X              Don't add species names to sequence IDs
 -y              Split paralogous clades below root of a HOG into separate HOGs
 -z              Don't trim MSAs (columns>=90% gap, min. alignment length 500)
 -n <txt>        Name to append to the results directory
 -o <txt>        Non-default results directory
 -h              Print this help text

WORKFLOW STOPPING OPTIONS:
 -op             Stop after preparing input files for BLAST
 -og             Stop after inferring orthogroups
 -os             Stop after writing sequence files for orthogroups
                 (requires '-M msa')
 -oa             Stop after inferring alignments for orthogroups
                 (requires '-M msa')
 -ot             Stop after inferring gene trees for orthogroups 

WORKFLOW RESTART COMMANDS:
 -b  <dir>         Start OrthoFinder from pre-computed BLAST results in <dir>
 -fg <dir>         Start OrthoFinder from pre-computed orthogroups in <dir>
 -ft <dir>         Start OrthoFinder from pre-computed gene trees in <dir>

LICENSE:
 Distributed under the GNU General Public License (GPLv3). See License.md

CITATION:
 When publishing work that uses OrthoFinder please cite:
 Emms D.M. & Kelly S. (2019), Genome Biology 20:238

 If you use the species tree in your work then please also cite:
 Emms D.M. & Kelly S. (2017), MBE 34(12): 3267-3278
 Emms D.M. & Kelly S. (2018), bioRxiv https://doi.org/10.1101/267914

Back to Top

Installation

source code from OrthoFinder

System

64-bit Linux