PASA-Sapelo2: Difference between revisions

From Research Computing Center Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 24: Line 24:
* version 2.5.3 with MySQL support is installed as an Apptainer container in /apps/singularity-images/pasa-2.5.3-MySQL/ and the image name is pasa-2.5.3-mysql-production.sif.
* version 2.5.3 with MySQL support is installed as an Apptainer container in /apps/singularity-images/pasa-2.5.3-MySQL/ and the image name is pasa-2.5.3-mysql-production.sif.


To use this version of PASA in a batch job, please follow these setup steps in your current job working directory '''before''' submitting a batch job:
To use this version of PASA with its sample data in a batch job, please follow these setup steps in your current job working directory '''before''' submitting a batch job:


<pre class="gscript">
<pre class="gscript">
cp /apps/singularity-images/funannotate-1.8.17-MySQL-PASA/*.sh .
cp -r /apps/singularity-images/pasa-2.5.3-MySQL/{1_init.sh,2_create_user_and_db.sh,3_cleanup.sh,pasa_conf,sub.sh,sample_data} .
cp -r /apps/singularity-images/funannotate-1.8.17-MySQL-PASA/pasa_conf .
</pre>
</pre>


Line 36: Line 35:


* Three MySQL config scripts: '''1_init.sh''', '''2_create_user_and_db.sh''', and '''3_cleanup.sh'''
* Three MySQL config scripts: '''1_init.sh''', '''2_create_user_and_db.sh''', and '''3_cleanup.sh'''
* A PASA config folder: '''pasa_conf/'''
* A sample batch job submission script: '''sub.sh'''
* A sample batch job submission script: '''sub.sh'''
* A PASA config folder: '''pasa_conf/'''
* Sample data folder provided by PASA : '''sample_data/'''


Below is the sample batch job submission script ('''sub.sh''') to run funannotate in a batch job using 80 CPU cores on the batch partition:  
Below is the sample batch job submission script ('''sub.sh''') to run funannotate in a batch job using 80 CPU cores on the batch partition:  
Line 43: Line 43:
<div class="gscript2">
<div class="gscript2">
<nowiki>#</nowiki>!/bin/bash<br>
<nowiki>#</nowiki>!/bin/bash<br>
<nowiki>#</nowiki>SBATCH --job-name=<u>funannotate-mysql</u><br>
<nowiki>#</nowiki>SBATCH --job-name=<u>pasa-mysql</u><br>
<nowiki>#</nowiki>SBATCH --partition=batch<br>
<nowiki>#</nowiki>SBATCH --partition=batch<br>
<nowiki>#</nowiki>SBATCH --ntasks=1<br>
<nowiki>#</nowiki>SBATCH --ntasks=1<br>
<nowiki>#</nowiki>SBATCH --cpus-per-task=<u>80</u><br>
<nowiki>#</nowiki>SBATCH --cpus-per-task=<u>20</u><br>
<nowiki>#</nowiki>SBATCH --mem=<u>80gb</u><br>
<nowiki>#</nowiki>SBATCH --mem=<u>80gb</u><br>
<nowiki>#</nowiki>SBATCH --time=<u>48:00:00</u><br>
<nowiki>#</nowiki>SBATCH --time=<u>48:00:00</u><br>
Line 55: Line 55:
cd $SLURM_SUBMIT_DIR
cd $SLURM_SUBMIT_DIR


 
<nowiki>#</nowiki> Initialize and start MySQL from inside of the container
<nowiki>#</nowiki> Initialize and start MySQL in the container


./1_init.sh && ./2_create_user_and_db.sh
./1_init.sh && ./2_create_user_and_db.sh


<nowiki>#</nowiki> unzip the sample input date genome_sample.fasta.gz


<nowiki>#</nowiki> Verify the installation of dependencies and show their versions (optional)
gunzip sample_data/genome_sample.fasta.gz
 
apptainer exec instance://funannotate-mysql /bin/bash -c "source activate /opt/funannotate && '''funannotate check --show-versions'''"
 
 
<nowiki>#</nowiki> Run a full test with PASA pipeline with MySQL support. You can run your own funannotate command line here.
 
apptainer exec instance://funannotate-mysql /bin/bash -c "source activate /opt/funannotate && '''funannotate test -t all --cpus 80'''"


<nowiki>#</nowiki> Run PASA pipeline with genome_sample.fasta


apptainer exec instance://pasa-mysql /bin/bash -c "source activate /opt/pasa && \$PASAHOME/Launch_PASA_pipeline.pl \
  --CPU <u>20</u><br>
  --config ./sample_data/mysql.confs/alignAssembly.config --create --run \
  --ALIGNER gmap --genome ./sample_data/genome_sample.fasta --transcripts ./sample_data/all_transcripts.fasta.clean"


<nowiki>#</nowiki> Clean up on the compute node by shutting down MySQL
<nowiki>#</nowiki> Clean up on the compute node and shut down MySQL


./3_cleanup.sh
./3_cleanup.sh
Line 81: Line 79:


# In the real submission script, at least all the above underlined values in Slurm headers need to be reviewed or to be replaced by the proper values.
# In the real submission script, at least all the above underlined values in Slurm headers need to be reviewed or to be replaced by the proper values.
# In the real submission script, the above the funannotate command lines in '''bold font''' can be replaced by your own funannotate command lines.
# In the real submission script, the above the funannotate command lines in '''bold font''' can be replaced by your own PASA command lines.
 


Please refer to [[Running Jobs on Sapelo2]] for more information running jobs on the Sapelo2 cluster.
Please refer to [[Running Jobs on Sapelo2]] for more information running jobs on the Sapelo2 cluster.


Here is an example of job submission command:
Here is an example of job submission command:
Line 96: Line 91:
   
   
<pre class="gcommand">
<pre class="gcommand">
https://funannotate.readthedocs.io/en/latest/
https://github.com/PASApipeline/PASApipeline
https://github.com/PASApipeline/PASApipeline/wiki/setting-up-pasa-mysql
https://github.com/PASApipeline/PASApipeline/wiki
</pre>
</pre>
[[#top|Back to Top]]
[[#top|Back to Top]]
Line 103: Line 98:
=== Installation ===
=== Installation ===
   
   
Source code is obtained from https://github.com/nextgenusfs/funannotate
Source code is obtained from https://github.com/PASApipeline/PASApipeline
   
   
=== System ===
=== System ===
64-bit Linux
64-bit Linux

Revision as of 09:38, 5 June 2025

Category

Bioinformatics

Program On

Sapelo2

Version

2.5.3

Author / Distributor

https://github.com/PASApipeline/PASApipeline https://github.com/PASApipeline/PASApipeline/wiki

Description

"PASA, acronym for Program to Assemble Spliced Alignments (and pronounced 'pass-uh'), is a eukaryotic genome annotation tool that exploits spliced alignments of expressed transcript sequences to automatically model gene structures, and to maintain gene structure annotation consistent with the most recently available experimental sequence data. PASA also identifies and classifies all splicing variations supported by the transcript alignments."

More details are at https://github.com/PASApipeline/PASApipeline/wiki

Running Program

  • version 2.5.3 with MySQL support is installed as an Apptainer container in /apps/singularity-images/pasa-2.5.3-MySQL/ and the image name is pasa-2.5.3-mysql-production.sif.

To use this version of PASA with its sample data in a batch job, please follow these setup steps in your current job working directory before submitting a batch job:

cp -r /apps/singularity-images/pasa-2.5.3-MySQL/{1_init.sh,2_create_user_and_db.sh,3_cleanup.sh,pasa_conf,sub.sh,sample_data} .

Note:

Running the above cp commands will copy the following setup files and folder to your current job working directory:

  • Three MySQL config scripts: 1_init.sh, 2_create_user_and_db.sh, and 3_cleanup.sh
  • A PASA config folder: pasa_conf/
  • A sample batch job submission script: sub.sh
  • Sample data folder provided by PASA : sample_data/

Below is the sample batch job submission script (sub.sh) to run funannotate in a batch job using 80 CPU cores on the batch partition:

#!/bin/bash
#SBATCH --job-name=pasa-mysql
#SBATCH --partition=batch
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=20
#SBATCH --mem=80gb
#SBATCH --time=48:00:00
#SBATCH --output=log.%j.out
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu

cd $SLURM_SUBMIT_DIR

# Initialize and start MySQL from inside of the container

./1_init.sh && ./2_create_user_and_db.sh

# unzip the sample input date genome_sample.fasta.gz

gunzip sample_data/genome_sample.fasta.gz

# Run PASA pipeline with genome_sample.fasta

apptainer exec instance://pasa-mysql /bin/bash -c "source activate /opt/pasa && \$PASAHOME/Launch_PASA_pipeline.pl \

  --CPU 20
--config ./sample_data/mysql.confs/alignAssembly.config --create --run \ --ALIGNER gmap --genome ./sample_data/genome_sample.fasta --transcripts ./sample_data/all_transcripts.fasta.clean"

# Clean up on the compute node and shut down MySQL

./3_cleanup.sh

Note:

  1. In the real submission script, at least all the above underlined values in Slurm headers need to be reviewed or to be replaced by the proper values.
  2. In the real submission script, the above the funannotate command lines in bold font can be replaced by your own PASA command lines.

Please refer to Running Jobs on Sapelo2 for more information running jobs on the Sapelo2 cluster.

Here is an example of job submission command:

sbatch ./sub.sh 

Documentation

https://github.com/PASApipeline/PASApipeline
https://github.com/PASApipeline/PASApipeline/wiki

Back to Top

Installation

Source code is obtained from https://github.com/PASApipeline/PASApipeline

System

64-bit Linux