SSPACE-longread-Teaching
Category
Bioinformatics
Program On
Teaching
Version
1-1
Author / Distributor
Description
"SSPACE-LongRead is a stand-alone program for scaffolding pre-assembled contigs using long reads (e.g. PacBio RS reads). Using the long read information, contigs (or scaffolds) are placed in the right order and orientation in so-called super-scaffolds. " More details are at SSPACE-longread
Running Program
The last version of this application is at /usr/local/apps/gb/sspace-longread/1-1
To use this version, please load the module with
ml sspace-longread/1-1
Here is an example of a shell script, sub.sh, to run on the batch queue:
#!/bin/bash
#SBATCH --job-name=j_SSPACE-longread
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=10gb
#SBATCH --time=08:00:00
#SBATCH --output=SSPACE-longread.%j.out
#SBATCH --error=SSPACE-longread.%j.err
cd $SLURM_SUBMIT_DIR
ml sspace-longread/1-1
perl /usr/local/apps/gb/sspace-longread/1-1/SSPACE-LongRead.pl [options]
In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.
Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.
Here is an example of job submission command:
sbatch ./sub.sh
Documentation
ml sspace-longread/1-1 perl /usr/local/apps/gb/sspace-longread/1-1/SSPACE-LongRead.pl -h Usage SSPACE-LongRead scaffolder version 1-1 perl SSPACE-LongRead.pl -c <contig-sequences> -p <pacbio-reads> General options: -c Fasta file containing contig sequences used for scaffolding (REQUIRED) -p File containing PacBio CLR sequences to be used scaffolding (REQUIRED) -b Output folder name where the results are stored (optional, default -b 'PacBio_scaffolder_results') Alignment options: -a Minimum alignment length to allow a contig to be included for scaffolding (default -a 0, optional) -i Minimum identity of the alignment of the PacBio reads to the contig sequences. Alignment below this value will be filtered out (default -i 70, optional) -t The number of threads to run BLASR with -g Minimmum gap between two contigs Scaffolding options: -l Minimum number of links (PacBio reads) to allow contig-pairs for scaffolding (default -k 3, optional) -r Maximum link ratio between two best contig pairs *higher values lead to least accurate scaffolding* (default -r 0.3, optional) -o Minimum overlap length to merge two contigs (default -o 10, optional) Other options: -k Store inner-scaffold sequences in a file. These are the long-read sequences spanning over a contig-link (default no output, set '-k 1' to store inner-scaffold sequences. If set, a folder is generated named 'inner-scaffold-sequences' -s Skip the alignment step and use a previous alignment file. Note that the results of a previous run will be overwritten. Set '-s 1' to skip the alignment. -h Prints this help message ERROR: Please insert a file with contig sequences. You've inserted '' which either does not exist or is not filled in
Installation
Source code is obtained from SSPACE-longread
System
64-bit Linux