SPAdes-Teaching

From Research Computing Center Wiki
Revision as of 13:07, 10 August 2018 by Yhuang (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Category

Bioinformatics

Program On

Teaching

Version

3.11.1

Author / Distributor

SPAdes

Description

"Genome assembler for single-cell and isolates data sets" More details are at SPAdes

Running Program

The last version of this application is at /usr/local/apps/eb/SPAdes/3.11.1-foss-2016b

To use this version, please load the module with

ml SPAdes/3.11.1-foss-2016b 

Here is an example of a shell script, sub.sh, to run on the batch queue:

#!/bin/bash
#SBATCH --job-name=j_SPAdes
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=10gb
#SBATCH --time=08:00:00
#SBATCH --output=SPAdes.%j.out
#SBATCH --error=SPAdes.%j.err

cd $SLURM_SUBMIT_DIR
ml SPAdes/3.11.1-foss-2016b
spades.py [options]

In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.

Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.


Here is an example of job submission command:

sbatch ./sub.sh 

Documentation

ml SPAdes/3.11.1-foss-2016b 
spades.py -h
SPAdes genome assembler v3.11.1

Usage: /usr/local/apps/eb/SPAdes/3.11.1-foss-2016b/bin/spades.py [options] -o <output_dir>

Basic options:
-o	<output_dir>	directory to store all the resulting files (required)
--sc			this flag is required for MDA (single-cell) data
--meta			this flag is required for metagenomic sample data
--rna			this flag is required for RNA-Seq data 
--plasmid		runs plasmidSPAdes pipeline for plasmid detection 
--iontorrent		this flag is required for IonTorrent data
--test			runs SPAdes on toy dataset
-h/--help		prints this usage message
-v/--version		prints version

Input data:
--12	<filename>	file with interlaced forward and reverse paired-end reads
-1	<filename>	file with forward paired-end reads
-2	<filename>	file with reverse paired-end reads
-s	<filename>	file with unpaired reads
--pe<#>-12	<filename>	file with interlaced reads for paired-end library number <#> (<#> = 1,2,..,9)
--pe<#>-1	<filename>	file with forward reads for paired-end library number <#> (<#> = 1,2,..,9)
--pe<#>-2	<filename>	file with reverse reads for paired-end library number <#> (<#> = 1,2,..,9)
--pe<#>-s	<filename>	file with unpaired reads for paired-end library number <#> (<#> = 1,2,..,9)
--pe<#>-<or>	orientation of reads for paired-end library number <#> (<#> = 1,2,..,9; <or> = fr, rf, ff)
--s<#>		<filename>	file with unpaired reads for single reads library number <#> (<#> = 1,2,..,9)
--mp<#>-12	<filename>	file with interlaced reads for mate-pair library number <#> (<#> = 1,2,..,9)
--mp<#>-1	<filename>	file with forward reads for mate-pair library number <#> (<#> = 1,2,..,9)
--mp<#>-2	<filename>	file with reverse reads for mate-pair library number <#> (<#> = 1,2,..,9)
--mp<#>-s	<filename>	file with unpaired reads for mate-pair library number <#> (<#> = 1,2,..,9)
--mp<#>-<or>	orientation of reads for mate-pair library number <#> (<#> = 1,2,..,9; <or> = fr, rf, ff)
--hqmp<#>-12	<filename>	file with interlaced reads for high-quality mate-pair library number <#> (<#> = 1,2,..,9)
--hqmp<#>-1	<filename>	file with forward reads for high-quality mate-pair library number <#> (<#> = 1,2,..,9)
--hqmp<#>-2	<filename>	file with reverse reads for high-quality mate-pair library number <#> (<#> = 1,2,..,9)
--hqmp<#>-s	<filename>	file with unpaired reads for high-quality mate-pair library number <#> (<#> = 1,2,..,9)
--hqmp<#>-<or>	orientation of reads for high-quality mate-pair library number <#> (<#> = 1,2,..,9; <or> = fr, rf, ff)
--nxmate<#>-1	<filename>	file with forward reads for Lucigen NxMate library number <#> (<#> = 1,2,..,9)
--nxmate<#>-2	<filename>	file with reverse reads for Lucigen NxMate library number <#> (<#> = 1,2,..,9)
--sanger	<filename>	file with Sanger reads
--pacbio	<filename>	file with PacBio reads
--nanopore	<filename>	file with Nanopore reads
--tslr	<filename>	file with TSLR-contigs
--trusted-contigs	<filename>	file with trusted contigs
--untrusted-contigs	<filename>	file with untrusted contigs

Pipeline options:
--only-error-correction	runs only read error correction (without assembling)
--only-assembler	runs only assembling (without read error correction)
--careful		tries to reduce number of mismatches and short indels
--continue		continue run from the last available check-point
--restart-from	<cp>	restart run with updated options and from the specified check-point ('ec', 'as', 'k<int>', 'mc')
--disable-gzip-output	forces error correction not to compress the corrected reads
--disable-rr		disables repeat resolution stage of assembling

Advanced options:
--dataset	<filename>	file with dataset description in YAML format
-t/--threads	<int>		number of threads
				[default: 16]
-m/--memory	<int>		RAM limit for SPAdes in Gb (terminates if exceeded)
				[default: 250]
--tmp-dir	<dirname>	directory for temporary files
				[default: <output_dir>/tmp]
-k		<int,int,...>	comma-separated list of k-mer sizes (must be odd and
				less than 128) [default: 'auto']
--cov-cutoff	<float>		coverage cutoff value (a positive float number, or 'auto', or 'off') [default: 'off']
--phred-offset	<33 or 64>	PHRED quality offset in the input reads (33 or 64)
				[default: auto-detect]

Back to Top

Installation

Source code is obtained from SPAdes

System

64-bit Linux