StringTie-Teaching
Category
Bioinformatics
Program On
Teaching
Version
1.3.3
Author / Distributor
Description
"StringTie is a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts." More details are at StringTie
Running Program
The last version of this application is at /usr/local/apps/eb/StringTie/1.3.3-foss-2016b
To use this version, please load the module with
ml StringTie/1.3.3-foss-2016b
Here is an example of a shell script, sub.sh, to run on the batch queue:
#!/bin/bash
#SBATCH --job-name=j_StringTie
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=10gb
#SBATCH --time=08:00:00
#SBATCH --output=StringTie.%j.out
#SBATCH --error=StringTie.%j.err
cd $SLURM_SUBMIT_DIR
ml StringTie/1.3.3-foss-2016b
stringtie [options]
In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.
Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.
Here is an example of job submission command:
sbatch ./sub.sh
Documentation
ml StringTie/1.3.3-foss-2016b stringtie --help StringTie v1.3.3 usage: stringtie <input.bam ..> [-G <guide_gff>] [-l <label>] [-o <out_gtf>] [-p <cpus>] [-v] [-a <min_anchor_len>] [-m <min_tlen>] [-j <min_anchor_cov>] [-f <min_iso>] [-C <coverage_file_name>] [-c <min_bundle_cov>] [-g <bdist>] [-u] [-e] [-x <seqid,..>] [-A <gene_abund.out>] [-h] {-B | -b <dir_path>} Assemble RNA-Seq alignments into potential transcripts. Options: --version : print just the version at stdout and exit -G reference annotation to use for guiding the assembly process (GTF/GFF3) --rf assume stranded library fr-firststrand --fr assume stranded library fr-secondstrand -l name prefix for output transcripts (default: STRG) -f minimum isoform fraction (default: 0.1) -m minimum assembled transcript length (default: 200) -o output path/file name for the assembled transcripts GTF (default: stdout) -a minimum anchor length for junctions (default: 10) -j minimum junction coverage (default: 1) -t disable trimming of predicted transcripts based on coverage (default: coverage trimming is enabled) -c minimum reads per bp coverage to consider for transcript assembly (default: 2.5) -v verbose (log bundle processing details) -g gap between read mappings triggering a new bundle (default: 50) -C output a file with reference transcripts that are covered by reads -M fraction of bundle allowed to be covered by multi-hit reads (default:0.95) -p number of threads (CPUs) to use (default: 1) -A gene abundance estimation output file -B enable output of Ballgown table files which will be created in the same directory as the output GTF (requires -G, -o recommended) -b enable output of Ballgown table files but these files will be created under the directory path given as <dir_path> -e only estimate the abundance of given reference transcripts (requires -G) -x do not assemble any transcripts on the given reference sequence(s) -u no multi-mapping correction (default: correction enabled) -h print this usage message and exit Transcript merge usage mode: stringtie --merge [Options] { gtf_list | strg1.gtf ...} With this option StringTie will assemble transcripts from multiple input files generating a unified non-redundant set of isoforms. In this mode the following options are available: -G <guide_gff> reference annotation to include in the merging (GTF/GFF3) -o <out_gtf> output file name for the merged transcripts GTF (default: stdout) -m <min_len> minimum input transcript length to include in the merge (default: 50) -c <min_cov> minimum input transcript coverage to include in the merge (default: 0) -F <min_fpkm> minimum input transcript FPKM to include in the merge (default: 1.0) -T <min_tpm> minimum input transcript TPM to include in the merge (default: 1.0) -f <min_iso> minimum isoform fraction (default: 0.01) -g <gap_len> gap between transcripts to merge together (default: 250) -i keep merged transcripts with retained introns; by default these are not kept unless there is strong evidence for them -l <label> name prefix for output transcripts (default: MSTRG)
Installation
Source code is obtained from StringTie
System
64-bit Linux