Proovread-Teaching: Difference between revisions

From Research Computing Center Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 36: Line 36:
<nowiki>#</nowiki>SBATCH --mail-type=ALL<br>  
<nowiki>#</nowiki>SBATCH --mail-type=ALL<br>  
<nowiki>#</nowiki>SBATCH --mail-user=<u>username@uga.edu</u><br>   
<nowiki>#</nowiki>SBATCH --mail-user=<u>username@uga.edu</u><br>   
<nowiki>#</nowiki>SBATCH --ntasks=<u>1</u><br>   
<nowiki>#</nowiki>SBATCH --ntasks=<u>12</u><br>   
<nowiki>#</nowiki>SBATCH --mem=<u>10gb</u><br>     
<nowiki>#</nowiki>SBATCH --mem=<u>10gb</u><br>     
<nowiki>#</nowiki>SBATCH --time=<u>08:00:00</u><br>   
<nowiki>#</nowiki>SBATCH --time=<u>08:00:00</u><br>   
Line 44: Line 44:
cd $SLURM_SUBMIT_DIR<br>
cd $SLURM_SUBMIT_DIR<br>
ml proovread/2.14.1-foss-2016b<br>     
ml proovread/2.14.1-foss-2016b<br>     
NA <u>[options]</u><br>   
proovread -t <u>12</u> <u>[options]</u><br>   
</div>
</div>
In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.   
In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.   
Line 60: Line 60:
<pre  class="gcommand">
<pre  class="gcommand">
ml proovread/2.14.1-foss-2016b  
ml proovread/2.14.1-foss-2016b  
NA
proovread -h
 
Options:
    -l, --long-reads=<FASTA/FASTQ>
        PacBio subreads or other erroneous sequences to be corrected.
 
    [-u, --unitigs=<FASTA/FASTQ>]
        High confidence unitigs. Can be specified multiple times.
 
    -s, --short-reads=<FASTA/FASTQ>
        High confidence short reads. Specify mulitple times for multiple
        libs, need to have same format (FASTA/FASTQ and offset 33,64)
 
          -s lib1.fq -s lib2.fq # use reads from two libs
 
    [--sam/--bam=<SAM/BAM>]
        External SAM or sorted BAM file instead of --short-reads.
 
    [-p, --prefix=<STRING>] [proovread]
        Prefix to output files.
 
    [--overwrite]
        Overwrite output folder if it already exists.
 
    [-t, --threads] [4]
        Number of threads to use for mapping and consensus.
 
    [--coverage=<INT>] [50]
        Estimated short read coverage, 50X recommended.
 
    [-m, --mode] [auto]
        Running mode of the pipeline, see README for details.
 
    [--create-cfg=<CFGFILENAME>] [<CWD>/proovread_cfg.pm]
        Create a custom config file with advanced options. Does not run the
        pipeline. For details, see config header section and README.
 
    [-c, --cfg]
        Use custom config file.
 
    [--lr-qv-offset/--sr-qv-offset=<INT>] [auto]
        Long/short read quality offset, required if auto-detect fails.
 
    [--ignore-sr-length]
        Don't stop if short reads are longer than 700bp.
 
    [--keep-temporary-files] [OFF]
        Keep temporary files of each task (BAMs, masked FASTQs, etc.)
 
    [--sample-run]
        Run the sample data set to test installation.
 
    [-h, --help]
    [-V, --version]


</pre>
</pre>

Revision as of 15:35, 10 August 2018

Category

Bioinformatics

Program On

Teaching

Version

2.14.1

Author / Distributor

proovread

Description

"PacBio hybrid error correction through iterative short read consensus" More details are at proovread

Running Program

The last version of this application is at /usr/local/apps/eb/proovread/2.14.1-foss-2016b

To use this version, please load the module with

ml proovread/2.14.1-foss-2016b 

Here is an example of a shell script, sub.sh, to run on the batch queue:

#!/bin/bash
#SBATCH --job-name=j_proovread
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=12
#SBATCH --mem=10gb
#SBATCH --time=08:00:00
#SBATCH --output=proovread.%j.out
#SBATCH --error=proovread.%j.err

cd $SLURM_SUBMIT_DIR
ml proovread/2.14.1-foss-2016b
proovread -t 12 [options]

In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.

Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.


Here is an example of job submission command:

sbatch ./sub.sh 

Documentation

ml proovread/2.14.1-foss-2016b 
proovread -h

Options:
    -l, --long-reads=<FASTA/FASTQ>
        PacBio subreads or other erroneous sequences to be corrected.

    [-u, --unitigs=<FASTA/FASTQ>]
        High confidence unitigs. Can be specified multiple times.

    -s, --short-reads=<FASTA/FASTQ>
        High confidence short reads. Specify mulitple times for multiple
        libs, need to have same format (FASTA/FASTQ and offset 33,64)

          -s lib1.fq -s lib2.fq # use reads from two libs

    [--sam/--bam=<SAM/BAM>]
        External SAM or sorted BAM file instead of --short-reads.

    [-p, --prefix=<STRING>] [proovread]
        Prefix to output files.

    [--overwrite]
        Overwrite output folder if it already exists.

    [-t, --threads] [4]
        Number of threads to use for mapping and consensus.

    [--coverage=<INT>] [50]
        Estimated short read coverage, 50X recommended.

    [-m, --mode] [auto]
        Running mode of the pipeline, see README for details.

    [--create-cfg=<CFGFILENAME>] [<CWD>/proovread_cfg.pm]
        Create a custom config file with advanced options. Does not run the
        pipeline. For details, see config header section and README.

    [-c, --cfg]
        Use custom config file.

    [--lr-qv-offset/--sr-qv-offset=<INT>] [auto]
        Long/short read quality offset, required if auto-detect fails.

    [--ignore-sr-length]
        Don't stop if short reads are longer than 700bp.

    [--keep-temporary-files] [OFF]
        Keep temporary files of each task (BAMs, masked FASTQs, etc.)

    [--sample-run]
        Run the sample data set to test installation.

    [-h, --help]
    [-V, --version]
 

Back to Top

Installation

Source code is obtained from proovread

System

64-bit Linux