Proovread-Teaching

From Research Computing Center Wiki
Jump to navigation Jump to search

Category

Bioinformatics

Program On

Teaching

Version

2.14.1

Author / Distributor

proovread

Description

"PacBio hybrid error correction through iterative short read consensus" More details are at proovread

Running Program

The last version of this application is at /usr/local/apps/eb/proovread/2.14.1-foss-2016b

To use this version, please load the module with

ml proovread/2.14.1-foss-2016b 

Here is an example of a shell script, sub.sh, to run on the batch queue:

#!/bin/bash
#SBATCH --job-name=j_proovread
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=10gb
#SBATCH --time=08:00:00
#SBATCH --output=proovread.%j.out
#SBATCH --error=proovread.%j.err

cd $SLURM_SUBMIT_DIR
ml proovread/2.14.1-foss-2016b
proovread [options]

In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.

Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.


Here is an example of job submission command:

sbatch ./sub.sh 

Documentation

ml proovread/2.14.1-foss-2016b 
proovread -h
[Wed Aug 15 15:42:49 2018] Running proovread-2.14.1 under Perl 5.24.1
[Wed Aug 15 15:42:49 2018] Reading core config
[Wed Aug 15 15:42:49 2018] Reading command line options
Options:
    -l, --long-reads=<FASTA/FASTQ>
        PacBio subreads or other erroneous sequences to be corrected.

    [-u, --unitigs=<FASTA/FASTQ>]
        High confidence unitigs. Can be specified multiple times.

    -s, --short-reads=<FASTA/FASTQ>
        High confidence short reads. Specify mulitple times for multiple
        libs, need to have same format (FASTA/FASTQ and offset 33,64)

          -s lib1.fq -s lib2.fq # use reads from two libs

    [--sam/--bam=<SAM/BAM>]
        External SAM or sorted BAM file instead of --short-reads.

    [-p, --prefix=<STRING>] [proovread]
        Prefix to output files.

    [--overwrite]
        Overwrite output folder if it already exists.

    [-t, --threads] [4]
        Number of threads to use for mapping and consensus.

    [--coverage=<INT>] [50]
        Estimated short read coverage, 50X recommended.

    [-m, --mode] [auto]
        Running mode of the pipeline, see README for details.

    [--create-cfg=<CFGFILENAME>] [<CWD>/proovread_cfg.pm]
        Create a custom config file with advanced options. Does not run the
        pipeline. For details, see config header section and README.

    [-c, --cfg]
        Use custom config file.

    [--lr-qv-offset/--sr-qv-offset=<INT>] [auto]
        Long/short read quality offset, required if auto-detect fails.

    [--ignore-sr-length]
        Don't stop if short reads are longer than 700bp.

    [--keep-temporary-files] [OFF]
        Keep temporary files of each task (BAMs, masked FASTQs, etc.)

    [--sample-run]
        Run the sample data set to test installation.

    [--haplo-coverage] [OFF]
        Adjust coverage for reads with low-coverage haplotype.

    [--no-sampling]
        Deactivate sampling, regardless of config settings.

    [-h, --help]
    [-V, --version]
    [--debug]


Back to Top

Installation

Source code is obtained from proovread

System

64-bit Linux