Canu-Teaching: Difference between revisions

Revision as of 14:40, 15 August 2018

Program On

Teaching

Version

1.5

Author / Distributor

Description

"Canu is a fork of the Celera Assembler, designed for high-noise single-molecule sequencing (such as the PacBio RS II or Oxford Nanopore MinION). Canu is a hierarchical assembly pipeline which runs in four steps: Detect overlaps in high-noise sequences using MHAP Generate corrected sequence consensus Trim corrected sequences Assemble trimmed corrected sequences" More details are at canu

Running Program

The last version of this application is at /usr/local/apps/eb/canu/1.5-foss-2016b

To use this version, please load the module with

ml canu/1.5-foss-2016b

Here is an example of a shell script, sub.sh, to run on the batch queue:

#!/bin/bash
#SBATCH --job-name=j_canu
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=10gb
#SBATCH --time=08:00:00
#SBATCH --output=canu.%j.out
#SBATCH --error=canu.%j.err

cd $SLURM_SUBMIT_DIR
ml canu/1.5-foss-2016b
canu [options]

In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.

Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.

Here is an example of job submission command:

sbatch ./sub.sh

Documentation

ml canu/1.5-foss-2016b 
canu -h

usage: canu [-version] \
            [-correct | -trim | -assemble | -trim-assemble] \
            [-s <assembly-specifications-file>] \
             -p <assembly-prefix> \
             -d <assembly-directory> \
             genomeSize=<number>[g|m|k] \
            [other-options] \
            [-pacbio-raw | -pacbio-corrected | -nanopore-raw | -nanopore-corrected] *fastq

  By default, all three stages (correct, trim, assemble) are computed.
  To compute only a single stage, use:
    -correct       - generate corrected reads
    -trim          - generate trimmed reads
    -assemble      - generate an assembly
    -trim-assemble - generate trimmed reads and then assemble them

  The assembly is computed in the (created) -d <assembly-directory>, with most
  files named using the -p <assembly-prefix>.

  The genome size is your best guess of the genome size of what is being assembled.
  It is used mostly to compute coverage in reads.  Fractional values are allowed: '4.7m'
  is the same as '4700k' and '4700000'

  A full list of options can be printed with '-options'.  All options
  can be supplied in an optional sepc file.

  Reads can be either FASTA or FASTQ format, uncompressed, or compressed
  with gz, bz2 or xz.  Reads are specified by the technology they were
  generated with:
    -pacbio-raw         <files>
    -pacbio-corrected   <files>
    -nanopore-raw       <files>
    -nanopore-corrected <files>

Complete documentation at http://canu.readthedocs.org/en/latest/

Back to Top

Installation

Source code is obtained from canu

System

64-bit Linux

@@ Line 9: / Line 9: @@
 === Version ===
-.4
+.5
 === Author / Distributor ===
@@ Line 21: / Line 21: @@
 === Running Program ===
-The last version of this application is at /usr/local/apps/eb/canu/1.4-foss-2016b
+The last version of this application is at /usr/local/apps/eb/canu/1.5-foss-2016b
 To use this version, please load the module with
 <pre class="gscript">
-ml canu/1.4-foss-2016b
+ml canu/1.5-foss-2016b
 </pre>
@@ Line 43: / Line 43: @@
 cd $SLURM_SUBMIT_DIR<br>
-ml canu/1.4-foss-2016b<br>
+ml canu/1.5-foss-2016b<br>
 canu <u>[options]</u><br>
 </div>
@@ Line 59: / Line 59: @@
 <pre  class="gcommand">
-ml canu/1.4-foss-2016b
+ml canu/1.5-foss-2016b
 canu -h
-usage: canu [-correct | -trim | -assemble | -trim-assemble] \
+usage: canu [-version] \
+            [-correct | -trim | -assemble | -trim-assemble] \
              [-s <assembly-specifications-file>] \
               -p <assembly-prefix> \
               -d <assembly-directory> \
               genomeSize=<number>[g|m|k] \
-             errorRate=0.X \
              [other-options] \
              [-pacbio-raw | -pacbio-corrected | -nanopore-raw | -nanopore-corrected] *fastq
@@ Line 84: / Line 84: @@
    It is used mostly to compute coverage in reads.  Fractional values are allowed: '4.7m'
    is the same as '4700k' and '4700000'
-  The errorRate is not used correctly (we're working on it).  Don't set it
-  If you want to change the defaults, use the various utg*ErrorRate options.
    A full list of options can be printed with '-options'.  All options
@@ Line 100: / Line 97: @@
 Complete documentation at http://canu.readthedocs.org/en/latest/
-ERROR:  Invalid command line option '-h'.  Did you forget quotes around options with spaces?
-ERROR:  Assembly name prefix not supplied with -p.
-ERROR:  Directory not supplied with -d.
-ERROR:  Invalid 'corErrorRate' specified; must be set
-ERROR:  Required parameter 'genomeSize' is not set

Canu-Teaching: Difference between revisions

Revision as of 14:40, 15 August 2018

Contents

Category

Program On

Version

Author / Distributor

Description

Running Program

Documentation

Installation

System

Navigation menu

Canu-Teaching: Difference between revisions

Revision as of 14:40, 15 August 2018

Category

Program On

Version

Author / Distributor

Description

Running Program

Documentation

Installation

System

Navigation menu

Search