Racon-Sapelo2

From Research Computing Center Wiki
Revision as of 13:11, 6 September 2023 by Chelsea (talk | contribs) (→‎Version 1.5.0)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Category

Bioinformatics

Program On

Sapelo2

Version

1.5.0

Author / Distributor

Please see https://github.com/isovic/racon

Description

From https://github.com/isovic/racon: Racon is a "Consensus module for raw de novo DNA assembly of long uncorrected reads. "

Running Program

Also refer to Running Jobs on Sapelo2

For more information on Environment Modules on Sapelo2 please see the Lmod page.

  • Version 1.5.0, compiled with GCCcore/11.2.0 toolchain, installed in /apps/eb/Racon/1.5.0-GCCcore-11.2.0

To use this version of racon, please first load the module with

ml Racon/1.5.0-GCCcore-11.2.0

Note: This version does not work on the nodes with AMD Opteron processors. You can request that your job does not get dispatched to a node with AMD Opteron processors by adding the following line to your job submission script:

#SBATCH --constraint=EDR

Sample job submission script (sub.sh) to run racon 1.4.13 in a batch job:

#!/bin/bash
#SBATCH --job-name=raconjob
#SBATCH --partition=batch  
#SBATCH --ntasks=1           
#SBATCH --cpus-per-task=12
#SBATCH --constraint=EDR
#SBATCH --mem=20gb             
#SBATCH --time=24:00:00    
#SBATCH --output=%x.%j.out  
#SBATCH --error=%x.%j.err   

cd $SLURM_SUBMIT_DIR

ml Racon/1.5.0-GCCcore-11.2.0

racon -t 12 [options] 

where [options] need to be replaced by the options (command and arguments) you want to use. Other parameters of the job, such as the number of cores per node, the maximum wall clock time, the maximum memory, and the job name need to be modified appropriately as well. Note that if you use the racon option -t to specify using multi threads, please also request the same number of cores with --cpus-per-task. In the example above --cpus-per-task=12 and racon is invoked with -t 12.

Submit the job to the queue with

sbatch sub.sh

Documentation

Please see links from https://github.com/isovic/racon.

[cft07037@d2-13 all]$ ml Racon/1.5.0-GCCcore-11.2.0 
[cft07037@d2-13 all]$ racon -h
usage: racon [options ...] <sequences> <overlaps> <target sequences>

    #default output is stdout
    <sequences>
        input file in FASTA/FASTQ format (can be compressed with gzip)
        containing sequences used for correction
    <overlaps>
        input file in MHAP/PAF/SAM format (can be compressed with gzip)
        containing overlaps between sequences and target sequences
    <target sequences>
        input file in FASTA/FASTQ format (can be compressed with gzip)
        containing sequences which will be corrected

    options:
        -u, --include-unpolished
            output unpolished target sequences
        -f, --fragment-correction
            perform fragment correction instead of contig polishing
            (overlaps file should contain dual/self overlaps!)
        -w, --window-length <int>
            default: 500
            size of window on which POA is performed
        -q, --quality-threshold <float>
            default: 10.0
            threshold for average base quality of windows used in POA
        -e, --error-threshold <float>
            default: 0.3
            maximum allowed error rate used for filtering overlaps
        --no-trimming
            disables consensus trimming at window ends
        -m, --match <int>
            default: 3
            score for matching bases
        -x, --mismatch <int>
            default: -5
            score for mismatching bases
        -g, --gap <int>
            default: -4
            gap penalty (must be negative)
        -t, --threads <int>
            default: 1
            number of threads
        --version
            prints the version number
        -h, --help
            prints the usage

Installation

Downloaded from https://github.com/lbcb-sci/racon/

System

64-bit Linux