DBG2OLC-Teaching: Difference between revisions

From Research Computing Center Wiki
Jump to navigation Jump to search
(Created page with "Category:TeachingCategory:SoftwareCategory:Bioinformatics === Category === Bioinformatics === Program On === Teaching === Version === 20170208 ==...")
 
No edit summary
 
(3 intermediate revisions by the same user not shown)
Line 23: Line 23:
The last version of this application is at /usr/local/apps/eb/DBG2OLC/20170208-foss-2016b
The last version of this application is at /usr/local/apps/eb/DBG2OLC/20170208-foss-2016b


To use this version, please loads the module with
To use this version, please load the module with
<pre class="gscript">
<pre class="gscript">
ml DBG2OLC/20170208-foss-2016b  
ml DBG2OLC/20170208-foss-2016b  
</pre>  
</pre>  


Here is an example of a shell script, sub.sh, to run on at the batch queue:  
Here is an example of a shell script, sub.sh, to run on the batch queue:  


<div class="gscript2">
<div class="gscript2">
Line 40: Line 40:
<nowiki>#</nowiki>SBATCH --time=<u>08:00:00</u><br>   
<nowiki>#</nowiki>SBATCH --time=<u>08:00:00</u><br>   
<nowiki>#</nowiki>SBATCH --output=DBG2OLC.%j.out<br>
<nowiki>#</nowiki>SBATCH --output=DBG2OLC.%j.out<br>
<nowiki>#</nowiki>SBATCH --error=DBG2OLC.%j.err<br>
   
   
cd $SLURM_SUBMIT_DIR<br>
cd $SLURM_SUBMIT_DIR<br>
ml DBG2OLC/20170208-foss-2016b<br>     
ml DBG2OLC/20170208-foss-2016b<br>     
NA <u>[options]</u><br>   
DBG2OLC <u>[options]</u><br>   
</div>
</div>
In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.   
In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.   
Line 59: Line 60:
<pre  class="gcommand">
<pre  class="gcommand">
ml DBG2OLC/20170208-foss-2016b  
ml DBG2OLC/20170208-foss-2016b  
NA NA
DBG2OLC -h
Example command:
For third-gen sequencing: DBG2OLC LD1 0 Contigs contig.fa k 17 KmerCovTh 2 MinOverlap 20 AdaptiveTh 0.005 f reads_file1.fq/fa f reads_file2.fq/fa
For sec-gen sequencing: DBG2OLC LD1 0 Contigs contig.fa k 31 KmerCovTh 0 MinOverlap 50 PathCovTh 1 f reads_file1.fq/fa f reads_file2.fq/fa
Parameters:
MinLen: min read length for a read to be used.
Contigs:  contig file to be used.
k: k-mer size.
LD: load compressed reads information. You can set to 1 if you have run the algorithm for one round and just want to fine tune the following parameters.
PARAMETERS THAT ARE CRITICAL FOR THE PERFORMANCE:
If you have high coverage, set large values to these parameters.
KmerCovTh: k-mer matching threshold for each solid contig. (suggest 2-10)
MinOverlap: min matching k-mers for each two reads. (suggest 10-150)
AdaptiveTh: [Specific for third-gen sequencing] adaptive k-mer threshold for each solid contig. (suggest 0.001-0.02)
PathCovTh: [Specific for Illumina sequencing] occurence threshold for a compressed read. (suggest 1-3)
Author: Chengxi Ye cxy@umd.edu.
last update: Jun 11, 2015.
Loading contigs.
0 k-mers in round 1.
0 k-mers in round 2.
Scoring method: 3
Match method: 1
Loading long read index
0 selected reads.
0 reads loaded.


</pre>
</pre>

Latest revision as of 12:19, 15 August 2018

Category

Bioinformatics

Program On

Teaching

Version

20170208

Author / Distributor

DBG2OLC

Description

"DBG2OLC:Efficient Assembly of Large Genomes Using Long Erroneous Reads of the Third Generation Sequencing Technologies" More details are at DBG2OLC

Running Program

The last version of this application is at /usr/local/apps/eb/DBG2OLC/20170208-foss-2016b

To use this version, please load the module with

ml DBG2OLC/20170208-foss-2016b 

Here is an example of a shell script, sub.sh, to run on the batch queue:

#!/bin/bash
#SBATCH --job-name=j_DBG2OLC
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=10gb
#SBATCH --time=08:00:00
#SBATCH --output=DBG2OLC.%j.out
#SBATCH --error=DBG2OLC.%j.err

cd $SLURM_SUBMIT_DIR
ml DBG2OLC/20170208-foss-2016b
DBG2OLC [options]

In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.

Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.


Here is an example of job submission command:

sbatch ./sub.sh 

Documentation

ml DBG2OLC/20170208-foss-2016b 
DBG2OLC -h
 Example command: 
For third-gen sequencing: DBG2OLC LD1 0 Contigs contig.fa k 17 KmerCovTh 2 MinOverlap 20 AdaptiveTh 0.005 f reads_file1.fq/fa f reads_file2.fq/fa
For sec-gen sequencing: DBG2OLC LD1 0 Contigs contig.fa k 31 KmerCovTh 0 MinOverlap 50 PathCovTh 1 f reads_file1.fq/fa f reads_file2.fq/fa
Parameters:
MinLen: min read length for a read to be used.
Contigs:  contig file to be used.
k: k-mer size.
LD: load compressed reads information. You can set to 1 if you have run the algorithm for one round and just want to fine tune the following parameters.
PARAMETERS THAT ARE CRITICAL FOR THE PERFORMANCE:
If you have high coverage, set large values to these parameters.
KmerCovTh: k-mer matching threshold for each solid contig. (suggest 2-10)
MinOverlap: min matching k-mers for each two reads. (suggest 10-150)
AdaptiveTh: [Specific for third-gen sequencing] adaptive k-mer threshold for each solid contig. (suggest 0.001-0.02)
PathCovTh: [Specific for Illumina sequencing] occurence threshold for a compressed read. (suggest 1-3)
Author: Chengxi Ye cxy@umd.edu.
last update: Jun 11, 2015.
Loading contigs.
0 k-mers in round 1.
0 k-mers in round 2.
Scoring method: 3
Match method: 1
Loading long read index
0 selected reads.
0 reads loaded.

Back to Top

Installation

Source code is obtained from DBG2OLC

System

64-bit Linux