Velvet-Sapelo2: Difference between revisions

From Research Computing Center Wiki
Jump to navigation Jump to search
(Created page with "Category:Sapelo2oldCategory:SoftwareCategory:Bioinformatics === Category === Bioinformatics === Program On === Sapelo2 === Version === 1.2.10 === Author...")
 
(Updated for version 1.2.10-GCC-11.2.0)
 
(2 intermediate revisions by one other user not shown)
Line 1: Line 1:
[[Category:Sapelo2old]][[Category:Software]][[Category:Bioinformatics]]   
[[Category:Sapelo2]][[Category:Software]][[Category:Bioinformatics]]   
 
=== Category ===
=== Category ===
Bioinformatics
Bioinformatics


=== Program On ===
=== Program On ===
Sapelo2
Sapelo2


=== Version ===
=== Version ===  
1.2.10
1.2.10
   
   
=== Author / Distributor ===
=== Author / Distributor ===  
Velvet: algorithms for de novo short read assembly using de Bruijn graphs. D.R. Zerbino and E. Birney. Genome Research 18:821-829
Velvet: algorithms for de novo short read assembly using de Bruijn graphs. D.R. Zerbino and E. Birney. Genome Research 18:821-829
   
   
=== Description ===
=== Description ===
Sequence assembler for very short reads. More information: http://www.ebi.ac.uk/~zerbino/velvet/
Sequence assembler for very short reads. More information: http://www.ebi.ac.uk/~zerbino/velvet/
   
   
Line 23: Line 19:
velveth - simple hashing program
velveth - simple hashing program


=== Running Program ===
=== Running Program ===  
Also refer to [[Running Jobs on Sapelo2]]
Also refer to [[Running Jobs on Sapelo2]]
   
   
Note:  
Note: Velvet is compiled in multi-thread (compiled with 'LONGSEQUENCES=1' 'MAXKMERLENGTH=191' 'CATEGORIES=2' 'OPENMP=1')  
velvet is compiled in multi-thread (compiled with 'BIGASSEMBLY=1' 'LONGSEQUENCES=1' 'MAXKMERLENGTH=100' 'CATEGORIES=62' 'OPENMP=1' )  


some long reads causes segment fault with high categories (e.g. CATEGORIES=99), we suggest using the fitting categories and kmer version for less memory.  
some long reads causes segment fault with high categories (e.g. CATEGORIES=99), we suggest using the fitting categories and kmer version for less memory.  


*Version 1.2.10, installed in /usr/local/apps/eb/Velvet/1.2.10-foss-2016b-mt-kmer_100-Perl-5.24.1
*Version 1.2.10, installed in /apps/eb/Velvet/1.2.10-GCC-11.2.0-mt-kmer_191/


To use this version of Velvet, please first load the module with
To use this version of Velvet, please first load the module with
<pre class="gscript">
<pre class="gscript">
module load Velvet/1.2.10-foss-2016b-mt-kmer_100-Perl-5.24.1
module load Velvet/1.2.10-GCC-11.2.0-mt-kmer_191
</pre>  
</pre>  


Example of a shell script velvet.sh to run on at the batch queue:
Example of a shell script sub.sh to run on at the batch partition:


<pre class="gscript">
<pre class="gscript">
#PBS -S /bin/bash
#!/bin/bash
#PBS -N j_velvet
#SBATCH --job-name=velvethk81       
#PBS -q batch
#SBATCH --partition=highmem_p       
#PBS -l nodes=1:ppn=2:AMD
#SBATCH --nodes=1                    
#PBS -l walltime=480:00:00
#SBATCH --ntasks=24                 
#PBS -l mem=100gb
#SBATCH --mem=800gb                 
#SBATCH --time=168:00:00              
#SBATCH --output=velvethk81.out     
#SBATCH --error=velvethk81.err     
#SBATCH --mail-type=ALL         
#SBATCH --mail-user=username@uga.edu 


cd $PBS_O_WORKDIR
cd $SLURM_SUBMIT_DIR


module load Velvet/1.2.10-foss-2016b-mt-kmer_100-Perl-5.24.1
ml Velvet/1.2.10-GCC-11.2.0-mt-kmer_191


export OMP_THREAD_LIMIT=2
export OMP_THREAD_LIMIT=24
export OMP_NUM_THREADS=2
export OMP_NUM_THREADS=24
time velveth myDirectory 21 -shortPaired data/test_reads.fa
 
time velvetg myDirectory
## run velveth to optmize kmers using the binary module for quicker run.
velveth velvet-kmers_81 81  -create_binary -reuse_Sequences \
-fastq.gz \
-shortPaired ../shortreads/Spu_genomic/Spu_genomic_trim/*.fq.gz
 
## run velvetg to assemble
velvetg velvet-kmers_81 -exp_cov auto -cov_cutoff auto
</pre>  
</pre>  


In above sample, 2 in OMP_THREAD_LIMIT and OMP_NUM_THREADS are the number of threads to use. ppn number has to match "2" in OMP_THREAD_LIMIT and OMP_NUM_THREADS
In the above sample, 24 for OMP_THREAD_LIMIT and OMP_NUM_THREADS are the number of threads to use. --ntasks number has to match the number "24" in OMP_THREAD_LIMIT and OMP_NUM_THREADS.


Example of submission to the queue:
Example of submission to the queue:
<pre  class="gcommand">
<pre  class="gcommand">
qsub velvet.sh  
sbatch sub.sh
</pre>
</pre>


Line 71: Line 76:


=== Documentation ===
=== Documentation ===
<pre class="gcommand">  
<pre class="gcommand">  
module load Velvet/1.2.10-foss-2016b-mt-kmer_100-Perl-5.24.1
module load Velvet/1.2.10-GCC-11.2.0-mt-kmer_191
velveth --help
velveth - simple hashing program
velveth - simple hashing program
Version 1.2.10
Version 1.2.10
Line 82: Line 86:


Compilation settings:
Compilation settings:
CATEGORIES = 62
CATEGORIES = 2
MAXKMERLENGTH = 100
MAXKMERLENGTH = 191
OPENMP
OPENMP
LONGSEQUENCES
LONGSEQUENCES
BIGASSEMBLY


Usage:
Usage:
./velveth directory hash_length {[-file_format][-read_type][-separate|-interleaved] filename1 [filename2 ...]} {...} [options]
./velveth directory hash_length {[-file_format][-read_type][-separate|-interleaved] filename1 [filename2 ...]} {...} [options]


        directory       : directory name for output files
directory : directory name for output files
        hash_length     : EITHER an odd integer (if even, it will be decremented) <= 100 (if above, will be reduced)
hash_length : EITHER an odd integer (if even, it will be decremented) <= 191 (if above, will be reduced)
                        : OR: m,M,s where m and M are odd integers (if not, they will be decremented) with m < M <= 100 (if above, will be reduced)
: OR: m,M,s where m and M are odd integers (if not, they will be decremented) with m < M <= 191 (if above, will be reduced)
                                and s is a step (even number). Velvet will then hash from k=m to k=M with a step of s
and s is a step (even number). Velvet will then hash from k=m to k=M with a step of s
        filename       : path to sequence file or - for standard input
filename : path to sequence file or - for standard input


File format options:
File format options:
        -fasta -fastq -raw   -fasta.gz       -fastq.gz       -raw.gz -sam   -bam   -fmtAuto
-fasta -fastq -raw -fasta.gz -fastq.gz -raw.gz -sam -bam -fmtAuto
        (Note: -fmtAuto will detect fasta or fastq, and will try the following programs for decompression : gunzip, pbunzip2, bunzip2
(Note: -fmtAuto will detect fasta or fastq, and will try the following programs for decompression : gunzip, pbunzip2, bunzip2


File layout options for paired reads (only for fasta and fastq formats):
File layout options for paired reads (only for fasta and fastq formats):
        -interleaved   : File contains paired reads interleaved in the one file (default)
-interleaved : File contains paired reads interleaved in the one file (default)
        -separate       : Read 2 separate files for paired reads
-separate : Read 2 separate files for paired reads


Read type options:
Read type options:
        -short -shortPaired
-short -shortPaired
        ...
-short2 -shortPaired2
        -short61        -shortPaired61
-long -longPaired
        -short62        -shortPaired62
-reference
        -long   -longPaired
        -reference


Options:
Options:
        -strand_specific       : for strand specific transcriptome sequencing data (default: off)
-strand_specific : for strand specific transcriptome sequencing data (default: off)
        -reuse_Sequences       : reuse Sequences file (or link) already in directory (no need to provide original filenames in this case (default: off)
-reuse_Sequences : reuse Sequences file (or link) already in directory (no need to provide original filenames in this case (default: off)
        -reuse_binary   : reuse binary sequences file (or link) already in directory (no need to provide original filenames in this case (default: off)
-reuse_binary : reuse binary sequences file (or link) already in directory (no need to provide original filenames in this case (default: off)
        -noHash                 : simply prepare Sequences file, do not hash reads or prepare Roadmaps file (default: off)
-noHash : simply prepare Sequences file, do not hash reads or prepare Roadmaps file (default: off)
        -create_binary         : create binary CnyUnifiedSeq file (default: off)
-create_binary : create binary CnyUnifiedSeq file (default: off)


Synopsis:
Synopsis:


- Short single end reads:
- Short single end reads:
        velveth Assem 29 -short -fastq s_1_sequence.txt
velveth Assem 29 -short -fastq s_1_sequence.txt


- Paired-end short reads (remember to interleave paired reads):
- Paired-end short reads (remember to interleave paired reads):
        velveth Assem 31 -shortPaired -fasta interleaved.fna
velveth Assem 31 -shortPaired -fasta interleaved.fna


- Paired-end short reads (using separate files for the paired reads)
- Paired-end short reads (using separate files for the paired reads)
        velveth Assem 31 -shortPaired -fasta -separate left.fa right.fa
velveth Assem 31 -shortPaired -fasta -separate left.fa right.fa


- Two channels and some long reads:
- Two channels and some long reads:
        velveth Assem 43 -short -fastq unmapped.fna -longPaired -fasta SangerReads.fasta
velveth Assem 43 -short -fastq unmapped.fna -longPaired -fasta SangerReads.fasta


- Three channels:
- Three channels:
        velveth Assem 35 -shortPaired -fasta pe_lib1.fasta -shortPaired2 pe_lib2.fasta -short3 se_lib1.fa
velveth Assem 35 -shortPaired -fasta pe_lib1.fasta -shortPaired2 pe_lib2.fasta -short3 se_lib1.fa


Output:
Output:
        directory/Roadmaps
directory/Roadmaps
        directory/Sequences
directory/Sequences
                [Both files are picked up by graph, so please leave them there]
[Both files are picked up by graph, so please leave them there]
 
</pre>
</pre>


[[#top|Back to Top]]
[[#top|Back to Top]]


<pre class="gcommand">  
<pre class="gcommand">  
module load Velvet/1.2.10-foss-2016b-mt-kmer_100-Perl-5.24.1
module load Velvet/1.2.10-GCC-11.2.0-mt-kmer_191
velvetg --help
velvetg --help
Usage:
Usage:
./velvetg directory [options]
./velvetg directory [options]


        directory                       : working directory name
directory : working directory name


Standard options:
Standard options:
        -cov_cutoff <floating-point|auto>       : removal of low coverage nodes AFTER tour bus or allow the system to infer it
-cov_cutoff <floating-point|auto> : removal of low coverage nodes AFTER tour bus or allow the system to infer it
                (default: no removal)
(default: no removal)
        -ins_length <integer>           : expected distance between two paired end reads (default: no read pairing)
-ins_length <integer> : expected distance between two paired end reads (default: no read pairing)
        -read_trkg <yes|no>             : tracking of short read positions in assembly (default: no tracking)
-read_trkg <yes|no> : tracking of short read positions in assembly (default: no tracking)
        -min_contig_lgth <integer>     : minimum contig length exported to contigs.fa file (default: hash length * 2)
-min_contig_lgth <integer> : minimum contig length exported to contigs.fa file (default: hash length * 2)
        -amos_file <yes|no>             : export assembly to AMOS file (default: no export)
-amos_file <yes|no> : export assembly to AMOS file (default: no export)
        -exp_cov <floating point|auto> : expected coverage of unique regions or allow the system to infer it
-exp_cov <floating point|auto> : expected coverage of unique regions or allow the system to infer it
                (default: no long or paired-end read resolution)
(default: no long or paired-end read resolution)
        -long_cov_cutoff <floating-point>: removal of nodes with low long-read coverage AFTER tour bus
-long_cov_cutoff <floating-point>: removal of nodes with low long-read coverage AFTER tour bus
                (default: no removal)
(default: no removal)


Advanced options:
Advanced options:
        -ins_length* <integer>         : expected distance between two paired-end reads in the respective short-read dataset (default: no read pairing)
-ins_length* <integer> : expected distance between two paired-end reads in the respective short-read dataset (default: no read pairing)
        -ins_length_long <integer>     : expected distance between two long paired-end reads (default: no read pairing)
-ins_length_long <integer> : expected distance between two long paired-end reads (default: no read pairing)
        -ins_length*_sd <integer>       : est. standard deviation of respective dataset (default: 10% of corresponding length)
-ins_length*_sd <integer> : est. standard deviation of respective dataset (default: 10% of corresponding length)
                [replace '*' by nothing, '2' or '_long' as necessary]
[replace '*' by nothing, '2' or '_long' as necessary]
        -scaffolding <yes|no>           : scaffolding of contigs used paired end information (default: on)
-scaffolding <yes|no> : scaffolding of contigs used paired end information (default: on)
        -max_branch_length <integer>   : maximum length in base pair of bubble (default: 100)
-max_branch_length <integer> : maximum length in base pair of bubble (default: 100)
        -max_divergence <floating-point>: maximum divergence rate between two branches in a bubble (default: 0.2)
-max_divergence <floating-point>: maximum divergence rate between two branches in a bubble (default: 0.2)
        -max_gap_count <integer>       : maximum number of gaps allowed in the alignment of the two branches of a bubble (default: 3)
-max_gap_count <integer> : maximum number of gaps allowed in the alignment of the two branches of a bubble (default: 3)
        -min_pair_count <integer>       : minimum number of paired end connections to justify the scaffolding of two long contigs (default: 5)
-min_pair_count <integer> : minimum number of paired end connections to justify the scaffolding of two long contigs (default: 5)
        -max_coverage <floating point> : removal of high coverage nodes AFTER tour bus (default: no removal)
-max_coverage <floating point> : removal of high coverage nodes AFTER tour bus (default: no removal)
        -coverage_mask <int>   : minimum coverage required for confident regions of contigs (default: 1)
-coverage_mask <int> : minimum coverage required for confident regions of contigs (default: 1)
        -long_mult_cutoff <int>         : minimum number of long reads required to merge contigs (default: 2)
-long_mult_cutoff <int> : minimum number of long reads required to merge contigs (default: 2)
        -unused_reads <yes|no>         : export unused reads in UnusedReads.fa file (default: no)
-unused_reads <yes|no> : export unused reads in UnusedReads.fa file (default: no)
        -alignments <yes|no>           : export a summary of contig alignment to the reference sequences (default: no)
-alignments <yes|no> : export a summary of contig alignment to the reference sequences (default: no)
        -exportFiltered <yes|no>       : export the long nodes which were eliminated by the coverage filters (default: no)
-exportFiltered <yes|no> : export the long nodes which were eliminated by the coverage filters (default: no)
        -clean <yes|no>                 : remove all the intermediary files which are useless for recalculation (default : no)
-clean <yes|no> : remove all the intermediary files which are useless for recalculation (default : no)
        -very_clean <yes|no>           : remove all the intermediary files (no recalculation possible) (default: no)
-very_clean <yes|no> : remove all the intermediary files (no recalculation possible) (default: no)
        -paired_exp_fraction <double>   : remove all the paired end connections which less than the specified fraction of the expected count (default: 0.1)
-paired_exp_fraction <double> : remove all the paired end connections which less than the specified fraction of the expected count (default: 0.1)
        -shortMatePaired* <yes|no>     : for mate-pair libraries, indicate that the library might be contaminated with paired-end reads (default no)
-shortMatePaired* <yes|no> : for mate-pair libraries, indicate that the library might be contaminated with paired-end reads (default no)
        -conserveLong <yes|no>         : preserve sequences with long reads in them (default no)
-conserveLong <yes|no> : preserve sequences with long reads in them (default no)


Output:
Output:
        directory/contigs.fa           : fasta file of contigs longer than twice hash length
directory/contigs.fa : fasta file of contigs longer than twice hash length
        directory/stats.txt             : stats file (tab-spaced) useful for determining appropriate coverage cutoff
directory/stats.txt : stats file (tab-spaced) useful for determining appropriate coverage cutoff
        directory/LastGraph             : special formatted file with all the information on the final graph
directory/LastGraph : special formatted file with all the information on the final graph
        directory/velvet_asm.afg       : (if requested) AMOS compatible assembly file
directory/velvet_asm.afg : (if requested) AMOS compatible assembly file
</pre>
</pre>


Line 201: Line 202:
source code download from http://www.ebi.ac.uk/~zerbino/velvet/
source code download from http://www.ebi.ac.uk/~zerbino/velvet/


velvet is compiled in multi-thread (compiled with 'BIGASSEMBLY=1' 'LONGSEQUENCES=1' 'MAXKMERLENGTH=99' 'CATEGORIES=62' 'OPENMP=1' )  
velvet is compiled in multi-thread (compiled with 'LONGSEQUENCES=1' 'MAXKMERLENGTH=191' 'CATEGORIES=2' 'OPENMP=1')  
   
   
=== System ===
=== System ===
64-bit Linux
64-bit Linux

Latest revision as of 14:16, 6 September 2023


Category

Bioinformatics

Program On

Sapelo2

Version

1.2.10

Author / Distributor

Velvet: algorithms for de novo short read assembly using de Bruijn graphs. D.R. Zerbino and E. Birney. Genome Research 18:821-829

Description

Sequence assembler for very short reads. More information: http://www.ebi.ac.uk/~zerbino/velvet/

velvetg - de Bruijn graph construction, error removal and repeat resolution velveth - simple hashing program

Running Program

Also refer to Running Jobs on Sapelo2

Note: Velvet is compiled in multi-thread (compiled with 'LONGSEQUENCES=1' 'MAXKMERLENGTH=191' 'CATEGORIES=2' 'OPENMP=1')

some long reads causes segment fault with high categories (e.g. CATEGORIES=99), we suggest using the fitting categories and kmer version for less memory.

  • Version 1.2.10, installed in /apps/eb/Velvet/1.2.10-GCC-11.2.0-mt-kmer_191/

To use this version of Velvet, please first load the module with

module load Velvet/1.2.10-GCC-11.2.0-mt-kmer_191

Example of a shell script sub.sh to run on at the batch partition:

#!/bin/bash
#SBATCH --job-name=velvethk81         
#SBATCH --partition=highmem_p         
#SBATCH --nodes=1                     
#SBATCH --ntasks=24                   
#SBATCH --mem=800gb                   
#SBATCH --time=168:00:00               
#SBATCH --output=velvethk81.out       
#SBATCH --error=velvethk81.err       
#SBATCH --mail-type=ALL          
#SBATCH --mail-user=username@uga.edu  

cd $SLURM_SUBMIT_DIR

ml Velvet/1.2.10-GCC-11.2.0-mt-kmer_191

export OMP_THREAD_LIMIT=24
export OMP_NUM_THREADS=24

## run velveth to optmize kmers using the binary module for quicker run.
velveth velvet-kmers_81 81  -create_binary -reuse_Sequences \
-fastq.gz \
-shortPaired ../shortreads/Spu_genomic/Spu_genomic_trim/*.fq.gz

## run velvetg to assemble
velvetg velvet-kmers_81 -exp_cov auto -cov_cutoff auto 

In the above sample, 24 for OMP_THREAD_LIMIT and OMP_NUM_THREADS are the number of threads to use. --ntasks number has to match the number "24" in OMP_THREAD_LIMIT and OMP_NUM_THREADS.

Example of submission to the queue:

sbatch sub.sh

Velvet needs large memory to run.

For transcriptomic assembly, Velvet is extended by Oases.

Documentation

 
module load Velvet/1.2.10-GCC-11.2.0-mt-kmer_191
velveth - simple hashing program
Version 1.2.10

Copyright 2007, 2008 Daniel Zerbino (zerbino@ebi.ac.uk)
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Compilation settings:
CATEGORIES = 2
MAXKMERLENGTH = 191
OPENMP
LONGSEQUENCES

Usage:
./velveth directory hash_length {[-file_format][-read_type][-separate|-interleaved] filename1 [filename2 ...]} {...} [options]

	directory	: directory name for output files
	hash_length	: EITHER an odd integer (if even, it will be decremented) <= 191 (if above, will be reduced)
			: OR: m,M,s where m and M are odd integers (if not, they will be decremented) with m < M <= 191 (if above, will be reduced)
				and s is a step (even number). Velvet will then hash from k=m to k=M with a step of s
	filename	: path to sequence file or - for standard input

File format options:
	-fasta	-fastq	-raw	-fasta.gz	-fastq.gz	-raw.gz	-sam	-bam	-fmtAuto
	(Note: -fmtAuto will detect fasta or fastq, and will try the following programs for decompression : gunzip, pbunzip2, bunzip2

File layout options for paired reads (only for fasta and fastq formats):
	-interleaved	: File contains paired reads interleaved in the one file (default)
	-separate	: Read 2 separate files for paired reads

Read type options:
	-short	-shortPaired
	-short2	-shortPaired2
	-long	-longPaired
	-reference

Options:
	-strand_specific	: for strand specific transcriptome sequencing data (default: off)
	-reuse_Sequences	: reuse Sequences file (or link) already in directory (no need to provide original filenames in this case (default: off)
	-reuse_binary	: reuse binary sequences file (or link) already in directory (no need to provide original filenames in this case (default: off)
	-noHash			: simply prepare Sequences file, do not hash reads or prepare Roadmaps file (default: off)
	-create_binary  	: create binary CnyUnifiedSeq file (default: off)

Synopsis:

- Short single end reads:
	velveth Assem 29 -short -fastq s_1_sequence.txt

- Paired-end short reads (remember to interleave paired reads):
	velveth Assem 31 -shortPaired -fasta interleaved.fna

- Paired-end short reads (using separate files for the paired reads)
	velveth Assem 31 -shortPaired -fasta -separate left.fa right.fa

- Two channels and some long reads:
	velveth Assem 43 -short -fastq unmapped.fna -longPaired -fasta SangerReads.fasta

- Three channels:
	velveth Assem 35 -shortPaired -fasta pe_lib1.fasta -shortPaired2 pe_lib2.fasta -short3 se_lib1.fa

Output:
	directory/Roadmaps
	directory/Sequences
		[Both files are picked up by graph, so please leave them there]

Back to Top

 
module load Velvet/1.2.10-GCC-11.2.0-mt-kmer_191
velvetg --help

Usage:
./velvetg directory [options]

	directory			: working directory name

Standard options:
	-cov_cutoff <floating-point|auto>	: removal of low coverage nodes AFTER tour bus or allow the system to infer it
		(default: no removal)
	-ins_length <integer>		: expected distance between two paired end reads (default: no read pairing)
	-read_trkg <yes|no>		: tracking of short read positions in assembly (default: no tracking)
	-min_contig_lgth <integer>	: minimum contig length exported to contigs.fa file (default: hash length * 2)
	-amos_file <yes|no>		: export assembly to AMOS file (default: no export)
	-exp_cov <floating point|auto>	: expected coverage of unique regions or allow the system to infer it
		(default: no long or paired-end read resolution)
	-long_cov_cutoff <floating-point>: removal of nodes with low long-read coverage AFTER tour bus
		(default: no removal)

Advanced options:
	-ins_length* <integer>		: expected distance between two paired-end reads in the respective short-read dataset (default: no read pairing)
	-ins_length_long <integer>	: expected distance between two long paired-end reads (default: no read pairing)
	-ins_length*_sd <integer>	: est. standard deviation of respective dataset (default: 10% of corresponding length)
		[replace '*' by nothing, '2' or '_long' as necessary]
	-scaffolding <yes|no>		: scaffolding of contigs used paired end information (default: on)
	-max_branch_length <integer>	: maximum length in base pair of bubble (default: 100)
	-max_divergence <floating-point>: maximum divergence rate between two branches in a bubble (default: 0.2)
	-max_gap_count <integer>	: maximum number of gaps allowed in the alignment of the two branches of a bubble (default: 3)
	-min_pair_count <integer>	: minimum number of paired end connections to justify the scaffolding of two long contigs (default: 5)
	-max_coverage <floating point>	: removal of high coverage nodes AFTER tour bus (default: no removal)
	-coverage_mask <int>	: minimum coverage required for confident regions of contigs (default: 1)
	-long_mult_cutoff <int>		: minimum number of long reads required to merge contigs (default: 2)
	-unused_reads <yes|no>		: export unused reads in UnusedReads.fa file (default: no)
	-alignments <yes|no>		: export a summary of contig alignment to the reference sequences (default: no)
	-exportFiltered <yes|no>	: export the long nodes which were eliminated by the coverage filters (default: no)
	-clean <yes|no>			: remove all the intermediary files which are useless for recalculation (default : no)
	-very_clean <yes|no>		: remove all the intermediary files (no recalculation possible) (default: no)
	-paired_exp_fraction <double>	: remove all the paired end connections which less than the specified fraction of the expected count (default: 0.1)
	-shortMatePaired* <yes|no>	: for mate-pair libraries, indicate that the library might be contaminated with paired-end reads (default no)
	-conserveLong <yes|no>		: preserve sequences with long reads in them (default no)

Output:
	directory/contigs.fa		: fasta file of contigs longer than twice hash length
	directory/stats.txt		: stats file (tab-spaced) useful for determining appropriate coverage cutoff
	directory/LastGraph		: special formatted file with all the information on the final graph
	directory/velvet_asm.afg	: (if requested) AMOS compatible assembly file

Back to Top

Installation

source code download from http://www.ebi.ac.uk/~zerbino/velvet/

velvet is compiled in multi-thread (compiled with 'LONGSEQUENCES=1' 'MAXKMERLENGTH=191' 'CATEGORIES=2' 'OPENMP=1')

System

64-bit Linux