KmerGenie-Teaching
Category
Bioinformatics
Program On
Teaching
Version
1.7044
Author / Distributor
Description
"KmerGenie estimates the best k-mer length for genome de novo assembly." More details are at KmerGenie
Running Program
The last version of this application is at /usr/local/apps/eb/KmerGenie/1.7044-foss-2016b
To use this version, please load the module with
ml KmerGenie/1.7044-foss-2016b
Here is an example of a shell script, sub.sh, to run on the batch queue:
#!/bin/bash
#SBATCH --job-name=j_KmerGenie
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=10gb
#SBATCH --time=08:00:00
#SBATCH --output=KmerGenie.%j.out
#SBATCH --error=KmerGenie.%j.err
cd $SLURM_SUBMIT_DIR
ml KmerGenie/1.7044-foss-2016b
kmergenie [options]
In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.
Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.
Here is an example of job submission command:
sbatch ./sub.sh
Documentation
ml KmerGenie/1.7044-foss-2016b
kmergenie --help
KmerGenie 1.7044
Usage:
kmergenie <read_file> [options]
Options:
--diploid use the diploid model (default: haploid model)
--one-pass skip the second pass to estimate k at 2 bp resolution (default: two passes)
-k <value> largest k-mer size to consider (default: 121)
-l <value> smallest k-mer size to consider (default: 15)
-s <value> interval between consecutive kmer sizes (default: 10)
-e <value> k-mer sampling value (default: auto-detected to use ~200 MB memory/thread)
-t <value> number of threads (default: number of cores minus one)
-o <prefix> prefix of the output files (default: histograms)
--debug developer output of R scripts
--orig-hist legacy histogram estimation method (slower, less accurate)
Installation
Source code is obtained from KmerGenie
System
64-bit Linux