GATK-Teaching

From Research Computing Center Wiki
Revision as of 12:42, 10 August 2018 by Yhuang (talk | contribs)
Jump to navigation Jump to search

Category

Bioinformatics

Program On

Teaching

Version

3.4-0

Author / Distributor

GATK

Description

"The Genome Analysis Toolkit or GATK is a software package developed at the Broad Institute to analyse next-generation resequencing data. The toolkit offers a wide variety of tools, with a primary focus on variant discovery and genotyping as well as strong emphasis on data quality assurance. Its robust architecture, powerful processing engine and high-performance computing features make it capable of taking on projects of any size." More details are at GATK

Running Program

The last version of this application is at /usr/local/apps/eb/GATK/3.4-0-Java-1.8.0_144

To use this version, please load the module with

ml GATK/3.4-0-Java-1.8.0_144 

Here is an example of a shell script, sub.sh, to run on the batch queue:

#!/bin/bash
#SBATCH --job-name=j_GATK
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=10gb
#SBATCH --time=08:00:00
#SBATCH --output=GATK.%j.out

cd $SLURM_SUBMIT_DIR
ml GATK/3.4-0-Java-1.8.0_144
gatk [options]

In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.

Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.


Here is an example of job submission command:

sbatch ./sub.sh 

Documentation

ml GATK/3.4-0-Java-1.8.0_144 
gatk gatk -h

Back to Top

Installation

Source code is obtained from GATK

System

64-bit Linux