GLIMMER-Teaching

From Research Computing Center Wiki
Revision as of 15:12, 24 October 2018 by Yhuang (talk | contribs) (Created page with "Category:TeachingCategory:SoftwareCategory:Bioinformatics === Category === Bioinformatics === Program On === Teaching === Version === 3.02b === A...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Category

Bioinformatics

Program On

Teaching

Version

3.02b

Author / Distributor

GLIMMER

Description

"Glimmer is a system for finding genes in microbial DNA, especially the genomes of bacteria, archaea, and viruses." More details are at GLIMMER

Running Program

The last version of this application is at /usr/local/apps/eb/GLIMMER/3.02b

To use this version, please load the module with

ml GLIMMER/3.02b-foss-2016b 

Here is an example of a shell script, sub.sh, to run on the batch queue:

#!/bin/bash
#SBATCH --job-name=j_GLIMMER
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=10gb
#SBATCH --time=08:00:00
#SBATCH --output=GLIMMER.%j.out
#SBATCH --error=GLIMMER.%j.err

cd $SLURM_SUBMIT_DIR
ml GLIMMER/3.02b-foss-2016b
glimmer3 [options]

In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.

Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.


Here is an example of job submission command:

sbatch ./sub.sh 

Documentation

ml GLIMMER/3.02b-foss-2016b 
glimmer3 -h
Starting at Wed Oct 24 16:12:42 2018

USAGE:  glimmer3 [options] <sequence-file> <icm-file> <tag>

Read DNA sequences in <sequence-file> and predict genes
in them using the Interpolated Context Model in <icm-file>.
Output details go to file <tag>.detail and predictions go to
file <tag>.predict

Options:
 -A <codon-list>
 --start_codons <codon-list>
    Use comma-separated list of codons as start codons
    Sample format:  -A atg,gtg
    Use -P option to specify relative proportions of use.
    If -P not used, then proportions will be equal
 -b <filename>
 --rbs_pwm <filename>
    Read a position weight matrix (PWM) from <filename> to identify
    the ribosome binding site to help choose start sites
 -C <p>
 --gc_percent <p>
    Use <p> as GC percentage of independent model
    Note:  <p> should be a percentage, e.g., -C 45.2
 -E <filename>
 --entropy <filename>
    Read entropy profiles from <filename>.  Format is one header
    line, then 20 lines of 3 columns each.  Columns are amino acid,
    positive entropy, negative entropy.  Rows must be in order
    by amino acid code letter
 -f
 --first_codon
    Use first codon in orf as start codon
 -g <n>
 --gene_len <n>
    Set minimum gene length to <n>
 -h
 --help
    Print this message
 -i <filename>
 --ignore <filename>
    <filename> specifies regions of bases that are off 
    limits, so that no bases within that area will be examined
 -l
 --linear
    Assume linear rather than circular genome, i.e., no wraparound
 -L <filename>
 --orf_coords <filename>
    Use <filename> to specify a list of orfs that should
    be scored separately, with no overlap rules
 -M
 --separate_genes
    <sequence-file> is a multifasta file of separate genes to
    be scored separately, with no overlap rules
 -o <n>
 --max_olap <n>
    Set maximum overlap length to <n>.  Overlaps this short or shorter
    are ignored.
 -P <number-list>
 --start_probs <number-list>
    Specify probability of different start codons (same number & order
    as in -A option).  If no -A option, then 3 values for atg, gtg and ttg
    in that order.  Sample format:  -P 0.6,0.35,0.05
    If -A is specified without -P, then starts are equally likely.
 -q <n>
 --ignore_score_len <n>
    Do not use the initial score filter on any gene <n> or more
    base long
 -r
 --no_indep
    Don't use independent probability score column
 -t <n>
 --threshold <n>
    Set threshold score for calling as gene to n.  If the in-frame
    score >= <n>, then the region is given a number and considered
    a potential gene.
 -X
 --extend
    Allow orfs extending off ends of sequence to be scored
 -z <n>
 --trans_table <n>
    Use Genbank translation table number <n> for stop codons
 -Z <codon-list>
 --stop_codons <codon-list>
    Use comma-separated list of codons as stop codons
    Sample format:  -Z tag,tga,taa


Back to Top

Installation

Source code is obtained from GLIMMER

System

64-bit Linux