AUGUSTUS-Teaching
Category
Bioinformatics
Program On
Teaching
Version
3.2.3
Author / Distributor
Description
"AUGUSTUS is a program that predicts genes in eukaryotic genomic sequences" More details are at AUGUSTUS
Running Program
The last version of this application is at /usr/local/apps/eb/AUGUSTUS/3.2.3-foss-2016b-Python-2.7.14
To use this version, please loads the module with
ml AUGUSTUS/3.2.3-foss-2016b-Python-2.7.14
Here is an example of a shell script, sub.sh, to run on at the batch queue:
#!/bin/bash
#SBATCH --job-name=j_AUGUSTUS
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=10gb
#SBATCH --time=08:00:00
#SBATCH --output=AUGUSTUS.%j.out
cd $SLURM_SUBMIT_DIR
ml
augustus [options]
In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.
Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.
Here is an example of job submission command:
sbatch ./sub.sh
Documentation
ml <!-- TEST_MODULE BEGIN --><!-- TEST_MODULE END --> augustus AUGUSTUS (3.2.3) is a gene prediction tool written by M. Stanke, O. Keller, S. König, L. Gerischer and L. Romoth. usage: augustus [parameters] --species=SPECIES queryfilename 'queryfilename' is the filename (including relative path) to the file containing the query sequence(s) in fasta format. SPECIES is an identifier for the species. Use --species=help to see a list. parameters: --strand=both, --strand=forward or --strand=backward --genemodel=partial, --genemodel=intronless, --genemodel=complete, --genemodel=atleastone or --genemodel=exactlyone partial : allow prediction of incomplete genes at the sequence boundaries (default) intronless : only predict single-exon genes like in prokaryotes and some eukaryotes complete : only predict complete genes atleastone : predict at least one complete gene exactlyone : predict exactly one complete gene --singlestrand=true predict genes independently on each strand, allow overlapping genes on opposite strands This option is turned off by default. --hintsfile=hintsfilename When this option is used the prediction considering hints (extrinsic information) is turned on. hintsfilename contains the hints in gff format. --AUGUSTUS_CONFIG_PATH=path path to config directory (if not specified as environment variable) --alternatives-from-evidence=true/false report alternative transcripts when they are suggested by hints --alternatives-from-sampling=true/false report alternative transcripts generated through probabilistic sampling --sample=n --minexonintronprob=p --minmeanexonintronprob=p --maxtracks=n For a description of these parameters see section 4 of README.TXT. --proteinprofile=filename When this option is used the prediction will consider the protein profile provided as parameter. The protein profile extension is described in section 7 of README.TXT. --progress=true show a progressmeter --gff3=on/off output in gff3 format --predictionStart=A, --predictionEnd=B A and B define the range of the sequence for which predictions should be found. --UTR=on/off predict the untranslated regions in addition to the coding sequence. This currently works only for a subset of species. --noInFrameStop=true/false Do not report transcripts with in-frame stop codons. Otherwise, intron-spanning stop codons could occur. Default: false --noprediction=true/false If true and input is in genbank format, no prediction is made. Useful for getting the annotated protein sequences. --uniqueGeneId=true/false If true, output gene identifyers like this: seqname.gN For a complete list of parameters, type "augustus --paramlist". An exhaustive description can be found in the file README.TXT.
Installation
Source code is obtained from AUGUSTUS
System
64-bit Linux