EMBOSS-Teaching: Difference between revisions
No edit summary |
No edit summary |
||
Line 40: | Line 40: | ||
<nowiki>#</nowiki>SBATCH --time=<u>08:00:00</u><br> | <nowiki>#</nowiki>SBATCH --time=<u>08:00:00</u><br> | ||
<nowiki>#</nowiki>SBATCH --output=EMBOSS.%j.out<br> | <nowiki>#</nowiki>SBATCH --output=EMBOSS.%j.out<br> | ||
<nowiki>#</nowiki>SBATCH --error=EMBOSS.%j.err<br> | |||
cd $SLURM_SUBMIT_DIR<br> | cd $SLURM_SUBMIT_DIR<br> | ||
Line 59: | Line 60: | ||
<pre class="gcommand"> | <pre class="gcommand"> | ||
ml EMBOSS/6.6.0-foss-2016b | ml EMBOSS/6.6.0-foss-2016b | ||
getorf --help | |||
Find and extract open reading frames (ORFs) | Find and extract open reading frames (ORFs) | ||
Version: EMBOSS:6.6.0.0 | Version: EMBOSS:6.6.0.0 |
Latest revision as of 13:03, 10 August 2018
Category
Bioinformatics
Program On
Teaching
Version
6.6.0
Author / Distributor
Description
"EMBOSS is" More details are at EMBOSS
Running Program
The last version of this application is at /usr/local/apps/eb/EMBOSS/6.6.0-foss-2016b
To use this version, please load the module with
ml EMBOSS/6.6.0-foss-2016b
Here is an example of a shell script, sub.sh, to run on the batch queue:
#!/bin/bash
#SBATCH --job-name=j_EMBOSS
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=10gb
#SBATCH --time=08:00:00
#SBATCH --output=EMBOSS.%j.out
#SBATCH --error=EMBOSS.%j.err
cd $SLURM_SUBMIT_DIR
ml EMBOSS/6.6.0-foss-2016b
getorf [options]
In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.
Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.
Here is an example of job submission command:
sbatch ./sub.sh
Documentation
ml EMBOSS/6.6.0-foss-2016b getorf --help Find and extract open reading frames (ORFs) Version: EMBOSS:6.6.0.0 Standard (Mandatory) qualifiers: [-sequence] seqall Nucleotide sequence(s) filename and optional format, or reference (input USA) [-outseq] seqoutall [<sequence>.<format>] Protein sequence set(s) filename and optional format (output USA) Additional (Optional) qualifiers: -table menu [0] Code to use (Values: 0 (Standard); 1 (Standard (with alternative initiation codons)); 2 (Vertebrate Mitochondrial); 3 (Yeast Mitochondrial); 4 (Mold, Protozoan, Coelenterate Mitochondrial and Mycoplasma/Spiroplasma); 5 (Invertebrate Mitochondrial); 6 (Ciliate Macronuclear and Dasycladacean); 9 (Echinoderm Mitochondrial); 10 (Euplotid Nuclear); 11 (Bacterial); 12 (Alternative Yeast Nuclear); 13 (Ascidian Mitochondrial); 14 (Flatworm Mitochondrial); 15 (Blepharisma Macronuclear); 16 (Chlorophycean Mitochondrial); 21 (Trematode Mitochondrial); 22 (Scenedesmus obliquus); 23 (Thraustochytrium Mitochondrial)) -minsize integer [30] Minimum nucleotide size of ORF to report (Any integer value) -maxsize integer [1000000] Maximum nucleotide size of ORF to report (Any integer value) -find menu [0] This is a small menu of possible output options. The first four options are to select either the protein translation or the original nucleic acid sequence of the open reading frame. There are two possible definitions of an open reading frame: it can either be a region that is free of STOP codons or a region that begins with a START codon and ends with a STOP codon. The last three options are probably only of interest to people who wish to investigate the statistical properties of the regions around potential START or STOP codons. The last option assumes that ORF lengths are calculated between two STOP codons. (Values: 0 (Translation of regions between STOP codons); 1 (Translation of regions between START and STOP codons); 2 (Nucleic sequences between STOP codons); 3 (Nucleic sequences between START and STOP codons); 4 (Nucleotides flanking START codons); 5 (Nucleotides flanking initial STOP codons); 6 (Nucleotides flanking ending STOP codons)) Advanced (Unprompted) qualifiers: -[no]methionine boolean [Y] START codons at the beginning of protein products will usually code for Methionine, despite what the codon will code for when it is internal to a protein. This qualifier sets all such START codons to code for Methionine by default. -circular boolean [N] Is the sequence circular -[no]reverse boolean [Y] Set this to be false if you do not wish to find ORFs in the reverse complement of the sequence. -flanking integer [100] If you have chosen one of the options of the type of sequence to find that gives the flanking sequence around a STOP or START codon, this allows you to set the number of nucleotides either side of that codon to output. If the region of flanking nucleotides crosses the start or end of the sequence, no output is given for this codon. (Any integer value) General qualifiers: -help boolean Report command line options and exit. More information on associated and general qualifiers can be found with -help -verbose
Installation
Source code is obtained from EMBOSS
System
64-bit Linux