BLAST-Teaching: Difference between revisions

From Research Computing Center Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(One intermediate revision by one other user not shown)
Line 21: Line 21:
=== Running Program ===
=== Running Program ===


The last version of this application is at /usr/local/apps/eb/BLAST/2.2.26-Linux_x86_64
The last version of this application is at /apps/eb/BLAST/2.2.26-Linux_x86_64


To use this version, please load the module with
To use this version, please load the module with
Line 40: Line 40:
<nowiki>#</nowiki>SBATCH --time=<u>08:00:00</u><br>   
<nowiki>#</nowiki>SBATCH --time=<u>08:00:00</u><br>   
<nowiki>#</nowiki>SBATCH --output=BLAST.%j.out<br>
<nowiki>#</nowiki>SBATCH --output=BLAST.%j.out<br>
<nowiki>#</nowiki>SBATCH --error=BLAST.%j.err<br>
   
   
cd $SLURM_SUBMIT_DIR<br>
cd $SLURM_SUBMIT_DIR<br>
Line 59: Line 60:
<pre  class="gcommand">
<pre  class="gcommand">
ml BLAST/2.2.26-Linux_x86_64  
ml BLAST/2.2.26-Linux_x86_64  
blastall blastall --help
blastall --help


blastall 2.2.26  arguments:
blastall 2.2.26  arguments:

Latest revision as of 13:22, 27 August 2020

Category

Bioinformatics

Program On

Teaching

Version

2.2.26

Author / Distributor

BLAST

Description

"Basic Local Alignment Search Tool, or BLAST, is an algorithm for comparing primary biological sequence information, such as the amino-acid sequences of different proteins or the nucleotides of DNA sequences." More details are at BLAST

Running Program

The last version of this application is at /apps/eb/BLAST/2.2.26-Linux_x86_64

To use this version, please load the module with

ml BLAST/2.2.26-Linux_x86_64 

Here is an example of a shell script, sub.sh, to run on the batch queue:

#!/bin/bash
#SBATCH --job-name=j_BLAST
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=10gb
#SBATCH --time=08:00:00
#SBATCH --output=BLAST.%j.out
#SBATCH --error=BLAST.%j.err

cd $SLURM_SUBMIT_DIR
ml BLAST/2.2.26-Linux_x86_64
blastall [options]

In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.

Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.


Here is an example of job submission command:

sbatch ./sub.sh 

Documentation

ml BLAST/2.2.26-Linux_x86_64 
blastall --help

blastall 2.2.26   arguments:

  -p  Program Name [String]
  -d  Database [String]
    default = nr
  -i  Query File [File In]
    default = stdin
  -e  Expectation value (E) [Real]
    default = 10.0
  -m  alignment view options:
0 = pairwise,
1 = query-anchored showing identities,
2 = query-anchored no identities,
3 = flat query-anchored, show identities,
4 = flat query-anchored, no identities,
5 = query-anchored no identities and blunt ends,
6 = flat query-anchored, no identities and blunt ends,
7 = XML Blast output,
8 = tabular, 
9 tabular with comment lines
10 ASN, text
11 ASN, binary [Integer]
    default = 0
    range from 0 to 11
  -o  BLAST report Output File [File Out]  Optional
    default = stdout
  -F  Filter query sequence (DUST with blastn, SEG with others) [String]
    default = T
  -G  Cost to open a gap (-1 invokes default behavior) [Integer]
    default = -1
  -E  Cost to extend a gap (-1 invokes default behavior) [Integer]
    default = -1
  -X  X dropoff value for gapped alignment (in bits) (zero invokes default behavior)
      blastn 30, megablast 20, tblastx 0, all others 15 [Integer]
    default = 0
  -I  Show GI's in deflines [T/F]
    default = F
  -q  Penalty for a nucleotide mismatch (blastn only) [Integer]
    default = -3
  -r  Reward for a nucleotide match (blastn only) [Integer]
    default = 1
  -v  Number of database sequences to show one-line descriptions for (V) [Integer]
    default = 500
  -b  Number of database sequence to show alignments for (B) [Integer]
    default = 250
  -f  Threshold for extending hits, default if zero
      blastp 11, blastn 0, blastx 12, tblastn 13
      tblastx 13, megablast 0 [Real]
    default = 0
  -g  Perform gapped alignment (not available with tblastx) [T/F]
    default = T
  -Q  Query Genetic code to use [Integer]
    default = 1
  -D  DB Genetic code (for tblast[nx] only) [Integer]
    default = 1
  -a  Number of processors to use [Integer]
    default = 1
  -O  SeqAlign file [File Out]  Optional
  -J  Believe the query defline [T/F]
    default = F
  -M  Matrix [String]
    default = BLOSUM62
  -W  Word size, default if zero (blastn 11, megablast 28, all others 3) [Integer]
    default = 0
  -z  Effective length of the database (use zero for the real size) [Real]
    default = 0
  -K  Number of best hits from a region to keep. Off by default.
If used a value of 100 is recommended.  Very high values of -v or -b is also suggested [Integer]
    default = 0
  -P  0 for multiple hit, 1 for single hit (does not apply to blastn) [Integer]
    default = 0
  -Y  Effective length of the search space (use zero for the real size) [Real]
    default = 0
  -S  Query strands to search against database (for blast[nx], and tblastx)
       3 is both, 1 is top, 2 is bottom [Integer]
    default = 3
  -T  Produce HTML output [T/F]
    default = F
  -l  Restrict search of database to list of GI's [String]  Optional
  -U  Use lower case filtering of FASTA sequence [T/F]  Optional
  -y  X dropoff value for ungapped extensions in bits (0.0 invokes default behavior)
      blastn 20, megablast 10, all others 7 [Real]
    default = 0.0
  -Z  X dropoff value for final gapped alignment in bits (0.0 invokes default behavior)
      blastn/megablast 100, tblastx 0, all others 25 [Integer]
    default = 0
  -R  PSI-TBLASTN checkpoint file [File In]  Optional
  -n  MegaBlast search [T/F]
    default = F
  -L  Location on query sequence [String]  Optional
  -A  Multiple Hits window size, default if zero (blastn/megablast 0, all others 40 [Integer]
    default = 0
  -w  Frame shift penalty (OOF algorithm for blastx) [Integer]
    default = 0
  -t  Length of the largest intron allowed in a translated nucleotide sequence when linking multiple distinct alignments. (0 invokes default behavior; a negative value disables linking.) [Integer]
    default = 0
  -B  Number of concatenated queries, for blastn and tblastn [Integer]  Optional
    default = 0
  -V  Force use of the legacy BLAST engine [T/F]  Optional
    default = F
  -C  Use composition-based score adjustments for blastp or tblastn:
      As first character:
      D or d: default (equivalent to T)
      0 or F or f: no composition-based statistics
      2 or T or t: Composition-based score adjustments as in Bioinformatics 21:902-911,
      1: Composition-based statistics as in NAR 29:2994-3005, 2001
          2005, conditioned on sequence properties
      3: Composition-based score adjustment as in Bioinformatics 21:902-911,
          2005, unconditionally
      For programs other than tblastn, must either be absent or be D, F or 0.
           As second character, if first character is equivalent to 1, 2, or 3:
      U or u: unified p-value combining alignment p-value and compositional p-value in round 1 only
 [String]
    default = D
  -s  Compute locally optimal Smith-Waterman alignments (This option is only
      available for gapped tblastn.) [T/F]
    default = F


Back to Top

Installation

Source code is obtained from BLAST

System

64-bit Linux