RepeatScout-Teaching: Difference between revisions

From Research Computing Center Wiki
Jump to navigation Jump to search
(Created page with "Category:TeachingCategory:SoftwareCategory:Bioinformatics === Category === Bioinformatics === Program On === Teaching === Version === 1.05 === Au...")
 
No edit summary
 
(One intermediate revision by the same user not shown)
Line 23: Line 23:
The last version of this application is at /usr/local/apps/eb/RepeatScout/1.05-foss-2016b
The last version of this application is at /usr/local/apps/eb/RepeatScout/1.05-foss-2016b


To use this version, please loads the module with
To use this version, please load the module with
<pre class="gscript">
<pre class="gscript">
ml RepeatScout/1.05-foss-2016b  
ml RepeatScout/1.05-foss-2016b  
</pre>  
</pre>  


Here is an example of a shell script, sub.sh, to run on at the batch queue:  
Here is an example of a shell script, sub.sh, to run on the batch queue:  


<div class="gscript2">
<div class="gscript2">
Line 40: Line 40:
<nowiki>#</nowiki>SBATCH --time=<u>08:00:00</u><br>   
<nowiki>#</nowiki>SBATCH --time=<u>08:00:00</u><br>   
<nowiki>#</nowiki>SBATCH --output=RepeatScout.%j.out<br>
<nowiki>#</nowiki>SBATCH --output=RepeatScout.%j.out<br>
<nowiki>#</nowiki>SBATCH --error=RepeatScout.%j.err<br>
   
   
cd $SLURM_SUBMIT_DIR<br>
cd $SLURM_SUBMIT_DIR<br>
Line 59: Line 60:
<pre  class="gcommand">
<pre  class="gcommand">
ml RepeatScout/1.05-foss-2016b  
ml RepeatScout/1.05-foss-2016b  
RepeatScout RepeatScout -h
RepeatScout -h
RepeatScout Version 1.0.5
RepeatScout Version 1.0.5



Latest revision as of 13:09, 10 August 2018

Category

Bioinformatics

Program On

Teaching

Version

1.05

Author / Distributor

RepeatScout

Description

"The purpose of the RepeatScout software is to identify repeat family sequences from genomes where hand-curated repeat databases (a laRepBase update) are not available. In fact, the output of this program can be used as input to RepeatMasker as a way of automatically masking newly-sequenced genomes." More details are at RepeatScout

Running Program

The last version of this application is at /usr/local/apps/eb/RepeatScout/1.05-foss-2016b

To use this version, please load the module with

ml RepeatScout/1.05-foss-2016b 

Here is an example of a shell script, sub.sh, to run on the batch queue:

#!/bin/bash
#SBATCH --job-name=j_RepeatScout
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=10gb
#SBATCH --time=08:00:00
#SBATCH --output=RepeatScout.%j.out
#SBATCH --error=RepeatScout.%j.err

cd $SLURM_SUBMIT_DIR
ml RepeatScout/1.05-foss-2016b
RepeatScout [options]

In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.

Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.


Here is an example of job submission command:

sbatch ./sub.sh 

Documentation

ml RepeatScout/1.05-foss-2016b 
RepeatScout -h
RepeatScout Version 1.0.5

Usage: 
RepeatScout -sequence <seq> -output <out> -freq <freq> -l <l> [opts]
     -L # size of region to extend left or right (10000) 
     -match # reward for a match (+1)  
     -mismatch # penalty for a mismatch (-1) 
     -gap  # penalty for a gap (-5)
     -maxgap # maximum number of gaps allowed (5) 
     -maxoccurrences # cap on the number of sequences to align (10,000) 
     -maxrepeats # stop work after reporting this number of repeats (10000)
     -cappenalty # cap on penalty for exiting alignment of a sequence (-20)
     -tandemdist # of bases that must intervene between two l-mers for both to be counted (500)
     -minthresh # stop if fewer than this number of l-mers are found in the seeding phase (3)
     -minimprovement # amount that a the alignment needs to improve each step to be considered progress (3)
     -stopafter # stop the alignment after this number of no-progress columns (100)
     -goodlength # minimum required length for a sequence to be reported (50)
     -maxentropy # entropy (complexity) threshold for an l-mer to be considered (-.7)
     -v[v[v[v]]] How verbose do you want it to be?  -vvvv is super-verbose.

Back to Top

Installation

Source code is obtained from RepeatScout

System

64-bit Linux