RepeatModeler-Teaching
Category
Bioinformatics
Program On
Teaching
Version
1.0.11
Author / Distributor
Description
"RepeatModeler is a de-novo repeat family identification and modeling package. At the heart of RepeatModeler are two de-novo repeat finding programs ( RECON and RepeatScout ) which employ complementary computational methods for identifying repeat element boundaries and family relationships from sequence data. RepeatModeler assists in automating the runs of RECON and RepeatScout given a genomic database and uses the output to build, refine and classify consensus models of putative interspersed repeats." More details are at RepeatModeler
Running Program
The last version of this application is at /usr/local/apps/eb/RepeatModeler/1.0.11-foss-2016b
To use this version, please load the module with
ml RepeatModeler/1.0.11-foss-2016b
Here is an example of a shell script, sub.sh, to run on the batch queue:
#!/bin/bash
#SBATCH --job-name=j_RepeatModeler
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=10gb
#SBATCH --time=08:00:00
#SBATCH --output=RepeatModeler.%j.out
#SBATCH --error=RepeatModeler.%j.err
cd $SLURM_SUBMIT_DIR
ml RepeatModeler/1.0.11-foss-2016b
RepeatModeler [options]
In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.
Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.
Here is an example of job submission command:
sbatch ./sub.sh
Documentation
ml RepeatModeler/1.0.11-foss-2016b
RepeatModeler -h
No database indicated
NAME
RepeatModeler - Model repetitive DNA
SYNOPSIS
RepeatModeler [-options] -database <XDF Database>
DESCRIPTION
The options are:
-h(elp)
Detailed help
-database
The prefix name of a XDF formatted sequence database containing the
genomic sequence to use when building repeat models. The database
may be created with the WUBlast "xdformat" utility or with the
RepeatModeler wrapper script "BuildXDFDatabase".
-engine <abblast|wublast|ncbi>
The name of the search engine we are using. I.e abblast/wublast or
ncbi (rmblast version).
-pa #
Specify the number of shared-memory processors available to this
program. RepeatModeler will use the processors to run BLAST searches
in parallel. i.e on a machine with 10 cores one might use 1 core for
the script and 9 cores for the BLAST searches by running with "-pa
9".
-recoverDir <Previous Output Directory>
If a run fails in the middle of processing, it may be possible
recover some results and continue where the previous run left off.
Simply supply the output directory where the results of the failed
run were saved and the program will attempt to recover and continue
the run.
-srand #
Optionally set the seed of the random number generator to a known
value before the batches are randomly selected ( using Fisher Yates
Shuffling ). This is only useful if you need to reproduce the sample
choice between runs. This should be an integer number.
SEE ALSO
RepeatMasker, WUBlast
COPYRIGHT
Copyright 2005-2017 Institute for Systems Biology
AUTHOR
Robert Hubley <rhubley@systemsbiology.org>
Arian Smit <asmit@systemsbiology.org>
Installation
Source code is obtained from RepeatModeler
System
64-bit Linux