RepeatModeler-Teaching
Category
Bioinformatics
Program On
Teaching
Version
1.0.11
Author / Distributor
Description
"RepeatModeler is a de-novo repeat family identification and modeling package. At the heart of RepeatModeler are two de-novo repeat finding programs ( RECON and RepeatScout ) which employ complementary computational methods for identifying repeat element boundaries and family relationships from sequence data. RepeatModeler assists in automating the runs of RECON and RepeatScout given a genomic database and uses the output to build, refine and classify consensus models of putative interspersed repeats." More details are at RepeatModeler
Running Program
The last version of this application is at /usr/local/apps/eb/RepeatModeler/1.0.11-foss-2016b
To use this version, please load the module with
ml RepeatModeler/1.0.11-foss-2016b
Here is an example of a shell script, sub.sh, to run on the batch queue:
#!/bin/bash
#SBATCH --job-name=j_RepeatModeler
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=10gb
#SBATCH --time=08:00:00
#SBATCH --output=RepeatModeler.%j.out
cd $SLURM_SUBMIT_DIR
ml RepeatModeler/1.0.11-foss-2016b
RepeatModeler [options]
In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.
Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.
Here is an example of job submission command:
sbatch ./sub.sh
Documentation
ml RepeatModeler/1.0.11-foss-2016b RepeatModeler RepeatModeler -h No database indicated NAME RepeatModeler - Model repetitive DNA SYNOPSIS RepeatModeler [-options] -database <XDF Database> DESCRIPTION The options are: -h(elp) Detailed help -database The prefix name of a XDF formatted sequence database containing the genomic sequence to use when building repeat models. The database may be created with the WUBlast "xdformat" utility or with the RepeatModeler wrapper script "BuildXDFDatabase". -engine <abblast|wublast|ncbi> The name of the search engine we are using. I.e abblast/wublast or ncbi (rmblast version). -pa # Specify the number of shared-memory processors available to this program. RepeatModeler will use the processors to run BLAST searches in parallel. i.e on a machine with 10 cores one might use 1 core for the script and 9 cores for the BLAST searches by running with "-pa 9". -recoverDir <Previous Output Directory> If a run fails in the middle of processing, it may be possible recover some results and continue where the previous run left off. Simply supply the output directory where the results of the failed run were saved and the program will attempt to recover and continue the run. -srand # Optionally set the seed of the random number generator to a known value before the batches are randomly selected ( using Fisher Yates Shuffling ). This is only useful if you need to reproduce the sample choice between runs. This should be an integer number. SEE ALSO RepeatMasker, WUBlast COPYRIGHT Copyright 2005-2017 Institute for Systems Biology AUTHOR Robert Hubley <rhubley@systemsbiology.org> Arian Smit <asmit@systemsbiology.org>
Installation
Source code is obtained from RepeatModeler
System
64-bit Linux