BUSCO-Sapelo2: Difference between revisions
No edit summary |
No edit summary |
||
Line 41: | Line 41: | ||
Here is an example of a shell script, sub.sh, to run on the batch queue: | Here is an example of a shell script, sub.sh, to run on the batch queue: | ||
<pre class="gscript"> | |||
#!/bin/bash | |||
#SBATCH --job-name=busco # Job name | |||
#SBATCH --partition=batch # Partition (queue) name | |||
#SBATCH --ntasks=1 # Run a single task | |||
#SBATCH --cpus-per-task=4 # Number of CPU cores per task | |||
#SBATCH --mem=10gb # Job memory request | |||
#SBATCH --time=48:00:00 # Time limit hrs:min:sec | |||
#SBATCH --output=log.%j.out # Standard output log | |||
#SBATCH --error=log.%j.err # Standard error log | |||
#SBATCH --mail-type=END,FAIL # Mail events (NONE, BEGIN, END, FAIL, ALL) | |||
#SBATCH --mail-user=username@uga.edu # Where to send mail | |||
cd $SLURM_SUBMIT_DIR | |||
ml BUSCO/4.0.5-foss-2019b-Python-3.7.4 # load BUSCO/4.0.5 module | |||
time busco --config ./config.ini --cpu 4 [options] | |||
</pre> | |||
Revision as of 14:24, 22 October 2020
Category
Bioinformatics
Program On
Sap2test
Version
4.0.5, 4.0.6
Author / Distributor
Description
"BUSCO - Benchmarking sets of Universal Single-Copy Orthologs." More details are at BUSCO
Running Program
Version 4.0.5
- Version 4.0.5, is installed at /apps/eb/BUSCO/4.0.5-foss-2019b-Python-3.7.4
BLAST+ v2.9.0 is loaded with this application. This version of Blast+ enables the multiple cores function for busco. AUGUSTUS v3.3.3 is also loaded with AUGUSTUS_CONFIG_PATH set correctly.
To use this version, please load the module with
ml BUSCO/4.0.5-foss-2019b-Python-3.7.4
Before run the program, please copy the config file config.ini to your current working folder and modify the input file value and other values as needed in it
cp /apps/eb/BUSCO/4.0.5-foss-2019b-Python-3.7.4/config/config.ini config.ini vim config.ini export BUSCO_CONFIG_FILE=config.ini
Here is an example of a shell script, sub.sh, to run on the batch queue:
#!/bin/bash #SBATCH --job-name=busco # Job name #SBATCH --partition=batch # Partition (queue) name #SBATCH --ntasks=1 # Run a single task #SBATCH --cpus-per-task=4 # Number of CPU cores per task #SBATCH --mem=10gb # Job memory request #SBATCH --time=48:00:00 # Time limit hrs:min:sec #SBATCH --output=log.%j.out # Standard output log #SBATCH --error=log.%j.err # Standard error log #SBATCH --mail-type=END,FAIL # Mail events (NONE, BEGIN, END, FAIL, ALL) #SBATCH --mail-user=username@uga.edu # Where to send mail cd $SLURM_SUBMIT_DIR ml BUSCO/4.0.5-foss-2019b-Python-3.7.4 # load BUSCO/4.0.5 module time busco --config ./config.ini --cpu 4 [options]
Version 4.0.5
- Version 4.0.5, is installed as a singularity image at /usr/local/singularity-images/busco-4.0.5.simg
To run BUSCO v4.0.5 included in this singularity image:
singularity exec /usr/local/singularity-images/busco-4.0.5.simg run_busco [options]
To get busco help info:
singularity exec /usr/local/singularity-images/busco-4.0.5.simg run_busco -h
To check busco version info:
singularity exec /usr/local/singularity-images/busco-4.0.5.simg run_busco -v
To check other programs included in this singularity image:
singularity exec /usr/local/singularity-images/busco-4.0.5.simg ls /usr/local/bin singularity exec /usr/local/singularity-images/busco-4.0.5.simg ls /augustus singularity exec /usr/local/singularity-images/busco-4.0.5.simg ls /ncbi-blast-2.2.31+/bin singularity exec /usr/local/singularity-images/busco-4.0.5.simg ls /hmmer-3.2.1
Version 4.0.6
- Version 4.0.6, is installed in /usr/local/apps/eb/BUSCO/4.0.6-foss-2019b-Python-3.7.4
To use this version of busco, please first load the module with
module load BUSCO/4.0.6-foss-2019b-Python-3.7.4
This module will load other modules that this version of busco depends on.
Sample job submission script (sub.sh) to run busco version 4.0.5:
#PBS -S /bin/bash #PBS -q batch #PBS -N jobname #PBS -l nodes=1:ppn=1 #PBS -l walltime=24:00:00 #PBS -l mem=10gb cd $PBS_O_WORKDIR singularity exec /usr/local/singularity-images/busco-4.0.5.simg run_busco [options]
where [options] need to be replaced by the options (command and arguments) you want to use. Other parameters of the job, such as the maximum wall clock time, maximum memory, the number of cores per node, and the job name need to be modified appropriately as well.
Here is an example of job submission command:
qsub ./sub.sh
Documentation
ml busco/3.0.2 python /usr/local/apps/gb/busco/3.0.2/scripts/run_BUSCO.py -h usage: python BUSCO.py -i [SEQUENCE_FILE] -l [LINEAGE] -o [OUTPUT_NAME] -m [MODE] [OTHER OPTIONS] Welcome to BUSCO 3.0.2: the Benchmarking Universal Single-Copy Ortholog assessment tool. For more detailed usage information, please review the README file provided with this distribution and the BUSCO user guide. optional arguments: -i FASTA FILE, --in FASTA FILE Input sequence file in FASTA format. Can be an assembled genome or transcriptome (DNA), or protein sequences from an annotated gene set. -c N, --cpu N Specify the number (N=integer) of threads/cores to use. -o OUTPUT, --out OUTPUT Give your analysis run a recognisable short name. Output folders and files will be labelled with this name. WARNING: do not provide a path -e N, --evalue N E-value cutoff for BLAST searches. Allowed formats, 0.001 or 1e-03 (Default: 1e-03) -m MODE, --mode MODE Specify which BUSCO analysis mode to run. There are three valid modes: - geno or genome, for genome assemblies (DNA) - tran or transcriptome, for transcriptome assemblies (DNA) - prot or proteins, for annotated gene sets (protein) -l LINEAGE, --lineage_path LINEAGE Specify location of the BUSCO lineage data to be used. Visit http://busco.ezlab.org for available lineages. -f, --force Force rewriting of existing files. Must be used when output files with the provided name already exist. -r, --restart Restart an uncompleted run. Not available for the protein mode -sp SPECIES, --species SPECIES Name of existing Augustus species gene finding parameters. See Augustus documentation for available options. --augustus_parameters AUGUSTUS_PARAMETERS Additional parameters for the fine-tuning of Augustus run. For the species, do not use this option. Use single quotes as follow: '--param1=1 --param2=2', see Augustus documentation for available options. -t PATH, --tmp_path PATH Where to store temporary files (Default: ./tmp/) --limit REGION_LIMIT How many candidate regions (contig or transcript) to consider per BUSCO (default: 3) --long Optimization mode Augustus self-training (Default: Off) adds considerably to the run time, but can improve results for some non-model organisms -q, --quiet Disable the info logs, displays only errors -z, --tarzip Tarzip the output folders likely to contain thousands of files --blast_single_core Force tblastn to run on a single core and ignore the --cpu argument for this step only. Useful if inconsistencies when using multiple threads are noticed -v, --version Show this version and exit -h, --help Show this help message and exit
Installation
Source code is obtained from BUSCO
System
64-bit Linux