BUSCO-Teaching
Category
Bioinformatics
Program On
Teaching
Version
3.0.2
Author / Distributor
Description
" Benchmarking sets of Universal Single-Copy Orthologs" More details are at BUSCO
Running Program
The last version of this application is at /usr/local/apps/gb/BUSCO/3.0.2
Bacteria and Eukaryota data sets are located at /usr/local/apps/gb/BUSCO/3.0.2.data
To use this version, please load the module with
ml BUSCO/3.0.2
Before run the program, copy the config files and change the input file value and other needed values at config file config_augustus/config.ini
cp /usr/local/apps/eb/AUGUSTUS/3.2.3-foss-2016b-Python-2.7.14/config config_augustus export AUGUSTUS_CONFIG_PATH=config_augustus cp -R /usr/local/apps/gb/busco/3.0.2/config/config.ini config_ini vi config.ini export BUSCO_CONFIG_FILE=config.ini
Here is an example of a shell script, sub.sh, to run on the batch queue:
#!/bin/bash
#SBATCH --job-name=j_BUSCO
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=10gb
#SBATCH --time=08:00:00
#SBATCH --output=BUSCO.%j.out
#SBATCH --error=BUSCO.%j.err
cd $SLURM_SUBMIT_DIR
ml BUSCO/3.0.2
python /usr/local/apps/gb/BUSCO/3.0.2/scripts/run_BUSCO.py [options]
In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.
Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.
Here is an example of job submission command:
sbatch ./sub.sh
Documentation
ml BUSCO/3.0.2 python /usr/local/apps/gb/BUSCO/3.0.2/scripts/run_BUSCO.py -h usage: python BUSCO.py -i [SEQUENCE_FILE] -l [LINEAGE] -o [OUTPUT_NAME] -m [MODE] [OTHER OPTIONS] Welcome to BUSCO 3.0.2: the Benchmarking Universal Single-Copy Ortholog assessment tool. For more detailed usage information, please review the README file provided with this distribution and the BUSCO user guide. optional arguments: -i FASTA FILE, --in FASTA FILE Input sequence file in FASTA format. Can be an assembled genome or transcriptome (DNA), or protein sequences from an annotated gene set. -c N, --cpu N Specify the number (N=integer) of threads/cores to use. -o OUTPUT, --out OUTPUT Give your analysis run a recognisable short name. Output folders and files will be labelled with this name. WARNING: do not provide a path -e N, --evalue N E-value cutoff for BLAST searches. Allowed formats, 0.001 or 1e-03 (Default: 1e-03) -m MODE, --mode MODE Specify which BUSCO analysis mode to run. There are three valid modes: - geno or genome, for genome assemblies (DNA) - tran or transcriptome, for transcriptome assemblies (DNA) - prot or proteins, for annotated gene sets (protein) -l LINEAGE, --lineage_path LINEAGE Specify location of the BUSCO lineage data to be used. Visit http://busco.ezlab.org for available lineages. -f, --force Force rewriting of existing files. Must be used when output files with the provided name already exist. -r, --restart Restart an uncompleted run. Not available for the protein mode -sp SPECIES, --species SPECIES Name of existing Augustus species gene finding parameters. See Augustus documentation for available options. --augustus_parameters AUGUSTUS_PARAMETERS Additional parameters for the fine-tuning of Augustus run. For the species, do not use this option. Use single quotes as follow: '--param1=1 --param2=2', see Augustus documentation for available options. -t PATH, --tmp_path PATH Where to store temporary files (Default: ./tmp/) --limit REGION_LIMIT How many candidate regions (contig or transcript) to consider per BUSCO (default: 3) --long Optimization mode Augustus self-training (Default: Off) adds considerably to the run time, but can improve results for some non-model organisms -q, --quiet Disable the info logs, displays only errors -z, --tarzip Tarzip the output folders likely to contain thousands of files --blast_single_core Force tblastn to run on a single core and ignore the --cpu argument for this step only. Useful if inconsistencies when using multiple threads are noticed -v, --version Show this version and exit -h, --help Show this help message and exit
Installation
Source code is obtained from BUSCO
System
64-bit Linux