BUSCO-Sapelo2: Difference between revisions

From Research Computing Center Wiki
Jump to navigation Jump to search
No edit summary
 
(18 intermediate revisions by 4 users not shown)
Line 1: Line 1:
[[Category:Sap2test]][[Category:Software]][[Category:Bioinformatics]]
[[Category:Sapelo2]][[Category:Software]][[Category:Bioinformatics]]
=== Category ===
=== Category ===


Line 6: Line 6:
=== Program On ===
=== Program On ===


Sap2test
Sapelo2


=== Version ===
=== Version ===
4.0.5, 4.0.6
2.0, 4.0.5, 5.4.7, 5.5.0
   
   
=== Author / Distributor ===
=== Author / Distributor ===
Line 21: Line 21:
=== Running Program ===
=== Running Program ===


==== Version 4.0.5 ====
==== Version 5.4.7 ====


* Version 4.0.5, is installed at /apps/eb/BUSCO/4.0.5-foss-2019b-Python-3.7.4
* Version 5.4.7, is installed at /apps/eb/BUSCO/5.4.7-foss-2022a


BLAST+ v2.9.0 is loaded with this application. This version of Blast+ enables the multiple cores function for busco. AUGUSTUS v3.3.3 is also loaded with AUGUSTUS_CONFIG_PATH set correctly.  
BLAST+ v2.13.0 is loaded with this application. This version of Blast+ enables the multiple cores function for busco. AUGUSTUS v3.5.0 is also loaded with AUGUSTUS_CONFIG_PATH set correctly.  


To use this version, please load the module with
To use this version of BUSCO, please load the module with
<pre class="gscript">
<pre class="gscript">
ml BUSCO/4.0.5-foss-2019b-Python-3.7.4
ml BUSCO/5.4.7-foss-2022a
</pre>
</pre>


Before run the program, please copy the config file config.ini to your current working folder and modify the input file value and other values as needed in it
Before run the program, please copy the BUSCO config file config.ini to your current working folder and modify the input file value and other values as needed in it. Please also copy AUGUSTUS config folder to the place:


<pre class="gscript">
<pre class="gscript">
cp /apps/eb/BUSCO/4.0.5-foss-2019b-Python-3.7.4/config/config.ini config.ini
cp -r /apps/eb/AUGUSTUS/3.5.0-foss-2022a/config config_augustus
export AUGUSTUS_CONFIG_PATH=config_augustus
 
cp /apps/eb/BUSCO/5.4.7-foss-2022a/config/config.ini config.ini
vim config.ini
export BUSCO_CONFIG_FILE=config.ini
</pre>
 
==== Version 5.5.0 ====
 
* Version 5.5.0, is installed at /apps/eb/BUSCO/5.5.0-foss-2022a
 
To use this version of BUSCO, please first load the module with
<pre class="gcommand">
ml BUSCO/5.5.0-foss-2022a
</pre>
 
BLAST+ v2.13.0 is loaded with this application. This version of Blast+ enables the multiple cores function for busco. AUGUSTUS v3.5.0 is also loaded with AUGUSTUS_CONFIG_PATH set correctly.
 
Before run the program, please copy the BUSCO config file config.ini to your current working folder and modify the input file value and other values as needed in it. Please also copy AUGUSTUS config folder to the place:
 
<pre class="gscript">
cp -r /apps/eb/AUGUSTUS/3.5.0-foss-2022a/config config_augustus
export AUGUSTUS_CONFIG_PATH=config_augustus
 
cp /apps/eb/BUSCO/5.5.0-foss-2022a/config/config.ini config.ini
vim config.ini
vim config.ini
export BUSCO_CONFIG_FILE=config.ini
export BUSCO_CONFIG_FILE=config.ini
</pre>
</pre>


Here is an example of a shell script, sub.sh, to run on the batch queue:




In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.   
'''Example shell script''' sub.sh to run BUSCO/5.5.0 on the batch partition:
<pre class="gscript">
#!/bin/bash
#SBATCH --job-name=busco                # Job name
#SBATCH --partition=batch              # Partition (queue) name
#SBATCH --ntasks=1                      # Run a single task
#SBATCH --cpus-per-task=4              # Number of CPU cores per task
#SBATCH --mem=10gb                      # Job memory request
#SBATCH --time=48:00:00                # Time limit hrs:min:sec
#SBATCH --output=log.%j.out            # Standard output log
#SBATCH --error=log.%j.err              # Standard error log
#SBATCH --export=NONE                  # Don't export user's explicit env variables to compute node
#SBATCH --mail-type=END,FAIL            # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=username@uga.edu    # Where to send mail
 
cd $SLURM_SUBMIT_DIR
 
ml BUSCO/5.5.0-foss-2022a # load BUSCO v5.5.0 module
 
export AUGUSTUS_CONFIG_PATH=${PWD}/config_augustus
export BUSCO_CONFIG_FILE=${PWD}/config.ini
 
time busco --config ./config.ini --cpu 4 [options]
</pre>


where [options] need to be replaced by the options (command and arguments) you want to use. Other parameters of the job, such as the time limit, maximum memory, number of cores, and the job name need to be modified appropriately as well.




==== Version 4.0.5, Singularity Image  ====


==== Version 4.0.5 ====
* version 2.0 is installed as a singularity image at /apps/singularity-images/busco-2.0.simg  


* Version 4.0.5, is installed as a singularity image at /usr/local/singularity-images/busco-4.0.5.simg
==== Version 4.0.5, Singularity Image  ====


To run BUSCO v4.0.5 included in this singularity image:
* Version 4.0.5, is installed as a singularity image at /apps/singularity-images/busco-4.0.5.simg
 
To run this singularity image:


<pre class="gcommand">
<pre class="gcommand">
singularity exec /usr/local/singularity-images/busco-4.0.5.simg run_busco [options]
singularity exec /apps/singularity-images/busco-4.0.5.simg run_busco [options]
</pre>
</pre>


To get busco help info:
To get busco help info:
<pre class="gcommand">
<pre class="gcommand">
singularity exec /usr/local/singularity-images/busco-4.0.5.simg run_busco -h
singularity exec /apps/singularity-images/busco-4.0.5.simg run_busco -h
</pre>
</pre>


To check busco version info:
To check busco version info:
<pre class="gcommand">
<pre class="gcommand">
singularity exec /usr/local/singularity-images/busco-4.0.5.simg run_busco -v
singularity exec /apps/singularity-images/busco-4.0.5.simg run_busco -v
</pre>
</pre>


To check other programs included in this singularity image:
To have a check on other programs included in this busco singularity image:
<pre class="gcommand">
<pre class="gcommand">
singularity exec /usr/local/singularity-images/busco-4.0.5.simg ls /usr/local/bin
singularity exec /apps/singularity-images/busco-4.0.5.simg ls /usr/local/bin
singularity exec /usr/local/singularity-images/busco-4.0.5.simg ls /augustus
singularity exec /apps/singularity-images/busco-4.0.5.simg ls /augustus
singularity exec /usr/local/singularity-images/busco-4.0.5.simg ls /ncbi-blast-2.2.31+/bin
singularity exec /apps/singularity-images/busco-4.0.5.simg ls /ncbi-blast-2.2.31+/bin
singularity exec /usr/local/singularity-images/busco-4.0.5.simg ls /hmmer-3.2.1
singularity exec /apps/singularity-images/busco-4.0.5.simg ls /hmmer-3.2.1
singularity exec /apps/singularity-images/busco-4.0.5.simg ls /prodigal
</pre>
</pre>


Before run busco singularity container, please copy AUGUSTUS config folder to your current working folder:


==== Version 4.0.6 ====
<pre class="gscript">
cp -r /apps/eb/AUGUSTUS/3.3.3-foss-2019b/config config_augustus
export AUGUSTUS_CONFIG_PATH=config_augustus
</pre>


* Version 4.0.6, is installed in /usr/local/apps/eb/BUSCO/4.0.6-foss-2019b-Python-3.7.4
Please also copy the BUSCO config file config.ini from its singularity image to your current working folder and modify the input file value and other values as needed in it.


To use this version of busco, please first load the module with
<pre class="gscript">
<pre class="gcommand">
singularity exec /apps/singularity-images/busco-4.0.5.simg cp /busco/config/config.ini .
module load BUSCO/4.0.6-foss-2019b-Python-3.7.4
vim config.ini
export BUSCO_CONFIG_FILE=config.ini
</pre>
</pre>
This module will load other modules that this version of busco depends on.




Sample job submission script (sub.sh) to run busco version 4.0.5:
'''Example shell script''' sub.sh to run BUSCO/4.0.5 singularity container:


<pre class="gscript">
<pre class="gscript">
#PBS -S /bin/bash
#!/bin/bash
#PBS -q batch
#SBATCH --job-name=busco              # Job name
#PBS -N jobname
#SBATCH --partition=batch             # Partition (queue) name
#PBS -l nodes=1:ppn=1
#SBATCH --ntasks=1                    # Run a single task
#PBS -l walltime=24:00:00
#SBATCH --cpus-per-task=4            # Number of CPU cores per task
#PBS -l mem=10gb
#SBATCH --mem=10gb                    # Job memory request
#SBATCH --time=24:00:00               # Time limit hrs:min:sec
#SBATCH --output=log.%j.out          # Standard output log
#SBATCH --error=log.%j.err            # Standard error log
#SBATCH --export=NONE                # Don't export user's explicit env variables to compute node
#SBATCH --mail-type=END,FAIL          # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=username@uga.edu  # Where to send mail
 
cd $SLURM_SUBMIT_DIR


cd $PBS_O_WORKDIR
export AUGUSTUS_CONFIG_PATH=${PWD}/config_augustus
export BUSCO_CONFIG_FILE=${PWD}/config.ini


singularity exec /usr/local/singularity-images/busco-4.0.5.simg run_busco [options]
time singularity exec --bind ./config_augustus/:/augustus/config /apps/singularity-images/busco-4.0.5.simg run_busco --config ./config.ini --cpu 4 [options]
</pre>
</pre>


Line 108: Line 173:
Here is an example of job submission command:
Here is an example of job submission command:
<pre  class="gcommand">
<pre  class="gcommand">
qsub  ./sub.sh  
sbatch sub.sh  
</pre>
</pre>


=== Documentation ===
=== Documentation ===
   
   
<pre class="gcommand">
<pre class="gcommand">
ml busco/3.0.2 
[cft07037@b1-24 bin]$ ml BUSCO/5.5.0-foss-2022a
python /usr/local/apps/gb/busco/3.0.2/scripts/run_BUSCO.py  -h
[cft07037@b1-24 bin]$ busco -h
usage: python BUSCO.py -i [SEQUENCE_FILE] -l [LINEAGE] -o [OUTPUT_NAME] -m [MODE] [OTHER OPTIONS]
usage: busco -i [SEQUENCE_FILE] -l [LINEAGE] -o [OUTPUT_NAME] -m [MODE] [OTHER OPTIONS]


Welcome to BUSCO 3.0.2: the Benchmarking Universal Single-Copy Ortholog assessment tool.
Welcome to BUSCO 5.5.0: the Benchmarking Universal Single-Copy Ortholog assessment tool.
For more detailed usage information, please review the README file provided with this distribution and the BUSCO user guide.
For more detailed usage information, please review the README file provided with this distribution and the BUSCO user guide. Visit this page https://gitlab.com/ezlab/busco#how-to-cite-busco to see how to cite BUSCO


optional arguments:
optional arguments:
   -i FASTA FILE, --in FASTA FILE
   -i SEQUENCE_FILE, --in SEQUENCE_FILE
                         Input sequence file in FASTA format. Can be an assembled genome or transcriptome (DNA), or protein sequences from an annotated gene set.
                         Input sequence file in FASTA format. Can be an assembled genome or transcriptome (DNA), or protein sequences from an annotated gene set. Also possible to use a path to a directory containing multiple input files.
  -c N, --cpu N        Specify the number (N=integer) of threads/cores to use.
   -o OUTPUT, --out OUTPUT
   -o OUTPUT, --out OUTPUT
                         Give your analysis run a recognisable short name. Output folders and files will be labelled with this name. WARNING: do not provide a path
                         Give your analysis run a recognisable short name. Output folders and files will be labelled with this name. The path to the output folder is set with --out_path.
  -e N, --evalue N      E-value cutoff for BLAST searches. Allowed formats, 0.001 or 1e-03 (Default: 1e-03)
   -m MODE, --mode MODE  Specify which BUSCO analysis mode to run.
   -m MODE, --mode MODE  Specify which BUSCO analysis mode to run.
                         There are three valid modes:
                         There are three valid modes:
Line 133: Line 196:
                         - tran or transcriptome, for transcriptome assemblies (DNA)
                         - tran or transcriptome, for transcriptome assemblies (DNA)
                         - prot or proteins, for annotated gene sets (protein)
                         - prot or proteins, for annotated gene sets (protein)
   -l LINEAGE, --lineage_path LINEAGE
   -l LINEAGE, --lineage_dataset LINEAGE
                         Specify location of the BUSCO lineage data to be used.
                         Specify the name of the BUSCO lineage to be used.
                         Visit http://busco.ezlab.org for available lineages.
  --augustus            Use augustus gene predictor for eukaryote runs
  --augustus_parameters --PARAM1=VALUE1,--PARAM2=VALUE2
                        Pass additional arguments to Augustus. All arguments should be contained within a single string with no white space, with each argument separated by a comma.
  --augustus_species AUGUSTUS_SPECIES
                         Specify a species for Augustus training.
  --auto-lineage        Run auto-lineage to find optimum lineage path
  --auto-lineage-euk    Run auto-placement just on eukaryote tree to find optimum lineage path
  --auto-lineage-prok  Run auto-lineage just on non-eukaryote trees to find optimum lineage path
  -c N, --cpu N        Specify the number (N=integer) of threads/cores to use.
  --config CONFIG_FILE  Provide a config file
  --contig_break n      Number of contiguous Ns to signify a break between contigs. Default is n=10.
  --datasets_version DATASETS_VERSION
                        Specify the version of BUSCO datasets, e.g. odb10
  --download [dataset ...]
                        Download dataset. Possible values are a specific dataset name, "all", "prokaryota", "eukaryota", or "virus". If used together with other command line arguments, make sure to place this last.
  --download_base_url DOWNLOAD_BASE_URL
                        Set the url to the remote BUSCO dataset location
  --download_path DOWNLOAD_PATH
                        Specify local filepath for storing BUSCO dataset downloads
  -e N, --evalue N      E-value cutoff for BLAST searches. Allowed formats, 0.001 or 1e-03 (Default: 1e-03)
   -f, --force          Force rewriting of existing files. Must be used when output files with the provided name already exist.
   -f, --force          Force rewriting of existing files. Must be used when output files with the provided name already exist.
   -r, --restart        Restart an uncompleted run. Not available for the protein mode
   -h, --help            Show this help message and exit
  -sp SPECIES, --species SPECIES
   --limit N            How many candidate regions (contig or transcript) to consider per BUSCO (default: 3)
                        Name of existing Augustus species gene finding parameters. See Augustus documentation for available options.
  --list-datasets      Print the list of available BUSCO datasets
  --augustus_parameters AUGUSTUS_PARAMETERS
   --long                Optimization Augustus self-training mode (Default: Off); adds considerably to the run time, but can improve results for some non-model organisms
                        Additional parameters for the fine-tuning of Augustus run. For the species, do not use this option.
  --metaeuk_parameters "--PARAM1=VALUE1,--PARAM2=VALUE2"
                        Use single quotes as follow: '--param1=1 --param2=2', see Augustus documentation for available options.
                        Pass additional arguments to Metaeuk for the first run. All arguments should be contained within a single string with no white space, with each argument separated by a comma.
  -t PATH, --tmp_path PATH
  --metaeuk_rerun_parameters "--PARAM1=VALUE1,--PARAM2=VALUE2"
                        Where to store temporary files (Default: ./tmp/)
                        Pass additional arguments to Metaeuk for the second run. All arguments should be contained within a single string with no white space, with each argument separated by a comma.
   --limit REGION_LIMIT  How many candidate regions (contig or transcript) to consider per BUSCO (default: 3)
  --miniprot            Use miniprot gene predictor for eukaryote runs
   --long                Optimization mode Augustus self-training (Default: Off) adds considerably to the run time, but can improve results for some non-model organisms
  --offline            To indicate that BUSCO cannot attempt to download files
  --out_path OUTPUT_PATH
                        Optional location for results folder, excluding results folder name. Default is current working directory.
   -q, --quiet          Disable the info logs, displays only errors
   -q, --quiet          Disable the info logs, displays only errors
   -z, --tarzip          Tarzip the output folders likely to contain thousands of files
   -r, --restart        Continue a run that had already partially completed.
   --blast_single_core   Force tblastn to run on a single core and ignore the --cpu argument for this step only. Useful if inconsistencies when using multiple threads are noticed
  --scaffold_composition
                        Writes ACGTN content per scaffold to a file scaffold_composition.txt
   --tar                Compress some subdirectories with many files to save space
   --update-data        Download and replace with last versions all lineages datasets and files necessary to their automated selection
   -v, --version        Show this version and exit
   -v, --version        Show this version and exit
  -h, --help            Show this help message and exit
</pre>
</pre>
[[#top|Back to Top]]
[[#top|Back to Top]]

Latest revision as of 09:21, 9 May 2024

Category

Bioinformatics

Program On

Sapelo2

Version

2.0, 4.0.5, 5.4.7, 5.5.0

Author / Distributor

BUSCO

Description

"BUSCO - Benchmarking sets of Universal Single-Copy Orthologs." More details are at BUSCO

Running Program

Version 5.4.7

  • Version 5.4.7, is installed at /apps/eb/BUSCO/5.4.7-foss-2022a

BLAST+ v2.13.0 is loaded with this application. This version of Blast+ enables the multiple cores function for busco. AUGUSTUS v3.5.0 is also loaded with AUGUSTUS_CONFIG_PATH set correctly.

To use this version of BUSCO, please load the module with

ml BUSCO/5.4.7-foss-2022a

Before run the program, please copy the BUSCO config file config.ini to your current working folder and modify the input file value and other values as needed in it. Please also copy AUGUSTUS config folder to the place:

cp -r /apps/eb/AUGUSTUS/3.5.0-foss-2022a/config config_augustus
export AUGUSTUS_CONFIG_PATH=config_augustus

cp /apps/eb/BUSCO/5.4.7-foss-2022a/config/config.ini config.ini
vim config.ini
export BUSCO_CONFIG_FILE=config.ini

Version 5.5.0

  • Version 5.5.0, is installed at /apps/eb/BUSCO/5.5.0-foss-2022a

To use this version of BUSCO, please first load the module with

ml BUSCO/5.5.0-foss-2022a

BLAST+ v2.13.0 is loaded with this application. This version of Blast+ enables the multiple cores function for busco. AUGUSTUS v3.5.0 is also loaded with AUGUSTUS_CONFIG_PATH set correctly.

Before run the program, please copy the BUSCO config file config.ini to your current working folder and modify the input file value and other values as needed in it. Please also copy AUGUSTUS config folder to the place:

cp -r /apps/eb/AUGUSTUS/3.5.0-foss-2022a/config config_augustus
export AUGUSTUS_CONFIG_PATH=config_augustus

cp /apps/eb/BUSCO/5.5.0-foss-2022a/config/config.ini config.ini
vim config.ini
export BUSCO_CONFIG_FILE=config.ini


Example shell script sub.sh to run BUSCO/5.5.0 on the batch partition:

#!/bin/bash
#SBATCH --job-name=busco                # Job name
#SBATCH --partition=batch               # Partition (queue) name
#SBATCH --ntasks=1                      # Run a single task	
#SBATCH --cpus-per-task=4               # Number of CPU cores per task
#SBATCH --mem=10gb                      # Job memory request
#SBATCH --time=48:00:00                 # Time limit hrs:min:sec
#SBATCH --output=log.%j.out             # Standard output log
#SBATCH --error=log.%j.err              # Standard error log
#SBATCH --export=NONE                   # Don't export user's explicit env variables to compute node
#SBATCH --mail-type=END,FAIL            # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=username@uga.edu    # Where to send mail	

cd $SLURM_SUBMIT_DIR

ml BUSCO/5.5.0-foss-2022a  # load BUSCO v5.5.0 module

export AUGUSTUS_CONFIG_PATH=${PWD}/config_augustus
export BUSCO_CONFIG_FILE=${PWD}/config.ini

time busco --config ./config.ini --cpu 4 [options]

where [options] need to be replaced by the options (command and arguments) you want to use. Other parameters of the job, such as the time limit, maximum memory, number of cores, and the job name need to be modified appropriately as well.


Version 4.0.5, Singularity Image

  • version 2.0 is installed as a singularity image at /apps/singularity-images/busco-2.0.simg  

Version 4.0.5, Singularity Image

  • Version 4.0.5, is installed as a singularity image at /apps/singularity-images/busco-4.0.5.simg

To run this singularity image:

singularity exec /apps/singularity-images/busco-4.0.5.simg run_busco [options]

To get busco help info:

singularity exec /apps/singularity-images/busco-4.0.5.simg run_busco -h

To check busco version info:

singularity exec /apps/singularity-images/busco-4.0.5.simg run_busco -v

To have a check on other programs included in this busco singularity image:

singularity exec /apps/singularity-images/busco-4.0.5.simg ls /usr/local/bin
singularity exec /apps/singularity-images/busco-4.0.5.simg ls /augustus
singularity exec /apps/singularity-images/busco-4.0.5.simg ls /ncbi-blast-2.2.31+/bin
singularity exec /apps/singularity-images/busco-4.0.5.simg ls /hmmer-3.2.1
singularity exec /apps/singularity-images/busco-4.0.5.simg ls /prodigal

Before run busco singularity container, please copy AUGUSTUS config folder to your current working folder:

cp -r /apps/eb/AUGUSTUS/3.3.3-foss-2019b/config config_augustus
export AUGUSTUS_CONFIG_PATH=config_augustus

Please also copy the BUSCO config file config.ini from its singularity image to your current working folder and modify the input file value and other values as needed in it.

singularity exec /apps/singularity-images/busco-4.0.5.simg cp /busco/config/config.ini .
vim config.ini
export BUSCO_CONFIG_FILE=config.ini


Example shell script sub.sh to run BUSCO/4.0.5 singularity container:

#!/bin/bash
#SBATCH --job-name=busco              # Job name
#SBATCH --partition=batch             # Partition (queue) name
#SBATCH --ntasks=1                    # Run a single task	
#SBATCH --cpus-per-task=4             # Number of CPU cores per task
#SBATCH --mem=10gb                    # Job memory request
#SBATCH --time=24:00:00               # Time limit hrs:min:sec
#SBATCH --output=log.%j.out           # Standard output log
#SBATCH --error=log.%j.err            # Standard error log
#SBATCH --export=NONE                 # Don't export user's explicit env variables to compute node
#SBATCH --mail-type=END,FAIL          # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=username@uga.edu  # Where to send mail	

cd $SLURM_SUBMIT_DIR

export AUGUSTUS_CONFIG_PATH=${PWD}/config_augustus
export BUSCO_CONFIG_FILE=${PWD}/config.ini

time singularity exec --bind ./config_augustus/:/augustus/config /apps/singularity-images/busco-4.0.5.simg run_busco --config ./config.ini --cpu 4 [options]

where [options] need to be replaced by the options (command and arguments) you want to use. Other parameters of the job, such as the maximum wall clock time, maximum memory, the number of cores per node, and the job name need to be modified appropriately as well.


Here is an example of job submission command:

sbatch sub.sh 

Documentation

[cft07037@b1-24 bin]$ ml BUSCO/5.5.0-foss-2022a
[cft07037@b1-24 bin]$ busco -h
usage: busco -i [SEQUENCE_FILE] -l [LINEAGE] -o [OUTPUT_NAME] -m [MODE] [OTHER OPTIONS]

Welcome to BUSCO 5.5.0: the Benchmarking Universal Single-Copy Ortholog assessment tool.
For more detailed usage information, please review the README file provided with this distribution and the BUSCO user guide. Visit this page https://gitlab.com/ezlab/busco#how-to-cite-busco to see how to cite BUSCO

optional arguments:
  -i SEQUENCE_FILE, --in SEQUENCE_FILE
                        Input sequence file in FASTA format. Can be an assembled genome or transcriptome (DNA), or protein sequences from an annotated gene set. Also possible to use a path to a directory containing multiple input files.
  -o OUTPUT, --out OUTPUT
                        Give your analysis run a recognisable short name. Output folders and files will be labelled with this name. The path to the output folder is set with --out_path.
  -m MODE, --mode MODE  Specify which BUSCO analysis mode to run.
                        There are three valid modes:
                        - geno or genome, for genome assemblies (DNA)
                        - tran or transcriptome, for transcriptome assemblies (DNA)
                        - prot or proteins, for annotated gene sets (protein)
  -l LINEAGE, --lineage_dataset LINEAGE
                        Specify the name of the BUSCO lineage to be used.
  --augustus            Use augustus gene predictor for eukaryote runs
  --augustus_parameters --PARAM1=VALUE1,--PARAM2=VALUE2
                        Pass additional arguments to Augustus. All arguments should be contained within a single string with no white space, with each argument separated by a comma.
  --augustus_species AUGUSTUS_SPECIES
                        Specify a species for Augustus training.
  --auto-lineage        Run auto-lineage to find optimum lineage path
  --auto-lineage-euk    Run auto-placement just on eukaryote tree to find optimum lineage path
  --auto-lineage-prok   Run auto-lineage just on non-eukaryote trees to find optimum lineage path
  -c N, --cpu N         Specify the number (N=integer) of threads/cores to use.
  --config CONFIG_FILE  Provide a config file
  --contig_break n      Number of contiguous Ns to signify a break between contigs. Default is n=10.
  --datasets_version DATASETS_VERSION
                        Specify the version of BUSCO datasets, e.g. odb10
  --download [dataset ...]
                        Download dataset. Possible values are a specific dataset name, "all", "prokaryota", "eukaryota", or "virus". If used together with other command line arguments, make sure to place this last.
  --download_base_url DOWNLOAD_BASE_URL
                        Set the url to the remote BUSCO dataset location
  --download_path DOWNLOAD_PATH
                        Specify local filepath for storing BUSCO dataset downloads
  -e N, --evalue N      E-value cutoff for BLAST searches. Allowed formats, 0.001 or 1e-03 (Default: 1e-03)
  -f, --force           Force rewriting of existing files. Must be used when output files with the provided name already exist.
  -h, --help            Show this help message and exit
  --limit N             How many candidate regions (contig or transcript) to consider per BUSCO (default: 3)
  --list-datasets       Print the list of available BUSCO datasets
  --long                Optimization Augustus self-training mode (Default: Off); adds considerably to the run time, but can improve results for some non-model organisms
  --metaeuk_parameters "--PARAM1=VALUE1,--PARAM2=VALUE2"
                        Pass additional arguments to Metaeuk for the first run. All arguments should be contained within a single string with no white space, with each argument separated by a comma.
  --metaeuk_rerun_parameters "--PARAM1=VALUE1,--PARAM2=VALUE2"
                        Pass additional arguments to Metaeuk for the second run. All arguments should be contained within a single string with no white space, with each argument separated by a comma.
  --miniprot            Use miniprot gene predictor for eukaryote runs
  --offline             To indicate that BUSCO cannot attempt to download files
  --out_path OUTPUT_PATH
                        Optional location for results folder, excluding results folder name. Default is current working directory.
  -q, --quiet           Disable the info logs, displays only errors
  -r, --restart         Continue a run that had already partially completed.
  --scaffold_composition
                        Writes ACGTN content per scaffold to a file scaffold_composition.txt
  --tar                 Compress some subdirectories with many files to save space
  --update-data         Download and replace with last versions all lineages datasets and files necessary to their automated selection
  -v, --version         Show this version and exit

Back to Top

Installation

Source code is obtained from BUSCO

System

64-bit Linux