Bactopia-Sapelo2
Category
Bioinformatics
Program On
Sapelo2
Version
3.2.0
Author / Distributor
Please see https://bactopia.github.io/v3.1.0/
Description
Bactopia is a flexible pipeline for complete analysis of bacterial genomes. This module can be loaded directly: module load Bactopia/3.2.0-conda
Running Program
To load Bactopia, use
ml Bactopia/3.2.0-conda
Please ensure you do not have any base conda environments active when running this program.
For errors such as "FileNotFoundError: [Errno 2] No such file or directory: '/home/MyID/.bactopia/conda/bioconda--tb-profiler-6.6.3/share/snpeff-5.2-1/snpEff.config'"
Bactopia expects a snpEff.config file in /home/MyID/.bactopia/conda/bioconda--tb-profiler-6.6.3/share/snpeff-5.2-1/. If Bactopia installed a different version of snpeff (i.e. not 5.2-1), then you can create a symbolic link to your actual config file. First, check to see what version of snpeff you have installed with "ls /home/MyID/.bactopia/conda/bioconda--tb-profiler-6.6.3/share/" and look for a directory named snpeff-<version>. There should be one with the version as 5.2-1 and another with a different version. Your snpEff.config file should exist inside the directory that is not version 5.2-1 (there should not be a snpEff.config file in the 5.2-1 directory, but we will fix that). You will create a symbolic link in the snpeff-5.2-1directory to the actual snpEff.config file in the other snpeff-<version> directory using this command (change MyID to your own MyID and <version> to the other snpeff version that is not 5.2-1): ln -s /home/MyID/.bactopia/conda/bioconda--tb-profiler-6.6.3/share/snpeff-<version>/snpEff.config /home/MyID/.bactopia/conda/bioconda--tb-profiler-6.6.3/share/snpeff-5.2-1/snpEff.config
For errors such as "ERROR Delly failed, skipping INFO Running snpEff ERROR Writing to /tmp/bcftools.gTh9nG "
Edit your snpEff.config file so that the 'data.dir' path points to your actual database directory that contains Mycobacterium_tuberculosis_h37rv (which may be something like /home/MyID/.bactopia/conda/bioconda--tb-profiler-6.6.3/share/snpeff-5.2-1/data/).
[02:10:40] ERROR Writing to /tmp/bcftools.gTh9nG utils.py:550"
For errors such as "ModuleNotFoundError: No module named 'pkg_resources' "
Ensure that the version of setuptools in your tbprofiler environment is less than or equal to 80.x. You check this by starting an interactive session, loading Bactopia, then activating your tbprofiler environment with "source activate ~/.bactopia/conda/bioconda--tb-profiler-6.6.3" and then "conda list". Then scroll to view the version of setuptools and change it if needed with "conda install setuptools=80.* "
Documentation
[cft07037@b1-24 ~]$ ml Bactopia/3.2.0-conda
[cft07037@b1-24 ~]$ bactopia --help
Nextflow 25.10.4 is available - Please consider updating your version to it
N E X T F L O W ~ version 23.10.1
Launching `/apps/eb/Bactopia/3.2.0-conda/share/bactopia-3.2.0/main.nf` [friendly_golick] DSL2 - revision: 0cd9f79ba7
---------------------------------------------
_ _ _
| |__ __ _ ___| |_ ___ _ __ (_) __ _
| '_ \ / _` |/ __| __/ _ \| '_ \| |/ _` |
| |_) | (_| | (__| || (_) | |_) | | (_| |
|_.__/ \__,_|\___|\__\___/| .__/|_|\__,_|
|_|
bactopia v3.2.0
Bactopia is a flexible pipeline for complete analysis of bacterial genomes
---------------------------------------------
Typical pipeline command:
bactopia --fastqs samples.txt --datasets datasets/ --species 'Staphylococcus aureus' -profile singularity
Required Parameters
Processing Multiple Samples
--samples [string] A FOFN (via bactopia prepare) with sample names and paths to FASTQ/FASTAs to process
Processing A Single Sample
--r1 [string] First set of compressed (gzip) Illumina paired-end FASTQ reads (requires --r2 and --sample)
--r2 [string] Second set of compressed (gzip) Illumina paired-end FASTQ reads (requires --r1 and --sample)
--se [string] Compressed (gzip) Illumina single-end FASTQ reads (requires --sample)
--ont [string] Compressed (gzip) Oxford Nanopore FASTQ reads (requires --sample)
--hybrid [boolean] Create hybrid assembly using Unicycler. (requires --r1, --r2, --ont and --sample)
--short_polish [boolean] Create hybrid assembly from long-read assembly and short read polishing. (requires --r1, --r2, --ont and
--sample)
--sample [string] Sample name to use for the input sequences
Downloading from SRA/ENA or NCBI Assembly
Note: Error free Illumina reads are simulated for assemblies
--accessions [string] A file containing ENA/SRA Experiment accessions or NCBI Assembly accessions to processed
--accession [string] Sample name to use for the input sequences
Processing an Assembly
Note: Error free Illumina reads are simulated for assemblies
--assembly [string] A assembled genome in compressed FASTA format. (requires --sample)
--check_samples [boolean] Validate the input FOFN provided by --samples
Dataset Parameters
--species [string] Name of species for species-specific dataset to use
--ask_merlin [boolean] Ask Merlin to execute species specific Bactopia tools based on Mash distances
--coverage [integer] Reduce samples to a given coverage, requires a genome size [default: 100]
--genome_size [string] Expected genome size (bp) for all samples, required for read error correction and read subsampling [default:
0]
--use_bakta [boolean] Use Bakta for annotation, instead of Prokka
QC Parameters
--use_bbmap [boolean] Illumina reads will be QC'd using BBMap
--use_porechop [boolean] Use Porechop to remove adapters from ONT reads
Assembler Parameters
--nanohq [boolean] For Flye, use '--nano-hq' instead of --nano-raw
AMRFinder+ Parameters
--organism [string] Taxonomy group to run additional screens against
--amrfinder_noplus [boolean] Disable running AMRFinder+ with the --plus option
--amrfinder_opts [string] Extra AMRFinder+ options in quotes.
--amrfinder_db [string] A custom AMRFinder+ database to use, either a tarball or a folder
MLST Parameters
--scheme [string] Don't autodetect, force this scheme on all inputs
--minid [integer] Minimum DNA percent identity of full allelle to consider 'similar' [default: 95]
--mincov [integer] Minimum DNA percent coverage to report partial allele at all [default: 10]
--minscore [integer] Minimum score out of 100 to match a scheme [default: 50]
--nopath [boolean] Strip filename paths from FILE column
--mlst_db [string] A custom MLST database to use, either a tarball or a directory
Prokka Parameters
--proteins [string] FASTA file of trusted proteins to first annotate from
--prodigal_tf [string] Training file to use for Prodigal
--prokka_coverage [integer] Minimum coverage on query protein [default: 80]
Optional Parameters
--outdir [string] Base directory to write results to [default: bactopia]
Nextflow Profile Parameters
--datasets_cache [string] Directory where downloaded datasets should be stored. [default: <BACTOPIA_DIR>/data/datasets]
Helpful Parameters
--wf [string] Specify which workflow or Bactopia Tool to execute [default: bactopia]
--list_wfs [boolean] List the available workflows and Bactopia Tools to use with '--wf'
--help_all [boolean] An alias for --help --show_hidden_params
--version [boolean] Display version text.
!! Hiding 110 params, use --show_hidden_params (or --help_all) to show them !!
--------------------------------------------------------------------
If you use bactopia for your analysis please cite:
* Bactopia
https://doi.org/10.1128/mSystems.00190-20
* The nf-core framework
https://doi.org/10.1038/s41587-020-0439-x
* Software dependencies
https://bactopia.github.io/acknowledgements/
--------------------------------------------------------------------
System
64-bit Linux