Bactopia-Sapelo2

From Research Computing Center Wiki
Jump to navigation Jump to search

Category

Bioinformatics

Program On

Sapelo2

Version

3.2.0

Author / Distributor

Please see https://bactopia.github.io/v3.1.0/

Description

Bactopia is a flexible pipeline for complete analysis of bacterial genomes. This module can be loaded directly: module load Bactopia/3.2.0-conda

Running Program

To load Bactopia, use

ml Bactopia/3.2.0-conda

Please ensure you do not have any base conda environments active when running this program.


For errors such as "FileNotFoundError: [Errno 2] No such file or directory: '/home/MyID/.bactopia/conda/bioconda--tb-profiler-6.6.3/share/snpeff-5.2-1/snpEff.config'"

Bactopia expects a snpEff.config file in /home/MyID/.bactopia/conda/bioconda--tb-profiler-6.6.3/share/snpeff-5.2-1/. If Bactopia installed a different version of snpeff (i.e. not 5.2-1), then you can create a symbolic link to your actual config file. First, check to see what version of snpeff you have installed with "ls /home/MyID/.bactopia/conda/bioconda--tb-profiler-6.6.3/share/" and look for a directory named snpeff-<version>. There should be one with the version as 5.2-1 and another with a different version. Your snpEff.config file should exist inside the directory that is not version 5.2-1 (there should not be a snpEff.config file in the 5.2-1 directory, but we will fix that). You will create a symbolic link in the snpeff-5.2-1directory to the actual snpEff.config file in the other snpeff-<version> directory using this command (change MyID to your own MyID and <version> to the other snpeff version that is not 5.2-1): ln -s /home/MyID/.bactopia/conda/bioconda--tb-profiler-6.6.3/share/snpeff-<version>/snpEff.config /home/MyID/.bactopia/conda/bioconda--tb-profiler-6.6.3/share/snpeff-5.2-1/snpEff.config


For errors such as "ERROR    Delly failed, skipping     INFO     Running snpEff       ERROR Writing to /tmp/bcftools.gTh9nG "

Edit your snpEff.config file so that the 'data.dir' path points to your actual database directory that contains Mycobacterium_tuberculosis_h37rv (which may be something like /home/MyID/.bactopia/conda/bioconda--tb-profiler-6.6.3/share/snpeff-5.2-1/data/).

  [02:10:40] ERROR    Writing to /tmp/bcftools.gTh9nG                 utils.py:550"


For errors such as "ModuleNotFoundError: No module named 'pkg_resources' "

Ensure that the version of setuptools in your tbprofiler environment is less than or equal to 80.x. You check this by starting an interactive session, loading Bactopia, then activating your tbprofiler environment with "source activate ~/.bactopia/conda/bioconda--tb-profiler-6.6.3" and then "conda list". Then scroll to view the version of setuptools and change it if needed with "conda install setuptools=80.* "

Documentation

 
[cft07037@b1-24 ~]$ ml Bactopia/3.2.0-conda
[cft07037@b1-24 ~]$ bactopia --help
Nextflow 25.10.4 is available - Please consider updating your version to it
N E X T F L O W  ~  version 23.10.1
Launching `/apps/eb/Bactopia/3.2.0-conda/share/bactopia-3.2.0/main.nf` [friendly_golick] DSL2 - revision: 0cd9f79ba7


---------------------------------------------
   _                _              _             
  | |__   __ _  ___| |_ ___  _ __ (_) __ _       
  | '_ \ / _` |/ __| __/ _ \| '_ \| |/ _` |   
  | |_) | (_| | (__| || (_) | |_) | | (_| |      
  |_.__/ \__,_|\___|\__\___/| .__/|_|\__,_| 
                            |_|                  
  bactopia v3.2.0
  Bactopia is a flexible pipeline for complete analysis of bacterial genomes 
---------------------------------------------
Typical pipeline command:

  bactopia --fastqs samples.txt --datasets datasets/ --species 'Staphylococcus aureus' -profile singularity

Required Parameters
  Processing Multiple Samples
  --samples                           [string]  A FOFN (via bactopia prepare) with sample names and paths to FASTQ/FASTAs to process

  Processing A Single Sample
  --r1                                [string]  First set of compressed (gzip) Illumina paired-end FASTQ reads (requires --r2 and --sample)
  --r2                                [string]  Second set of compressed (gzip) Illumina paired-end FASTQ reads (requires --r1 and --sample)
  --se                                [string]  Compressed (gzip) Illumina single-end FASTQ reads  (requires --sample)
  --ont                               [string]  Compressed (gzip) Oxford Nanopore FASTQ reads  (requires --sample)
  --hybrid                            [boolean] Create hybrid assembly using Unicycler.  (requires --r1, --r2, --ont and --sample)
  --short_polish                      [boolean] Create hybrid assembly from long-read assembly and short read polishing.  (requires --r1, --r2, --ont and 
                                                --sample) 
  --sample                            [string]  Sample name to use for the input sequences

  Downloading from SRA/ENA or NCBI Assembly
  Note: Error free Illumina reads are simulated for assemblies
  --accessions                        [string]  A file containing ENA/SRA Experiment accessions or NCBI Assembly accessions to processed
  --accession                         [string]  Sample name to use for the input sequences

  Processing an Assembly
  Note: Error free Illumina reads are simulated for assemblies
  --assembly                          [string]  A assembled genome in compressed FASTA format. (requires --sample)
  --check_samples                     [boolean] Validate the input FOFN provided by --samples

Dataset Parameters
  --species                           [string]  Name of species for species-specific dataset to use
  --ask_merlin                        [boolean] Ask Merlin to execute species specific Bactopia tools based on Mash distances
  --coverage                          [integer] Reduce samples to a given coverage, requires a genome size [default: 100]
  --genome_size                       [string]  Expected genome size (bp) for all samples, required for read error correction and read subsampling [default: 
                                                0] 
  --use_bakta                         [boolean] Use Bakta for annotation, instead of Prokka

QC Parameters
  --use_bbmap                         [boolean] Illumina reads will be QC'd using BBMap
  --use_porechop                      [boolean] Use Porechop to remove adapters from ONT reads

Assembler Parameters
  --nanohq                            [boolean] For Flye, use '--nano-hq' instead of --nano-raw

AMRFinder+ Parameters
  --organism                          [string]  Taxonomy group to run additional screens against
  --amrfinder_noplus                  [boolean] Disable running AMRFinder+ with the --plus option
  --amrfinder_opts                    [string]  Extra AMRFinder+ options in quotes.
  --amrfinder_db                      [string]  A custom AMRFinder+ database to use, either a tarball or a folder

MLST Parameters
  --scheme                            [string]  Don't autodetect, force this scheme on all inputs
  --minid                             [integer] Minimum DNA percent identity of full allelle to consider 'similar' [default: 95]
  --mincov                            [integer] Minimum DNA percent coverage to report partial allele at all [default: 10]
  --minscore                          [integer] Minimum score out of 100 to match a scheme [default: 50]
  --nopath                            [boolean] Strip filename paths from FILE column
  --mlst_db                           [string]  A custom MLST database to use, either a tarball or a directory

Prokka Parameters
  --proteins                          [string]  FASTA file of trusted proteins to first annotate from
  --prodigal_tf                       [string]  Training file to use for Prodigal
  --prokka_coverage                   [integer] Minimum coverage on query protein [default: 80]

Optional Parameters
  --outdir                            [string]  Base directory to write results to [default: bactopia]

Nextflow Profile Parameters
  --datasets_cache                    [string]  Directory where downloaded datasets should be stored. [default: <BACTOPIA_DIR>/data/datasets]

Helpful Parameters
  --wf                                [string]  Specify which workflow or Bactopia Tool to execute [default: bactopia]
  --list_wfs                          [boolean] List the available workflows and Bactopia Tools to use with '--wf'
  --help_all                          [boolean] An alias for --help --show_hidden_params
  --version                           [boolean] Display version text.

!! Hiding 110 params, use --show_hidden_params (or --help_all) to show them !!
--------------------------------------------------------------------
If you use bactopia for your analysis please cite:

* Bactopia
  https://doi.org/10.1128/mSystems.00190-20

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x

* Software dependencies
  https://bactopia.github.io/acknowledgements/
--------------------------------------------------------------------


Back to Top

System

64-bit Linux