Mothur-Sapelo2

From Research Computing Center Wiki
Revision as of 10:30, 24 February 2022 by Keeko (talk | contribs)
Jump to navigation Jump to search

Category

Bioinformatics

Program On

Sapelo2

Version

1.43.0,1.45.2,1.47.0

Author / Distributor

Please see Mothur

Description

"Mothur is a single piece of open-source, expandable software to fill the bioinformatics needs of the microbial ecology community. The functionality of different software including dotur, sons, treeclimber, s-libshuff, unifrac, and others have been incorperated in Mothur. In addition to improving the flexibility of these software, a number of other features including calculators and visualization tools are available with Mothur." More details, documentation and tutorials are at theMothur Wiki

Running Program

Also refer to Running Jobs on Sapelo2

For more information on Environment Modules on Sapelo2 please see the Lmod page.

The latest version of this application is at /apps/eb/Mothur/1.47.0-GCCcore-8.3.0

To use this version, please load the module with

ml Mothur/1.47.0-GCCcore-8.3.0

To run mother in "batch mode", collect your Mothur commands into a command file and use that file in a batch job. More information about creating Mothur batch files can be found here: https://mothur.org/wiki/batch_mode/ Some Mothur commands can use multiple cores. To use multiple cores, adjust the processors parameter for each command in the Mothur command file. You must also adjust the --cpus_per_task parameter in your submission script to be equal to the number of processors you request in your Mothur command file.

The following is an example of a Mothur command file requesting use of 8 cpus.This file is called Mother_Commandfile.txt, and assembles pair end reads and prepares them for analysis.

make.file(inputdir=./MiSeq_SOP, type=gz, prefix=stability)
make.contigs(file=current, processors=8)
screen.seqs(fasta=current, group=current, maxambig=0, maxlength=275)
unique.seqs()
count.seqs(name=current, group=current)
align.seqs(fasta=current, reference=silva.v4.fasta)

he following is an example job submission script (sub.sh) running the above Mother command file:

#!/bin/bash
#SBATCH --job-name=Mothurtest         # Job name
#SBATCH --partition=batch             # Partition (queue) name
#SBATCH --ntasks=1                    # Run on a single CPU
#SBATCH --cpus-per-task=8             # Use 8 cpus
#SBATCH --mem=10gb                    # Job memory request
#SBATCH --time=02:00:00               # Time limit hrs:min:sec
#SBATCH --output=%x_%j.out            # Standard output log
#SBATCH --error=%x_%j.err             # Standard error log

#SBATCH --mail-type=END,FAIL          # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=username@uga.edu  # Where to send mail	

cd $SLURM_SUBMIT_DIR
module load Mothur/1.47.0-GCCcore-8.3.0"
mothur Mother_Commandfile.txt

Here is an example of job submission command:

sbatch ./sub.sh 

Documentation

ml Mothur/1.47.0-GCCcore-8.3.0
mothur --help
Linux version

Using ReadLine,Boost,HDF5,GSL
mothur v.1.47.0
Last updated: 1/21/22
by
Patrick D. Schloss

Department of Microbiology & Immunology

University of Michigan
http://www.mothur.org

When using, please cite:
Schloss, P.D., et al., Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537-41.

Distributed under the GNU General Public License

Type 'help()' for information on the commands that are available

For questions and analysis support, please visit our forum at https://forum.mothur.org

Type 'quit()' to exit program

[NOTE]: Setting random seed to 19760620.

Script Mode



mothur > help()

NOTE: sens.spec assumes that only unique sequences were used to generate the distance matrix.


Clustering commmands include: cluster, cluster.classic, cluster.fit, cluster.split, mgcluster, phylotype

General commmands include: get.current, get.dists, make.biom, make.file, make.group, make.lefse, merge.count, merge.files, merge.groups, remove.dists, rename.file, set.current, set.dir, set.logfile, set.seed, system

Hypothesis Testing commmands include: amova, anosim, clearcut, cooccurrence, corr.axes, deunique.tree, homova, indicator, kruskal.wallis, libshuff, mantel, nmds, otu.association, parsimony, pca, pcoa, phylo.diversity, unifrac.unweighted, unifrac.weighted

OTU-Based Approaches commmands include: biom.info, classify.svm, collect.shared, collect.single, create.database, dist.shared, estimator.single, filter.shared, get.communitytype, get.coremicrobiome, get.group, get.groups, get.label, get.otus, get.otulist, get.oturep, get.otus, get.rabund, get.relabund, get.sabund, get.sharedseqs, heatmap.bin, heatmap.sim, lefse, list.otus, list.otus, make.clr, make.shared, merge.otus, metastats, normalize.shared, otu.hierarchy, primer.design, rarefaction.shared, rarefaction.single, remove.groups, remove.otus, remove.otus, remove.rare, sens.spec, sparcc, split.abund, summary.shared, summary.single, tree.shared, venn

Phylotype Analysis commmands include: classify.otu, classify.seqs, classify.tree, get.lineage, merge.taxsummary, remove.lineage, summary.tax

Sequence Processing commmands include: align.check, align.seqs, bin.seqs, chimera.bellerophon, chimera.ccode, chimera.check, chimera.perseus, chimera.pintail, chimera.slayer, chimera.uchime, chimera.vsearch, chop.seqs, cluster.fragments, consensus.seqs, count.groups, count.seqs, degap.seqs, deunique.seqs, dist.seqs, fastq.info, filter.seqs, get.mimarkspackage, get.seqs, list.seqs, make.contigs, make.fastq, make.lookup, make.sra, count.seqs, merge.sfffiles, pairwise.seqs, pcr.seqs, pre.cluster, remove.seqs, rename.seqs, reverse.seqs, screen.seqs, seq.error, sff.multiple, sff.info, shhh.flows, shhh.seqs, sort.seqs, split.groups, sra.info, sub.sample, summary.qual, summary.seqs, trim.flows, trim.seqs, unique.seqs

For more information about a specific command type 'commandName(help)' i.e. 'cluster(help)'


Common Questions: 

1. How do I cite mothur?
	Schloss, P.D., et al., Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537-41.

2. Do you have an example analysis?
	Yes, https://mothur.org/wiki/454_SOP and https://mothur.org/wiki/MiSeq_SOP highlight some of the things you can do with mothur.

3. Do you offer workshops?
	Yes! Please see our https://mothur.org/wiki/Workshops page for more information.

4. What are mothur's file types?
	Mothur uses and creates many file types. Including fasta, name, group, design, count, list, rabund, sabund, shared, relabund, oligos, taxonomy, constaxonomy, phylip, column, flow, qfile, file, biom and tree. You can find out more about these formats here: https://www.mothur.org/wiki/File_Types.

5. Is there a list of all of mothur's commands?
	Yes! You can find it here, http://www.mothur.org/wiki/Category:Commands.

6. Why does the cutoff change when I cluster with average neighbor?
	This is a product of using the average neighbor algorithm with a sparse distance matrix. When you run cluster, the algorithm looks for pairs of sequences to merge in the rows and columns that are getting merged together. Let's say you set the cutoff to 0.05. If one cell has a distance of 0.03 and the cell it is getting merged with has a distance above 0.05 then the cutoff is reset to 0.03, because it's not possible to merge at a higher level and keep all the data. All of the sequences are still there from multiple phyla. Incidentally, although we always see this, it is a bigger problem for people that include sequences that do not fully overlap.


Back to Top

Installation

Source code is obtained from Mothur github

System

64-bit Linux