Picard-Teaching: Difference between revisions
No edit summary |
No edit summary |
||
(One intermediate revision by the same user not shown) | |||
Line 44: | Line 44: | ||
cd $SLURM_SUBMIT_DIR<br> | cd $SLURM_SUBMIT_DIR<br> | ||
ml picard/2.16.0-Java-1.8.0_144<br> | ml picard/2.16.0-Java-1.8.0_144<br> | ||
java-jar/usr/local/apps/eb/picard/2.16.0-Java-1.8.0_144/picard.jar <u>[options]</u><br> | java -jar /usr/local/apps/eb/picard/2.16.0-Java-1.8.0_144/picard.jar <u>[options]</u><br> | ||
</div> | </div> | ||
In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values. | In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values. | ||
Line 60: | Line 60: | ||
<pre class="gcommand"> | <pre class="gcommand"> | ||
ml picard/2.16.0-Java-1.8.0_144 | ml picard/2.16.0-Java-1.8.0_144 | ||
java-jar/usr/local/apps/eb/picard/2.16.0-Java-1.8.0_144/picard.jar -h | java -jar /usr/local/apps/eb/picard/2.16.0-Java-1.8.0_144/picard.jar -h | ||
To execute picard run: java -jar $EBROOTPICARD/picard.jar[yhuang@hn-teach 3.0]$ java -jar /usr/local/apps/eb/picard/2.16.0-Java-1.8.0_144/picard.jar -h | |||
USAGE: PicardCommandLine <program name> [-h] | |||
Available Programs: | |||
-------------------------------------------------------------------------------------- | |||
Alpha Tools: Tools that are currently UNSUPPORTED until further testing and maturation. | |||
CollectIndependentReplicateMetrics (Experimental) Estimates the rate of independent replication of reads within a bam. | |||
CollectWgsMetricsWithNonZeroCoverage (Experimental) Collect metrics about coverage and performance of whole genome sequencing (WGS) experiments. | |||
UmiAwareMarkDuplicatesWithMateCigar Identifies duplicate reads using information from read positions and UMIs. | |||
-------------------------------------------------------------------------------------- | |||
Fasta: Tools for manipulating FASTA, or related data. | |||
CreateSequenceDictionary Creates a sequence dictionary for a reference sequence. | |||
ExtractSequences Subsets intervals from a reference sequence to a new FASTA file. | |||
NonNFastaSize Counts the number of non-N bases in a fasta file. | |||
NormalizeFasta Normalizes lines of sequence in a FASTA file to be of the same length. | |||
-------------------------------------------------------------------------------------- | |||
Fingerprinting Tools: Tools for manipulating fingerprints, or related data. | |||
CheckFingerprint Computes a fingerprint from the supplied input (SAM/BAM or VCF) file and compares it to the provided genotypes | |||
ClusterCrosscheckMetrics Clusters the results of a CrosscheckFingerprints run by LOD score. | |||
CrosscheckFingerprints Checks if all fingerprints appear to come from the same individual. | |||
CrosscheckReadGroupFingerprints DEPRECATED: USE CrosscheckFingerprints. Checks if all read groups appear to come from the same individual. | |||
-------------------------------------------------------------------------------------- | |||
Illumina Tools: Tools for manipulating data specific to Illumina sequencers. | |||
CheckIlluminaDirectory Asserts the validity for specified Illumina basecalling data. | |||
CollectIlluminaBasecallingMetrics Collects Illumina Basecalling metrics for a sequencing run. | |||
CollectIlluminaLaneMetrics Collects Illumina lane metrics for the given BaseCalling analysis directory. | |||
ExtractIlluminaBarcodes Tool determines the barcode for each read in an Illumina lane. | |||
IlluminaBasecallsToFastq Generate FASTQ file(s) from Illumina basecall read data. | |||
IlluminaBasecallsToSam Transforms raw Illumina sequencing data into an unmapped SAM or BAM file. | |||
MarkIlluminaAdapters Reads a SAM or BAM file and rewrites it with new adapter-trimming tags. | |||
-------------------------------------------------------------------------------------- | |||
Interval Tools: Tools for manipulating Picard interval lists. | |||
BedToIntervalList Converts a BED file to a Picard Interval List. | |||
IntervalListToBed Converts an Picard IntervalList file to a BED file. | |||
IntervalListTools Manipulates interval lists. | |||
LiftOverIntervalList Lifts over an interval list from one reference build to another. | |||
ScatterIntervalsByNs Writes an interval list based on splitting a reference by Ns. | |||
-------------------------------------------------------------------------------------- | |||
Metrics: Tools for reporting metrics on various data types. | |||
AccumulateVariantCallingMetrics Combines multiple Variant Calling Metrics files into a single file | |||
CollectAlignmentSummaryMetrics <b>Produces a summary of alignment metrics from a SAM or BAM file.</b> | |||
CollectBaseDistributionByCycle Chart the nucleotide distribution per cycle in a SAM or BAM file | |||
CollectGcBiasMetrics Collect metrics regarding GC bias. | |||
CollectHiSeqXPfFailMetrics Classify PF-Failing reads in a HiSeqX Illumina Basecalling directory into various categories. | |||
CollectHsMetrics Collects hybrid-selection (HS) metrics for a SAM or BAM file. | |||
CollectInsertSizeMetrics Collect metrics about the insert size distribution of a paired-end library. | |||
CollectJumpingLibraryMetrics Collect jumping library metrics. | |||
CollectMultipleMetrics Collect multiple classes of metrics. | |||
CollectOxoGMetrics Collect metrics to assess oxidative artifacts. | |||
CollectQualityYieldMetrics Collect metrics about reads that pass quality thresholds and Illumina-specific filters. | |||
CollectRawWgsMetrics Collect whole genome sequencing-related metrics. | |||
CollectRnaSeqMetrics Produces RNA alignment metrics for a SAM or BAM file. | |||
CollectRrbsMetrics <b>Collects metrics from reduced representation bisulfite sequencing (Rrbs) data.</b> | |||
CollectSequencingArtifactMetrics Collect metrics to quantify single-base sequencing artifacts. | |||
CollectTargetedPcrMetrics Calculate PCR-related metrics from targeted sequencing data. | |||
CollectVariantCallingMetrics Collects per-sample and aggregate (spanning all samples) metrics from the provided VCF file | |||
CollectWgsMetrics Collect metrics about coverage and performance of whole genome sequencing (WGS) experiments. | |||
CompareMetrics Compare two metrics files. | |||
ConvertSequencingArtifactToOxoG Extract OxoG metrics from generalized artifacts metrics. | |||
EstimateLibraryComplexity Estimates the numbers of unique molecules in a sequencing library. | |||
MeanQualityByCycle Collect mean quality by cycle. | |||
QualityScoreDistribution Chart the distribution of quality scores. | |||
-------------------------------------------------------------------------------------- | |||
Miscellaneous Tools: A set of miscellaneous tools. | |||
BaitDesigner Designs oligonucleotide baits for hybrid selection reactions. | |||
FifoBuffer FIFO buffer used to buffer input and output streams with a customizable buffer size | |||
-------------------------------------------------------------------------------------- | |||
SAM/BAM: Tools for manipulating SAM, BAM, or related data. | |||
AddCommentsToBam Adds comments to the header of a BAM file. | |||
AddOrReplaceReadGroups Replace read groups in a BAM file. | |||
BamIndexStats Generate index statistics from a BAM file | |||
BamToBfq Create BFQ files from a BAM file for use by the maq aligner. | |||
BuildBamIndex Generates a BAM index ".bai" file. | |||
CalculateReadGroupChecksum Creates a hash code based on the read groups (RG). | |||
CheckTerminatorBlock Asserts the provided gzip file's (e.g., BAM) last block is well-formed; RC 100 otherwise | |||
CleanSam Cleans the provided SAM/BAM, soft-clipping beyond-end-of-reference alignments and setting MAPQ to 0 for unmapped reads | |||
CompareSAMs Compare two input ".sam" or ".bam" files. | |||
DownsampleSam Downsample a SAM or BAM file. | |||
FastqToSam Converts a FASTQ file to an unaligned BAM or SAM file. | |||
FilterSamReads Subset read data from a SAM or BAM file | |||
FixMateInformation Verify mate-pair information between mates and fix if needed. | |||
GatherBamFiles Concatenate one or more BAM files as efficiently as possible | |||
MarkDuplicates Identifies duplicate reads. | |||
MarkDuplicatesWithMateCigar Identifies duplicate reads, accounting for mate CIGAR. | |||
MergeBamAlignment Merge alignment data from a SAM or BAM with data in an unmapped BAM file. | |||
MergeSamFiles Merges multiple SAM and/or BAM files into a single file. | |||
PositionBasedDownsampleSam Downsample a SAM or BAM file to retain a subset of the reads based on the reads location in each tile in the flowcell. | |||
ReorderSam Reorders reads in a SAM or BAM file to match ordering in reference | |||
ReplaceSamHeader Replaces the SAMFileHeader in a SAM or BAM file. | |||
RevertOriginalBaseQualitiesAndAddMateCigar Reverts the original base qualities and adds the mate cigar tag to read-group BAMs | |||
RevertSam Reverts SAM or BAM files to a previous state. | |||
SamFormatConverter Convert a BAM file to a SAM file, or a SAM to a BAM | |||
SamToFastq Converts a SAM or BAM file to FASTQ. | |||
SetNmAndUqTags DEPRECATED: Use SetNmMdAndUqTags instead. | |||
SetNmMdAndUqTags Fixes the NM, MD, and UQ tags in a SAM file. | |||
SortSam Sorts a SAM or BAM file. | |||
SplitSamByLibrary Splits a SAM or BAM file into individual files by library | |||
SplitSamByNumberOfReads Splits a SAM or BAM file to multiple BAMs. | |||
ValidateSamFile Validates a SAM or BAM file. | |||
ViewSam Prints a SAM or BAM file to the screen | |||
-------------------------------------------------------------------------------------- | |||
Unit Testing: Unit testing | |||
SimpleMarkDuplicatesWithMateCigar (Experimental) Examines aligned records in the supplied SAM or BAM file to locate duplicate molecules. | |||
-------------------------------------------------------------------------------------- | |||
VCF/BCF: Tools for manipulating VCF, BCF, or related data. | |||
FilterVcf Hard filters a VCF. | |||
FindMendelianViolations Finds mendelian violations of all types within a VCF | |||
FixVcfHeader Replaces or fixes a VCF header. | |||
GatherVcfs Gathers multiple VCF files from a scatter operation into a single VCF file | |||
GenotypeConcordance Evaluate genotype concordance between callsets. | |||
LiftoverVcf Lifts over a VCF file from one reference build to another. | |||
MakeSitesOnlyVcf Creates a VCF bereft of genotype information from an input VCF or BCF | |||
MergeVcfs Merges multiple VCF or BCF files into one VCF file or BCF | |||
RenameSampleInVcf Renames a sample within a VCF or BCF. | |||
SortVcf Sorts one or more VCF files. | |||
SplitVcfs Splits SNPs and INDELs into separate files. | |||
UpdateVcfSequenceDictionary Takes a VCF and a second file that contains a sequence dictionary and updates the VCF with the new sequence dictionary. | |||
VcfFormatConverter Converts VCF to BCF or BCF to VCF. | |||
VcfToIntervalList Converts a VCF or BCF file to a Picard Interval List. | |||
-------------------------------------------------------------------------------------- | |||
</pre> | </pre> |
Latest revision as of 14:43, 15 August 2018
Category
Bioinformatics
Program On
Teaching
Version
2.16.0
Author / Distributor
Description
"A set of tools (in Java) for working with next generation sequencing data in the BAM format." More details are at picard
Running Program
The last version of this application is at /usr/local/apps/eb/picard/2.16.0-Java-1.8.0_144
To use this version, please load the module with
ml picard/2.16.0-Java-1.8.0_144
Here is an example of a shell script, sub.sh, to run on the batch queue:
#!/bin/bash
#SBATCH --job-name=j_picard
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=10gb
#SBATCH --time=08:00:00
#SBATCH --output=picard.%j.out
#SBATCH --error=picard.%j.err
cd $SLURM_SUBMIT_DIR
ml picard/2.16.0-Java-1.8.0_144
java -jar /usr/local/apps/eb/picard/2.16.0-Java-1.8.0_144/picard.jar [options]
In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.
Please refer to Running_Jobs_on_the_teaching_cluster, Run X window Jobs and Run interactive Jobs for more details of running jobs at Teaching cluster.
Here is an example of job submission command:
sbatch ./sub.sh
Documentation
ml picard/2.16.0-Java-1.8.0_144 java -jar /usr/local/apps/eb/picard/2.16.0-Java-1.8.0_144/picard.jar -h To execute picard run: java -jar $EBROOTPICARD/picard.jar[yhuang@hn-teach 3.0]$ java -jar /usr/local/apps/eb/picard/2.16.0-Java-1.8.0_144/picard.jar -h USAGE: PicardCommandLine <program name> [-h] Available Programs: -------------------------------------------------------------------------------------- Alpha Tools: Tools that are currently UNSUPPORTED until further testing and maturation. CollectIndependentReplicateMetrics (Experimental) Estimates the rate of independent replication of reads within a bam. CollectWgsMetricsWithNonZeroCoverage (Experimental) Collect metrics about coverage and performance of whole genome sequencing (WGS) experiments. UmiAwareMarkDuplicatesWithMateCigar Identifies duplicate reads using information from read positions and UMIs. -------------------------------------------------------------------------------------- Fasta: Tools for manipulating FASTA, or related data. CreateSequenceDictionary Creates a sequence dictionary for a reference sequence. ExtractSequences Subsets intervals from a reference sequence to a new FASTA file. NonNFastaSize Counts the number of non-N bases in a fasta file. NormalizeFasta Normalizes lines of sequence in a FASTA file to be of the same length. -------------------------------------------------------------------------------------- Fingerprinting Tools: Tools for manipulating fingerprints, or related data. CheckFingerprint Computes a fingerprint from the supplied input (SAM/BAM or VCF) file and compares it to the provided genotypes ClusterCrosscheckMetrics Clusters the results of a CrosscheckFingerprints run by LOD score. CrosscheckFingerprints Checks if all fingerprints appear to come from the same individual. CrosscheckReadGroupFingerprints DEPRECATED: USE CrosscheckFingerprints. Checks if all read groups appear to come from the same individual. -------------------------------------------------------------------------------------- Illumina Tools: Tools for manipulating data specific to Illumina sequencers. CheckIlluminaDirectory Asserts the validity for specified Illumina basecalling data. CollectIlluminaBasecallingMetrics Collects Illumina Basecalling metrics for a sequencing run. CollectIlluminaLaneMetrics Collects Illumina lane metrics for the given BaseCalling analysis directory. ExtractIlluminaBarcodes Tool determines the barcode for each read in an Illumina lane. IlluminaBasecallsToFastq Generate FASTQ file(s) from Illumina basecall read data. IlluminaBasecallsToSam Transforms raw Illumina sequencing data into an unmapped SAM or BAM file. MarkIlluminaAdapters Reads a SAM or BAM file and rewrites it with new adapter-trimming tags. -------------------------------------------------------------------------------------- Interval Tools: Tools for manipulating Picard interval lists. BedToIntervalList Converts a BED file to a Picard Interval List. IntervalListToBed Converts an Picard IntervalList file to a BED file. IntervalListTools Manipulates interval lists. LiftOverIntervalList Lifts over an interval list from one reference build to another. ScatterIntervalsByNs Writes an interval list based on splitting a reference by Ns. -------------------------------------------------------------------------------------- Metrics: Tools for reporting metrics on various data types. AccumulateVariantCallingMetrics Combines multiple Variant Calling Metrics files into a single file CollectAlignmentSummaryMetrics <b>Produces a summary of alignment metrics from a SAM or BAM file.</b> CollectBaseDistributionByCycle Chart the nucleotide distribution per cycle in a SAM or BAM file CollectGcBiasMetrics Collect metrics regarding GC bias. CollectHiSeqXPfFailMetrics Classify PF-Failing reads in a HiSeqX Illumina Basecalling directory into various categories. CollectHsMetrics Collects hybrid-selection (HS) metrics for a SAM or BAM file. CollectInsertSizeMetrics Collect metrics about the insert size distribution of a paired-end library. CollectJumpingLibraryMetrics Collect jumping library metrics. CollectMultipleMetrics Collect multiple classes of metrics. CollectOxoGMetrics Collect metrics to assess oxidative artifacts. CollectQualityYieldMetrics Collect metrics about reads that pass quality thresholds and Illumina-specific filters. CollectRawWgsMetrics Collect whole genome sequencing-related metrics. CollectRnaSeqMetrics Produces RNA alignment metrics for a SAM or BAM file. CollectRrbsMetrics <b>Collects metrics from reduced representation bisulfite sequencing (Rrbs) data.</b> CollectSequencingArtifactMetrics Collect metrics to quantify single-base sequencing artifacts. CollectTargetedPcrMetrics Calculate PCR-related metrics from targeted sequencing data. CollectVariantCallingMetrics Collects per-sample and aggregate (spanning all samples) metrics from the provided VCF file CollectWgsMetrics Collect metrics about coverage and performance of whole genome sequencing (WGS) experiments. CompareMetrics Compare two metrics files. ConvertSequencingArtifactToOxoG Extract OxoG metrics from generalized artifacts metrics. EstimateLibraryComplexity Estimates the numbers of unique molecules in a sequencing library. MeanQualityByCycle Collect mean quality by cycle. QualityScoreDistribution Chart the distribution of quality scores. -------------------------------------------------------------------------------------- Miscellaneous Tools: A set of miscellaneous tools. BaitDesigner Designs oligonucleotide baits for hybrid selection reactions. FifoBuffer FIFO buffer used to buffer input and output streams with a customizable buffer size -------------------------------------------------------------------------------------- SAM/BAM: Tools for manipulating SAM, BAM, or related data. AddCommentsToBam Adds comments to the header of a BAM file. AddOrReplaceReadGroups Replace read groups in a BAM file. BamIndexStats Generate index statistics from a BAM file BamToBfq Create BFQ files from a BAM file for use by the maq aligner. BuildBamIndex Generates a BAM index ".bai" file. CalculateReadGroupChecksum Creates a hash code based on the read groups (RG). CheckTerminatorBlock Asserts the provided gzip file's (e.g., BAM) last block is well-formed; RC 100 otherwise CleanSam Cleans the provided SAM/BAM, soft-clipping beyond-end-of-reference alignments and setting MAPQ to 0 for unmapped reads CompareSAMs Compare two input ".sam" or ".bam" files. DownsampleSam Downsample a SAM or BAM file. FastqToSam Converts a FASTQ file to an unaligned BAM or SAM file. FilterSamReads Subset read data from a SAM or BAM file FixMateInformation Verify mate-pair information between mates and fix if needed. GatherBamFiles Concatenate one or more BAM files as efficiently as possible MarkDuplicates Identifies duplicate reads. MarkDuplicatesWithMateCigar Identifies duplicate reads, accounting for mate CIGAR. MergeBamAlignment Merge alignment data from a SAM or BAM with data in an unmapped BAM file. MergeSamFiles Merges multiple SAM and/or BAM files into a single file. PositionBasedDownsampleSam Downsample a SAM or BAM file to retain a subset of the reads based on the reads location in each tile in the flowcell. ReorderSam Reorders reads in a SAM or BAM file to match ordering in reference ReplaceSamHeader Replaces the SAMFileHeader in a SAM or BAM file. RevertOriginalBaseQualitiesAndAddMateCigar Reverts the original base qualities and adds the mate cigar tag to read-group BAMs RevertSam Reverts SAM or BAM files to a previous state. SamFormatConverter Convert a BAM file to a SAM file, or a SAM to a BAM SamToFastq Converts a SAM or BAM file to FASTQ. SetNmAndUqTags DEPRECATED: Use SetNmMdAndUqTags instead. SetNmMdAndUqTags Fixes the NM, MD, and UQ tags in a SAM file. SortSam Sorts a SAM or BAM file. SplitSamByLibrary Splits a SAM or BAM file into individual files by library SplitSamByNumberOfReads Splits a SAM or BAM file to multiple BAMs. ValidateSamFile Validates a SAM or BAM file. ViewSam Prints a SAM or BAM file to the screen -------------------------------------------------------------------------------------- Unit Testing: Unit testing SimpleMarkDuplicatesWithMateCigar (Experimental) Examines aligned records in the supplied SAM or BAM file to locate duplicate molecules. -------------------------------------------------------------------------------------- VCF/BCF: Tools for manipulating VCF, BCF, or related data. FilterVcf Hard filters a VCF. FindMendelianViolations Finds mendelian violations of all types within a VCF FixVcfHeader Replaces or fixes a VCF header. GatherVcfs Gathers multiple VCF files from a scatter operation into a single VCF file GenotypeConcordance Evaluate genotype concordance between callsets. LiftoverVcf Lifts over a VCF file from one reference build to another. MakeSitesOnlyVcf Creates a VCF bereft of genotype information from an input VCF or BCF MergeVcfs Merges multiple VCF or BCF files into one VCF file or BCF RenameSampleInVcf Renames a sample within a VCF or BCF. SortVcf Sorts one or more VCF files. SplitVcfs Splits SNPs and INDELs into separate files. UpdateVcfSequenceDictionary Takes a VCF and a second file that contains a sequence dictionary and updates the VCF with the new sequence dictionary. VcfFormatConverter Converts VCF to BCF or BCF to VCF. VcfToIntervalList Converts a VCF or BCF file to a Picard Interval List. --------------------------------------------------------------------------------------
Installation
Source code is obtained from picard
System
64-bit Linux