DMRIharmonization-Sapelo2: Difference between revisions
(Created page with "Category:Sapelo2Category:SoftwareCategory:Engineering === Category === Engineering === Program On === Sapelo2 === Version === 20240227 === Author / Distributor === See https://github.com/pnlbwh/dMRIharmonization === Description === "dMRIharmonization repository is developed by Tashrif Billah, Sylvain Bouix, Suheyla Cetin Karayumak, and Yogesh Rathi, Brigham and Women's Hospital (Harvard Medical School)." For more information, please see https://...") |
No edit summary |
||
Line 19: | Line 19: | ||
=== Description === | === Description === | ||
"dMRIharmonization repository is developed by Tashrif Billah, Sylvain Bouix, Suheyla Cetin Karayumak, and Yogesh Rathi, Brigham and Women's Hospital (Harvard Medical School)." | "dMRIharmonization repository is developed by Tashrif Billah, Sylvain Bouix, Suheyla Cetin Karayumak, and Yogesh Rathi, Brigham and Women's Hospital (Harvard Medical School)." | ||
More details are at https://github.com/pnlbwh/dMRIharmonization. | |||
=== Running Program === | |||
* Version 20240227 is installed as a Python virtual environment on Sapelo2 at /apps/gb/dMRIharmonization/20240227 | |||
To use it, please load the module and activate its env with: | |||
<pre class="gscript"> | |||
ml dMRIharmonization/20240227 | |||
source ${EBROOTDMRIHARMONIZATION}/harmonization/bin/activate | |||
source ${EBROOTDMRIHARMONIZATION}/../env.sh | |||
</pre> | |||
To deactivate its env, please do: | |||
<pre class="gscript"> | |||
deactivate | |||
</pre> | |||
<nowiki>#</nowiki> | |||
Below is an example of a job submission script (sub.sh) to run harmonization.py on the batch queue: | |||
<div class="gscript2"> | |||
#!/bin/bash | |||
#SBATCH --job-name=dc_h | |||
#SBATCH --partition=batch | |||
#SBATCH --mem=160G | |||
#SBATCH --nodes=1 | |||
#SBATCH --ntasks=24 | |||
#SBATCH --cpus-per-task=1 | |||
#SBATCH --time=7-00 | |||
#SBATCH --constraint="Genoa|Milan" | |||
#SBATCH --mail-type=ALL | |||
#SBATCH --mail-user=jbrown95@uga.edu | |||
cd $SLURM_SUBMIT_DIR | |||
ml purge | |||
ml dMRIharmonization/20240227 | |||
source ${EBROOTDMRIHARMONIZATION}/harmonization/bin/activate | |||
source ${EBROOTDMRIHARMONIZATION}/../env.sh | |||
export OMP_NUM_THREADS=1 | |||
site=dallas | |||
harmonization.py \ | |||
--tar_list full_inputs/${site}_all.csv \ | |||
--tar_name ${site} \ | |||
--template ${site}_to_chicago_template/ \ | |||
--nshm 8 \ | |||
--nzero 10 \ | |||
--nproc 24 \ | |||
--process | |||
</div> | |||
In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values. | |||
Please use '''--constraint="Genoa|Milan"''' header in your job submission script for a quicker job start time and optimal job performance. | |||
Here is an example of job submission command: | |||
<pre class="gcommand"> | |||
sbatch ./sub.sh | |||
</pre> | |||
=== Documentation === | |||
<pre class="gcommand"> | |||
ml CellRanger-ATAC/1.2.0 | |||
cellranger-atac -h | |||
cellranger-atac -h (1.2.0) | |||
Copyright (c) 2019 10x Genomics, Inc. All rights reserved. | |||
------------------------------------------------------------------------------- | |||
Usage: | |||
cellranger-atac mkfastq | |||
cellranger-atac count | |||
cellranger-atac aggr | |||
cellranger-atac reanalyze | |||
cellranger-atac mkref | |||
cellranger-atac testrun | |||
cellranger-atac upload | |||
cellranger-atac sitecheck | |||
cellranger-atac count -h | |||
cellranger-atac count (1.2.0) | |||
Copyright (c) 2019 10x Genomics, Inc. All rights reserved. | |||
------------------------------------------------------------------------------- | |||
The commands below should be preceded by 'cellranger-atac': | |||
Usage: | |||
count | |||
--id=ID | |||
--fastqs=PATH | |||
[--sample=PREFIX] | |||
[options] | |||
count <run_id> <mro> [options] | |||
count -h | --help | --version | |||
Arguments: | |||
id A unique run id, used to name output folder [a-zA-Z0-9_-]+. | |||
fastqs Path of folder created by mkfastq or bcl2fastq. | |||
sample Prefix of the filenames of FASTQs to select. | |||
Options: | |||
# Sample Specification | |||
--reference=PATH Path of folder containing a 10x-compatible reference. | |||
Required. | |||
--description=TEXT More detailed sample description. Optional. | |||
--lanes=NUMS Comma-separated lane numbers. | |||
--indices=INDICES Deprecated. Not needed with the output of | |||
cellranger-atac mkfastq, or bcl2fastq | |||
--project=TEXT Name of the project folder within a mkfastq or | |||
bcl2fastq-generated folder to pick FASTQs from. | |||
# ATAC analysis | |||
--force-cells=N Define the top N barcodes with the most reads as | |||
cells. N must be a positive integer <= | |||
20,000. Please consult the documentation | |||
before using this option. Optional. | |||
--dim-reduce=MODE Dimensionality reduction mode for clustering: 'lsa' | |||
(default), 'plsa', or 'pca'. Optional. | |||
# Downsampling | |||
--downsample=GB Downsample input FASTQs to approximately GB | |||
gigabases of input sequence. Optional. | |||
# Martian Runtime | |||
--jobmode=MODE Job manager to use. Valid options: | |||
local (default), sge, lsf, or a .template file | |||
--localcores=NUM Set max cores the pipeline may request at one time. | |||
Only applies to local jobs. | |||
--localmem=NUM Set max GB the pipeline may request at one time. | |||
Only applies to local jobs. | |||
--localvmem=NUM Set max virtual address space in GB for the pipeline. | |||
Only applies to local jobs. | |||
--mempercore=NUM Reserve enough threads for each job to ensure enough | |||
memory will be available, assuming each core on your | |||
cluster has at least this much memory available. | |||
Only applies in cluster jobmodes. | |||
--maxjobs=NUM Set max jobs submitted to cluster at one time. | |||
Only applies in cluster jobmodes. | |||
--jobinterval=NUM Set delay between submitting jobs to cluster, in ms. | |||
Only applies in cluster jobmodes. | |||
--overrides=PATH The path to a JSON file that specifies stage-level | |||
overrides for cores and memory. Finer-grained | |||
than --localcores, --mempercore and --localmem. | |||
Consult the 10x support website for an example | |||
override file. | |||
--uiport=PORT Serve web UI at http://localhost:PORT | |||
--disable-ui Do not serve the UI. | |||
--noexit Keep web UI running after pipestance completes or fails. | |||
--nopreflight Skip preflight checks. | |||
-h --help Show this message. | |||
--version Show version. | |||
Note: 'cellranger-atac count' works as follows: | |||
set --fastqs to the folder containing FASTQ files. In addition, | |||
set --sample to the name prefixed to the FASTQ files comprising your sample. | |||
For example, if your FASTQs are named: | |||
subject1_S1_L001_R1_001.fastq.gz | |||
then set --sample=subject1 | |||
</pre> | |||
[[#top|Back to Top]] | |||
=== Installation === | |||
Source code is download from https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/latest | |||
=== System === | |||
64-bit Linux |
Revision as of 08:31, 17 September 2024
Category
Engineering
Program On
Sapelo2
Version
20240227
Author / Distributor
See https://github.com/pnlbwh/dMRIharmonization
Description
"dMRIharmonization repository is developed by Tashrif Billah, Sylvain Bouix, Suheyla Cetin Karayumak, and Yogesh Rathi, Brigham and Women's Hospital (Harvard Medical School)." More details are at https://github.com/pnlbwh/dMRIharmonization.
Running Program
- Version 20240227 is installed as a Python virtual environment on Sapelo2 at /apps/gb/dMRIharmonization/20240227
To use it, please load the module and activate its env with:
ml dMRIharmonization/20240227 source ${EBROOTDMRIHARMONIZATION}/harmonization/bin/activate source ${EBROOTDMRIHARMONIZATION}/../env.sh
To deactivate its env, please do:
deactivate
#
Below is an example of a job submission script (sub.sh) to run harmonization.py on the batch queue:
- !/bin/bash
- SBATCH --job-name=dc_h
- SBATCH --partition=batch
- SBATCH --mem=160G
- SBATCH --nodes=1
- SBATCH --ntasks=24
- SBATCH --cpus-per-task=1
- SBATCH --time=7-00
- SBATCH --constraint="Genoa|Milan"
- SBATCH --mail-type=ALL
- SBATCH --mail-user=jbrown95@uga.edu
cd $SLURM_SUBMIT_DIR
ml purge ml dMRIharmonization/20240227 source ${EBROOTDMRIHARMONIZATION}/harmonization/bin/activate source ${EBROOTDMRIHARMONIZATION}/../env.sh
export OMP_NUM_THREADS=1
site=dallas
harmonization.py \ --tar_list full_inputs/${site}_all.csv \ --tar_name ${site} \ --template ${site}_to_chicago_template/ \ --nshm 8 \ --nzero 10 \ --nproc 24 \ --process
In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.
Please use --constraint="Genoa|Milan" header in your job submission script for a quicker job start time and optimal job performance.
Here is an example of job submission command:
sbatch ./sub.sh
Documentation
ml CellRanger-ATAC/1.2.0 cellranger-atac -h cellranger-atac -h (1.2.0) Copyright (c) 2019 10x Genomics, Inc. All rights reserved. ------------------------------------------------------------------------------- Usage: cellranger-atac mkfastq cellranger-atac count cellranger-atac aggr cellranger-atac reanalyze cellranger-atac mkref cellranger-atac testrun cellranger-atac upload cellranger-atac sitecheck cellranger-atac count -h cellranger-atac count (1.2.0) Copyright (c) 2019 10x Genomics, Inc. All rights reserved. ------------------------------------------------------------------------------- The commands below should be preceded by 'cellranger-atac': Usage: count --id=ID --fastqs=PATH [--sample=PREFIX] [options] count <run_id> <mro> [options] count -h | --help | --version Arguments: id A unique run id, used to name output folder [a-zA-Z0-9_-]+. fastqs Path of folder created by mkfastq or bcl2fastq. sample Prefix of the filenames of FASTQs to select. Options: # Sample Specification --reference=PATH Path of folder containing a 10x-compatible reference. Required. --description=TEXT More detailed sample description. Optional. --lanes=NUMS Comma-separated lane numbers. --indices=INDICES Deprecated. Not needed with the output of cellranger-atac mkfastq, or bcl2fastq --project=TEXT Name of the project folder within a mkfastq or bcl2fastq-generated folder to pick FASTQs from. # ATAC analysis --force-cells=N Define the top N barcodes with the most reads as cells. N must be a positive integer <= 20,000. Please consult the documentation before using this option. Optional. --dim-reduce=MODE Dimensionality reduction mode for clustering: 'lsa' (default), 'plsa', or 'pca'. Optional. # Downsampling --downsample=GB Downsample input FASTQs to approximately GB gigabases of input sequence. Optional. # Martian Runtime --jobmode=MODE Job manager to use. Valid options: local (default), sge, lsf, or a .template file --localcores=NUM Set max cores the pipeline may request at one time. Only applies to local jobs. --localmem=NUM Set max GB the pipeline may request at one time. Only applies to local jobs. --localvmem=NUM Set max virtual address space in GB for the pipeline. Only applies to local jobs. --mempercore=NUM Reserve enough threads for each job to ensure enough memory will be available, assuming each core on your cluster has at least this much memory available. Only applies in cluster jobmodes. --maxjobs=NUM Set max jobs submitted to cluster at one time. Only applies in cluster jobmodes. --jobinterval=NUM Set delay between submitting jobs to cluster, in ms. Only applies in cluster jobmodes. --overrides=PATH The path to a JSON file that specifies stage-level overrides for cores and memory. Finer-grained than --localcores, --mempercore and --localmem. Consult the 10x support website for an example override file. --uiport=PORT Serve web UI at http://localhost:PORT --disable-ui Do not serve the UI. --noexit Keep web UI running after pipestance completes or fails. --nopreflight Skip preflight checks. -h --help Show this message. --version Show version. Note: 'cellranger-atac count' works as follows: set --fastqs to the folder containing FASTQ files. In addition, set --sample to the name prefixed to the FASTQ files comprising your sample. For example, if your FASTQs are named: subject1_S1_L001_R1_001.fastq.gz then set --sample=subject1
Installation
Source code is download from https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/latest
System
64-bit Linux