DMRIharmonization-Sapelo2: Difference between revisions

From Research Computing Center Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
(4 intermediate revisions by the same user not shown)
Line 38: Line 38:

Below is an example of a job submission script ( to run with 24 parallel processes on a single compute node on the batch parttition:<syntaxhighlight lang="shell">
Below is an example of a job submission script ( to run '''''' with 24 parallel processes on a single compute node on the batch partition:<syntaxhighlight lang="shell">
#SBATCH --job-name=test_dMRIharmonization           
#SBATCH --job-name=test_dMRIharmonization           
#SBATCH --partition=batch           
#SBATCH --partition=batch           
#SBATCH --mem=100G
#SBATCH --nodes=1
#SBATCH --nodes=1
#SBATCH --ntasks=24
#SBATCH --ntasks=24
#SBATCH --cpus-per-task=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=100G
#SBATCH --time=7-00:00:00
#SBATCH --time=7-00:00:00
#SBATCH --constraint="Genoa|Milan"
#SBATCH --constraint="Genoa|Milan"
Line 62: Line 62: --nproc 24 <your other options and arguments> --nproc 24 <your other options and arguments>
</syntaxhighlight>In your actual submission script, please ensure that you request the appropriate computing resources for your job. For example, you can request CPU cores using the Slurm headers, such as <code>--ntasks=24</code> and <code>--cpus-per-task=1</code>.
</syntaxhighlight>In your actual submission script, please ensure that you request the appropriate computing resources for your job. For example, you can request CPU cores for running parallel processes using the Slurm headers, such as <code>--ntasks=24</code> and <code>--cpus-per-task=1</code>.

'''Please note:'''
'''Please note:'''

* Use the header <code>--constraint="Genoa|Milan"</code> in your job submission script for optimal job performance.
* Use the header <code>--constraint="Genoa|Milan"</code> in your job submission script for optimal job performance.
* The value for <code>--ntasks</code>, e.g., <code>--ntasks=24</code>, should match the number specified for <code>--nproc</code>, i.e., <code>--nproc=24</code>.
* The value for the header <code>--ntasks</code>, e.g., <code>--ntasks=24</code>, should match the number specified for the <code>--nproc</code> option on your command line, i.e., <code>--nproc=24</code>.
* We recommend setting <code>--cpus-per-task=1</code> and exporting <code>OMP_NUM_THREADS=1</code> by including <code>export OMP_NUM_THREADS=1</code>in your job submission script.
* We highly recommend setting <code>--cpus-per-task=1</code> and exporting <code>OMP_NUM_THREADS=1</code> by including <code>export OMP_NUM_THREADS=1</code>in your job submission script.

Here is an example of job submission command:
Here is an example of job submission command:
Line 77: Line 80:
=== Documentation ===
=== Documentation ===
<pre class="gcommand">
<pre class="gcommand">
ml CellRanger-ATAC/1.2.0
ml dMRIharmonization/20240227
cellranger-atac -h
source ${EBROOTDMRIHARMONIZATION}/harmonization/bin/activate
cellranger-atac -h (1.2.0) -h
Copyright (c) 2019 10x Genomics, Inc.  All rights reserved.
    cellranger-atac mkfastq
    cellranger-atac count
    cellranger-atac aggr
    cellranger-atac reanalyze
    cellranger-atac mkref
    cellranger-atac testrun
    cellranger-atac upload
    cellranger-atac sitecheck

dMRIharmonization (2018) pipeline is written by-

cellranger-atac count -h
Brigham and Women's Hospital/Harvard Medical School,

cellranger-atac count (1.2.0)
Copyright (c) 2019 10x Genomics, Inc.  All rights reserved.
See details at
Submit issues at

The commands below should be preceded by 'cellranger-atac':
Template creation, harmonization, and debugging

     count [SWITCHES]  
    count <run_id> <mro> [options]
    count -h | --help | --version
    id      A unique run id, used to name output folder [a-zA-Z0-9_-]+.
    fastqs  Path of folder created by mkfastq or bcl2fastq.
    sample  Prefix of the filenames of FASTQs to select.
# Sample Specification
    --reference=PATH Path of folder containing a 10x-compatible reference.
    --description=TEXT  More detailed sample description. Optional.
    --lanes=NUMS        Comma-separated lane numbers.
    --indices=INDICES  Deprecated. Not needed with the output of
    cellranger-atac mkfastq, or bcl2fastq
    --project=TEXT      Name of the project folder within a mkfastq or
                            bcl2fastq-generated folder to pick FASTQs from.
# ATAC analysis
    --force-cells=N    Define the top N barcodes with the most reads as
                            cells. N must be a positive integer <=
                            20,000. Please consult the documentation
                            before using this option. Optional.
    --dim-reduce=MODE  Dimensionality reduction mode for clustering: 'lsa'
                            (default), 'plsa', or 'pca'. Optional.
# Downsampling
    --downsample=GB    Downsample input FASTQs to approximately GB
                            gigabases of input sequence. Optional.
# Martian Runtime
    --jobmode=MODE      Job manager to use. Valid options:
                            local (default), sge, lsf, or a .template file
    --localcores=NUM    Set max cores the pipeline may request at one time.
                            Only applies to local jobs.
    --localmem=NUM      Set max GB the pipeline may request at one time.
                            Only applies to local jobs.
    --localvmem=NUM    Set max virtual address space in GB for the pipeline.
                            Only applies to local jobs.
    --mempercore=NUM    Reserve enough threads for each job to ensure enough
                        memory will be available, assuming each core on your
                        cluster has at least this much memory available.
                            Only applies in cluster jobmodes.
    --maxjobs=NUM      Set max jobs submitted to cluster at one time.
                            Only applies in cluster jobmodes.
    --jobinterval=NUM  Set delay between submitting jobs to cluster, in ms.
                            Only applies in cluster jobmodes.
    --overrides=PATH    The path to a JSON file that specifies stage-level
                            overrides for cores and memory.  Finer-grained
                            than --localcores, --mempercore and --localmem.
                            Consult the 10x support website for an example
                            override file.
    --uiport=PORT      Serve web UI at http://localhost:PORT
    --disable-ui        Do not serve the UI.
    --noexit            Keep web UI running after pipestance completes or fails.
    --nopreflight      Skip preflight checks.
    -h --help          Show this message.
    --version          Show version.

Note: 'cellranger-atac count' works as follows:
set --fastqs to the folder containing FASTQ files. In addition,
    -h, --help                          Prints this help message and quits
set --sample to the name prefixed to the FASTQ files comprising your sample.
    --help-all                          Prints help messages of all sub-commands and quits
For example, if your FASTQs are named:
    -v, --version                      Prints the program's version and quits
then set --sample=subject1

    --bvalMap VALUE:str                specify a bmax to scale bvalues into
    --create                            turn on this flag to create template
    --debug                            turn on this flag to debug harmonized data (valid only with --process)
    --denoise                          turn on this flag to denoise voxel data
    --force                            turn on this flag to overwrite existing data
    --harm_list VALUE:ExistingFile      harmonized csv/txt file with first column for dwi and 2nd column for mask: dwi1,mask1\n dwi2,mask2\n...
    --nproc VALUE:str                  number of processes/threads to use (-1 for all available, may slow down your system); the default is 4
    --nshm VALUE:str                    spherical harmonic order; the default is -1
    --nzero VALUE:str                  number of zero padding for denoising skull region during signal reconstruction; the default is 10
    --process                          turn on this flag to harmonize
    --ref_list VALUE:ExistingFile      reference csv/txt file with first column for dwi and 2nd column for mask: dwi1,mask1\n dwi2,mask2\n...
    --ref_name VALUE:str                reference site name
    --resample VALUE:str                voxel size MxNxO to resample into
    --stats                            print statistics of all sites, useful for recomputing --debug statistics separately
    --tar_list VALUE:ExistingFile      target csv/txt file with first column for dwi and 2nd column for mask: dwi1,mask1\n dwi2,mask2\n...; required
    --tar_name VALUE:str                target site name; required
    --template VALUE:str                template directory; required
    --travelHeads                      travelling heads
[[#top|Back to Top]]
[[#top|Back to Top]]
Line 183: Line 133:
=== Installation ===
=== Installation ===
Source code is download from
Source code is download from
=== System ===
=== System ===
64-bit Linux
64-bit Linux

Latest revision as of 10:01, 17 September 2024



Program On




Author / Distributor



"dMRIharmonization repository is developed by Tashrif Billah, Sylvain Bouix, Suheyla Cetin Karayumak, and Yogesh Rathi, Brigham and Women's Hospital (Harvard Medical School)." More details are at

Running Program

  • Version 20240227 is installed as a Python virtual environment on Sapelo2 at /apps/gb/dMRIharmonization/20240227

To use it, please load the module and activate its env with:

ml dMRIharmonization/20240227
source ${EBROOTDMRIHARMONIZATION}/harmonization/bin/activate

To deactivate its env, please do:


Below is an example of a job submission script ( to run with 24 parallel processes on a single compute node on the batch partition:

#SBATCH --job-name=test_dMRIharmonization           
#SBATCH --partition=batch          
#SBATCH --nodes=1
#SBATCH --ntasks=24
#SBATCH --cpus-per-task=1
#SBATCH --mem=100G
#SBATCH --time=7-00:00:00
#SBATCH --constraint="Genoa|Milan"

#SBATCH --mail-type=ALL     
#SBATCH --mail-user=<yourMyID>


ml purge
ml dMRIharmonization/20240227
source ${EBROOTDMRIHARMONIZATION}/harmonization/bin/activate

export OMP_NUM_THREADS=1 --nproc 24 <your other options and arguments>


In your actual submission script, please ensure that you request the appropriate computing resources for your job. For example, you can request CPU cores for running parallel processes using the Slurm headers, such as --ntasks=24 and --cpus-per-task=1.

Please note:

  • Use the header --constraint="Genoa|Milan" in your job submission script for optimal job performance.
  • The value for the header --ntasks, e.g., --ntasks=24, should match the number specified for the --nproc option on your command line, i.e., --nproc=24.
  • We highly recommend setting --cpus-per-task=1 and exporting OMP_NUM_THREADS=1 by including export OMP_NUM_THREADS=1in your job submission script.

Here is an example of job submission command:

sbatch ./ 


ml dMRIharmonization/20240227 
source ${EBROOTDMRIHARMONIZATION}/harmonization/bin/activate

dMRIharmonization (2018) pipeline is written by-

Brigham and Women's Hospital/Harvard Medical School,

See details at
Submit issues at

Template creation, harmonization, and debugging

Usage: [SWITCHES] 

    -h, --help                          Prints this help message and quits
    --help-all                          Prints help messages of all sub-commands and quits
    -v, --version                       Prints the program's version and quits

    --bvalMap VALUE:str                 specify a bmax to scale bvalues into
    --create                            turn on this flag to create template
    --debug                             turn on this flag to debug harmonized data (valid only with --process)
    --denoise                           turn on this flag to denoise voxel data
    --force                             turn on this flag to overwrite existing data
    --harm_list VALUE:ExistingFile      harmonized csv/txt file with first column for dwi and 2nd column for mask: dwi1,mask1\n dwi2,mask2\n...
    --nproc VALUE:str                   number of processes/threads to use (-1 for all available, may slow down your system); the default is 4
    --nshm VALUE:str                    spherical harmonic order; the default is -1
    --nzero VALUE:str                   number of zero padding for denoising skull region during signal reconstruction; the default is 10
    --process                           turn on this flag to harmonize
    --ref_list VALUE:ExistingFile       reference csv/txt file with first column for dwi and 2nd column for mask: dwi1,mask1\n dwi2,mask2\n...
    --ref_name VALUE:str                reference site name
    --resample VALUE:str                voxel size MxNxO to resample into
    --stats                             print statistics of all sites, useful for recomputing --debug statistics separately
    --tar_list VALUE:ExistingFile       target csv/txt file with first column for dwi and 2nd column for mask: dwi1,mask1\n dwi2,mask2\n...; required
    --tar_name VALUE:str                target site name; required
    --template VALUE:str                template directory; required
    --travelHeads                       travelling heads

Back to Top


Source code is download from


64-bit Linux