DMRIharmonization-Sapelo2: Difference between revisions

From Research Computing Center Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 38: Line 38:
</pre>
</pre>


Below is an example of a job submission script (sub.sh) to run harmonization.py with 24 parallel processes on a single compute node on the batch parttition:<syntaxhighlight lang="shell">
Below is an example of a job submission script (sub.sh) to run '''harmonization.py''' with 24 parallel processes on a single compute node on the batch parttition:<syntaxhighlight lang="shell">
#!/bin/bash
#!/bin/bash
#SBATCH --job-name=test_dMRIharmonization           
#SBATCH --job-name=test_dMRIharmonization           
#SBATCH --partition=batch           
#SBATCH --partition=batch           
#SBATCH --mem=100G
#SBATCH --nodes=1
#SBATCH --nodes=1
#SBATCH --ntasks=24
#SBATCH --ntasks=24
#SBATCH --cpus-per-task=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=100G
#SBATCH --time=7-00:00:00
#SBATCH --time=7-00:00:00
#SBATCH --constraint="Genoa|Milan"
#SBATCH --constraint="Genoa|Milan"
Line 62: Line 62:


harmonization.py --nproc 24 <your other options and arguments>
harmonization.py --nproc 24 <your other options and arguments>
</syntaxhighlight>In your actual submission script, please ensure that you request the appropriate computing resources for your job. For example, you can request CPU cores using the Slurm headers, such as <code>--ntasks=24</code> and <code>--cpus-per-task=1</code>.
</syntaxhighlight>In your actual submission script, please ensure that you request the appropriate computing resources for your job. For example, you can request CPU cores for running parallel processes using the Slurm headers, such as <code>--ntasks=24</code> and <code>--cpus-per-task=1</code>.


'''Please note:'''
'''Please note:'''


* Use the header <code>--constraint="Genoa|Milan"</code> in your job submission script for optimal job performance.
* Use the header <code>--constraint="Genoa|Milan"</code> in your job submission script for optimal job performance.
* The value for <code>--ntasks</code>, e.g., <code>--ntasks=24</code>, should match the number specified for <code>--nproc</code>, i.e., <code>--nproc=24</code>.
* The value for the header <code>--ntasks</code>, e.g., <code>--ntasks=24</code>, should match the number specified for the <code>--nproc</code> option on your command line, i.e., <code>--nproc=24</code>.
* We recommend setting <code>--cpus-per-task=1</code> and exporting <code>OMP_NUM_THREADS=1</code> by including <code>export OMP_NUM_THREADS=1</code>in your job submission script.
* We highly recommend setting <code>--cpus-per-task=1</code> and exporting <code>OMP_NUM_THREADS=1</code> by including <code>export OMP_NUM_THREADS=1</code>in your job submission script.


Here is an example of job submission command:
Here is an example of job submission command:
Line 77: Line 77:
=== Documentation ===
=== Documentation ===
   
   
<pre class="gcommand">
<pre class="gcommand">
ml CellRanger-ATAC/1.2.0
ml dMRIharmonization/20240227
cellranger-atac -h
source ${EBROOTDMRIHARMONIZATION}/harmonization/bin/activate
 
source ${EBROOTDMRIHARMONIZATION}/../env.sh
cellranger-atac -h (1.2.0)
harmonization.py -h
Copyright (c) 2019 10x Genomics, Inc.  All rights reserved.
-------------------------------------------------------------------------------
 
Usage:
    cellranger-atac mkfastq
 
    cellranger-atac count
    cellranger-atac aggr
    cellranger-atac reanalyze
 
    cellranger-atac mkref


    cellranger-atac testrun
===============================================================================
    cellranger-atac upload
dMRIharmonization (2018) pipeline is written by-
    cellranger-atac sitecheck


TASHRIF BILLAH
Brigham and Women's Hospital/Harvard Medical School
tbillah@bwh.harvard.edu, tashrifbillah@gmail.com


cellranger-atac count -h
===============================================================================
See details at https://github.com/pnlbwh/dMRIharmonization
Submit issues at https://github.com/pnlbwh/dMRIharmonization/issues
View LICENSE at https://github.com/pnlbwh/dMRIharmonization/blob/master/LICENSE
===============================================================================


cellranger-atac count (1.2.0)
Template creation, harmonization, and debugging
Copyright (c) 2019 10x Genomics, Inc.  All rights reserved.
-------------------------------------------------------------------------------
 
The commands below should be preceded by 'cellranger-atac':


Usage:
Usage:
     count
     harmonization.py [SWITCHES]  
        --id=ID
        --fastqs=PATH
        [--sample=PREFIX]
        [options]
    count <run_id> <mro> [options]
    count -h | --help | --version
 
Arguments:
    id      A unique run id, used to name output folder [a-zA-Z0-9_-]+.
    fastqs  Path of folder created by mkfastq or bcl2fastq.
    sample  Prefix of the filenames of FASTQs to select.
 
Options:
# Sample Specification
    --reference=PATH Path of folder containing a 10x-compatible reference.
        Required.
    --description=TEXT  More detailed sample description. Optional.
    --lanes=NUMS        Comma-separated lane numbers.
    --indices=INDICES  Deprecated. Not needed with the output of
    cellranger-atac mkfastq, or bcl2fastq
    --project=TEXT      Name of the project folder within a mkfastq or
                            bcl2fastq-generated folder to pick FASTQs from.
# ATAC analysis
    --force-cells=N    Define the top N barcodes with the most reads as
                            cells. N must be a positive integer <=
                            20,000. Please consult the documentation
                            before using this option. Optional.
    --dim-reduce=MODE  Dimensionality reduction mode for clustering: 'lsa'
                            (default), 'plsa', or 'pca'. Optional.
# Downsampling
    --downsample=GB    Downsample input FASTQs to approximately GB
                            gigabases of input sequence. Optional.
# Martian Runtime
    --jobmode=MODE      Job manager to use. Valid options:
                            local (default), sge, lsf, or a .template file
    --localcores=NUM    Set max cores the pipeline may request at one time.
                            Only applies to local jobs.
    --localmem=NUM      Set max GB the pipeline may request at one time.
                            Only applies to local jobs.
    --localvmem=NUM    Set max virtual address space in GB for the pipeline.
                            Only applies to local jobs.
    --mempercore=NUM    Reserve enough threads for each job to ensure enough
                        memory will be available, assuming each core on your
                        cluster has at least this much memory available.
                            Only applies in cluster jobmodes.
    --maxjobs=NUM      Set max jobs submitted to cluster at one time.
                            Only applies in cluster jobmodes.
    --jobinterval=NUM  Set delay between submitting jobs to cluster, in ms.
                            Only applies in cluster jobmodes.
    --overrides=PATH    The path to a JSON file that specifies stage-level
                            overrides for cores and memory.  Finer-grained
                            than --localcores, --mempercore and --localmem.
                            Consult the 10x support website for an example
                            override file.
    --uiport=PORT      Serve web UI at http://localhost:PORT
    --disable-ui        Do not serve the UI.
    --noexit            Keep web UI running after pipestance completes or fails.
    --nopreflight      Skip preflight checks.
 
    -h --help          Show this message.
    --version          Show version.


Note: 'cellranger-atac count' works as follows:
Meta-switches:
set --fastqs to the folder containing FASTQ files. In addition,
    -h, --help                          Prints this help message and quits
set --sample to the name prefixed to the FASTQ files comprising your sample.
    --help-all                          Prints help messages of all sub-commands and quits
For example, if your FASTQs are named:
    -v, --version                      Prints the program's version and quits
    subject1_S1_L001_R1_001.fastq.gz
then set --sample=subject1


Switches:
    --bvalMap VALUE:str                specify a bmax to scale bvalues into
    --create                            turn on this flag to create template
    --debug                            turn on this flag to debug harmonized data (valid only with --process)
    --denoise                          turn on this flag to denoise voxel data
    --force                            turn on this flag to overwrite existing data
    --harm_list VALUE:ExistingFile      harmonized csv/txt file with first column for dwi and 2nd column for mask: dwi1,mask1\n dwi2,mask2\n...
    --nproc VALUE:str                  number of processes/threads to use (-1 for all available, may slow down your system); the default is 4
    --nshm VALUE:str                    spherical harmonic order; the default is -1
    --nzero VALUE:str                  number of zero padding for denoising skull region during signal reconstruction; the default is 10
    --process                          turn on this flag to harmonize
    --ref_list VALUE:ExistingFile      reference csv/txt file with first column for dwi and 2nd column for mask: dwi1,mask1\n dwi2,mask2\n...
    --ref_name VALUE:str                reference site name
    --resample VALUE:str                voxel size MxNxO to resample into
    --stats                            print statistics of all sites, useful for recomputing --debug statistics separately
    --tar_list VALUE:ExistingFile      target csv/txt file with first column for dwi and 2nd column for mask: dwi1,mask1\n dwi2,mask2\n...; required
    --tar_name VALUE:str                target site name; required
    --template VALUE:str                template directory; required
    --travelHeads                      travelling heads
</pre>
</pre>
[[#top|Back to Top]]
[[#top|Back to Top]]

Revision as of 09:52, 17 September 2024


Category

Engineering

Program On

Sapelo2

Version

20240227

Author / Distributor

See https://github.com/pnlbwh/dMRIharmonization

Description

"dMRIharmonization repository is developed by Tashrif Billah, Sylvain Bouix, Suheyla Cetin Karayumak, and Yogesh Rathi, Brigham and Women's Hospital (Harvard Medical School)." More details are at https://github.com/pnlbwh/dMRIharmonization.

Running Program

  • Version 20240227 is installed as a Python virtual environment on Sapelo2 at /apps/gb/dMRIharmonization/20240227

To use it, please load the module and activate its env with:

ml dMRIharmonization/20240227
source ${EBROOTDMRIHARMONIZATION}/harmonization/bin/activate
source ${EBROOTDMRIHARMONIZATION}/../env.sh

To deactivate its env, please do:

deactivate

Below is an example of a job submission script (sub.sh) to run harmonization.py with 24 parallel processes on a single compute node on the batch parttition:

#!/bin/bash
#SBATCH --job-name=test_dMRIharmonization           
#SBATCH --partition=batch          
#SBATCH --nodes=1
#SBATCH --ntasks=24
#SBATCH --cpus-per-task=1
#SBATCH --mem=100G
#SBATCH --time=7-00:00:00
#SBATCH --constraint="Genoa|Milan"

#SBATCH --mail-type=ALL     
#SBATCH --mail-user=<yourMyID>@uga.edu

cd $SLURM_SUBMIT_DIR

ml purge
ml dMRIharmonization/20240227
source ${EBROOTDMRIHARMONIZATION}/harmonization/bin/activate
source ${EBROOTDMRIHARMONIZATION}/../env.sh

export OMP_NUM_THREADS=1

harmonization.py --nproc 24 <your other options and arguments>

In your actual submission script, please ensure that you request the appropriate computing resources for your job. For example, you can request CPU cores for running parallel processes using the Slurm headers, such as --ntasks=24 and --cpus-per-task=1.

Please note:

  • Use the header --constraint="Genoa|Milan" in your job submission script for optimal job performance.
  • The value for the header --ntasks, e.g., --ntasks=24, should match the number specified for the --nproc option on your command line, i.e., --nproc=24.
  • We highly recommend setting --cpus-per-task=1 and exporting OMP_NUM_THREADS=1 by including export OMP_NUM_THREADS=1in your job submission script.

Here is an example of job submission command:

sbatch ./sub.sh 

Documentation

ml dMRIharmonization/20240227 
source ${EBROOTDMRIHARMONIZATION}/harmonization/bin/activate
source ${EBROOTDMRIHARMONIZATION}/../env.sh
harmonization.py -h

===============================================================================
dMRIharmonization (2018) pipeline is written by-

TASHRIF BILLAH
Brigham and Women's Hospital/Harvard Medical School
tbillah@bwh.harvard.edu, tashrifbillah@gmail.com

===============================================================================
See details at https://github.com/pnlbwh/dMRIharmonization
Submit issues at https://github.com/pnlbwh/dMRIharmonization/issues
View LICENSE at https://github.com/pnlbwh/dMRIharmonization/blob/master/LICENSE
===============================================================================

Template creation, harmonization, and debugging

Usage:
    harmonization.py [SWITCHES] 

Meta-switches:
    -h, --help                          Prints this help message and quits
    --help-all                          Prints help messages of all sub-commands and quits
    -v, --version                       Prints the program's version and quits

Switches:
    --bvalMap VALUE:str                 specify a bmax to scale bvalues into
    --create                            turn on this flag to create template
    --debug                             turn on this flag to debug harmonized data (valid only with --process)
    --denoise                           turn on this flag to denoise voxel data
    --force                             turn on this flag to overwrite existing data
    --harm_list VALUE:ExistingFile      harmonized csv/txt file with first column for dwi and 2nd column for mask: dwi1,mask1\n dwi2,mask2\n...
    --nproc VALUE:str                   number of processes/threads to use (-1 for all available, may slow down your system); the default is 4
    --nshm VALUE:str                    spherical harmonic order; the default is -1
    --nzero VALUE:str                   number of zero padding for denoising skull region during signal reconstruction; the default is 10
    --process                           turn on this flag to harmonize
    --ref_list VALUE:ExistingFile       reference csv/txt file with first column for dwi and 2nd column for mask: dwi1,mask1\n dwi2,mask2\n...
    --ref_name VALUE:str                reference site name
    --resample VALUE:str                voxel size MxNxO to resample into
    --stats                             print statistics of all sites, useful for recomputing --debug statistics separately
    --tar_list VALUE:ExistingFile       target csv/txt file with first column for dwi and 2nd column for mask: dwi1,mask1\n dwi2,mask2\n...; required
    --tar_name VALUE:str                target site name; required
    --template VALUE:str                template directory; required
    --travelHeads                       travelling heads

Back to Top

Installation

Source code is download from https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/latest

System

64-bit Linux