Cactus-Sapelo2

From Research Computing Center Wiki
Jump to navigation Jump to search

Category

Bioinformatics

Program On

Sapelo2

Version

2.6.7, 2.6.9, 2.7.0

Author / Distributor

Please see https://github.com/ComparativeGenomicsToolkit/cactus

Description

From https://github.com/ComparativeGenomicsToolkit/cactus: "Cactus is a reference-free whole-genome alignment program, as well as a pagenome graph construction toolkit."

Running Program

Also refer to Running Jobs on Sapelo2

For more information on Environment Modules on Sapelo2 please see the Lmod page.

  • Version 2.6.7 is installed as module called Cactus/2.6.7-GCCcore-11.3.0-Python-3.10.4
  • Version 2.6.9 is installed as a singularity image at /apps/singularity-images/cactus_v2.6.9.sif
  • Version 2.7.0 is installed as a singularity image at /apps/singularity-images/cactus_v2.7.0.sif
  • Version 2.7.0 with GPU support is installed as a singularity image at /apps/singularity-images/cactus_v2.7.0-gpu.sif


To run the commands in the singularity containers, please use an overlay for the /tmp partition. This can be done with the following steps:

export CACTUS_TMPDIR=/lscratch/$USER/cactus-$SLURM_JOB_ID
mkdir -p -m 700 $CACTUS_TMPDIR/upper $CACTUS_TMPDIR/work
truncate -s 300M jobStore.img
apptainer exec /apps/singularity-images/cactus_v2.7.0.sif mkfs.ext3 -d $CACTUS_TMPDIR jobStore.img

mkdir -m 700 -p $CACTUS_TMPDIR/tmp
mkdir cactus_wd

apptainer exec --cleanenv --overlay jobStore.img --bind $CACTUS_TMPDIR/tmp:/tmp \
	--env PYTHONNOUSERSITE=1 /apps/singularity-images/cactus_v2.7.0.sif cactus-pangenome \
	--workDir=cactus_wd [options]

cd /lscratch/$USER
rm -r -f cactus-$SLURM_JOB_ID

where the cactus-pangenome command in the example can be replaced by the cactus command.


Sample job submission script (sub.sh) to run cactus version 2.7.0:

#!/bin/bash
#SBATCH --job-name=testcactus         # Job name
#SBATCH --partition=batch             # Partition (queue) name
#SBATCH --ntasks=1                    # Run on a single CPU
#SBATCH --mem=5gb                     # Job memory request
#SBATCH --time=02:00:00               # Time limit hrs:min:sec
#SBATCH --output=%x.%j.out            # Standard output log
#SBATCH --error=%x.%j.err             # Standard error log

cd $SLURM_SUBMIT_DIR

export CACTUS_TMPDIR=/lscratch/$USER/cactus-$SLURM_JOB_ID
mkdir -p -m 700 $CACTUS_TMPDIR/upper $CACTUS_TMPDIR/work
truncate -s 300M jobStore.img
apptainer exec /apps/singularity-images/cactus_v2.7.0.sif mkfs.ext3 -d $CACTUS_TMPDIR jobStore.img

mkdir -m 700 -p $CACTUS_TMPDIR/tmp
mkdir cactus_wd

apptainer exec --cleanenv --overlay jobStore.img --bind $CACTUS_TMPDIR/tmp:/tmp \
	--env PYTHONNOUSERSITE=1 /apps/singularity-images/cactus_v2.7.0.sif cactus-pangenome \
	--workDir=cactus_wd  ./js ./evolverPrimates.txt --outDir primates-pg --outName primates-pg \
	--reference simChimp --vcf --giraffe --gfa --gbz

cd /lscratch/$USER
rm -r -f cactus-$SLURM_JOB_ID

where the sample options used here need to be replaced by the options (command and arguments) you want to use. Other parameters of the job, such as the maximum wall clock time, maximum memory, the number of cores per node, and the job name need to be modified appropriately as well.

Documentation

Please see https://github.com/ComparativeGenomicsToolkit/cactus

Installation

Singularity images built from the docker container provided by the authors. For example:

singularity pull docker://quay.io/comparative-genomics-toolkit/cactus:v2.7.0

singularity pull docker://quay.io/comparative-genomics-toolkit/cactus:v2.7.0-gpu

System

64-bit Linux