Running Jobs on the teaching cluster: Difference between revisions

Revision as of 16:17, 2 August 2018

Using the Queueing System

The login node for the teaching cluster should be used for text editing, and job submissions. No jobs should be run directly on the login node. Processes that use too much CPU or RAM on the login node may be terminated by GACRC staff, or automatically, in order to keep the cluster running properly. Jobs should be run using the Slurm queueing system. The queueing system should be used to run both interactive and batch jobs.

Back to Top

Batch Queues on the teaching cluster

There are different queues defined on the teaching cluster. Users are required to specify, in the job submission script or as job submission command line arguments, the queue and the resources needed by the job in order for it to be assigned to compute node(s) that have enough available resources (such as number of cores, amount of memory, GPU cards, etc).

The table below summarizes the queues defined and the compute nodes that they target:

Queue Name	Node Type	Node Number	Description	Notes
batch	Intel	37	12-core, 48GB RAM, Intel Xeon	Regular nodes.
batch	Intel	2	8-core, 48GB RAM, Intel Xeon	Regular nodes.
highmem	AMD	2	48-core, 128GB RAM, AMD Opteron	For high memory jobs.
highmem	Intel	3	8-core, 192GB RAM, Intel Xeon	For high memory jobs.
gpu	GPU, M2070	1	12-core, 48GB RAM, Intel Xeon, 8 NVIDIA M2070 GPUs	For GPU-enabled jobs.
interq	AMD	3	32-core, 64GB RAM, AMD Opteron	For interactive jobs.

Back to Top

Job submission Scripts

Users are required to specify the number of cores, the amount of memory, the queue name, and the maximum wallclock time needed by the job.

Header lines

Basic job submission script

At a minimum, the job submission script needs to have the following header lines:

#!/bin/bash
#SBATCH --partition=batch
#SBATCH --job-name=test
#SBATCH --ntasks=2
#SBATCH --time=2:00:00
#SBATCH --mem=2gb

Commands to run your application should be added after these header lines.

Header lines explained

#!/bin/bash : used to specify using /bin/bash shell
#SBATCH --partition=batch : used to specify the partition (queue) name, e.g. batch
#SBATCH --job-name=test : used to specify the name of the job, e.g. test
#SBATCH --ntasks=2 : used to specify the number of tasks (e.g. 2).
#SBATCH --time=2:00:00# : used to specify the maximum allowed wall clock time in dd:hh:mm:ss format for the job (e.g 2h).
#SBATCH --mem=2gb : used to specify the maximum memory allowed for the job (e.g. 2GB)

Below are some of the most commonly used queueing system options to configure the job.

Options to request resources for the job

-t, --time=time

   Set a limit on the total run time. Acceptable formats include  "minutes", "minutes:seconds",  "hours:minutes:seconds",  "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds"

--mem=MB

   Maximum memory per node the job will need in MegaBytes

--mem-per-cpu=MB

   Memory required per allocated CPU in MegaBytes

-N, --nodes=num

   Number of nodes are required. Default is 1 node

-n, --ntasks=num

   Maximum number tasks will be launched. Default is one task per node

--ntasks-per-node=ntasks

   Request that ntasks be invoked on each node

-c, --cpus-per-task=ncpus

   Require ncpus number of CPU cores per task. Without this option, allocate one core per task

Please try to request resources for your job as accurately as possible, because this allows your job to be dispatched to run at the earliest opportunity and it helps the system allocate resources efficiently to start as many jobs as possible, benefiting all users.

Options to manage job notification and output

-J jobname

   Give the job a name. The default is the filename of the job script. Within the job, $SBATCH_JOB_NAME expands to the job name

-o path/for/stdout

   Send stdout to path/for/stdout. The default filename is slurm-${SLURM_JOB_ID}.out, e.g. slurm-12345.out, in the directory from which the job was submitted

-e path/for/stderr

   Send stderr to path/for/stderr.

--mail-user=yourUGAMyID@uga.edu

   Send email notification to the address you specified when certain events occur.

--mail-type=type

   The value oftype can be set to NONE, BEGIN, END, FAIL, ALL.

Options to set Array Jobs

If you wish to run an application binary or script using e.g. different input files, then you might find it convenient to use an array job. To create an array job with e.g. 10 elements, use

#SBATCH -t 0-9

or

#SBATCH --array=0-9

The ID of each element in an array job is stored in the variable SLURM_ARRAY_TASK_ID. The variable SLURM_ARRAY_JOB_ID will be expanded into the jobid of the array job. Each array job element runs as an independent job, so multiple array elements can run concurrently, if resources are available.

Option to set job dependency

You can set job dependency with the option -d or --dependency=dependency-list. For example, if you want to specify that one job only starts after job with jobid 1234 finishes, you can add the following header line in the job submission script of the job:

#SBATCH --dependency=afterok:1234

Having this header line in the job submission script will ensure that the job is only dispatched to run after job 1234 has completed successfully.

Environment Variables exported by batch jobs

When a batch job is started, a number of variables are introduced into the job's environment that can be used by the batch script in making decisions, creating output files, and so forth. Some of these variables are listed in the following table:

Variable	Description
SLURM_ARRAY_JOB_ID	Job id of an array job
SLURM_ARRAY_TASK_ID	Value of job array index for this job
SLURM_CPUS_ON_NODE	Number of CPUS on the allocated node.
SLURM_CPUS_PER_TASK	Number of cpus requested per task. Only set if the --cpus-per-task option is specified.
SLURM_JOB_ID	Unique pbs job id
SLURM_JOB_NAME	User specified jobname
SLURM_JOB_CPUS_PER_NODE	Count of processors available to the job on this node.
SLURM_JOB_NAME	Name of the job.
SLURM_JOB_NODELIST	List of nodes allocated to the job.
SLURM_JOB_NUM_NODES	Total number of nodes in the job's resource allocation.
SLURM_JOB_PARTITION	Name of the partition (i.e. queue) in which the job is running.
SLURM_NTASKS	Same as -n, --ntasks
SLURM_NTASKS_PER_NODE	Number of tasks requested per node. Only set if the --ntasks-per-node option is specified.
SLURM_SUBMIT_DIR	The directory from which sbatch was invoked.
SLURM_TASKS_PER_NODE	Number of tasks to be initiated on each node.

Back to Top

Sample job submission scripts

OpenMPI

Sample job submission script (sub.sh) to run an OpenMPI application:

#PBS -S /bin/bash
#PBS -q batch
#PBS -N testjob
#PBS -l nodes=2:ppn=48:AMD
#PBS -l walltime=48:00:00
#PBS -l mem=2gb
#PBS -M username@uga.edu 
#PBS -m abe

cd $PBS_O_WORKDIR

ml OpenMPI/2.1.1-GCC-6.4.0-2.28

echo
echo "Job ID: $PBS_JOBID"
echo "Queue:  $PBS_QUEUE"
echo "Cores:  $PBS_NP"
echo "Nodes:  $(cat $PBS_NODEFILE | sort -u | tr '\n' ' ')"
echo "mpirun: $(which mpirun)"
echo

mpirun ./a.out > outputfile

OpenMP

Sample job submission script (sub.sh) to run a program that uses OpenMP with 10 threads:

#PBS -S /bin/bash
#PBS -q batch
#PBS -N testjob
#PBS -l nodes=1:ppn=10
#PBS -l walltime=48:00:00
#PBS -l mem=30gb
#PBS -M username@uga.edu 
#PBS -m abe

cd $PBS_O_WORKDIR

export OMP_NUM_THREADS=10

echo
echo "Job ID: $PBS_JOBID"
echo "Queue:  $PBS_QUEUE"
echo "Cores:  $PBS_NP"
echo "Nodes:  $(cat $PBS_NODEFILE | sort -u | tr '\n' ' ')"
echo

time ./a.out > outputfile

A high memory job

Sample job submission script (sub.sh) to run an application that needs to use an Intel HIGHMEM node:

#PBS -S /bin/bash
#PBS -q highmem_q
#PBS -N testjob
#PBS -l nodes=1:ppn=12:Intel
#PBS -l walltime=48:00:00
#PBS -l mem=400gb
#PBS -M username@uga.edu 
#PBS -m abe

cd $PBS_O_WORKDIR

ml Velvet 

velvetg [options] > outputfile

If the application can run either on an Intel or an AMD HIGHMEM node:

#PBS -S /bin/bash
#PBS -q highmem_q
#PBS -N testjob
#PBS -l nodes=1:ppn=12
#PBS -l walltime=48:00:00
#PBS -l mem=400gb
#PBS -M username@uga.edu 
#PBS -m abe

cd $PBS_O_WORKDIR

ml Velvet 

velvetg [options] > outputfile

GPU/CUDA

Sample job submission script (sub.sh) to run a GPU-enabled (e.g. CUDA) application:

#PBS -S /bin/bash
#PBS -q gpu_q
#PBS -N testjob
#PBS -l nodes=1:ppn=4:gpus=1
#PBS -l walltime=48:00:00
#PBS -l mem=2gb
#PBS -M username@uga.edu 
#PBS -m abe

cd $PBS_O_WORKDIR

ml CUDA/9.0.176

echo
echo "Job ID: $PBS_JOBID"
echo "Queue:  $PBS_QUEUE"
echo "Cores:  $PBS_NP"
echo "Nodes:  $(cat $PBS_NODEFILE | sort -u | tr '\n' ' ')"
echo

time ./a.out > outputfile

Note: Please note the additional gpus=1 option in the header line. This option should be used to request the number of GPU cards to be used (e.g. to request 2 GPU cards, use gpus=2).

The GPU devices allocated to a job are listed in a file whose name is stored in the queueing system environment variable PBS_GPUFILE. You can print what this file name is with the command (add it to your job submission file):

echo $PBS_GPUFILE

To get a list of the numbers of the GPU devices allocated to your job, separated by a blank space, use the command:

CUDADEV=$(cat $PBS_GPUFILE | rev | cut -d"u" -f1)

echo "List of devices allocated to this job:"

echo $CUDADEV

To remove the blank space between two device numbers in the CUDADEV variable above, use the command:

CUDADEV=$(cat $PBS_GPUFILE | rev | cut -d"u" -f1)

GPULIST=$(echo $CUDADEV | sed 's/ //')

echo "List of devices allocated to this job (no blank spaces between devices):"

echo $GPULIST

Some GPU/CUDA applications require that a list of the GPU devices be given as an argument to the application. If the application needs a blank space separated device number list, use the $CUDADEV variable as an argument. If no blank space is allowed in the list, you can use the $GPULIST variable as an argument to the application.

Hybrid MPI/shared-memory using OpenMPI

Sample job submission script (sub.sh) to run a parallel job that uses 3 MPI processes with OpenMPI and each MPI process runs with 12 threads:

#PBS -S /bin/bash
#PBS -j oe
#PBS -q batch
#PBS -N testhybrid
#PBS -l nodes=3:ppn=12:AMD
#PBS -l mem=60g
#PBS -l walltime=4:00:00
#PBS -M username@uga.edu 
#PBS -m abe

ml OpenMPI/2.1.1-GCC-6.4.0-2.28

echo
echo "Job ID: $PBS_JOBID"
echo "Queue:  $PBS_QUEUE"
echo "Cores:  $PBS_NP"
echo "Nodes:  $(cat $PBS_NODEFILE | sort -u | tr '\n' ' ')"
echo "mpirun: $(which mpirun)"
echo

cd $PBS_O_WORKDIR

export OMP_NUM_THREADS=12

perl /usr/local/bin/makehostlist.pl $PBS_NODEFILE $PBS_NUM_PPN $PBS_JOBID

mpirun -machinefile  host.$PBS_JOBID.list ./a.out

Running an array job

Sample job submission script (sub.sh) to submit an array job with 10 elements. In this example, each array job element will run the a.out binary using an input file called input_0, input_1, ..., input_9.

#PBS -S /bin/bash
#PBS -j oe
#PBS -q batch
#PBS -N myarrayjob
#PBS -l nodes=1:ppn=1
#PBS -l walltime=4:00:00
#PBS -t 0-9

cd $PBS_O_WORKDIR

time ./a.out < input_$PBS_ARRAYID

Back to Top

How to submit a job to the batch queue

With the resource requirements specified in the job submission script (sub.sh), submit your job with

qsub scriptname

For example

qsub sub.sh

Once the job is submitted, the Job ID of the job (e.g. 123456.pbs.scm) will be printed on the screen.

Back to Top

Discovering if the queue is busy

To check if the queue is busy or which node is free to accept the job, following command can helps:

mdiag -n -v

For example, check if the highmem_q is busy:

mdiag -n -v | grep highmem_q

Back to Top

How to open an interactive session

An interactive session on a compute node can be started with the command

qlogin

This command will start an interactive session with one core on a node with feature inter, using the s_interq queue, and with a walltime limit of 12h. The interactive session will open on either an AMD node or an Intel node.

The qlogin command is an alias for

qsub -I -q s_interq -l walltime=12:00:00 -l nodes=1:ppn=1 -l mem=2gb

so it can be used to start an interactive session on a node with feature inter and with a walltime of 12h.

If you wish to start an interactive session on an AMD node, you can use the command

qlogin_amd

If you wish to start an interactive session on an Intel node, you can use the command

qlogin_intel

If you would like to start an interactive session with a different walltime limit or with more cores (e.g. to test a small parallel job), please use the command below and select appropriate values for the walltime and the ppn value. For example, this command:

qsub -I -q s_interq -l walltime=02:00:00 -l nodes=1:ppn=4:inter -l mem=2gb

will start an interactive session with 4 cores, a walltime limit of 2h (choose appropriately), using the s_interq queue, and on a node with feature inter.

To start an interactive on your lab's buyin node, please select the queue that targets your lab's node(s). For example:

qsub -I -l walltime=12:00:00 -l nodes=1:ppn=1 -l mem=2gb -q abclab_q

A typical use of an interactive session is for code compilation, so the binaries generated are optimized for the compute node type (e.g. inter which is identical to nodes with the AMD feature).

Back to Top

How to run an interactive job with Graphical User Interface capabilities

If you want to run an application as an interactive job and have its graphical user interface displayed on the terminal of your local machine, you need to enable X-forwarding when you ssh into the login node. For information on how to do this, please see questions 10 and 11 in the Frequently Asked Questions page.

Then start an interactive session, but add the option -X to the qsub command. For example:

qsub -I -X -q s_interq -l walltime=12:00:00 -l nodes=1:ppn=1 -l mem=2gb

where the walltime has been set to 12h, the memory set to 2gb (choose appropriately), and the queue selected was s_interq, which targets interactive nodes with either Intel or AMD feature.

The xqlogin command is an alias for

qsub -I -X -q s_interq -l walltime=12:00:00 -l nodes=1:ppn=1 -l mem=2gb

so it can be used to start an interactive session with X-forwarding enabled and with a walltime of 12h.

Once a shell prompt on an interactive node is returned, you can invoke the application. If it has a GUI, that should be displayed on your local machine (laptop or desktop).

Back to Top

How to run a singularity application

There are applications installed as singularity containers under /usr/local/singularity-images.

The file name is in format of application-version prefix, such as /usr/local/singularity-images/trinity-2.5.1--0.simg is for Trinity version 2.5.1.

For information on Singularity please visit: http://singularity.lbl.gov/

Singularity containers have been configured to access to the user's home directory ($HOME), lustre1 directory (/lustre1), lscratch directory (/lscratch). The temp directory (/tmp) is inside the container.

All environment variables set before executing singularity command is available inside the container.

Below examples all use Trinity as an example.

To find the installed location of the application:

singularity exec /usr/local/singularity-images/trinity-2.5.1--0.simg which Trinity
/usr/local/bin/Trinity
singularity exec /usr/local/singularity-images/trinity-2.5.1--0.simg ls -al /usr/local/bin/Trinity
lrwxrwxrwx    1 root     root            28 Dec  9 04:04 /usr/local/bin/Trinity -> ../opt/trinity-2.5.1/Trinity

All the content of the application can be listed as:

singularity exec /usr/local/singularity-images/trinity-2.5.1--0.simg ls /usr/local/opt/trinity-2.5.1

To run applications:

#PBS -S /bin/bash
#PBS -N j_s_trinity
#PBS -q highmem_q
#PBS -l nodes=1:ppn=1
#PBS -l walltime=480:00:00
#PBS -l mem=100gb
 
cd $PBS_O_WORKDIR
singularity exec /usr/local/singularity-images/trinity-2.5.1--0.simg COMMAND OPTION

where COMMAND should be replaced by the specific command and options, such as:

#PBS -S /bin/bash
#PBS -N j_s_trinity
#PBS -q highmem_q
#PBS -l nodes=1:ppn=16
#PBS -l walltime=480:00:00
#PBS -l mem=100gb
 
cd $PBS_O_WORKDIR

singularity exec /usr/local/singularity-images/trinity-2.5.1--0.simg Trinity --seqType <string> --max_memory 100G --CPU 8 --no_version_check 1>job.out 2>job.err

To run in an interactive session: For example:

qsub -I -l nodes=1:ppn=1 -l mem=40gb -l walltime=12:00:00 -q s_interq

singularity exec /usr/local/singularity-images/trinity-2.5.1--0.simg Trinity --seqType <string> --max_memory 40G --CPU 1 --no_version_check 1>job.out 2>job.err

Back to Top

How to run a job from the compute node's local disk (/lscratch)

Each compute node has a file system called /lscratch, which resides on the node's local solid state drive (SSD). Single node jobs that need to perform a lot of input and output to disk can benefit from running from /lscratch. In order to run a job from /lscratch, we recommend that the following steps be done in the job submission script:

1. create a directory in /lscratch for the job
2. copy all files that the job needs in order to run into this newly created directory in /lscratch
3. change directory into this /lscratch directory
4. load the modules and run the application
5. copy the results back to the global scratch area (/lustre1) or to the /project area, as appropriate
6. delete all files used/generated by this job from /lscratch

Note that the /lscratch file system resides on the node where the job is running, it is not directly accessible from the login node.

The job submission script should include a header line to specify how much space in /lscratch the job will use per core:

#PBS -l gres=lscratch:N

where N should be replaced by the number of KB that the job will use in /lscratch per core (not the total amount in /lscratch that the job needs). For example, to specify needing 20GB of space per core, use:

#PBS -l gres=lscratch:20000000

Note that you cannot use 20gb to replace 20000000 in this option.

Important Note: The total amount of "lscratch" allocated to a job will be the value specified with the gres=lscratch option times the number of cores requested with the ppn option.

Sample job submission script to run a PartitionFinder job from /lscratch (in this example the job needs 20GB of total space in /lscratch):

#PBS -S /bin/bash
#PBS -N jobname
#PBS -q batch
#PBS -l nodes=1:ppn=4
#PBS -l walltime=120:00:00
#PBS -l mem=20gb
#PBS -l gres=lscratch:5000000

#create a unique directory in /lscratch for this job
mkdir -p /lscratch/${USER}/$PBS_JOBID

#change into your current working directory, from where the job was submitted (e.g. in /lustre1)
cd $PBS_O_WORKDIR

#copy any files needed for this job to the lscratch dir, for example
cp partition_finder.cfg /lscratch/${USER}/$PBS_JOBID

#change into the lscratch dir:
cd /lscratch/${USER}/$PBS_JOBID

#load the module(s) needed for this job 
ml PartitionFinder/2.1.1-foss-2016b-Python-2.7.14

#command to run the application
python $EBROOTPARTITIONFINDER/PartitionFinder.py ./ ./partition_finder.cfg --raxml -p 4

#copy the results back to /lustre1, replace "results" by the name of the files
#you wish to copy back:
cp results $PBS_O_WORKDIR

#delete all files left over in lscratch
rm -r -f /lscratch/${USER}/$PBS_JOBID

In the example above, the total amount of /lscratch allocated to this job is 5000000 * 4 = 20000000 = 20GB.

Back to Top

How to check on running or pending jobs

To list all running and pending jobs (by all users), use the command

qstat

To list all your running and pending jobs, use the command

qstat_me

or

qstat -u MyID

where MyID needs to be replaced by your UGA MyID.

To list all array elements of array jobs, add the -t option to qstat:

qstat -u MyID -t

For detailed information on how to monitor your jobs, please see Monitoring Jobs on Sapelo2.

Back to Top

How to delete a running or pending job

To delete one of your running or pending job, use the command

qdel jobid

For example, to delete a job with Job ID 123456.pbs.scm use

qdel 123456

Standard error and standard output files of a job

By default, the standard output and the standard error of the job will be written into files called jobname.oJobid and jobname.eJobid, respectively, where jobname is the name of the job and Jobid is the job id number. If you want the standard error to be written into the standard output file, please add the header line

#PBS -j oe

These files will be written to disk (to your working directory) while the job is running. However, we still encourage users to write all standard output of the application into a separate file. If the application writes to the standard output, you could redirect the stdout and stderr of an application to a file, e.g., output.txt:

./application >output.txt 2>&1

where the name output.txt can be replaced by a file name of choice.

Back to Top

How to check resource utilization of a finished job

1. You can request than an email be sent to you when the job finishes, by adding these two header lines to the job submission script:

#PBS -M username@uga.edu 
#PBS -m ae

where username@uga.edu should be replaced by your email address (not necessarily a UGAMail address).

The email message will include the resource utilization of the job.

2. Within 24 hours of a job completion, you can use the command

qstat -f jobid

to check on the resource utilization (such as wall clock time, amount of memory, etc).

Back to Top

@@ Line 77: / Line 77: @@
 Below are some of the most commonly used queueing system options to configure the job.
-'''Options to request resources for the job:'''
+====Options to request resources for the job====
 * -t, --time=time
@@ Line 103: / Line 103: @@
-'''Options to manage job notification and output:'''
+====Options to manage job notification and output====
 * -J jobname
@@ Line 121: / Line 121: @@
+====Options to set Array Jobs====
-'''How do I figure out how much time and memory my job needs?'''
+If you wish to run an application binary or script using e.g. different input files, then you might find it convenient to use an array job. To create an array job with e.g. 10 elements, use
-One way to determine the resource requirements of your job is to run it for the first time while being generous with the resource requested and then refine the requirements based on what the job actually used. If you put the following header lines in your job submission script, then when the job finishes you will receive an email that will include a summary of the resources used.
 <pre class="gscript">
-#PBS -M username@uga.edu
+#SBATCH -t 0-9
-#PBS -m ae
 </pre>
-where ''username@uga.edu'' should be replaced by your email address (not necessarily a UGAMail address).
+or
-If you also want to receive an email when the job begins, use the header lines
 <pre class="gscript">
-#PBS -M username@uga.edu
+#SBATCH --array=0-9
-#PBS -m abe
 </pre>
+The ID of each element in an array job is stored in the variable SLURM_ARRAY_TASK_ID. The variable SLURM_ARRAY_JOB_ID will be expanded into the jobid of the array job. Each array job element runs as an independent job, so multiple array elements can run concurrently, if resources are available.
-'''Additional header lines'''
+====Option to set job dependency====
+You can set job dependency with the option -d or --dependency=''dependency-list''. For example, if you want to specify that one job only starts after job with jobid 1234 finishes, you can add the following header line in the job submission script of the job:
-By default the stdout and stderr files of the job will be written into files called ''jobname''.'''o'''''jobid'' and ''jobname''.'''e'''''jobid''. If you want to merge the stderr into the stdout file, you can add the header line
 <pre class="gscript">
-#PBS -j oe
+#SBATCH --dependency=afterok:1234
 </pre>
+Having this header line in the job submission script will ensure that the job is only dispatched to run after job 1234 has completed successfully.
-'''NOTE:''' The stdout and stderr will only be written into your working directory '''after''' the job finishes (and in some circumstances these files might not be written at all). Therefore, in order to follow the progress of the job while it is running and to ensure that output are written to disk, we recommend writing all important output into separate files.
-<pre class="gcomment">
-Please do not use the -k option, as jobs that include this option will fail to run.
-</pre>
 ====Other content of the script====
@@ Line 155: / Line 144: @@
 Following the header lines, users can include commands to change to the working directory, to load the modules needed to run the application, and to invoke the application. For example, to use the directory from which the job is submitted as the working directory (where to find input files or binaries), add the line
 <pre class="gscript">
-cd $PBS_O_WORKDIR
+cd $SLURM_SUBMIT_DIR
 </pre>
-You can then load the needed modules. For example, if you are running an application compiled with GCC 6.4 and OpenMPI 2.1.1, then include the line
+You can then load the needed modules. For example, if you are running an R program, then include the line
 <pre class="gscript">
-ml OpenMPI/2.1.1-GCC-6.4.0-2.28
+ml R/3.4.4-foss-2016b-X11-20160819-GACRC
 </pre>
-Optionally, you can include the following lines to have a record of some information in the stdout file:
-<pre class="gscript">
-echo
-echo "Job ID: $PBS_JOBID"
-echo "Queue:  $PBS_QUEUE"
-echo "Cores:  $PBS_NP"
-echo "Nodes:  $(cat $PBS_NODEFILE | sort -u | tr '\n' ' ')"
-echo "mpirun: $(which mpirun)"
-echo
-</pre>
-Then invoke your application. For example, if you are running a binary called a.out which is in your job submission directory, use
-<pre class="gscript">
-mpirun ./a.out
-</pre>
-If your application writes to stdout, we recommend that you pipe the stdout of the application into a separate file (as the stdout/stderr file(s) for the job are only written into your directory at the end of the job). For example, use
-<pre class="gscript">
-mpirun ./a.out > outputfile
-</pre>
-====Running MPI jobs that do not use all cores per node====
-If you are planning to run MPI jobs using fewer than the total number of cores on the node, please do the following:
-'''Step 1.''' The number of cores requested should be a multiple of 3, and larger or equal to the number of cores you wish to use. For example:
+Then invoke your application. For example, if you are running an R program called add.R which is in your job submission directory, use
-* To run an MPI job using 10 cores, please request 12 cores with
 <pre class="gscript">
-#PBS -l nodes=1:ppn=12:AMD
+R CMD BATCH add.R
 </pre>
-and add the option '''-np 10''' to your mpirun or mpiexec command.
-* To run an MPI job using 24 cores, please request 24 cores with
-<pre class="gscript">
-#PBS -l nodes=1:ppn=24:AMD
-</pre>
-and you '''do not need''' to use the option '''-np 24''' in your mpirun or mpiexec command.
-'''Step 2.''' Add the following lines in your job submission script:
-<pre class="gscript">
-# Context Sharing
-CONTEXTS=$(/usr/local/bin/set_contexts.sh $PBS_NUM_PPN)
-if [[ "$?" -eq "0" ]] ; then
-  export PSM_SHAREDCONTEXTS_MAX=$CONTEXTS
-fi
-</pre>
-====Parallel jobs using hybrid MPI/shared-memory threads====
-To run a parallel job that uses MPI processes and each MPI process runs with shared-memory threads, use the '''nodes=''' option to request the number of MPI processes and set '''ppn=''' to be equal to the number of threads that each MPI process will use. Note that some of the MPI processes might be on the same node, some might be on different nodes. For example, to run a job that will use 3 MPI processes and each MPI process will use 12 threads, please use the header line
-<pre class="gscript">
-#PBS -l nodes=3:ppn=12:AMD
-</pre>
-and add the line
-<pre class="gscript">
-perl /usr/local/bin/makehostlist.pl $PBS_NODEFILE $PBS_NUM_PPN $PBS_JOBID
-</pre>
-If the MPI processes use OpenMP threads, then set the number of threads with e.g.
-<pre class="gscript">
-export OMP_NUM_THREADS=12
-</pre>
-Please also use the mpiexec option '''-f host.$PBS_JOBID.list''' (for MVAPICH2) and the option '''-machinefile host.$PBS_JOBID.list''' (for OpenMPI) to run the executable.
-====Array Jobs====
-If you wish to run an application binary or script using e.g. different input files, then you might find it convenient to use an array job. To create an array job with e.g. 10 elements, use
-<pre class="gscript">
-#PBS -t 0-9
-</pre>
-The ID of each element in an array job is stored in the variable PBS_ARRAYID. Each array job element runs as an independent job, so multiple array elements can run concurrently, if resources are available.
@@ Line 245: / Line 168: @@
 |-
-| PBS_ARRAYID 	|| Value of job array index for this job
+| SLURM_ARRAY_JOB_ID || Job id of an array job
-|-
-| PBS_GPUFILE 	|| Line-delimited list of GPUs allocated to the job Each line follows the following format: <host>-gpu<number>
-|-
-| PBS_JOBID 	|| Unique pbs job id
 |-
-| PBS_JOBNAME 	|| User specified jobname
+| SLURM_ARRAY_TASK_ID 	|| Value of job array index for this job
 |-
-| PBS_NODEFILE 	|| File containing line delimited list of nodes allocated to the job
+| SLURM_CPUS_ON_NODE ||  Number of CPUS on the allocated node.
 |-
-| PBS_NP 	|| Number of execution slots (cores) for the job
+| SLURM_CPUS_PER_TASK || Number of cpus requested per task. Only set if the --cpus-per-task option is specified.
 |-
-| PBS_NUM_NODES || Number of nodes allocated to the job
+| SLURM_JOB_ID 	|| Unique pbs job id
 |-
-| PBS_NUM_PPN 	|| Number of procs per node allocated to the job
+| SLURM_JOB_NAME 	|| User specified jobname
 |-
-| PBS_O_HOME 	|| Home directory of submitting user
+| SLURM_JOB_CPUS_PER_NODE ||  Count of processors available to the job on this node.
 |-
-| PBS_O_HOST 	|| Host on which job script is currently running
+| SLURM_JOB_NAME ||  Name of the job.
 |-
-| PBS_O_LANG 	|| Language variable for job
+| SLURM_JOB_NODELIST ||  List of nodes allocated to the job.
 |-
-| PBS_O_LOGNAME || Name of submitting user
+| SLURM_JOB_NUM_NODES ||Total number of nodes in the job's resource allocation.
 |-
-| PBS_O_PATH 	|| Path variable used to locate executables within job script
+| SLURM_JOB_PARTITION ||  Name of the partition (i.e. queue) in which the job is running.
 |-
-| PBS_O_SHELL 	|| Script shell
+| SLURM_NTASKS ||  Same as -n, --ntasks
 |-
-| PBS_O_WORKDIR || Job's submission directory
+| SLURM_NTASKS_PER_NODE ||  Number of tasks requested per node. Only set if the --ntasks-per-node option is specified.
 |-
-| PBS_QUEUE 	|| Job queue
+| SLURM_SUBMIT_DIR || The directory from which sbatch was invoked.
 |-
-| PBS_TASKNUM 	|| Number of tasks requested
+| SLURM_TASKS_PER_NODE ||  Number of tasks to be initiated on each node.
 |-
 |}

Running Jobs on the teaching cluster: Difference between revisions

Revision as of 16:17, 2 August 2018

Contents

Using the Queueing System

Batch Queues on the teaching cluster

Job submission Scripts

Header lines

Options to request resources for the job

Options to manage job notification and output

Options to set Array Jobs

Option to set job dependency

Other content of the script

Environment Variables exported by batch jobs

Sample job submission scripts

OpenMPI

OpenMP

A high memory job

GPU/CUDA

Hybrid MPI/shared-memory using OpenMPI

Running an array job

How to submit a job to the batch queue

Discovering if the queue is busy

How to open an interactive session

How to run an interactive job with Graphical User Interface capabilities

How to run a singularity application

How to run a job from the compute node's local disk (/lscratch)

How to check on running or pending jobs

How to delete a running or pending job

Standard error and standard output files of a job

How to check resource utilization of a finished job

Navigation menu

Running Jobs on the teaching cluster: Difference between revisions

Revision as of 16:17, 2 August 2018

Using the Queueing System

Batch Queues on the teaching cluster

Job submission Scripts

Header lines

Options to request resources for the job

Options to manage job notification and output

Options to set Array Jobs

Option to set job dependency

Other content of the script

Environment Variables exported by batch jobs

Sample job submission scripts

OpenMPI

OpenMP

A high memory job

GPU/CUDA

Hybrid MPI/shared-memory using OpenMPI

Running an array job

How to submit a job to the batch queue

Discovering if the queue is busy

How to open an interactive session

How to run an interactive job with Graphical User Interface capabilities

How to run a singularity application

How to run a job from the compute node's local disk (/lscratch)

How to check on running or pending jobs

How to delete a running or pending job

Standard error and standard output files of a job

How to check resource utilization of a finished job

Navigation menu

Search