Migrating from Torque to Slurm: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
|||
Line 19: | Line 19: | ||
|- | |- | ||
| Script directive || #PBS || #SBATCH | | Script directive || <code>#PBS</code> || <code>#SBATCH</code> | ||
|- | |- | ||
| Job name || -N <name> | | Job name || <code>-N <name> </code> || <code>--job-name=<name></code> <br> <code>-J <name></code> | ||
|- | |- | ||
| Queue || -q <queue> || --partition=<queue> | | Queue || <code>-q <queue> </code>|| <code>--partition=<queue></code> | ||
|- | |- | ||
| Wall time limit || -l walltime=<hh:mm:ss> || --time=<hh:mm:ss> | | Wall time limit || <code>-l walltime=<hh:mm:ss> </code>|| <code>--time=<hh:mm:ss></code> | ||
|- | |- | ||
| Node count || -l nodes=<count> || --nodes=<count> <br> -N <count> | | Node count || <code>-l nodes=<count></code> || <code>--nodes=<count></code> <br><code> -N <count></code> | ||
|- | |- | ||
| Process count per node || -l ppn=<count> || --ntasks-per-node=<count> | | Process count per node || <code>-l ppn=<count></code> || <code>--ntasks-per-node=<count></code> | ||
|- | |- | ||
| core count (per process) || || --cpus-per-task=<cores> | | core count (per process) || || <code>--cpus-per-task=<cores></code> | ||
|- | |- | ||
| Memory limit || -l mem=<limit> || --mem=<limit> (Memory per node in mega bytes – MB) | | Memory limit || <code>-l mem=<limit></code> || <code>--mem=<limit> </code> <small>(Memory per node in mega bytes – MB)</small> | ||
|- | |- | ||
| Minimum memory per processor || -l pmem=<limit> || --mem-per-cpu=<memory> | | Minimum memory per processor || <code>-l pmem=<limit> </code>|| <code>--mem-per-cpu=<memory></code> | ||
|- | |- | ||
| Request GPUs || -l gpus=<count> || --gres=gpu:<count> | | Request GPUs || <code>-l gpus=<count></code> || <code>--gres=gpu:<count></code> | ||
|- | |- | ||
| Request specific nodes || -l nodes=<node>[,node2[,...]]> || -w, --nodelist=<node>[,node2[,...]]> <br> -F, --nodefile=<node file> | | Request specific nodes || <code>-l nodes=<node>[,node2[,...]]></code> || <code>-w, --nodelist=<node>[,node2[,...]]></code> <br><code> -F, --nodefile=<node file></code> | ||
|- | |- | ||
| Job array || -t <array indices> || --array <indexes> <br> -a <indexes> <br> Where <indexes> is replaced by a range (0-15), a list (0, 6, 16-32), or a step function (0-15:4) | | Job array || <code>-t <array indices></code> || <code>--array <indexes></code> <br> <code>-a <indexes></code> <br> Where <indexes> is replaced by a range (0-15), a list (0, 6, 16-32), or a step function (0-15:4) | ||
|- | |- | ||
| Standard output file ||-o <file path> || --output=<file path> (path must exist) | | Standard output file ||<code>-o <file path></code> || <code>--output=<file path></code> (path must exist) | ||
|- | |- | ||
| Standard error file || -e <file path> || --error=<file path> (path must exist) | | Standard error file || <code>-e <file path></code> || <code>--error=<file path></code> (path must exist) | ||
|- | |- | ||
| Combine stdout/stderr to stdout || -j oe || --output=<combined out and err file path> | | Combine stdout/stderr to stdout || <code>-j oe</code> || <code>--output=<combined out and err file path></code> | ||
|- | |- | ||
| Copy environment || -V || --export=ALL (default) <br> --export=NONE to not export environment | | Copy environment || <code>-V </code>|| <code>--export=ALL (default)</code> <br> <code>--export=NONE </code> to not export environment | ||
|- | |- | ||
| Copy environment variable || -v <variable[=value][,variable2=value2[,...]]> || --export=<variable[=value][,variable2=value2[,...]]> | | Copy environment variable || <code>-v <variable[=value][,variable2=value2[,...]]></code> || <code>--export=<variable[=value][,variable2=value2[,...]]></code> | ||
|- | |- | ||
| Job dependency || -W depend=after:jobID[:jobID...] <br> -W depend=afterok:jobID[:jobID...] <br> -W depend=afternotok:jobID[:jobID...] <br> -W depend=afterany:jobID[:jobID...] || --dependency=after:jobID[:jobID...] <br> --dependency=afterok:jobID[:jobID...] <br> --dependency=afternotok:jobID[:jobID...] <br> --dependency=afterany:jobID[:jobID...] | | Job dependency || <code>-W depend=after:jobID[:jobID...]</code> <br> <code>-W depend=afterok:jobID[:jobID...]</code> <br> <code>-W depend=afternotok:jobID[:jobID...]</code> <br> <code>-W depend=afterany:jobID[:jobID...]</code> || <code>--dependency=after:jobID[:jobID...]</code> <br> <code>--dependency=afterok:jobID[:jobID...]</code> <br> <code>--dependency=afternotok:jobID[:jobID...]</code> <br> <code>--dependency=afterany:jobID[:jobID...]</code> | ||
|- | |- | ||
| Request event notification || -m <events> || --mail-type=<events> <br> Note: multiple mail-type requests may be specified in a comma separated list: <br> --mail-type=BEGIN,END,NONE,FAIL,REQUEUE | | Request event notification || <code>-m <events> </code>|| <code>--mail-type=<events></code> <br> Note: multiple mail-type requests may be specified in a comma separated list: <br><code> --mail-type=BEGIN,END,NONE,FAIL,REQUEUE</code> | ||
|- | |- | ||
| Email address || -M <email address> || --mail-user=<email address> | | Email address || <code>-M <email address></code> || <code>--mail-user=<email address></code> | ||
|- | |- | ||
| Defer job until the specified time || -a <date/time> || --begin=<date/time> | | Defer job until the specified time || <code>-a <date/time></code> || <code>--begin=<date/time></code> | ||
|- | |- | ||
| Node exclusive job || qsub -n || --exclusive | | Node exclusive job || <code>qsub -n</code> || <code>--exclusive</code> | ||
|} | |} | ||
===Job Environment and Environment Variables=== | ===Job Environment and Environment Variables=== |
Revision as of 10:03, 11 February 2020
Later this year GACRC will be implementing the Simple Linux Utility for Resource Management (Slurm) software for job scheduling and resource management on Sapelo2, to replace the Torque (PBS) resource manager and Moab scheduling system that it currently uses.
How is Slurm different from Torque?
Job Submission Options
As with Torque, job options and resource requests in Slurm can be set in the job submission script or as options to the job submission command. However, the syntax used to request resources is different and the table below summarizes some of the options that are frequently used.
Option | Torque (qsub) | Slurm (sbatch) |
---|---|---|
Script directive | #PBS |
#SBATCH
|
Job name | -N <name> |
--job-name=<name> -J <name>
|
Queue | -q <queue> |
--partition=<queue>
|
Wall time limit | -l walltime=<hh:mm:ss> |
--time=<hh:mm:ss>
|
Node count | -l nodes=<count> |
--nodes=<count> -N <count>
|
Process count per node | -l ppn=<count> |
--ntasks-per-node=<count>
|
core count (per process) | --cpus-per-task=<cores>
| |
Memory limit | -l mem=<limit> |
--mem=<limit> (Memory per node in mega bytes – MB)
|
Minimum memory per processor | -l pmem=<limit> |
--mem-per-cpu=<memory>
|
Request GPUs | -l gpus=<count> |
--gres=gpu:<count>
|
Request specific nodes | -l nodes=<node>[,node2[,...]]> |
-w, --nodelist=<node>[,node2[,...]]> -F, --nodefile=<node file>
|
Job array | -t <array indices> |
--array <indexes> -a <indexes> Where <indexes> is replaced by a range (0-15), a list (0, 6, 16-32), or a step function (0-15:4) |
Standard output file | -o <file path> |
--output=<file path> (path must exist)
|
Standard error file | -e <file path> |
--error=<file path> (path must exist)
|
Combine stdout/stderr to stdout | -j oe |
--output=<combined out and err file path>
|
Copy environment | -V |
--export=ALL (default) --export=NONE to not export environment
|
Copy environment variable | -v <variable[=value][,variable2=value2[,...]]> |
--export=<variable[=value][,variable2=value2[,...]]>
|
Job dependency | -W depend=after:jobID[:jobID...] -W depend=afterok:jobID[:jobID...] -W depend=afternotok:jobID[:jobID...] -W depend=afterany:jobID[:jobID...] |
--dependency=after:jobID[:jobID...] --dependency=afterok:jobID[:jobID...] --dependency=afternotok:jobID[:jobID...] --dependency=afterany:jobID[:jobID...]
|
Request event notification | -m <events> |
--mail-type=<events> Note: multiple mail-type requests may be specified in a comma separated list: --mail-type=BEGIN,END,NONE,FAIL,REQUEUE
|
Email address | -M <email address> |
--mail-user=<email address>
|
Defer job until the specified time | -a <date/time> |
--begin=<date/time>
|
Node exclusive job | qsub -n |
--exclusive
|
Job Environment and Environment Variables
Common Job Commands
How to Submit and Manage Jobs
How to Monitor Jobs
Valid Job States