Job Submission partitions on Sapelo2: Difference between revisions

From Research Computing Center Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 11: Line 11:
! scope="col" | Partition Name
! scope="col" | Partition Name
! scope="col" | Time limit
! scope="col" | Time limit
! scope="col" | Max jobs
! scope="col" | Max jobs running
! scope="col" | Max jobs able to be submitted
! scope="col" | Notes
! scope="col" | Notes
|-
|-
|-
|-
| batch || 7 days || 250 || Regular nodes.
| batch || 7 days || 250 || 10,000 || Regular nodes.
|-
|-
| batch_30d || 30 days || 2 || Regular nodes. A given user can have up to one job running at a time here, plus one pending, or two pending and none running. A user's attempt to submit a third job into this partition will be rejected.
| batch_30d || 30 days || 1 || 2 || Regular nodes. A given user can have up to one job running at a time here, plus one pending, or two pending and none running. A user's attempt to submit a third job into this partition will be rejected.
|-
|-
| highmem_p || 7 days || 15 || For high memory jobs
| highmem_p || 7 days || 15 || 100 || For high memory jobs
|-
|-
| highmem_30d_p || 30 days || 2 || For high memory jobs. A given user can have up to one job running at a time here, plus one pending, or two pending and none running. A user's attempt to submit a third job into this partition will be rejected.
| highmem_30d_p || 30 days || 1 || 2 || For high memory jobs. A given user can have up to one job running at a time here, plus one pending, or two pending and none running. A user's attempt to submit a third job into this partition will be rejected.
|-
|-
| gpu_p || 7 days || 18 || For GPU-enabled jobs.
| gpu_p || 7 days || 18 || 20 || For GPU-enabled jobs.
|-
|-
| gpu_30d_p || 30 days || 2 || For GPU-enabled jobs. A given user can have up to one job running at a time here, plus one pending, or two pending and none running. A user's attempt to submit a third job into this partition will be rejected.
| gpu_30d_p || 30 days || 2 || 2 || For GPU-enabled jobs. A given user can have up to one job running at a time here, plus one pending, or two pending and none running. A user's attempt to submit a third job into this partition will be rejected.
|-
|-
| inter_p || 2 days || 3 || Regular nodes, for interactive jobs.
| inter_p || 2 days || 3 || 20 || Regular nodes, for interactive jobs.
|-
|-
| '''name'''_p || style="text-align: center" colspan="2"| variable  || Partitions that target different groups' buy-in nodes. The '''name''' string is specific to each group.  
| '''name'''_p || style="text-align: center" colspan="2"| variable  || Partitions that target different groups' buy-in nodes. The '''name''' string is specific to each group.  

Revision as of 10:37, 19 May 2021


Batch partitions (queues) defined on the Sapelo2

There are different partitions defined on Sapelo2. The Slurm queueing system refers to queues as partition. Users are required to specify, in the job submission script or as job submission command line arguments, the partition and the resources needed by the job in order for it to be assigned to compute node(s) that have enough available resources (such as number of cores, amount of memory, GPU cards, etc). Please note, Slurm will not allow a job to be submitted if there are no resources matching your request. Please refer to Migrating from Torque to Slurm for more info about Slurm queueing system.

The following partitions are defined on the Sapelo2 cluster:

Partition Name Time limit Max jobs running Max jobs able to be submitted Notes
batch 7 days 250 10,000 Regular nodes.
batch_30d 30 days 1 2 Regular nodes. A given user can have up to one job running at a time here, plus one pending, or two pending and none running. A user's attempt to submit a third job into this partition will be rejected.
highmem_p 7 days 15 100 For high memory jobs
highmem_30d_p 30 days 1 2 For high memory jobs. A given user can have up to one job running at a time here, plus one pending, or two pending and none running. A user's attempt to submit a third job into this partition will be rejected.
gpu_p 7 days 18 20 For GPU-enabled jobs.
gpu_30d_p 30 days 2 2 For GPU-enabled jobs. A given user can have up to one job running at a time here, plus one pending, or two pending and none running. A user's attempt to submit a third job into this partition will be rejected.
inter_p 2 days 3 20 Regular nodes, for interactive jobs.
name_p variable Partitions that target different groups' buy-in nodes. The name string is specific to each group.


When defining the resources for your job, you'll want to make sure you stay within the bounds of the resources available for the partition that you're using. The below table outlines the resources available per type of node, with the red values being the maximum for that corresponding partition.

Partition Name # of Nodes Max Mem(GB)/Node Max Cores/Node Processor Type GPU Cards/Node
batch, batch_30d
93 120 64 AMD EPYC N/A
49 32
68 48 AMD Opteron
2 250
42 180 32 Intel Xeon Skylake
32 58 28 Intel Xeon Broadwell
highmem_p, highmem_30d_p 18 500 32 AMD EPYC
6 48 AMD Opteron
4 990 64 AMD EPYC
4 28 Intel Xeon Broadwell
1 48 AMD Opteron
gpu_p, gpu_30d_p 3 180 32 Intel Xeon Skylake 1 NVDIA P100
2 120 16 Intel Xeon 8 NVIDIA K40m
1 90 12 7 NVIDIA K20Xm
name_p variable