Job Submission partitions on Sapelo2 access
Job submission partitions on Sapelo2
Overview
This page describes the Slurm partitions available on the Sapelo2 cluster, including job limits and the resources available in each partition.
In Slurm, queues are called partitions. When you submit a job, you must request both:
- the partition to use, and
- the resources your job needs, such as CPU cores, memory, or GPU devices.
Slurm will reject a job submission if no nodes match the resources you request. For background on Slurm, see Migrating from Torque to Slurm.
How to use this page
Use the first table to choose a partition based on job type and time limit.
Use the second table to confirm that your requested resources fit within the hardware available in that partition.
Partition limits
| Partition name | Time limit | Maximum running jobs per user | Maximum submitted jobs per user | Intended use and notes |
|---|---|---|---|---|
batch
|
7 days | 250 | 10,000 | Standard partition for regular compute jobs on general-purpose nodes. |
batch_30d
|
30 days | 1 | 2 | Standard partition for long-running jobs on regular nodes. A user may have one running job and one pending job, or two pending jobs and no running job. A third submission to this partition will be rejected. |
highmem_p
|
7 days | 6 | 100 | High-memory partition for jobs that require more memory than standard nodes provide. |
highmem_30d_p
|
30 days | 1 | 2 | High-memory partition for long-running jobs. A user may have one running job and one pending job, or two pending jobs and no running job. A third submission to this partition will be rejected. |
hugemem_p
|
7 days | 4 | 4 | Huge-memory partition for jobs needing up to 3 TB of memory. |
hugemem_30d_p
|
30 days | 4 | 4 | Huge-memory partition for long-running jobs needing up to 3 TB of memory. |
gpu_p
|
7 days | 6 | 20 | GPU-enabled partition for jobs that require one or more GPUs. |
gpu_30d_p
|
30 days | 2 | 2 | GPU-enabled partition for long-running jobs. A user may have one running job and one pending job, or two pending jobs and no running job. A third submission to this partition will be rejected. |
inter_p
|
2 days | 3 | 20 | Interactive partition for interactive jobs on regular nodes. |
name_p
|
Variable | Variable | Variable | Partition for a specific group's buy-in nodes. Replace name with the group-specific partition prefix.
|
Resource limits by partition
Before submitting a job, make sure your requested memory, CPU cores, and GPU count fit within the limits of the partition you choose.
In the table below, the phrase partition maximum identifies the largest per-node resource values available within that partition. This replaces color-only emphasis so that the information is available to all users.
| Partition | Number of nodes | Memory per node (GB) | CPU cores per node | Processor type | GPU configuration | Notes |
|---|---|---|---|---|---|---|
batch, batch_30d
|
16 | 740 | 128 | AMD EPYC Genoa (4th gen) | None | Partition maximum for memory and cores is available on this node type. |
batch, batch_30d
|
120 | 500 | 128 | AMD EPYC Milan (3rd gen) | None | Partition maximum for cores is also available on this node type. |
batch, batch_30d
|
4 | 250 | 64 | AMD EPYC Milan (3rd gen) | None | Standard-capacity general-purpose nodes. |
batch, batch_30d
|
2 | 120 | 64 | AMD EPYC Milan (3rd gen) | None | Standard-capacity general-purpose nodes. |
batch, batch_30d
|
123 | 120 | 64 | AMD EPYC Rome (2nd gen) | None | Standard-capacity general-purpose nodes. |
batch, batch_30d
|
25 | 120 | 32 | AMD EPYC Naples (1st gen) | None | Lower-core-count general-purpose nodes. |
batch, batch_30d
|
40 | 180 | 32 | Intel Xeon Skylake | None | Lower-core-count general-purpose nodes. |
highmem_p, highmem_30d_p
|
10 | 500 | 32 | AMD EPYC Naples (1st gen) | None | High-memory nodes. |
highmem_p, highmem_30d_p
|
2 | 990 | 128 | AMD EPYC Milan (3rd gen) | None | Partition maximum for memory and cores is available on this node type. |
highmem_p, highmem_30d_p
|
12 | 990 | 32 | AMD EPYC Milan (3rd gen) | None | High-memory nodes with fewer available cores per node. |
hugemem_p, hugemem_30d_p
|
3 | 3000 | 48 | AMD EPYC Genoa (4th gen) | None | Partition maximum for memory and cores is available on this node type. |
hugemem_p, hugemem_30d_p
|
2 | 2000 | 32 | AMD EPYC Rome (2nd gen) | None | Huge-memory nodes with lower maximums than the partition peak. |
gpu_p, gpu_30d_p
|
2 | 180 | 32 | Intel Xeon Skylake | 1 NVIDIA P100 | Older GPU nodes. |
gpu_p, gpu_30d_p
|
2 | 120 | 64 | AMD EPYC Rome (2nd gen) | 1 NVIDIA V100S | Single-GPU nodes with 64 cores. |
gpu_p, gpu_30d_p
|
14 | 1000 | 64 | AMD EPYC Milan (3rd gen) | 4 NVIDIA A100 | Partition maximum for memory is available on this node type. |
gpu_p, gpu_30d_p
|
12 | 1000 | 64 | Intel Xeon Sapphire Rapids | 4 NVIDIA H100 | Partition maximum for memory is available on this node type. |
gpu_p, gpu_30d_p
|
12 | 740 | 128 | AMD EPYC Genoa (4th gen) | 4 NVIDIA L4 | Partition maximum for cores is available on this node type. |
name_p
|
Variable | Variable | Variable | Variable | Variable | Resource limits depend on the group's buy-in nodes. |
Choosing a partition
A general rule of thumb is:
- Use
batchfor most non-GPU jobs. - Use
batch_30donly when your job genuinely needs a longer wall time. - Use
highmem_porhighmem_30d_pwhen your memory requirements exceed what standard nodes provide. - Use
hugemem_porhugemem_30d_pfor jobs that need very large memory allocations, including jobs approaching 3 TB of memory. - Use
gpu_porgpu_30d_pfor GPU jobs. - Use
inter_pfor interactive work. - Use
name_ponly if your group has access to a buy-in partition with that name.
Example Slurm directives
The examples below show common ways to request a partition.
Regular compute job
#SBATCH --partition=batch
#SBATCH --time=2-00:00:00
#SBATCH --cpus-per-task=16
#SBATCH --mem=64G
GPU job
#SBATCH --partition=gpu_p
#SBATCH --time=1-00:00:00
#SBATCH --gres=gpu:1
#SBATCH --cpus-per-task=8
#SBATCH --mem=64G
High-memory job
#SBATCH --partition=highmem_p
#SBATCH --time=12:00:00
#SBATCH --cpus-per-task=16
#SBATCH --mem=700G
Terms used on this page
- Partition
- A Slurm queue that determines which nodes your job may run on.
- Time limit
- The maximum wall-clock runtime allowed for a job in that partition.
- Running jobs
- Jobs currently executing for a user in that partition.
- Submitted jobs
- Total jobs a user may have in the partition, including running and pending jobs.
- Buy-in nodes
- Nodes purchased by a specific group and made available through a group-specific partition.