Job Submission partitions on Sapelo2: Difference between revisions

Revision as of 08:32, 2 June 2023

Batch partitions (queues) defined on the Sapelo2

There are different partitions defined on Sapelo2. The Slurm queueing system refers to queues as partition. Users are required to specify, in the job submission script or as job submission command line arguments, the partition and the resources needed by the job in order for it to be assigned to compute node(s) that have enough available resources (such as number of cores, amount of memory, GPU cards, etc). Please note, Slurm will not allow a job to be submitted if there are no resources matching your request. Please refer to Migrating from Torque to Slurm for more info about Slurm queueing system.

The following partitions are defined on the Sapelo2 cluster:

Partition Name	Time limit	Max jobs running	Max jobs able to be submitted	Notes
batch	7 days	250	10,000	Regular nodes.
batch_30d	30 days	1	2	Regular nodes. A given user can have up to one job running at a time here, plus one pending, or two pending and none running. A user's attempt to submit a third job into this partition will be rejected.
highmem_p	7 days	15	100	For high memory jobs
highmem_30d_p	30 days	1	2	For high memory jobs. A given user can have up to one job running at a time here, plus one pending, or two pending and none running. A user's attempt to submit a third job into this partition will be rejected.
hugemem_p	7 days	4	4	For jobs needing up to 2TB of memory
hugemem_30d_p	30 days	4	4	For jobs needing up to 2TB of memory
gpu_p	7 days	18	20	For GPU-enabled jobs.
gpu_30d_p	30 days	2	2	For GPU-enabled jobs. A given user can have up to one job running at a time here, plus one pending, or two pending and none running. A user's attempt to submit a third job into this partition will be rejected.
inter_p	2 days	3	20	Regular nodes, for interactive jobs.
name_p	variable		Partitions that target different groups' buy-in nodes. The name string is specific to each group.

When defining the resources for your job, you'll want to make sure you stay within the bounds of the resources available for the partition that you're using. The below table outlines the resources available per type of node, with the red values being the maximum for that corresponding partition.

Partition Name	# of Nodes	Max Mem(GB)/Node	Max Cores/Node	Processor Type	GPU Cards/Node
batch, batch_30d
	119	500	128	AMD EPYC Milan (3rd gen)	N/A
	4	250	64	AMD EPYC Milan (3rd gen)
	2	120	64	AMD EPYC Milan (3rd gen)
	123		64	AMD EPYC Rome (2nd gen)
	64		32	AMD EPYC Naples (1st gen)
	42	180	32	Intel Xeon Skylake
highmem_p, highmem_30d_p	18	500	32	AMD EPYC Naples (1st gen)
	4	990	64	AMD EPYC Naples (1st gen)
	4	990	28	Intel Xeon Broadwell
hugemem_p, hugemem_30d_p	2	2000	32	AMD EPYC Rome (2nd gen)
gpu_p, gpu_30d_p	4	180	32	Intel Xeon Skylake	1 NVDIA P100
	2	120	16	Intel Xeon	8 NVIDIA K40m
	5	1000	64	AMD EPYC Milan (3rd gen)	4 NVIDIA A100
name_p	variable

@@ Line 94: / Line 94: @@
 | 2 || 120 || 16 || Intel Xeon || 8 NVIDIA K40m
 |-
-|1
+|5
 |style="color:red" |'''1000'''
 |style="color:red" |'''64'''

Job Submission partitions on Sapelo2: Difference between revisions

Revision as of 08:32, 2 June 2023

Batch partitions (queues) defined on the Sapelo2

Navigation menu

Search