Systems: Difference between revisions
No edit summary |
No edit summary |
||
(One intermediate revision by the same user not shown) | |||
Line 39: | Line 39: | ||
'''Regular nodes''' | '''Regular nodes''' | ||
* 14 compute nodes with AMD EPYC (Genoa 4th gen) processors (128 cores and 745GB of RAM per node) | |||
* 120 compute nodes with AMD EPYC (Milan 3rd gen) processors (128 cores and 512GB of RAM per node) | * 120 compute nodes with AMD EPYC (Milan 3rd gen) processors (128 cores and 512GB of RAM per node) | ||
* 4 compute nodes with AMD EPYC (Milan 3rd gen) processors (64 cores and 256GB of RAM per node) | * 4 compute nodes with AMD EPYC (Milan 3rd gen) processors (64 cores and 256GB of RAM per node) | ||
Line 73: | Line 74: | ||
'''GPU nodes''' | '''GPU nodes''' | ||
* | * 12 compute nodes with Intel Xeon SapphireRapids processors (64 cores and 1TB of RAM) and 4x NVIDIA H100 GPU cards. | ||
* 12 compute nodes with AMD EPYC (Genoa 4th gen) processors (128 cores and 745GB of RAM) and 4x NVIDIA L4 GPU cards. | |||
* 14 compute nodes with AMD EPYC (Milan 3rd gen) processors (64 cores and 1TB of RAM) and 4x NVIDIA A100 GPU cards. | |||
* 2 compute nodes with Intel Xeon Skylake processors (32 cores and 187GB of RAM) and 1x NVIDIA P100 GPU card per node | * 2 compute nodes with Intel Xeon Skylake processors (32 cores and 187GB of RAM) and 1x NVIDIA P100 GPU card per node | ||
<!-- * 2 compute nodes with Intel Xeon processors (16 cores and 128GB of RAM) and 8x NVIDIA K40m GPU cards per node --> | <!-- * 2 compute nodes with Intel Xeon processors (16 cores and 128GB of RAM) and 8x NVIDIA K40m GPU cards per node --> |
Latest revision as of 09:26, 12 September 2024
Sapelo2
Sapelo2 is a Linux cluster that runs a 64-bit Rocky 8.8 operating system and it is managed using Warewulf. Several virtual login nodes are available, with Intel Xeon Gold 6230 processors, 32GB of RAM, and 16 cores per node. The queueing system on Sapelo2 is Slurm.
Internodal communication among the compute nodes and between these nodes and the storage systems serving the home directories and the scratch directories is provided by an EDR Infiniband network (100Gbps).
The cluster is currently comprised of the following resources:
Regular nodes
- 14 compute nodes with AMD EPYC (Genoa 4th gen) processors (128 cores and 745GB of RAM per node)
- 120 compute nodes with AMD EPYC (Milan 3rd gen) processors (128 cores and 512GB of RAM per node)
- 4 compute nodes with AMD EPYC (Milan 3rd gen) processors (64 cores and 256GB of RAM per node)
- 2 compute nodes with AMD EPYC (Milan 3rd gen) processors (64 cores and 128GB of RAM per node)
- 123 compute nodes with AMD EPYC (Rome 2nd gen) processors (64 cores and 128GB of RAM per node)
- 50 compute nodes with AMD EPYC (Naples 1st gen) processors (32 cores and 128GB of RAM per node)
- 42 compute nodes with Intel Xeon Skylake processors (32 cores and 192GB of RAM per node)
High memory nodes (3TB/node)
- 3 compute nodes with AMD EPYC (Genoa 4th gen) processors (48 cores and 3TB of RAM per node)
High memory nodes (2TB/node)
- 2 compute nodes with AMD EPYC (Rome 2nd gen) processors (32 cores and 2TB of RAM per node)
High memory nodes (1TB/node)
- 2 compute nodes with AMD EPYC (Milan 3rd gen) processors (128 cores and 1TB of RAM per node)
- 12 compute nodes with AMD EPYC (Milan 3rd gen) processors (32 cores and 1TB of RAM per node)
- 2 compute nodes with AMD EPYC (Naples 1st gen) processors (64 cores and 1TB of RAM per node)
- 1 compute nodes with Intel Xeon Broadwell processors (28 cores and 1TB of RAM per node)
High memory nodes (512GB/node)
- 18 compute nodes with AMD EPYC (Naples 1st gen) processors (32 cores and 512GB of RAM per node)
GPU nodes
- 12 compute nodes with Intel Xeon SapphireRapids processors (64 cores and 1TB of RAM) and 4x NVIDIA H100 GPU cards.
- 12 compute nodes with AMD EPYC (Genoa 4th gen) processors (128 cores and 745GB of RAM) and 4x NVIDIA L4 GPU cards.
- 14 compute nodes with AMD EPYC (Milan 3rd gen) processors (64 cores and 1TB of RAM) and 4x NVIDIA A100 GPU cards.
- 2 compute nodes with Intel Xeon Skylake processors (32 cores and 187GB of RAM) and 1x NVIDIA P100 GPU card per node
Buy-in nodes
- Various configurations
Connecting to Sapelo2
Transferring Files
Disk Storage
Software on Sapelo2
Available Toolchains and Toolchain Compatibility
Code Compilation on Sapelo2
Running Jobs on Sapelo2
Monitoring Jobs on Sapelo2
Migrating from Torque to Slurm
Training material
To help users familiarize with Slurm and the test cluster environment, we have prepared some training videos that are available from the GACRC's Kaltura channel at https://kaltura.uga.edu/channel/GACRC/176125031 (login with MyID and password is required). Training sessions and slides are available at https://wiki.gacrc.uga.edu/wiki/Training
Teaching cluster
The teaching cluster is a Linux cluster that runs a 64-bit Linux, with Rocky 8.8. The login node is a VM that has 4 cores (Intel Xeon Gold 6230 processor) and 16GB of RAM. An EDR Infiniband network (100Gbps) provides internodal communication among compute nodes, and between the compute nodes and the storage systems serving the home directories and the work directories.
The cluster is currently comprised of the following resources:
Regular nodes:
- 10 compute nodes with AMD EPYC (Naples 1st gen) processors (32 cores and 128GB or RAM per node)
High-memory nodes:
- 2 compute nodes with AMD EPYC (Naples 1st gen) processors (64 cores and 1TB of RAM per node)
GPU nodes:
- 1 compute node with Intel Skylake processors (32 cores, 192GB RAM per node) and a P100 GPU card
The queueing system on the teaching cluster is Slurm.
Connecting to the teaching cluster
Transferring Files
Software Installed on the teaching cluster
The teaching cluster has access to the same software stack installed on Sapelo2.