Rocky 8 Transition Guide

From Research Computing Center Wiki
Revision as of 10:59, 28 July 2023 by Shtsai (talk | contribs)
Jump to navigation Jump to search

Introduction

As part of our August 29-31,2023 maintenance window, GACRC will be upgrading the Sapelo2 cluster Linux operating system from CentOS 7 to Rocky 8.

Why is a major Operating System (OS) update necessary?

  • Existing OS is End of Life - There are no more full version updates being released for the existing operating system and newer versions of some software applications are not supported by the current OS version.
  • Bringing New Nodes Online - As development within the existing OS has stopped, some of the latest generation of compute node hardware cannot use it, needing driver types newer than what this OS has. New hardware and architecture that we will be bringing online soon requires this OS update.
  • Security Improvements - In order to keep our cluster as up to date as possible, these kinds of big OS updates need to happen.
  • Why Rocky 8? - A good portion of the HPC centers is adopting it, which means there is a good amount of community support.

What does this mean to you and your workflows?

Overview

  • We are not changing anything from the data storage standpoint. All existing /home, /scratch, /work, /project spaces will retain the existing data.
  • The compiler toolchains and many software packages will be updated to newer versions.
  • Because this is a major OS update, we need to recompile all the applications and ensure that they work with the new version of OS.
  • We will have as comprehensive a software suite available on the new OS as possible, but some less widely used applications and older version software will not be immediately available.
  • As software modules will be reinstalled and updated, all pending jobs will be canceled during the maintenance window, to prevent job failure due to changes in the module names post maintenance.

Storage

There will be no changes to the storage system at this maintenance window. All existing /home, /scratch, /work, /project, /db spaces will be available after the maintenance and they will retain the existing data.


Queueing System

The Slurm queueing system will be updated from version 21.08.8 to version 23.02.2. Most compute nodes available on the CentOS 7 system will continue to be available after the transition to Rocky 8 and the Slurm partitions will remain the same.


Software

Warning

Because this is a major change in the operating system, most user software built on CentOS 7 will not work and will need to be rebuilt. Even if the programs run without being rebuilt the change in the underlying libraries may impact code execution and results. Therefore, users should test and verify that their codes are producing the expected results on the new operating system.

Compiler toolchains

The base compiler toolchains used to build software libraries and applications on the cluster will be updated, as newer versions are able to generate more optimized code for newer computer hardware and newer software versions.

Base compiler toolchains on CentOS 7:

  • GCCcore/8.3.0, GCC/8.3.0, gompi/2019b, foss/2019b
  • GCCcore/10.2.0, GCC/10.2.0, gompi/2020b, foss/2020b
  • CUDA versions 10.2 and 11.1
  • OpenMPI versions 3.1.4 and 4.0.5

Base compiler toolchains on Rocky 8:

  • GCCcore/11.2.0, GCC/11.2.0, gompi/2021b, foss/2021b
  • GCCcore/11.3.0, GCC/11.3.0, gompi/2022a, foss/2022a
  • CUDA versions 11.4, 11.7, and 12.0
  • OpenMPI versions 4.1.2 and 4.1.4

Centrally installed modules

Centrally installed software modules will continue to have the format Name/Version-Toolchain, but for most software packages the Version and Toolchain will updated.

Sample module on CentOS 7:

BLAST+/2.12.0-gompi-2020b

Sample module on Rocky 8:

BLAST+/2.13.0-gompi-2022a

Conda environments

Python packages

R packages

Singularity containers