Best Practices on Sapelo2

From Research Computing Center Wiki
Revision as of 15:26, 3 June 2020 by Shtsai (talk | contribs)
Jump to navigation Jump to search


Data Workflow Management

Home directories have a per user quota and have snapshots. Snapshots are like backups in that they are read-only moment-in-time captures of files and directories which can be used to restore files that may have been accidentally deleted or overwritten. If files are created and deleted with frequency, the snapshots will grow and might end up using a lot of space in the overall home file system.

The recommended data workflow is to have files in the home directory *change* as little as possible. These should be databases, applications that you use frequently but do not need to modify that often and other things that you, primarily, read from. Think of snapshots as the memory of the files that were stored there - no matter if you add, change or delete the files, the total sum of that activity will build up over time and may exceed your quota.

The recommended data workflow will have jobs write output files, including intermediate data, such as checkpoint files, and final results into the /scratch file system. Final results should then be transferred out of the /scratch file system, if these are not needed for other jobs that are being submitted soon. Any intermediate files and data not needed for other jobs to be submitted soon should be deleted immediately from the /scratch area.

Files that are needed for many jobs continuously or at some time intervals, possibly by multiple users within a group, such as reference data and model data, can be stored in the group's /work directory.

Single node jobs that do not need to write large output files, but that need to access the files often (for example, to write small amounts of data into disk), can benefit from using a compute node's local hard drive, /lscratch. Jobs that use /lscratch should request the amount of space in /lscratch.

The recommended data workflow is to have data not needed for current jobs, but that are still needed for future jobs on the cluster, be transferred into the project file system and deleted from the /scratch area.