Disk Storage: Difference between revisions
| No edit summary | |||
| Line 53: | Line 53: | ||
| </pre> | </pre> | ||
| and a sub-directory will be created, and the user will be told the path to the sub-directory e.g. | and a sub-directory will be created, and the user will be told the path to the sub-directory, e.g. /panfs/pstor.storage/escratch1/jsmith_Oct_22. The life span of the directory will be one week longer than the longest duration queue, which is currently 30 days (i.e., life span = 37 days). At that time, the directory and its contents will be deleted. Users can create one escratch directory per day if needed. The total space a user can use on scratch (all scratch directories combined) is 4TB. The scratch directories are not backed up. | ||
| === lscratch === | === lscratch === | ||
Revision as of 16:00, 18 April 2013
Storage Overview
Network attached storage systems at the GACRC are tiered in three collections of systems based on speed and capacity. Our fastest network attached storage system is a Panasas ActiveStor 12 which exports 156TB of data mounted on every node at /panfs and divided into two categories: home and scratch.
Home Directories
All users have a default 300GB home quota (i.e., maximum limit) on their home directory; however, justifiable requests for quotas up to 2TB can be made by contacting the GACRC IT Manager (currently Greg Derda: derda@uga.edu). Storage in the home directory to avoid archive storage fees is not a justifiable request. Requests for home quotas greater than 2TB must be submitted by the PI of a lab group, and approved by the GACRC advisory committee (via the IT Manager). Users may create lab directories for data that is shared by a lab group, but those directories count against the quota of the creating user. An example of this, for the “abclab” users, would be: /home/abclab/labdata. Home directories are backed up.
Snapshots
Lab home directories are snapshotted. Snapshots are like backups in that they are read-only moment-in-time captures of files and directories which can be copied from to restore files that may have been accidentally deleted or overwritten.
Home directories have snapshots taken once a day and maintained for 4 days, giving the user the ability to retrieve old files for up to 4 days after they have deleted them.
Any directory on the /home filesystem contains a completely invisible directory named ".snapshot". This directory cannot be listed with ls or viewed by any program at all. Only the "cd" command can be used to enter this directory. Users of /home directories may retrieve files from these snapshots by using the "cd" command and copying files from the appropriate snapshot to any location they would like.
Note: ANY user, from any HOME directory can access the snapshots *from that directory* to restore files
For example:
[cecombs@sites test]$ pwd /home/rccstaff/cecombs/test [cecombs@sites ~]$ cd test [cecombs@sites test]$ ls Main.java [cecombs@sites test]$ rm -rf Main.java [cecombs@sites test]$ cd .snapshot [cecombs@sites .snapshot]$ ls 2013.04.16.00.00.01.daily 2013.04.17.00.00.01.daily 2013.04.18.00.00.01.daily [cecombs@sites .snapshot]$ cd 2013.04.18.00.00.01.daily/ [cecombs@sites 2013.04.18.00.00.01.daily]$ ls Main.java [cecombs@sites 2013.04.18.00.00.01.daily]$ cp Main.java /home/rccstaff/cecombs/test [cecombs@sites 2013.04.18.00.00.01.daily]$ cd /home/rccstaff/cecombs/test [cecombs@sites test]$ ls Main.java
Scratch
Scratch Directories are for dynamic data. Scratch space is specifically designed to handle datasets that grow and shrink on-demand. The GACRC does not and can not snapshot scratch directories because the amount of data which changes periodically is too great and snapshots would only serve to slow the file systems down.
eScratch
eScratch directories are for Ephemeral datasets. Commonly the output of large calculations that need to be stored in a temporary place for a short period of time. Any user can make escratch directories for their work. Ephemeral scratch directories on GACRC cluster reside on a Panasas ActiveStor 12 storage cluster.
Making an eScratch Directory
Researchers who need to use scratch space can type
make_escratch
and a sub-directory will be created, and the user will be told the path to the sub-directory, e.g. /panfs/pstor.storage/escratch1/jsmith_Oct_22. The life span of the directory will be one week longer than the longest duration queue, which is currently 30 days (i.e., life span = 37 days). At that time, the directory and its contents will be deleted. Users can create one escratch directory per day if needed. The total space a user can use on scratch (all scratch directories combined) is 4TB. The scratch directories are not backed up.
lscratch
lscratch stands for local scratch and is available on every node in the zcluster.
lscratch information
All /lscratch filesystems on every node have these properties:
- lscratch is by far the fastest possible filesystem at the GACRC, however the lscratch directory is only available to the node that a job get scheduled to.
- lscratch filesystem resides on the local hard drive of the node.
- Represents the remainder of unused disk after the OS is installed.
- Multiple different sizes for /lscratch; nodes have different sized disks.
- Not accessible from other nodes
- Every user has a directory on every node, /lscratch/<username>
lscratch Guidelines
This is a list of guidelines for /lscratch usage:
- Do not count on any lscratch sizes above 10G unless you know the size of the local hard drive and target that node specifically (e.g.: qsub -l h=compute-15-36)
- You will be responsible for migrating your data from the node after your job finishes. The job itself can transfer the data.
- Make sure that your output goes to: /lscratch/<username> (e.g: /lscratch/cecombs)
Quotas
To see how much space you are consuming on the home and scratch file systems, please use the command
quota_rep
Overflow/Archival Storage
Some labs also have a subscription archival storage space, which is mounted on the zcluster login node and on the copy nodes as /oflow (note that /oflow is not mounted on the compute nodes). The archival storage system is for long-term storage of large, static datasets.
This filesystem is snapshotted. The snapshots are available only from the mount point under the hidden ".zfs" directory (e.g.: /oflow/jlmlab/.zfs).
Please contact the GACRC staff to request Overflow storage.