Policies: Difference between revisions
Line 79: | Line 79: | ||
There is no storage size quota for SCRATCH usage. Space is only limited by the physical size of the scratch space being used. If usage across the entire file system is more than 80% of total capacity, the GACRC will take additional measures to reduce usage to a more suitable level. Amongst possible actions, the GACRC may request/force users to clean up their SCRATCH directories or reduce temporarily the 30 day limit to a lower limit. | There is no storage size quota for SCRATCH usage. Space is only limited by the physical size of the scratch space being used. If usage across the entire file system is more than 80% of total capacity, the GACRC will take additional measures to reduce usage to a more suitable level. Amongst possible actions, the GACRC may request/force users to clean up their SCRATCH directories or reduce temporarily the 30 day limit to a lower limit. | ||
In order to help users identify old files in their /scratch directory, we generate one file per user, every morning. This file, namely /usr/local/var/lustre_stats/$USER.over30d.files.lst (where $USER refers to the user's UGA MyID), contains a list of all the files that this user has in his/her /scratch directory and that have not been accessed in over 30 days. This file provides the full path, the last accessed date, and the size of the files that have not been accessed in over 30 days. Because the /scratch filesystem only updates on-disk file access times every 5 days, the last accessed time reported in that list file is approximate and could be off by up to 5 days. The purging system understands this and will make sure the access time falls within the purge window before deleting the file. | |||
===Policy Statement for WORK File System=== | ===Policy Statement for WORK File System=== |
Revision as of 12:29, 13 January 2020
Introduction to GACRC Policies
The following policies are subject to revision, especially as the GACRC grows in scope and services. Your comments and questions will be useful to our policy formulation and refinement and are actively solicited by the GACRC Advisory Committee.
The GACRC computational infrastructure, including its servers, clusters, data stores, and other related devices are for the exclusive use of authorized users only.
Anyone using these systems expressly consents to abide by the policies of the University of Georgia and the Georgia Advanced Computing Resource Center and, accordingly, is subject to account termination and/or immediate disconnection from GACRC resources.
GACRC Resource Usage
The computational resources of the Georgia Advanced Computing Resource Center are to be used in direct support of research programs at the University of Georgia. Support is also provided for classes that teach computational methods, and provide training for high performance computing. The GACRC reserves the right to restrict access to its resources for course work if such work is deemed to present a negative impact to authorized research activities.
GACRC policies supplement UGA’s Policies on the Use of Computers, found at:. http://eits.uga.edu/access_and_security/infosec/pols_regs/policies/aup
GACRC Eligibility and Access
Access to and use of the computing facilities managed by the Georgia Advanced Computing Resource Center are limited to persons affiliated with the University of Georgia or associated with research projects sponsored by UGA.
Affiliation in this context means faculty, research staff and supervised students of the University of Georgia. Faculty includes persons holding permanent or temporary appointments as well as adjunct faculty, instructors and visiting faculty while in residence at the University. It also includes those persons with faculty status such as research associates, research scientists, post-doctoral researchers and academic and service professionals. Staff includes all those non-faculty persons employed directly by the University in a research-support role. Graduate and undergraduate students who are members of faculty research labs are eligible for accounts as well. For directly affiliated users, accounts on the GACRC computers will remain active as long as the individuals hold the above status.
Access by non-UGA researchers and their students, affiliated to higher-education institutions or non-profit research organizations, for work on research projects conducted in collaboration with UGA Faculty is possible under the guidelines established by the Office of the Vice President for Research and the Office of International Education. A request for access can be forwarded to the GACRC by the UGA Faculty, providing details of the collaboration on a joint research project. Such Affiliate users will be considered part of the UGA Faculty’s group and will be under the Faculty’s responsibility. Affiliate users’ access will be granted for a fixed period of time, according to the expected length of the collaborative project. Renewal of affiliate accounts will be required annually.
All accounts will remain active no more than 30 days following a status change (i.e., leaving the university). Graduate instructional accounts will only remain active for the duration of the semester in which they are actually needed. HOME and PROJECT directories will be archived for at least 90 days, but no longer than 180 days after an account becomes inactive.
Requests for access by individuals other than those listed above should be directed to the Director of the Center for consideration with the GACRC Advisory Committee.
Access will be granted to a specific GACRC resource after appropriate training is undertaken with GACRC staff. Existing access to other GACRC resources is not a sufficient criteria for access to a new resource. No exceptions will be given to the training requirement.
GACRC Identity Management
Below are described the procedures for validating the identity of account users.
UGA Users and Faculty Lab Groups
- A UGA Faculty must first establish a GACRC group account using the instructions provided on the GACRC website (http://gacrc.uga.edu/accounts). The UGA Faculty can choose or not to obtain a GACRC user account affiliated with his/her group account.
- All directly affiliated persons wanting an account must apply for access to the GACRC using the instructions provided on the GACRC website (http://gacrc.uga.edu/accounts). The applicant must authenticate to the form using his/her MyID. The applicant must specify to which group he/she belongs. Verification with the UGA Faculty responsible for the group’s account will be used in case identity or affiliation needs to be verified.
- Upon acceptance of the application, the user will be notified via e-mail. The applicant’s UGA MyID and password will be used to log into the requested GACRC resources.
Affiliate Users
- A recognized Affiliate user must be sponsored by a UGA Faculty member through an established GACRC group account. The UGA Faculty involved in an established collaboration with the Affiliate user, must apply on behalf of the applicant by contacting the GACRC staff.
- A request will be made by the GACRC to EITS to allocate to the Affiliate a UGA MyID.
- Upon acceptance of the Affiliate user application, the Affiliate will be notified via e-mail. The Affiliate’s UGA MyID and password will be used to log into the requested GACRC resources.
Protection of Passwords
As described in UGA’s Password Policy, an account holder must never divulge their MyID and password to a third party. Only authorized account holders may access the resources of the GACRC. If a third party is found to be using an account holder’s login with or without the permission of the account holder, the account holder’s access privileges may be revoked at the sole discretion of the GACRC Manager or Director. Enforcement of this policy is under the responsibility of the Office of the Vice President for Information Technology’s Division of Information Security.
More information is found at the following EITS website:
http://eits.uga.edu/access_and_security/infosec/pols_regs/policies/passwords/
GACRC Storage Usage
Some working definitions
Snapshot - Copies of files that are stored on the same storage system as the original files. Snapshots are primarily used to recover files that have been accidentally deleted or corrupted within the recent past. Users are able to manage the file recovery tasks. Snapshots are not maintained beyond a defined rotation schedule, i.e., some number of hourly, daily, weekly, and monthly snapshots are kept on the storage system.
Backup - Copies of files and/or snapshots kept on a storage system (disk/tape) other than the one that the original files reside on. Backups are primarily used to recover files following a catastrophic failure of the original file or storage system. Backups require administrators to perform file system recovery tasks. Like snapshots, backups have a defined rotation schedule.
Archive - Copies of files that are not currently being accessed, on a resilient storage system dedicated to reliable long-term storage. Archives can be tape-based or disk-based, and typically part of a disaster recovery plan. The files may be copies of original data which is stored elsewhere (individual groups having their own copies), or the archive storage system may be fed by a dedicated "backup" storage system.
Active Projects – Projects that have on-going computational work being performed with files that are regularly created, accessed or modified.
Policy Statement for SCRATCH File System
The SCRATCH file system resides on a high-performance storage device and is to be used uniquely for temporary storage of files in use by actively running compute jobs. Files are to be removed from SCRATCH when a job completes, e.g. can be copied to the PROJECT file system. The SCRATCH file system is not backed up in any way and no snapshots are taken. The SCRATCH filesystem is mounted under /scratch on all the compute nodes, login nodes and data transfer nodes.
Any file that is not accessed or modified by a compute job in a time period no longer than 30 days will be automatically deleted off the SCRATCH file system. Measures circumventing this policy will be monitored and actively discouraged.
There is no storage size quota for SCRATCH usage. Space is only limited by the physical size of the scratch space being used. If usage across the entire file system is more than 80% of total capacity, the GACRC will take additional measures to reduce usage to a more suitable level. Amongst possible actions, the GACRC may request/force users to clean up their SCRATCH directories or reduce temporarily the 30 day limit to a lower limit.
In order to help users identify old files in their /scratch directory, we generate one file per user, every morning. This file, namely /usr/local/var/lustre_stats/$USER.over30d.files.lst (where $USER refers to the user's UGA MyID), contains a list of all the files that this user has in his/her /scratch directory and that have not been accessed in over 30 days. This file provides the full path, the last accessed date, and the size of the files that have not been accessed in over 30 days. Because the /scratch filesystem only updates on-disk file access times every 5 days, the last accessed time reported in that list file is approximate and could be off by up to 5 days. The purging system understands this and will make sure the access time falls within the purge window before deleting the file.
Policy Statement for WORK File System
The WORK file system resides on a high-performance storage device and is to be used for storing files that are frequently used by the group for computation. The WORK file system is NOT subject to the 30-day purge policy. The filesystem usage is controlled using a quota on the size and number of files that can be stored in a lab group's WORK area. Initially each group is given a 500GB and 100,000-file quota. The WORK file system is not backed up in any way and no snapshots are taken. The WORK filesystem is mounted under /work on all the compute nodes, login nodes and data transfer nodes. Each lab group has a directory under the /work directory.
The WORK file system is NOT subject to the 30-day purge policy. But if there is sufficient space consumption on the storage appliance, we reserve the right to ask users to clean up their WORK area. If the users do not respond in a timely fashion we will purge files beginning with the oldest ones. Please do not use the WORK area to store files long term.
Policy Statement for HOME File System
The HOME file system resides on a high-performance storage device and is used for long-term storage of files, typically programs and scripts, needed for analysis on the GACRC computing cluster.
All users have 100GB allocated for their HOME usage. Groups may request a separate 100GB allocation for a directory under /usr/local/lab/, for shared use of common applications, libraries, and scripts.
HOME directories will have daily, weekly and up to 3 monthly snapshots kept on the same storage unit to protect against accidental file deletion. Currently, the GACRC is not able to perform any backup of the HOME file system onto another storage device. Users are strongly encouraged to make their own copies of critical files, while accepting any risks associated with HOME usage. Appropriate communications will take place once a backup service is enabled.
Snapshot retention, data purge and quota allocation policies are subject to change based on available storage capacity, users’ demand, equipment condition and availability, as well as any other conditions that might affect the provision of the HOME service.
Policy Statement for PROJECT File System
The PROJECT file system resides on lower-performance/higher-capacity storage devices, accessible by all GACRC login and data transfer nodes. PROJECT will not be accessible on Sapelo2's compute nodes. This space is to be used by groups for storage of active projects using Sapelo2. PROJECT should not be seen as a long-term repository, as it is not designed as such. Once a project is completed, data should be moved from the PROJECT space to user-managed storage, freeing up capacity for the next active project.
Access to the PROJECT file system is not supported through NFS to a destination outside of the Boyd Data Center, or through the use of the Samba or CIFS protocols. Transfer protocols available through the data transfer nodes are secure ftp, scp, rsync, GridFTP, amongst others.
More info is found at https://wiki.gacrc.uga.edu/wiki/Transferring_Files.
Each group can request a PROJECT volume with an initial 1TB allocation, accessible by all users ascribed to the group, where the sharing of files will be enabled. Users are encouraged to consider their PROJECT space as the primary area to transfer compute job inputs/outputs. Additional space can be requested by a Faculty on behalf of his/her group, in increments of 1TB.
The GACRC reserves the right to establish a cost-recovery rate for PROJECT storage beyond the initial 1TB allocation. Appropriate communications will take place in such an event.
PROJECT directories will have daily, weekly and up to 3 monthly snapshots kept on the same storage unit to protect against accidental file deletion. Currently, the GACRC is not able to perform any backup of the PROJECT file system onto another storage device. Users are strongly encouraged to make their own copies of critical files, while accepting any risks associated with PROJECT usage. Appropriate communications will take place once a backup service is enabled.
Snapshot retention, data purge and quota allocation policies are subject to change based on available storage capacity, users’ demand, equipment condition and availability, as well as any other conditions that might affect the provision of the PROJECT service.
GACRC Software Policy
The GACRC maintains a collection of program libraries and software packages to support research computing activities across diverse research domains. While a user can install a software package in their own environment, for the sake of general access across groups, and an appropriate deployment with current libraries, compilers and other dependencies, we strongly recommend that GACRC staff be asked to perform the installation or upgrade.
Any software that requires a signed license or contract, even if it is a click-through agreement, must absolutely be reviewed and handled by the Office of Legal Affairs before being signed by an appropriate signature authority. After the license or contract is accepted and the software is made available, GACRC users must fully comply and use the software in a way that does not violate any terms of the license or contract. Further information on licensing issues can be found at the following EITS website:
http://eits.uga.edu/access_and_security/infosec/pols_regs/policies/aup/eula
As a matter of policy, the GACRC will not purchase any commercial software for the use of a single group or a small number of groups. Commercial software currently purchased and maintained by the GACRC are of general interest and applicability to the whole UGA research community. The GACRC will however install and maintain a group-purchased commercial software, which complies with the above comments on licenses and contracts.
Security
To minimize disruption of service, protect data integrity, conserve facility resources and maximize the effectiveness of staff support, the GACRC maintains strict security requirements for access to GACRC resources. Over time, the enforcement of these requirements will become increasingly strict, with the goal of preventing any access to the GACRC resources by any person or any device that is not in strict compliance with these requirements.
Operating Systems
Any computer accessing the GACRC for any purpose must run a currently supported operating system, updated to the latest version and update (patch) levels.
Anti-Virus Software
Any computer accessing the GACRC for any purpose must meet minimum levels of anti-virus protection. Any computer used by an account holder must have anti-virus software from a source approved by UGA’s Office of Information Security must have that virus protection activated, and must have automatic updates activated.
More information can be found at the following EITS website:
http://eits.uga.edu/access_and_security/infosec/protect_your_computer
Suspiciously Behaving Software
Any software that behaves in a suspicious manner may at any time be terminated and/or deleted from GACRC resources at the sole discretion of the GACRC’s systems administrator(s), manager, Director, or EITS information security staff.
Suspiciously Behaving Networks and Devices
Any connection from any device to the GACRC may be terminated at any time, if the device or the connection or a network to which the device is attached appears to be not in compliance with UGA’s security requirements, is behaving suspiciously, or if a threat emerges requiring termination for intrusion prevention at the sole discretion of the GACRC’s systems administrator(s), manager, Director, or EITS information security staff.
Account Holder Responsibility
The account holder is responsible for diligently monitoring their account and compliance with the GACRC’s operating system, intrusion and virus protection standards. The account holder will be duly notified if GACRC personnel determine that minimum security requirements are not met. Specific actions will be requested of the account holder and compliance to these will be expected in a timely fashion. An account holder’s privileges to use GACRC facilities may be terminated by the GACRC Manager or Director at any time, without notice if, in the opinion of either, the account holder is reluctant or averse to practicing diligence in meeting the GACRC’s minimum requirements for intrusion and/or anti-viral protection.
Resolving Disagreements about Revocation of Privileges or Provisioning Resources
The Director of the Georgia Advanced Computing Resource Center has full authority to revoke a user's privileges or deny the request of a new resource allocation. The decision to revoke a user's privileges will be based on, but not limited to, abuses of the UGA Policies on the Use of Computers and/or abuses of the UGA Password Policy.
If an account holder is denied a request for provisioning of GACRC resources or resource privileges are revoked, the account holder’s Department Head may appeal to the Vice President for Research and the Vice President for Information Technology. Their decision will be informed by the Director of the GACRC, the Chief Technology Officer as well as the Associate Chief Information Officer for Information Security. The decision of the Vice President for Research and Vice President for Information Technology is final.
System Maintenance and Downtime
Planned Maintenance
The GACRC instituted monthly maintenance windows in order to perform maintenance operations requiring system operations to be reduced or interrupted.
The schedule will be as follows:
- The last Wednesday of each month from 10AM to 4PM will be reserved for partial cluster maintenance.
- Twice a year, a two-day shut-down of GACRC services will be scheduled for more complex maintenance operations. These will occur on the last Tuesday and Wednesday of the months of January and July.
These maintenance windows represent periods when the GACRC may choose to drain the queues of running jobs and suspend access to the Sapelo2 cluster, as well as storage devices for maintenance purposes. Interruptions will be kept as brief as possible.
The GACRC will notify all users at least 10 days in advance that a maintenance window will be in effect. The notification will describe the nature and extent (partial or full) of the interruptions of cluster and or storage services. In case a maintenance window has to be extended due to unavoidable technical reasons, adequate communications will be made to all users.
The impact of the outages will vary, and the GACRC will do its best to preserve pending and running jobs, which is often very doable. Nevertheless, users will need to plan their job submissions around the maintenance windows.
Unplanned Maintenance and System Outage
From time to time, hardware, software, and/or environmental factors may cause a system or subsystem to malfunction, causing disruption to service. Also, there may be circumstances or events related to possible security issues or intrusions which will cause GACRC staff to take systems offline while the nature of the apparent breach is analyzed and appropriate action is taken.
Whenever possible, account holders will be notified by e-mail of these outages in advance, but that may not always be possible. Account holders will be notified by e-mail if the disruption should last more than 30 minutes.
GACRC staff will strive to preserve the work and/or prevent disruption of jobs in process during such outages. However, there may be circumstances which cause disruption of jobs and loss of data. Users are encouraged to implement methods in their code which minimize the effect of unplanned interruption of a job’s execution, such as checkpoints. Users are also strongly encouraged to maintain copies of files of importance.
Regulatory Compliance
The GACRC as an infrastructure and service provider does NOT currently warrant that its practices or facilities meet government-mandated requirements for the storage and protection of sensitive, private or classified information. Users may not store such information on GACRC facilities. In other words, data that falls under HIPAA, FERPA, FISMA or similar regulatory requirements, may not be stored, computed against or otherwise transacted through, or with, GACRC infrastructure.
The GACRC and its users must comply with all existing Federal export control regulations for services and infrastructure. Research groups must agree to NOT install or use any software or data that falls under Export Control regulations. More information on the subject of Export Control is available at the following OVPR website:
http://research.uga.edu/export-control/
Copyrighted materials are prohibited without proper authorization. Additionally, illegal content is prohibited.
Non-compliance with any such Federal requirements might impact GACRC operations or delivery of services and could place the GACRC and UGA at risk. If a research group is found to be in non-compliance, then account access will be immediately suspended, while an investigation by EITS’s Information Security division is instigated.
Disclosure
Research groups that are involved in activities that store protected data on GACRC infrastructure must contact immediately the GACRC Director in order to address the issue. Depending on circumstances, accommodations might be possible for such activities.
Research Data Compliance
Research Data Management is a critical factor in both obtaining federal funding from the NSF, NIH, DoD and other federal agencies, and in the conduct of funded research. Responsibility in maintaining and preserving research data, as detailed in data management plans submitted to Federal funding agencies, is entirely placed upon the research faculty, post docs, and graduate students conducting the research. The GACRC will help by providing information and assistance, but will not be responsible to ensure compliance with a project’s data management plan.
During the phase of proposal writing, arrangements can be discussed and agreed upon as to the GACRC playing an active role, and ensuring the provision of specific services towards the compliance of a data management plan. Depending on the complexity or the nature of the proposed services, the GACRC might require the purchase of specific hardware/software and/or the availability of a %FTE salary and benefits.
More information on data management plans can be found here.