Globus

From Research Computing Center Wiki
Revision as of 21:51, 29 September 2021 by Shtsai (talk | contribs)
Jump to navigation Jump to search

Introduction

The GACRC, on behalf of UGA, recently procured an institutional standard subscription to Globus for secure, reliable management of UGA's research data. Globus is a high-performance data-transfer platform that allows you to perform and/or automate:

  • Data transfers between servers in your group.
  • Data transfers between a server and your laptop.
  • Sharing data with researchers at other institutions.
  • Sharing data with the world.

Data transfers happen unattended and are faster than SCP/SFTP, data verification is on by default, and automatic restarts or continuation of transfers happen after a disruption.


Back to Top

Getting Started

If you are a first time user of Globus, you will need to create an Identity Account. At a minimum you will need to setup your identity using the University of Georgia organizational login in order to access UGA systems.

  • Go to https://www.globus.org and choose Login in the upper right corner.
  • Search for University of Georgia in the “Use your existing organizational login" box.
  • Choose continue and you will be forwarded to a UGA Single Sign-On (SSO) login page. You will also need to authenticate with Duo (two-factor authentication).


When you login for the first time using an existing organization login associated to UGA:

  • Globus will ask if you would like to link to an existing account. If you have already used another account with Globus in the past, you can choose "Link to an existing account". Otherwise, click "Continue" to proceed.
  • You will need to accept Globus Terms of Service and Privacy Policy and click Continue to proceed.
  • You will need to give Globus permission to use your identity to access information and perform actions (like file transfers) on your behalf.

These 3 steps will not be prompted after your first login.

Detailed information with screenshots are provided by Globus at their Getting Started page.



Back to Top

Access GACRC Storage

GACRC maintains a UGA GACRC Collection that can be used to access the Sapelo2 /home, /scratch, /work, and /project file systems.

After you have logged in to Globus, click the File Manager link at the top-left of the window.

In the Collection search box, enter GACRC and you should see UGA GACRC Collection in the list.

Globus-UGA-GACRC-Collection.png


Select the UGA GACRC Collection and authenticate with SSO to access your files on Sapelo2. By default, you will open your home directory on Sapelo2. For example:


Globus-sapelo2-home.png

To access your /scratch or /project areas, enter the full path in the Path field under the Collection name (for example, enter /project/abclab) and then press the Enter or Return key on your keyboard for the change in the path to take effect.


Back to Top

Transfer Data Between GACRC Storage and Desktops/Laptops

There are two ways to use Globus to transfer files between a GACRC storage and desktops or laptops. The method that is best suited depends on the number and size of the files to be transferred. In both cases you would use a browser to Login into https://www.globus.org, as described above, and open the UGA GACRC Collection in the File Manager panel.

Small number of files or small sizes downloaded or uploaded

Uploads - Once you open the UGA GACRC Collection in the File Manager panel, you can select "Upload" button to upload a file from your local machine to the UGA GACRC Collection, to the path you select (e.g. your Sapelo2 /scratch dir).

Downloads - Once you open the UGA GACRC Collection in the File Manager panel, you can navigate the directory where the files are located and select the files you want to download. Then select the "Download" button to download the file(s) to your local machine.

Here is a sample screenshot to download a file called analysis.txt from the user's home directory on Sapelo2 to the local machine:

Globus-sapelo2-download-file.png

You do not need to install Globus Connect Personal on your local machine in order to use the Download and Upload features, to transfer files from e.g. the UGA GACRC Collection to your local machine. Please note that not all Collections have the Download and Upload feature. If this feature is not available, you can follow the instructions in the Many files or large files section below.

Many files or large files

Install Globus Connect Personal (GCP) and create a Globus endpoint on your local machine. Globus Connect Personal allows faster and more reliable file transfers. Information on how to install GCP on your local machine and create an endpoint are available on the Globus Connect Personal page.

Once you have installed GCP and created an endpoint on your local machine, navigate to the UGA GACRC Collection in the File Manager panel, select the files or directories you wish to transfer, and click on the double panel icon (top right), as illustrated here:


Globus-UGA-GACRC-collection-filetransfer1.png


On the right panel, enter the name of the Collection on your endpoint (e.g. called shtsai-imac in this example) and select the path where to transfer the data to (or from). By default, file integrity will be checked after the transfer is completed. You can enable other options (e.g. file encryption on transfer) by opening the Transfer & Sync Options menu and selecting the options you wish to use. Once all the settings are chosen, and you are ready to perform the file transfer, click on the Start button. You can check the status of the transfer by going to the Activity panel.


Globus-UGA-GACRC-collection-filetransfer2.png


Transferring files between Sapelo2 file systems

To transfer files between two files systems on Sapelo2, navigate to the File Manager two-panel view, and select UGA GACRC Collection on both panels. Enter the path to the location of the files on one panel (e.g. your /project directory) and the location where you want to transfer the file to on the other panel (e.g. your /scratch directory). Select the files to transfer and click Start. You can check a report of the transfer by going to the Activity panel.

Example:

Globus-filetransfer-project-scratch.png



Back to Top

Access Storage Not Hosted by GACRC

Globus can be used to access, share, transfer, or manage data stored on devices outside of the GACRC. The steps described above can also be used to access and transfer data from any endpoint that has been shared with you (for example, shared by a collaborator in another institution). You can either search for an endpoint in the File Manager panel, or go to the Endpoints panel and select Shared with you to see a list of Collections shared with you.

You can transfer files into your Sapelo2 file systems by using the UGA GACRC Collection and selecting the appropriate directory (e.g. your /home, /scratch, or /project directories).



Back to Top

Sharing Data

Globus can be used to share data from your local machine or your GACRC /project area.

Sharing data from GACRC storage (/project)

You can allow collaborators to download or transfer files from your /project area into their local endpoints/collections. Your collaborators do not have to be associated with an institution that has a Globus subscription. They can login into https:///www.globus.org using a Globus ID, a Google Account, a ORCiD ID, or an institutional account if their institution has a Globus subscription. They will also need to install Globus Connect Personal on their desktop or laptop.

Steps to shared a /project folder to a collaborator:

1. In the File Manager panel, open the UGA GACRC Project Sharing collection.

2. Navigate to the folder you wish to share (e.g. /abclab/shareddir) and select it.

3. Select Share and create a shared collection.

4. Add permissions to share the collection with your collaborators (they will need to have a Globus account, as indicated above).

Note that your collaborators will not be able to transfer files into your shared Collection on /project, even if you choose to add write permission while sharing the Collection.

If you would like your collaborators to transfer data into your GACRC storage for you to use on the cluster, please contact GACRC and we will set up a space where they can write into.

Sharing data from your desktop or laptop

If you have a need to share data from your desktop or laptop with research collaborators add “University of Georgia” as your Globus Plus Sponsor in your Globus Account Settings. Please be sure you have your UGAMail (yourMyID@uga.edu) set as your contact email address in your profile. Note this is not required for normal Collections/Endpoints provided by GACRC.

  • Click on username menu top right
  • Choose Globus Plus
  • Choose get Globus Plus or if you are already a Plus member with another institution, choose add another provider
  • Find and click the radio box for “University of Georgia” in the list. GACRC will be notified of your request after you click Continue

Once your request is approved, Globus Plus will allow you to create shared links to your own Globus Connect Personal client. These are also known as guest collections. Please note that if you allow write access to your Collection, your local hard drive can be inadvertently filled or files stored there could be deleted. Please see the Globus Connect Personal page for more information.

Sharing data from a multi-user system

If the data are stored (or will be stored) on a multi-user system, you can install Globus Connect Server and create an endpoint on this system. Please contact GACRC to request that your endpoint be associated with UGA's Globus subscription. Once your endpoint is listed in UGA's subscription, you will be able to share it.



Back to Top

Documentation

Official Globus Documentation

Key Resource List

How-to