Borg backup
Introduction
The Borg backup software is a deduplicating backup program. It supports compression using various codecs as well.
The borg backup software is installed on all xfer nodes. You can use this software to archive your files. If you have a large amount of data this could possibly reduce storage by deduplicating and compressing your data. This is especially true for large genomic datasets.
Example showing archiving lustre project directory
Initializing a borg repository
Borg uses a repository (a special directory in your filesystem) to store the backup data. You will have to initialize a repository before writing backup data to it. You can create the repository in any filesystem. In this example I am storing the repository in my project filesystem. Run the following command to initialize the repository.
$ borg init --encryption none /project/gclab/raj76/my_project
If successful the command will not return/print anything.
The next step is to create a backup to the repository. Run the following command to create a backup with deduplication and fast compression.
$ borg create -s --compression auto,lz4 /project/gclab/raj76/my_project::lustre1-{now} /lustre1/raj76/my_project
In the above command ::lustre1-{now} is the name of the archive that will be created by this backup. The {now} shorthand tells borg backup to use the current time stamp as part of the archive name. This will be useful later to identify archives. The above command should be run in a screen or tmux session as this will take a while for large datasets.