Borg backup

From Research Computing Center Wiki
Revision as of 13:41, 17 September 2018 by Raj76 (talk | contribs) (Created page with "= Introduction = The [https://borgbackup.readthedocs.io/en/stable/index.html Borg backup software] is a deduplicating backup program. It supports compression using various cod...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Introduction

The Borg backup software is a deduplicating backup program. It supports compression using various codecs as well.

The borg backup software is installed on all xfer nodes. You can use this software to archive your files. If you have a large amount of data this could possibly reduce storage by deduplicating and compressing your data. This is especially true for large genomic datasets.

Example showing archiving lustre project directory

Initializing a borg repository

Borg uses a repository (a special directory in your filesystem) to store the backup data. You will have to initialize a repository before writing backup data to it. You can create the repository in any filesystem. In this example I am storing the repository in my project filesystem. Run the following command to initialize the repository.

$ borg init --encryption none /project/gclab/raj76/my_project

If successful the command will not return/print anything.

The next step is to create a backup to the repository. Run the following command to create a backup with deduplication and fast compression.

$ borg create -s --compression auto,lz4 /project/gclab/raj76/my_project::lustre1-{now} /lustre1/raj76/my_project

In the above command ::lustre1-{now} is the name of the archive that will be created by this backup. The {now} shorthand tells borg backup to use the current time stamp as part of the archive name. This will be useful later to identify archives. The above command should be run in a screen or tmux session as this will take a while for large datasets.