institute of biotechnology >> brc >> bioinformatics >> internal >> biohpc lab: user guide
 

BioHPC Lab:
User Guide

 


Backups Overview

In order to use our backup system you need to purchase backup storage (the pricing is listed online), then choose which files and directories are to be backed up and which are to be skipped. The details of the process are discussed below. The backups are executed daily during the night, so any changes made to the configuration will be reflected in backups the next day. The backup servers are located in Weill Hall.

Purchase Backup Storage





  • First time users must start by purchasing backup storage by clicking on the "Purchase Backup Credit Account" button on the bottom of the "My Storage" page.
  • Backup Storage is purchased in 1 TB-year increments, similarly as our main storage. How long your purchased storage will last depends on the backup size - it is similar to storage-quota relation (see the bottom of main storage page for details). This means if you purchase 1TB-year of backup storage and your backup size is 0.5 TB then your 1TB-year of purchased backup storage will last 2 years. If you backup size is 2TB then your 1TB-year purchase will expire after 6 months.
  • Backup Storage Usage is calculated and reported every day on your "My Storage" page and your remaining backup storage and expiration date recomputed accordingly.
  • A default name is given to your new Backup Credit Account after you accept the Purchase and an invoice is created (the name may be changed after the purchase from the status table on the "My Storage" page).


Add Directories/Files to Backup Storage


  • Click on the "Add or modify Backup Storage" button on the bottom of the "My Storage" page.
  • This will bring up the "Add Backup Storage" Page.


  • Here you enter the Directory you want to backup in the Text Box and click on the Add button
  • Use the default Server: Network Storage to backup your directories which start with /home.
  • You may change the "Server: Network Storage"" text box to the name of any specific server which you can access and where you have files that require backup. This typically applies to BioHPC Lab hosted servers.
  • Once a Directory has been added you will be able to "Manage Excludes" and edit the "Retention", "Frequency" and "MinSave" Values.


Exclude Directories/Files from Backup Directories

  • Click on the "Manage Excludes" button to list the Directories/Files in the "Backup Directories"
  • Click on the "Exclude checkbox" to remove the Directories/Files from the Backup Process.
  • Repeat the "Add Directories/Files to Backup Storage" step for all directories you would like add to your Backup Account
  • Check the "My Storage" page regularly for current Backup Credit Account Status. You will be notified by e-mail when your purchased backup storage is about to expire.

How Does Backup Work

    Using the procedure described below, the user specifies one or more directories they wish to back up. Each such directory becomes a backup root. A typical example would be your home directory, although it is also possible to specify other directories, located on hosted servers. Each backup root will be backed up entirely (reccursively with all files and subdirectories) except subdirectories or files explicitly excluded.

    When backup of a given directory is being done for the first time, the entire directory (except exclusions) will be copied to the backup server, i.e., its current snpashot will be created reflecting the directory's state at backup time. Next time the backup runs, this current snapshot will be updated, i.e., files removed, added, or changed by the user in the meantime in the source directory will be also removed from, added to, or changed in the current snapshot. However, the files that have been removed, as well as previous versions of those that changed, will be saved on the backup server in a backup snapshot labeled with backup date and time. The backup snapshot contains only files that have been changed or removed by the user from the source directory since the previous backup cycle. Subsequent backup cycles will update the current snapshot, create new backup snapshots, and remove the older ones.

    Thus, the backup server will always contain the current snpashot, reflecting the state of the directory from before the latest backup, plus a number of dated backup snapshots containing files changed or removed between previous backup cycles. Multiple snapshots facilitate retrieval of old versions of all files, whenever needed. The maximum age of the backup snapshots to be kept is configurable by the user.

Meaning of the Backup Parameters

    Backup is controled by three parameters, set individually for each backup root directory:

  • Retension: age (in days) of the oldest version of the directory to be kept
  • Frequency: backup frequency (e.g., setting this to 3 means backup of this directory will be run every 3 days)
  • MinSave: minimum number of old versions of the directpry to be saved always, regardless of age (prevents a possibility of all previous versions being erased if the original directory is not changed for longer than Retension days)

Typical Backup Scenarios

    Depending on your needs, you may consider two basic backup strategies: back up most, exclude some and back up some, exclude most.

  • back up most, exclude some: Specify some top-level directory (such as your homoe directory) as backup root, possibly with a few exclusions. The advantage is that all changes you make to this directory (except excluded parts) will be reflected in the backup without you taking any extra effort. However, if you add some large files which you did not really intend to back up but forget to exclude them, they will be copied to the backup server and you will be charged for space-time they occupy.
  • back up some, exclude most: Backup only one (or more) individual subdirectory of your home directory, the content of which you consier most important. To do this, you need to specify this subdirectory (rather than your entire home directory) as backup root. The advantage is that changes you make outside of backup root will not junk up the backup. However, if any of these changes are important but you forget to copy or move them into backup root, these changes will not be reflected in the backup.

Accessing Your Backup

    Backup directories exported from backup server are mounted on our login nodes, cbsulogin and cbsulogin2. Each user-specified backup root has a corresponding location under /backups/backup1 on both login nodes. This location reflects the owner, source server, and backup root. The picture below shows three examples, with different parts of the path color-coded for clarity.


    The first two backup roots are within home directories located on Network Storage. The last backup root is located on a hosted server cbsubscb02.

    Each of these locations is, in turn, organized in current snapshot and backup snapshot directories. For example, listing the content of the first of the directories above will show



    The directory current contains the current snapshot, whereas the bak_* directories (each marked with the date) contain files changed or deleted between the date of the directory and the backup cycle preceding it.
My Storage page

 

Website credentials: login  Web Accessibility Help