Centre closure

As part of a broader organisational restructure, data networking research at Swinburne University of Technology has moved from the Centre for Advanced Internet Architecture (CAIA) to the Internet For Things (I4T) Research Lab.

Although CAIA no longer exists, this website reflects CAIA's activities and outputs between March 2002 and February 2017, and is being maintained as a service to the broader data networking research community.

Incremental Backups under FreeBSD 4.x using Rsync

CAIA Technical Report 020927A

Grenville Armitage

September 27th, 2002

Introduction

This report summarizes the technology I used to provide automated, incremental backups for Swinburne's Centre for Advanced Internet Architectures (CAIA).

Living without a backup scheme is like not having car or home insurance. It works until the first disaster and then you wish you'd installed it. I kept putting it off because I didn't want to consume huge amounts of disk space keeping multiple, replicated copies of key directories. Then one day I stumbled across a web page by Mike Rubel discussing how the open source package 'rsync' could be leveraged to create a free, open-source mechanism for doing incremental backups under Linux. I made my own version based on his ideas and customised for the FreeBSD 4.x environment we use here at CAIA. This report is for the benefit of anyone curious enough to try and replicate our scheme.

Problem statement

In a nutshell here's the problem we're solving:

We have a bunch of users on a lab server, called mordor. Server runs FreeBSD 4.6.

We have a second FreeBSD 4.6 machine, backup1, with good IP connectivity to mordor.

Backups are desired every hour during the main part of each day, once a day for a week, and once a week for a month.

The backups should be incremental - minimising storage requirements by only keeping 'diffs' from one backup to the next.

Users on mordor should be able to access their backups - going back hours, days, or weeks - without system administrator intervention.

Points 3, 4 and 5 define the core technical problem. I need to do incremental backups, at various times of the day, and allow independent, user-controlled access to those backups.

Cron is your friend

The first rule for simple, automated tasks under unix is that cron is your friend. Under FreeBSD you can, as a regular user, create your own set of cron jobs with 'crontab -e'. As root you can create a set of system cron jobs (which will run as root or any other user you specify) simply by editing /etc/crontab.

So, requirement 3 can be met by three rules in /etc/crontab set to handle hourly, daily, and weekly backup snapshots. For example, assume you have a shell script called 'updateusers.sh' that takes a single parameter 'hourly', 'daily', or 'weekly' to indicate what level of backup is required. Further assume you required:

hourly backups at 45 minutes past every hour between 10:45am and 1:45am every day
daily backups at 3:15am every day
weekly backups at 3:30am every Saturday

The following /etc/crontab lines would achieve the stated goals:

45      0,1,10-23 *    *       *       root    /bin/csh /root/bin/updateusers.sh hourly
15      3       *       *       *       root    /bin/csh /root/bin/updateusers.sh daily
30      3       *       *       Sat     root    /bin/csh /root/bin/updateusers.sh weekly

There is no need to do anything more than edit /etc/crontab - the cron daemon re-reads /etc/crontab every minute to determine what has changed.

Rsync: synchronising directories between machines

From the rsync homepage: "rsync is an open source utility that provides fast incremental file transfer. rsync is freely available under the GNU General Public License." For our purposes, rsync is great for synchronising two instances of the same directory tree that may be located on separate machines.

FreeBSD 4.6 includes rsync as a a pre-compiled package (which is what I used) or you can compile it fresh from the ports collection. The package is available at /packages/All/rsync-2.5.5_1.tgz on the first CD ROM, or "pkg_add -r rsync" to pull the latest copy from the online packages repository. The port can be found under /usr/ports/net/rsync.

The package version, rsync-2.5.5_1, uses ssh by default to establish a secure, encrypted communication path between source and target hosts. This means it is reasonable to do backups over public or relatively untrusted IP network connections.

Our example environment is:

Home directories on mordor reside under /home (e.g. /home/gja, /home/frank, etc....)

Primary backups are held on backup1 under /backups/home.0 (e.g. /backups/home.0/gja, etc....)

rsync is executed on mordor as follows:

/usr/local/bin/rsync -a --update --delete /home/ backup1:/backups/home.0

This causes /backups/home.0 on backup1 to be updated to reflect the set of directories and files under /home on mordor. The update process will recurse down each branch of the directory tree under /home on mordor and:

delete entries under backup1:/backups/home.0 that no longer exist under mordor:/home

add entries to backup1:/backups/home.0 that have appeared under mordor:/home

update entries in backup1:/backups/home.0 that have changed under mordor:/home

backup1:/backups/home.0 contains a snapshot of mordor:/home at the time rsync was called.

(The source directory is specified as "/home/" with a trailing slash to ensure rsync synchronises the sub-directories of /home, rather than /home itself. If the trailing slash had been omitted rsync would create /backups/home.0/home/gja rather than /backups/home.0/gja, etc... on backup1.)

Running rsync as root

rsync must run with root priviledges on both mordor and backup1 for this to work properly, otherwise it might not be able to read all the files under /home nor write the correct ownerships and permissions on backup1. This requires that remote root logins via ssh are enabled on backup1 (mordor is pushing content to backup1).

Create a public/private RSA1 key pair for root (using ssh-keygen), add the public key to /root/.ssh/authorized_keys on backup1 and the private key under /root/.ssh/identity on mordor. Edit the ssh daemon configuration file on backup1 (typically at /etc/ssh/sshd_config) by adding the following line:

PermitRootLogin without-password

Finally, re-initialize the sshd daemon by sending it a SIGHUP signal ('kill -HUP <processID>' to the processID of the running sshd). This will allow root login via ssh from anywhere to backup1, but only if the originator is in possession of the correct private key (password-only logins are disabled for root).

[You could also use ssh version 2 key pairs without changing the basic functionality.]

Incremental backups: using cpio

Running rsync is only part of the solution - it keeps backup1:/backups/home.0 snapshots up to date, but doesn't address the need for users to retrieve older/previous snapshots. For this the cpio command comes in handy.

The key to incremental snapshots is making copies of previous snapshots before they are updated by rsync, and doing so in a way that takes up minimal space on disk. The solution is to copy the primary snapshot's directory tree structure without making redundant copies of the actual files themselves. To do this we make use of a unix-ism known as hard-links.

To summarise: unix directories are lists of hard-links to the actual files on disk. A single file can appear in multiple directories, yet only use space once on disk because each directory entry (filename) is simply a link to the shared copy of the file itself. When a file is deleted (e.g. with "rm") the directory entry is deleted - the file itself is not removed and forgotten until there are no more directory entries linking to the file in question.

[Adapted from Mike's web page...] The following command pipeline uses cpio to create a hard-link replica of /backups/home.0 into /backups/home.1

( cd /backups/home.0 && find . -print | cpio -dplm /backups/home.1 )

Now /backups/home.1 looks identical to /backups/home.0, while taking up virtually no extra space on disk (except for the replicated directory tables themselves).

The real trick comes next. When rsync is run again to update /backups/home.0, it un-links any files that it needs to delete or modify in /backups/home.0 before actually performing the deletion or update. The nett result is that every file in common between /backups/home.0and /backups/home.1 appears only once on disk, and extra disk space is only consumed when files change between consecutive runs of rsync.

Finally, a replicated directory can be moved (with 'mv') without affecting the hard-links inside. For example, to create and maintain a sequence of four hourly backup snapshots on backup1 we would perform the following sequence every hour:

On backup1:

rm -rf /backups/home.3

mv /backups/home.2 /backups/home.3

mv /backups/home.1 /backups/home.2

( cd /backups/home.0 && find . -print | cpio -dplm /backups/home.1 )

On mordor:

/usr/local/bin/rsync -a --update --delete /home/ backup1:/backups/home.0

Every hour the 4th most recent snapshot is deleted, the 3rd, and 2nd most recent snapshots are shuffled back one position and a new links-only copy of /backups/home.0 is created in /backups/home.1 Because we're only re-creating the directory structure this process is also far quicker than actually copying files.

/backups/home.0 is then re-synchronised with mordor:/home, only unlinking and updating files that have changed in the previous hour.

[Note: If you have access to the standard GNU 'cp' utility (e.g under Linux), you can replace the entire "find... cpio..." line with 'cp -al /backups/home.0 /backups/home.1'. The BSD-derived 'cp' utility doesn't have the same options, so we use cpio.]

Making daily and weekly snapshots

My goal is for daily and weekly snapshots to have the form /backups/home.day.0, /backups/home.day.1, etc... and /backups/home.week.0, /backups/home.week.1, etc.... This requires repeated use of the cpio sequence on backup1, and does not require any addition rsync activity between backup1 and mordor.

The daily and weekly snapshots are shuffled in a similar manner to the hourly snapshots described above, except that the final step involves a links-only copy from /backups/home.1 rather than an rsync from mordor:/home/. (I don't recommend copying from /backups/home.0, in case you overlap a concurrent hourly rsync to /backups/home.0 that might be running overtime).

Although the shuffling of snapshot directories needs to be done on backup1, it can be controlled from mordor. Place a suitable shell script on backup1, and have the cron job on mordor call the script on backup1 using ssh. For example, if the backup1 script was called /root/bin/shuffle_local the following line might be appropriate inside the updateusers.sh script on mordor:

/usr/bin/ssh root@backup1 /root/bin/shuffle_local /backups/home <time>

(where <time> is "hourly", "daily", or "weekly".)

Making backups available to users

Traditional backup schemes involving tapes or specialized drives often require adminstrator intervention when a user wants to go back a few hours, days, or weeks. This is sub-optimal for small research labs where administration is often a part-time role of one of the staff or research students.

The following solution seems to work well:

Enable NFS server support on backup1

Set the permissions on /backups so that only root can read/search/write ('chmod 700 /backups')

Export /backups read-only from backup1, with mordor explicitly listed as the only allowed remote host

Enable NFS client support on mordor

Mount backup1:/backups onto mordor:/backups (either permanently, or through the amd automounter service)

The nett result is that users logged in on mordor will see their current home directory contents under /home and simultaneously have read-only access to past snapshots under /backups/home.N, /backups/home.day.N, and /backups/home.week.N.

Step 2 is useful if backup1 serves other purposes inside your lab and has other users logged in at various times. Step 3 ensures that users cannot accidentally scribble on their past snapshots (rather an important protection, since making accidental file alterations is often why the user is searching through their backups in the first place).

Caveats

Although rsync unlinks and re-copies files that have changed, it does not un-link a file if only the file's permission/ownership information has changed. Thus if a file's permissions change from one hour to the next, the new permissions will be written into /backups/home.0 and immediately inherited by that file's ancestors in /backups/home.1, /backups/home.2, etc....

Even though rsync uses ssh to establish a secure IP link to backup1, the use of regular NFS to allow direct access to the backup snapshots opens up a security hole (NFS traffic can be sniffed in transit). If mordor and backup1 were separated by an untrustworthy IP link, I would consider removing the ability for users to access their backup snapshots from mordor itself. An alternative would be for every mordor user to have an account on backup1, sufficient for them to login with ssh and access their snapshots locally on backup1.

There are always risks of allowing remote control of one computer by another - this scheme makes backup1 vulnerable to root logins if mordor is compromised and root's private key is obtained. Note that backup1 should be the only additional machine that runs this risk. The public/private key pair used to allow root login to backup1 should never be used for any other security arrangements in your lab.

Storage Requirements

backup1 requires at least as much storage as mordor has allocated for user home directories. How much more space backup1 requires depends on the expected churn in every user's home directories. Neertheless, the extra space requirement will invariably be far less than would be required if we were making actual copies of the home directories every hour, day, and week.

Conclusion

I've used this scheme for the past two months at work, and at home (to regularly backup my laptop to my home fileserver). It seems to do the job, and required no proprietary software. Nice.

Appendix

In case the preceding general discussion is not sufficient to create your own automated backup scheme, here are the two scripts I created for use at Swinburne's Centre for Advanced Internet Architectures. The primary script updateusers.sh runs on mordor, which calls a secondary script caia_shuffle_local on backup1.

On mordor: /etc/crontab

The following /etc/crontab lines on mordor cause updates to occur at the appropriate time:

45      0,1,10-23 *    *       *       root    /bin/csh /root/bin/updateusers.sh hourly
15      3       *       *       *       root    /bin/csh /root/bin/updateusers.sh daily
30      3       *       *       Sat     root    /bin/csh /root/bin/updateusers.sh weekly

There is no need to do anything more than edit /etc/crontab - the cron daemon re-reads /etc/crontab every minute to determine what has changed.

On mordor: /root/bin/updateusers.sh

#/bin/csh
#
# updateusers.sh
# Version 0.2
# Date: 080502_A
#
# Copyright (c) Grenville Armitage, August 2002
# garmitage@swin.edu.au
# Centre for Advanced Internet Architectures
# Swinburne University of Technology
#
#
# Script to use rsync to create a backup of mordor's
# /home directories (users), saving the backups on backup1
#
# This script is presume to be run FROM mordor to push the required
# files to backup1.
#
# Backups are stored on $TARGET_HOST under a sequence of
# directories with prefix $TARGET_DIR
#
# Hourly: ${TARGET_DIR}.0, ${TARGET_DIR}.1, etc
# Daily: ${TARGET_DIR}.day.0, ${TARGET_DIR}.day.1, etc
# Weekly: ${TARGET_DIR}.week.0, ${TARGET_DIR}.week.1, etc
#
# Command line options:
#
# cmd <hourly|daily|weekly> [debug]
#
# We rely on shuffle_local to perform snapshot directory
# rotations on the remote machine. Yes, this could be done by
# the remote machine, but triggering it here ensures everything
# stops if this local machine dies for some period of time
#

# Validate command line option
setenv WHENTIME $1
if (! (($WHENTIME == "weekly")||($WHENTIME == "daily")||($WHENTIME == "hourly"))) then
        echo Problem with 3rd parameter $WHENTIME - must be weekly, daily, or hourly
        exit 1
endif

setenv DEBUG $2

setenv SRC /home/
setenv TARGET_DIR /backups/mordor.caia.swin.edu.au/home
setenv TARGET_HOST backup1.swin.edu.au
setenv TARGET ${TARGET_HOST}:${TARGET_DIR}

# Confirm target host can be reached

/sbin/ping -c 2 $TARGET_HOST >& /dev/null

if ( $status != "0" ) then
        if ($DEBUG == "debug") then
                echo Host $TARGET_HOST is down or non-existent.
                echo No backup done `date`
        endif
        exit 1
endif

# Target can be reached, trigger a rotation of the snapshot directories
# on $TARGET_HOST

if ($DEBUG == "debug") then
        echo `date`
        echo $TARGET_HOST is up, rotating $WHENTIME snapshots.
        setenv DEVNULL "/dev/stdout"
else
        setenv DEVNULL "/dev/null"
endif

/usr/bin/ssh root@$TARGET_HOST /root/bin/caia_shuffle_local $TARGET_DIR 6 $WHENTIME $DEBUG >& $DEVNULL

if ( $status != "0" ) then
if ($WHENTIME == "hourly") then
        echo Problems with snapshot rotation, rsync skipped on `date`
else
        echo Problems with snapshot rotation on `date`
endif
exit 1
endif

# If this not an hourly rotation, we're complete once the
# remote "caia_shuffle_local" is complete.

if ($WHENTIME != "hourly") then
        echo $WHENTIME rotation completed on `date` > $DEVNULL
        exit 0
endif

#
# Create exclude file of directories/files we don't want to mirror
#
setenv TMPFILE `mktemp /tmp/updateusersXXXXXXXX`
cat << EOM > $TMPFILE
quota.user
EOM

# We're doing an hourly rotation, which involves an rsync
# of ${TARGET}.0 againts the $SRC directories

echo Rsync of user /home directories began on `date` > $DEVNULL

/usr/local/bin/rsync -a --update --delete --exclude-from=$TMPFILE $SRC ${TARGET}.0 >& $DEVNULL

echo Rsync of user /home directories completed on `date` > $DEVNULL

rm $TMPFILE
#

On backup1: /root/bin/caia_shuffle_local

#!/bin/csh
#
# ** caia_shuffle_local
# Version 0.1
# Date: 080402_A
#
# Copyright (c) Grenville Armitage, August 2002
# garmitage@swin.edu.au
# Centre for Advanced Internet Architectures
# Swinburne University of Technology
#
#
# Script to create a single LOCAL snapshot backup of my
# most recent hourly snapshot. This script assumes we're
# running on the machine on which the hourly backups
# reside, so remote access to anywhere is not required
#
# <cmd> <target directory> <number> <week|day> ["debug"]
#
# <target directory> is the path name of the backup
# directory, without trailing "/". Hourly
# snapshots are presumed to be of the form
# <target>.0, <target>.1, etc; daily backups are
# in <target>.day.0, <target>.day.1, etc; weekly
# have the form <target>.week.0, etc.
#
# 2nd param indicates the number of snapshots we are
# keeping at this level
#
# 3rd parameter indicates whether to update the daily from
# the hourly, or the weekly from the hourly snapshots
#
# (if optional 4th parameter is "debug" then script issues
# warnings if there are errors, otherwise # script is silent)
#
# This tool is presumed to be used in conjunction with
# a separate 'rsync' process that keeps the hourly.0
# snapshot up to date. We are called with "hour" to
# rotate hourly.1 through hourly.N, but when called
# with "day" or "week" we rotate day.0 through day.N
# (or week.0 to week.N) respectively and then copy
# over hourly.1 into day.0 or week.0 as required.
# [NB. To avoid potential race conditions with any
# long running, concurrent hourly update, the daily
# and weekly rotations copy the *previous* hour's
# snapshot, hourly.1, rather than the current one
# at hourly.0 which might be still being updated by
# rsync under extreme conditions.]
#
# The trick is that we use "cpio" to copy links rather
# than actual file contents. This minimises disk space
# usage. Subsequent use of 'rsync' ensures new file
# storage is only allocated for files that change from
# one hourly snapshot to the next. (The rsync is currently
# handled outside this program.)
#

if ( $1 == "" ) then
        echo Missing target directory
        exit 1
endif
setenv TARGET_DIR $1

if ( $2 == "" ) then
        echo Missing number of rotations
        exit 1
endif

if ( ($2 < "1") || ($2 > "6")) then
        echo Rotations must be between 1 and 6
        exit 1
endif
setenv ROTATIONS $2

setenv WHENTIME $3
if ($WHENTIME == "weekly") then
        setenv TARGET ${TARGET_DIR}.week
        setenv TARGET_H ${TARGET_DIR}.1
        setenv ENDCNT   0
else if ($WHENTIME == "daily") then
        setenv TARGET ${TARGET_DIR}.day
        setenv TARGET_H ${TARGET_DIR}.1
        setenv ENDCNT   0
else if ($WHENTIME == "hourly") then
        setenv TARGET ${TARGET_DIR}
        setenv TARGET_H ${TARGET_DIR}.0
        setenv ENDCNT   1
else
        echo Problem with 3rd parameter $WHENTIME - must be weekly, daily, or hourly
        exit 1
endif

setenv DEBUG $4

#
# First sanity check that there's actually an hourly snapshot
# that we can copy from in the first place (although if we're
# doing an hourly snapshot, just create an empty $TARGET_H on
# the assumption that it is about to be filled in anyway.)

if ( ! -e $TARGET_H ) then
if ( $WHENTIME == "hourly" ) then
         mkdir -p $TARGET_H
else
        if ($DEBUG == "debug") echo Hourly snapshot $TARGET_H does not exist
        exit 1
endif
endif

if ( $DEBUG == "debug" ) then
        echo `date`
        echo Rotating $WHENTIME snapshots.
endif

#

@ END = $2

if ( -e ${TARGET}.$END ) rm -rf ${TARGET}.$END

while ( $END > $ENDCNT )
        @ END_LESS_ONE = $END - 1
        if ( -e ${TARGET}.$END_LESS_ONE ) mv ${TARGET}.$END_LESS_ONE ${TARGET}.$END
        @ END = $END_LESS_ONE
end

#
# Explicitly re-create an empty place to copy the current hourly snapshot
#
mkdir -p ${TARGET}.${ENDCNT}

#
# Create a copy of the directory entries in ${TARGET_H}
# but do not create independent copies of the files themselves
# (this line is roughly equivalent to the GNU 'cp -al <src> <dst>')
#
if ( $DEBUG == "debug" ) then
        ( cd $TARGET_H && find . -print | cpio -dplm ${TARGET}.${ENDCNT} )
else
        ( cd $TARGET_H && find . -print | cpio -dplm ${TARGET}.${ENDCNT} ) >& /dev/null
endif

if ( $DEBUG == "debug" ) echo Rotation completed `date`

exit 0