Taking snapshots

I have a fairly large amount of data (source code, DB dumps, docs etc.) that I keep on either my workstations or a file servers. I use software RAID on both systems, either mirroring (RAID1) or stripes+parity (RAID5), and that obviously saves me from fatal disk errors. But this doesn't prevent me from losing data when I'm a total moron, or some application goes bad.

So a while ago, using some favorite tools ([http://search.yahoo.com/|Yahoo search] obviously, for those of you who know me) I went out to see what was out there. I found this very [http://www.mikerubel.org/computers/rsync_snapshots/|informative article] on the topic. Doing some more searches, I then found a little nugget called [http://www.rsnapshot.org/|rsnapshot]. This tool pretty much automates everything necessary to perform hourly,daily, weekly, monthly or whatever type of snapshots you wish to do.

Configuring rsnapshot is pretty straight forward, and it comes with a good template configuration file that you can tune and tweak. It's assumed you have a recent version of rsync installed, and SSH properly setup and running if you are doing snapshots over the wire. I'll describe a few of the configurations that I've used. First off, you need to provide some basic information about how snapshot should behave, where to store snapshots etc. __Note__: this is for a Linux system:

{CODE()}
snapshot_root /export/.snapshots/
cmd_cp /bin/cp
cmd_rsync /usr/bin/rsync
cmd_ssh /usr/bin/ssh
link_dest 1
verbose 3
loglevel 3
{CODE}

Next we need to decide what types of snapshots we want, and what sort of retention to keep. I've decided to do daily, weekly and monthly snapshots only, keeping 6 daily, 3 weekl, 3 monthly snaps, 3 quarterly and 9 yearly (yeah ...).

{CODE()}
interval daily 6
interval weekly 3
interval monthly 3
interval quarterly 3
interval yearly 9
{CODE}

Finally, we need to specify which directories to make snapshots of, possibly from a remote server. In my case, I do snapshots over the network only, to keep all snapshots on a RAID5 device.

{CODE()}
# Workstation
backup root@ws1.ogre.com:/etc/ ws1/etc/ exclude_file=/admin/etc/ws1.exclude
backup root@ws1.ogre.com:/export/ ws1/export/ exclude_file=/admin/etc/ws1.exclude

# Web/mail server
backup root@s1.ogre.com:/etc/ s1/etc/ exclude_file=/admin/etc/s1.exclude
backup root@s1.ogre.com:/data/ s1/server/ exclude_file=/admin/etc/s1.exclude
{CODE}

This part of the configuration is a bit finicky, in particular, you have to use a <TAB> character between the destination directory (e.g. ws1/etc/) and any extra options you want to pass to rsync. Any <SPACE> characters will actually be part of the directory name, which was kind of a surprise to me.

With this all configured, you're pretty much set to go, just run rsnapshot out of your crontab at the desired frequency. In my case, since I do daily snapshots (and not hourly), I just added a daily cron job, like:

{CODE()}
#!/bin/sh

rsnap='/usr/local/bin/rsnapshot'

do_weekly=0
do_monthly=0
do_quarterly=0
do_yearly=0

wday_num=`/bin/date '+%u'`
if [ $wday_num -eq 7 ]; then
do_weekly=1
day_num=`/bin/date '+%d'`
if [ $day_num -ge 25 ]; then
do_monthly=1
m_num=`/bin/date '+%m'`
if [ $m_num -eq 3 -o $m_num -eq 6 -o $m_num -eq 9 -o $m_num -eq 12 ]; then
do_quarterly=1
fi
if [ $m_num -eq 12 ]; then
do_yearly=1
fi
fi
fi

if [ $do_yearly -eq 1 ]; then
echo "Saving yearly snapshot"
$rsnap yearly
fi
if [ $do_quarterly -eq 1 ]; then
echo "Saving quarterly snapshot"
$rsnap quarterly
fi
if [ $do_monthly -eq 1 ]; then
echo "Saving monthly snapshot"
$rsnap monthly
fi
if [ $do_weekly -eq 1 ]; then
echo "Saving weekly snapshot"
$rsnap weekly
fi

echo "Doing daily snapshot"
$rsnap daily
{CODE}

This will run daily snapshots Monday - Saturday, a weekly on Sundays, except on the last Sunday of the month I perform a monthly snapshot. I also NFS export my snapshot directory (/export/.snapshots), __read-only__, so that I can easily get to it from all my machines.

Hacking: