Custom Search

Friday, January 29, 2010

Backup on ZFS, part 1

One of the nice things about having systems on ZFS was that the disk failures in the last few days didn't cost me any noticeable downtime per se. Pulling and replacing disks - without hot swappable hardware - and the system upgrade those inspired still costs time, as are hardware failures that leave a system unbootable. But in general, disk problems with ZFS file systems are just minor problems: you notice the disk is no longer in service, decide how to deal with it, and then do so.

Part of that is having reliable backups. ZFS makes even that easier. The best example is of course the OpenSolaris "Time Slider" tool, which uses the ZFS snapshot feature to let you recover old versions of files. Snapshots also make backups to other disks - suitable for taking offsite, for instance - easier to deal with as well.

As disks have gotten cheap, it's become common to keep backups on line. A typical home-grown backup script will use something like rsync to copy files to the destination disk, or file server. To make old versions available, it will then play games with a copy of the directory tree and symlinks to create an image of the tree at that time while not duplicating files that haven't changed between backups.

Snapshots can go one better. If your copy software will write just changed blocked in a file, instead of recreating the entire file, then the blocks that haven't changed in a file will also be shared across snapshots. Better yet, the snapshot can be created by running one command - a "zfs snapshot backuppool/mybackup" on the system the backup resides on.

The final nicety is that even systems without the hardware oomph for ZFS - it was designed for 64 bit CPUs with a gigabyte of ram - or an OS that doesn't support ZFS can take advantage of this in their backups. Here's the script I use for  my local backups. While I use it in production, it's not up to product status, in that it's really intended for use by relatively astute system admins. In particular, there's no nice error reporting, no simple tools for either complete restores or simple file recovery, etc. Those shouldn't be hard to build on top of this, but these are good enough for my use.

As with the previous script, the goal is more to get people thinking about how to leverage ZFS for these types of chores. If you've already done that and have tools available, provide a link in the comments and I'll pull it into the body so you get the traffic. If you feel moved to productize this script - the same applies.
#!/bin/sh

BACKUP_DEST=/export/backups
BACKUP_FS=external/export/backups
BACKUP_HOST=backups
BACKUP_USER=operator

if [ "$DEBUG" = "" ]
then
ECHO=""
else
ECHO=echo
fi

case $(uname) in
Darwin)
dump_list=$(df -T ufs,hfs | awk 'NR != 1 { print $NF }') ;
extra_flags="--extended-attribues"
hostname=$(hostname -s) ;;
FreeBSD)
dump_list=$(mount -p -t ufs,zfs | awk ' { print $2 }') ;
extra_flags="--acls --xattrs"
hostname=$(hostname -s) ;;
SunOS)
dump_list=$(/usr/gnu/bin/df -P -t zfs -t ufs | awk 'NR != 1 && !/^external/ { print $NF }') ;
extra_flags=""
hostname=$(hostname) ;;
esac

if [ $# -eq 0 ]
then
dump_name=$hostname
else
dump_name=$1; shift
dump_list="$@"
fi

for dir in $dump_list
do
case $dir in
/tmp*) echo Skipping $dir ;;
*) $ECHO rsync --verbose --archive --hard-links --delete --one-file-system --no-whole-file --exclude /.zfs $dir $BACKUP_DEST/$dump_name$dir ;;
esac
done

SNAPSHOT_COMMAND="/usr/sbin/zfs snapshot -r $BACKUP_FS/$dump_name@$(date +%F)"
if [ "$BACKUP_HOST" = "$hostname" ]
then
$ECHO $SNAPSHOT_COMMAND
else
$ECHO su $BACKUP_USER -c "ssh $BACKUP_HOST 'pfexec $SNAPSHOT_COMMAND'"
fi