Incremental Backups

Bennett Todd -- gaj bet at zachary.mc.duke.edu
Wed Jun 5 04:40:00 AEST 1991


Well, I guess I've got a different approach to solving this problem.
We've got a couple of Exabyte 8200 drives on one of the (diskless)
workstations. Every night I stuff in two tapes. We are backing up about
16G of user data, from all over the network. We have Suns, Stellar
GS1000s, microvaxes, SGI Irises, and I forget what-all else.

I find, for each machine, a command that will allow me to do a full or
an incremental of a single filesystem, emitting the results to stdout.
On the Iris I made some trivial scripts with find(1) and cpio(1).

I have a master database, that contains one record for each filesystem.
Each record contains the hostname, partition name, filesystem name (for
comments), size (in megabytes), command to take a full dump, and command
to take an incremental dump. The size I generate by taking the output of
df(1) and adding used+avail. I run this database through a perl program
that generates a series of databases -- at this point it generates 15 of
them. Each database describes a single tape. The fulls are currently
taking 7 tapes. For each tape of the full dump the perl script generates
an incremental database describing an incremental dump of every
filesystem that isn't in that full. So that's 14 databases. The 15'th is
a complete incremental.

I have a script that will read one of these databases and write a dump
tape, dumping over the network via rsh piped into dd. I am doing a
two-week rotation. Every night of the first week, and Monday and Tuesday
of the second week, I write two tapes -- one of the full tapes and the
complementary incremental. Wednesday through Friday of the second week I
do incrementals.

Everything is backed up fully twice a month, everything is backed up one
way or another every night, and I never have to sit around shuffling
tapes. We are using Fuji P6-120MP tapes, which we can get for <$5 ea. As
easy as they are to store, and as cheap as they are, we never recycle
them -- we keep them all forever.

As for the issue of dumping live filesystems, I ignore it. I've never
heard anyone claim that doing dumps this way can corrupt the
*filesystem*, just the dump. As far as I know, changing files won't hurt
anything (just that the individual files may not be correctly dumped),
and the only thing that can corrupt the entire dump is a subdirectory
being deleted, and it's inode recycled as a file later on.

Given a choice between taking dumps oftener (and having a small but
non-zero probability of an error in the dump), and slightly increasing
the reliability but making it *much* more obstructive to take the dump,
I know which way I'll go.

-Bennett
bet at orion.mc.duke.edu




More information about the Comp.sys.sun mailing list