Why idle backups??

Chris Torek chris at mimsy.umd.edu
Thu Nov 1 10:58:21 AEST 1990


In article <32749 at sparkyfs.istc.sri.com> zwicky at sparkyfs.istc.sri.com
(Elizabeth Zwicky) answers the `subject' question.  Five articles later...

In <KOPPENH.90Oct24113316 at dia.informatik.uni-stuttgart.de>
koppenh at informatik.uni-stuttgart.de (Andreas Koppenhoefer), and in
<339 at gallium.UUCP> garyb at gallium.UUCP (Gary Blumenstein), ask for the
Purdue mods mentioned.

Equivalent mods are already included in recent versions of `dump' (as
distributed by Berkeley since 4.3-tahoe if not earlier, and Sun since
4.0.3 if not earlier, and presumably DEC by 2001 if not earlier :-) ).
The actual changes are:

 1. Add a `dirdump' routine to dumptraverse; use this in
    dumpmain for pass III (directory dump) by changing

	pass(dump, dirmap);

    to

	pass(dirdump, dirmap);

    where dirdump(ip) simply calls dump(ip) if and only if
    (ip->di_mode & IFMT) == IFDIR.

    This prevents `restore' from seeing a regular file in the middle
    of the directory listing, which hopelessly confuses old versions
    of restore (and possibly new ones as well).  Such things happen if
    a directory is deleted and its inode reused as a regular file before
    dump manages to reach it.  (More on this below.)

 2. Add code to dump() (also in dumptraverse.c) to skip a file if its
    mode (ip->di_mode) is 0, i.e., the inode is no longer in use.  This
    happens whenever a file or directory is deleted and the inode is
    *not* reused.

In <1990Oct24.210312.3271 at cubmol.bio.columbia.edu>
ping at cubmol.bio.columbia.edu (Shiping Zhang) asks how to put a complete
backup onto no more than one tape.  This is easily accomplished by
buying an 8mm Exabyte drive, unless you have disks that hold more than
2 GB.  (DAT drives will also work but hold less data, and the things
cost more.  New Exabyte hardware that stores over 4 GB per tape is now,
or will soon be, available as well.)

In <1990Oct24.151840.25570 at ccad.uiowa.edu> emcguire at ccad.uiowa.edu (Ed
McGuire) asks about validating a dump.  This is difficult, as Elizabeth
Zwicky describes in <32757 at sparkyfs.istc.sri.com>:

>1) Some individual file may be missing or damaged; without
>attempting to restore that particular file, you will never know.

It would not be difficult, although restore does not do this now, to
write a program that compares the maps at the front with the inode
special records to verify that all files exist on the tape.  Files
that were removed and not replaced, or directory files that were removed
and were replaced with something other than another directory, will of
course be `missing'.

>2) Some individual file may be damaged so that any attempt to
>read it confuses restore permanently.

Any such thing points to a bug in restore.  Restore should be (but
perhaps is not) able to recover from such things.  Naturally, such a
damaged file will itself not be recoverable.

These events [>1)] and [>2)] are most likely to happen when a file
changes size while that file is being dumped.  (Dump reads the inode,
then the direct block contents, then the indirect blocks and their
contents, all the while assuming that this data is valid.)  This should
merely cause the tape data to be invalid, and should not give restore
fits.  Note that restoring such a file could breach security: e.g., the
sequence of events could be:
 A. dump discovers a 100 MB file
 B. dump begins dumping the file
 C. the file is truncated
 D. the blocks for that file are allocated to a new, high-security
    (mode 0600) file owned by someone else
 E. dump finishes dumping the file.
The resulting tape holds up to 100 MB of high-security file contents
attached to the original user id.  When restored, the 100 MB file
`reappears' but its contents differ from the original.

>(Since restore doesn't tell you what it's trying to restore, only what
>it has finished restoring, if you run into one of these when trying to
>restore, you get to play binary search, doing add and extracts on
>subsets of your original file list until you have everything but the
>bad one. Ick.)

Actually, you can run a `restore iv', add what you like, `extract', and
note the name and/or inode number of the last file printed.  Then run
`restore t | sort -n' and look at the next higher inode number.  This is
the file that is causing restore to hang up.  (`restore rv' will also work.
Be sure to use a CRT so as not to waste paper.)

>3) At some point, some file may be screwed enough to corrupt
>everything after it ...

Again, this should never happen (but probably can).  In particular, this
used to happen with the 4.2BSD dump/restore when the pass(dump, dirmap)
wound up dumping a regular file (see `1.' near top of this article);
this has been fixed.

>4) There may be physical write or read errors on the tape.

Good hardware will detect these while the tape is being written, though
of course marginal defects may escape notice the first few times.

In another article which I foolishly forgot to note, Dick Karpinski
suggests that dump ought to be able to (slowly) produce a correct dump
even when the file system is active, perhaps (my interpretation) by
using some other algorithm.  The answer to this is `no and yes': it
could, but only by using a staging area at least as large as the final
backup, and potentially unbounded time.  The reason for this is simple,
though the details are not.

The tapes produced by dump are intended to be a complete snapshot of
the state of the file system, but are ordered so that restores are not
too difficult, without being ordered so strongly that dumps are slow.
(Some may argue with the latter statement. :-) )  To this end, the
contents of an infinitely long tape are:

 A. A `TS_TAPE' record naming dump time, level, etc.

 B. A bitmap of clear inodes (i.e., those that are not holding any file,
    of any kind).  This is used to tell which files have been removed
    since the previous dump (so that `restore r' can put things back as
    they were).  This is prefixed by a `TS_CLRI' record.

 C. A bitmap of set inodes (those that are holding files).  This is
    prefixed by a `TS_BITS' record.

 D. All the directories needed to produce complete path names to all the
    files on the tape.  These are a series of (TS_INODE,blocks,TS_ADDR,
    blocks,TS_ADDR,blocks,...) records, where each TS_INODE or TS_ADDR
    record contains enough information to tell how many `blocks' appear
    on the tape.  (Holes in files result in non-written `blocks', i.e.,
    a file consisting entirely of a hole has only TS_INODE and perhaps
    TS_ADDR records.)

  E. All the files being dumped (see item C above).

  F. A `TS_END' record.

The boundary between directories and files is defined implicitly by the
first non-directory on the tape.  This is why the `dirdump' routine is
so important for active dumps.  Restore would have to be made much
smarter to recover from `embedded' files in the directory area, and
would still have to read the entire dump, not just the directory part,
to be sure it got them all.

If a dump requires more than one tape, each tape after the first begins
with a TS_TAPE record followed by the same bitmap as in C above.  (In
theory this allows restore to `pick up' in the middle.  In actuality, a
data block which sufficiently resembles a TS_INODE record will fool a
restore that is doing this.  The 4.3-reno dump has a DR_NEWHEADER flag
and new fields in the TS_TAPE record that tell how far restore has to
go to get to a real TS_INODE record, which avoids this problem.)

Dump decides which files (including directories) to dump by checking
the inode times (atime, mtime, ctime, although the ctime alone should
suffice).  It reads a bunch of inodes from the raw disk device and
pokes through them, reads another bunch, etc., until it has read them
all.  Each file that must be dumped sets a bit in the `files to dump'
map.  This is `pass I (regular files)'.

Next dump scans through all the inodes again, this time checking to see
if it needs to add any parent directories so as to reach the marked
inodes.  It loops doing this secondary scan until nothing more is
marked.  This is `pass II (directories)', and this is why pass II is
usually run three or four times.  (To make it run lots of times, mkdir
a a/b a/b/c a/b/c/d a/b/c/d/e a/b/c/d/e/f a/b/c/d/e/f/g a/b/c/d/e/f/g/h
and do a full backup, then touch a/b/c/d/e/f/g/h/i and do an
incremental backup.)  I added a hack, included in the latest BSD dump,
that avoids pass II entirely if all directories are being dumped (this
speeds up all level 0 dumps).  (To make it pretty, it still claims to
run pass II.  You can tell that you have this version by the fact that
`dump 0 ...' prints `pass I', runs for a while, then prints `pass II'
and `pass III' without pausing in between.)  If a file with several
links changes, all directories leading to it are put on the tape.

In pass III, dump actually writes all those directories it marked in
passes I and II to the tape, and in pass IV, dump writes all the other
files it marked (including devices and symlinks).

In order to make a consistent backup, dump would have to:

 1. Scan the disk for files to back up.
 2. Write the backup to a staging area.
 3. Use file-system calls (lstat()) to check up on everything
    written to the staging area.
 4. For each file changed since part 1, replace its backup in the
    staging area, and add any new directories required.  For each
    file deleted since part 1, effectively remove it from the staging
    area.  Repeat from 3. until no files have changed or been removed.
 5. Dump the staging area to the backup device.  The date of this
    dump would be the time at which the final scan in step 3 (the
    one that found no changes) began.

A much simpler method would be to freeze activity on the file system
being dumped.  A `freezefs' system call is being contemplated.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750)
Domain:	chris at cs.umd.edu	Path:	uunet!mimsy!chris



More information about the Comp.unix.admin mailing list