the /usr partition

Rayan Zachariassen rayan at ai.toronto.edu
Wed Jan 25 19:13:58 AEST 1989


> If it is mounted r/o, then a
> crash that doesn't even have a chance to sync (such as a power outage or a
> "L1-A" followed by "b") won't leave the partition in an inconsistent
> state.  Of course, if you aren't writing to it anyway, it probably doesn't
> matter.  --wnl

We had a series of Very Bad crashes that would cause the disk blocks under
the disk heads at the time of the crash to be nulled.  For a while, every
2-3 days we would lose root, /usr, /var, /tmp, and two to five other
partitions depending on the system activity at the time of the crash (that
tends to happen when the filesystem root inode and a few hundred others
get stomped on).  We are very grateful we had a way of booting that system
that didn't use its main disks, and that we had a stable twin machine
right next to it so we could dd /usr across the net when it got trashed.
The point is, just because software people think a partition being
read-only means it won't be corrupted, doesn't mean it won't be corrupted
(insert snide remark about hardware people :-).

I wonder how the robust filesystems of the future will deal with failures
that ``can't happen'' according to the software semantics or
``improbable'' hardware failures.  Having to trash an entire partition,
just because all copies of some piece of data goes bad, shouldn't be
acceptable in that context.

Paranoia is good for you!

rayan

[[ Being a system manager type, I completely agree with you about
paranoia.  A speck of dust that gets wedged between the flying head and
the disk surface really isn't going to care much about the fact that you
have that partition mounted read-only.  But I was referring more to the
fact that with a partition mounted read-only the cache in the kernel will
never be inconsistent with the disk contents.  The most common cause of
trashed file systems is an unflushed cache (because Unix's cache is not
write-through).  That means that *any* crash (not just those caused by the
disk drivers) can corrupt the disk if the kernel never gets the chance to
"sync" the disks.  By the way, has any vendor ever considered adding
write-through to Unix's file system cacheing code?  (Now I'll probably get
dozens of messages telling me that it was put in V5R1 500 years ago and
what rock did I just crawl out from under and I should stop putting these
editorial comments in if they're just going to be wrong all the time :-).
--wnl ]]



More information about the Comp.sys.sun mailing list