Another reason I hate NFS: Silent data loss!

der Mouse mouse at thunder.mcrcim.mcgill.edu
Sat Jun 22 23:33:34 AEST 1991


In article <truesdel.677362688 at sun418>, truesdel at nas.nasa.gov (David A. Truesdell) writes:
> brnstnd at kramden.acf.nyu.edu (Dan Bernstein) writes:
>> In article <27226 at adm.brl.mil> mike at BRL.MIL ( Mike Muuss) writes:
>>> NFS is designed as a reliable protocol.  I have pounded more than
>>> 250 NFS requests/sec against a fileserver, and no data loss.
>>> Things you should check are the number of retransmit's you
>>> authorized in /etc/fstab, [...]
>> If the number of retransmits runs out, the writing process
>> ``should'' get an error.  Otherwise the implementation is
>> (obviously) buggy.
> Why ``should'' it?  Your writes probably put their data into the
> buffer cache just fine, it's the subsequent flushing of the buffer
> cache that failed.  And guess what?  The write had probably already
> returned by then.

Consider a real disk.  What happens if a real disk doesn't respond when
the kernel writes a buffer from the buffer cache to it?

Right.  The kernel panics.

So a case could be made that if the number of retransmits runs out
(where a hard mount could be considered as specifying infinite
retransmission), the kernel should panic.

Unfortunately, fileservers die much more often than disks do.  The
current behavior is a compromise between preserving disk semantics and
practicality.

(No, I don't particularly like NFS either.  For us, unfortunately, it
is pretty much the only game in town.)

					der Mouse

			old: mcgill-vision!mouse
			new: mouse at larry.mcrcim.mcgill.edu



More information about the Comp.unix.wizards mailing list