Another reason I hate NFS: Silent data loss!

David A. Truesdell truesdel at nas.nasa.gov
Thu Jun 20 06:18:08 AEST 1991


brnstnd at kramden.acf.nyu.edu (Dan Bernstein) writes:

>In article <27226 at adm.brl.mil> mike at BRL.MIL ( Mike Muuss) writes:
>> NFS is designed as a reliable protocol.  I have pounded more than 250
>> NFS requests/sec against a fileserver, and no data loss.

>In this case the 20 requests came in under 1/50 of a second (somewhat
>smaller, I think, but I don't have good measuring tools). I can't
>sustain this load from one Sun, but a single burst was enough to lose
>data.

>> Things you
>> should check are the number of retransmit's you authorized in /etc/fstab,

>If the number of retransmits runs out, the writing process ``should''
>get an error. Otherwise the implementation is (obviously) buggy.

Why ``should'' it?  Your writes probably put their data into the buffer cache
just fine, it's the subsequent flushing of the buffer cache that failed.  And
guess what?  The write had probably already returned by then.  Or, do you
always use O_SYNC when opening files for writing?
--
T.T.F.N.,
dave truesdell (truesdel at nas.nasa.gov)
"Carpe Noctem"



More information about the Comp.unix.wizards mailing list