(was slashes, now NFS devices)

Robert Thurlow thurlow at convex.com
Sun Mar 17 13:07:59 AEST 1991


More from David Zink:

>I do the following command:
>$ cat /bin/* > P1 & ; sleep 1 ; ls -l
>The ls hangs until the cat finishes.

No, it doesn't hang on our system, but it does take a long time.
You're getting an awful lot of contention for the lock on that file, so
the stat(2) is not going to be an instant thing.

And guess what?  This is exactly the way local disk works (try it, but
scale up your data size accordingly).  The only difference is the
performance bottleneck due to the network.  If you crippled your I/O
subsystem, you'd see similar things.  Until we get new networks that
are two orders of magnitude faster, this may be the case.  And even
then, misdirected packets, lossy networks and dead servers are going
to have to be dealt with.

What people don't seem to realize until they work with this stuff a bit
is that you're taking a kernel that assumes it has absolute dominion
over a disk that will never fail and trying to make it understand that
yes, the disk might not be exclusively yours, and yes, the disk might
just never give you that block.  This is a damned tough challenge to
meet.  Some problems can be addressed by changing the way things are
done, but you're still up against a big brick wall in some areas.
Thank _God_ NFS doesn't permit transparent access to remote devices;
the assumptions there would make the disk I/O system assumptions look
tame.

[Open-unlink failures:]
>They both open the same file (already a bug) and one unlinks it.  Poof,
>stale file handle. (I could as easily say the problem is the non-use of O_EXCL)

Exactly.  I hammer on Sun about this point - this lack is definitely a
protocol bug that needs to be fixed, and if it only was, I think I could
make my kernel Do The Right Thing in under a day.  This is the thing
that annoys me most about NFS.

>And I wish you wise guys would stop telling me 'If you don't like it, write
>better.'  I have not the time, and I suspect that even if I did write one it
>would never achieve much distribution due to the livable solutions already
>in place.

NFS cannot be all that I want until a partitioned kernel with
redundancy and fault-tolerance comes along which is obviously the best
direction for the majority to choose as its next Unix base (i.e., Unix
System VII Release 69, based on both Mach and Chorus).  Sun has
something that works this century, and it basically lets us get work
done we _could not_ do otherwise without great expense.  Does it need
improvement?  Yes.  Is it a pain in the butt at times?  Yes.  Do
current implementations have bugs?  Definitely.  I am very interested
in talking about those bugs and trying to fix them, and I get impatient
with people who only want to bitch.  Especially at those who I feel
haven't done enough thinking about the issues.  I am more compassionate
about the problems of naive end users than I am about someone who has
an insufficiently-researched great idea about how he'd do it, if he
only had the time.

>I have heard no argument in favor of NFS that did not have the same form
>as the arguments I always hear in favor of Basic and Cobol.

Don't use NFS.  Just try to live without it.  And when you do use it
(because you'll have to sometimes), report the bugs you find so they
can be fixed.

>P.S.  If NFS need hold no state on the server, what is a .nfsXXX file?
>"I know, I know, it's not part of the protocol."

Think:  if the server crashes, the file will still be there.  If the
client crashes, the process that cared won't be running.  The state
has no impact on correctness in either case.  NFS "statelessness" is
not exactly what the marketing word implies; if a different word had
been coined, you might be just as confused or moreso.  I think that
people who take the word "statelessness" at face value get the confusion
they deserve.

Rob T
--
Rob Thurlow, thurlow at convex.com
An employee and not a spokesman for Convex Computer Corp., Dallas, TX



More information about the Comp.unix.internals mailing list