On the silliness of close() giving EDQUOT (NFS out of space)

Jeff Anton jeff at ingres.com
Tue Oct 23 05:48:15 AEST 1990


|>  When you've written a filesystem as successful as NFS, which works as well
|>as AFS, which doesn't have the close() problem we're discussing, I'll gladly
|>use it, and I'll gladly try to get other people to use it, and I'll think you
|>a thousand times.  But until then, I'll stick with the people who *have* done
|>it.  And I'll admit that i may not be able to see forever into the future and
|>tell that we will never ever be able to justify a filesystem that doesn't
|>detect quota problems on write().
|>
|>-- 
|>Jonathan Kamens			              USnail:
|>MIT Project Athena				11 Ashford Terrace
|>jik at Athena.MIT.EDU				Allston, MA  02134
|>Office: 617-253-8085			      Home: 617-782-0710

NFS, though successful, is hardly a system to present as a good example
of robust out of space error reporting.  ENOSPC handleing is less than
optimal even in a buffered situation.  NFS 'remembers' if a write caused
an out of space condition and refuses to query the server about further
writes until the file is closed and reopened.  (And the close() returns
ENOSPC as well).  Two problems with this optimization, first you can't
seek backwards in the file to write an indication that you ran out of space
because you can't overwrite existing allocated blocks - the no further
writes rule dissallows this so you can't recover from ENOSPC even if you
checked for the case, and second if two processes have the file open they
have to communicate and close the file together to clear the no further
writes condition.  I think this behavior is to prevent the stupid program
which ignores errors from write from killing the NFS server & network
performance.  A simple fix would be to have lseek clear the no futher
writes condition on the grounds that after seeking a write might succeed.

Also, what do you do if close() reports an error like ENOSPC?  Did close
release the file descriptor or not?  You would have to fstat it to
dertermine if it is closed.  And if it was not closed how would you close
it?  We need documentation!

This might be a simple bug, but no vendor has admitted it.  It's just
another performance vs. reliability trade off.  (O_SYNC doesn't help).
(Actually, I've not tested for this bug in the last year or so, it might
be fixed in some places.)
					Jeff Anton



More information about the Comp.unix.internals mailing list