nfs/svn/apache strange interaction

J. Bruce Fields bfields at fieldses.org
Sat Apr 26 21:37:30 EDT 2008


On Sat, Apr 26, 2008 at 05:33:41PM -0700, Trond Myklebust wrote:
> 
> On Wed, 2008-04-23 at 13:26 -0400, J. Bruce Fields wrote:
> 
> > Trond, why would a single ESTALE error on a PUTFH cause the client to
> > fail the whole syscall?  Shouldn't it retry at least in some cases?  (We
> > don't know the rest of the operations in the compound yet.)
> 
> Why should it retry anything at all? The server just told it that access
> to that file has been revoked. It is entirely correct to treat ESTALE as
> a fatal error.

I'm not sure what exactly happened in this case, but say it was
something like:

	1. application opens and closes /some/file
	2. On the server, /some/file is deleted, and a new file is
	   created in its place.
	3. Applications tries to stat /some/file
	4. Client assumes that /some/file still points at the same
	   filehandle it did in step 1, issues a stat on that filehandle,
	   and gets ESTALE.

Retrying the stat would obviously be pointless, but redoing the lookup
wouldn't be.  Uh, OK, this is just rehashing the previous discussion
raised by Peter Staubach's patches.  Looking back at some of that:

	http://www.ussg.iu.edu/hypermail/linux/kernel/0801.2/1467.html

So it's possible this could be due in part to 1-second ctime resolution
on the server?  I wonder if it'd be possible for Guillaume to verify
whether the same problem is reproduceable on xfs, for example.

Or perhaps he could try Peter's patches:

	http://www.ussg.iu.edu/hypermail/linux/kernel/0801.2/1357.html

if only to confirm that the problem is actually with out of date lookup
results.

--b.


More information about the NFSv4 mailing list