[NFS] Missing handling for NFS4ERR_OLD_STATEID in nfs4_handle_exception?
Frank Filz
ffilzlnx at us.ibm.com
Thu Apr 12 13:11:02 EDT 2007
On Thu, 2007-04-12 at 07:59 -0400, Jeff Layton wrote:
> This looks pretty much correct to me as-is. If we set ret=0 on
> -NFS4ERR_OLD_STATEID, then the caller won't get back an error code. This
> makes an assumption that every caller of nfs4_handle_exception is
> looping based on exception->retry. I'm not sure if that's a safe
> assumption. A better idea *might* be to fix up nfs4_map_errors not to
> throw the warning for some errors < -1000, but still return an error.
nfs4_map_errors should warn about errors, because it's a last defense
against leaking NFS4 error numbers to the rest of the kernel (that
doesn't recognize them). So before calling nfs4_map_errors(), the error
code should already be converted to an errno code.
It looked to me like every caller of nfs4_handle_exception() does loop
on exception-retry(), and in that case, does not look at the error
returned.
> This sounds sort of like addressing the symptom and not the real
> problem, however. The real question ought to be why you're getting
> OLD_STATEID errors back from the server here. There can be legit
> reasons, but these errors ought to be fairly rare. I generally only have
> seen them when processes are signalled while RPC requests are in flight.
Sure, understanding why we are getting them is important, but since it
appears that handling might be missing, we may be seeing more of them
than expected.
> Also, it seems like when we hit -NFS4ERR_DELAY, we want to retry but
> only if the delay didn't hit an error. It looks like it only returns
> error if process was signalled while in nfs4_delay, and then we want to
> pass an -ERESTARTSYS back up the call chain (and not retry). So I think
> that's also correct as-is.
I agree that's correct.
Frank
More information about the NFSv4
mailing list