[pnfs] [PATCH 9/9] pnfs client prevent race in sequence slot

William A. (Andy) Adamson andros at citi.umich.edu
Thu Sep 27 15:00:20 EDT 2007


On 9/26/07, Trond Myklebust <trond.myklebust at fys.uio.no> wrote:
>
> On Wed, 2007-09-26 at 17:26 -0400, William A. (Andy) Adamson wrote:
> > Here is what is happening on the failure of the Connectathon lock test
> > #12.
> >
> > The client produces a SEQUENCE:PUTFH:GETATTR (nfs4_proc_getattr)
> > compond followed by a SEQUENCE:PUTFH:LOCKU (nfs4_proc_unlck).
> >
> > Wireshark shows the nfs4_proc_getattr as succeeding, and the
> > nfs4_proc_unlck as failing with BAD_SESSION.
> >
> > The nfs4_proc_getattr rpc task catches a signal (from the test) and
> > returns -ERESTARTSYS, having not called decode, which results in the
> > nfs4_proc_getattr local variable nfs41_sequence_res.status being left
> > unset - so whatever garbage is in the un-initalized
> > nfs41_sequence_res.status is what is passed to nfs41_sequence_done.
> > This happens to be non-zero, and is interpreted by
> > nfs41_proc_sequence_done() as an error, which means that the sequence
> > number for the slot is decremented, and the next rpc (nfs4_proc_unlk)
> > will send the same sequence number and get an error (BAD_SESSION on
> > our server).
> >
> > The client does not know if the rpc succeeded, because it never
> > decoded the reply. But, in order to process the SEQUENCE operation on
> > the nfs4_proc_getattr and not get out of sync with the server, the
> > client needs to know the status of the SEQUENCE operation sent by the
> > server.
> >
> > Suggestions?
>
> I suggest bringing this question up on the ietf channel. I think this
> question is of interest to more than just Linux...
>
> That said, how about the following suggestion:
>
> If we have to interrupt an RPC call, then we immediately fire off an
> asynchronous RPC call with a single SEQUENCE call that uses the _same_
> sa_sequence id as the synchronous RPC call that was cancelled (and drops
> all the other arguments).


ok. but can't we get around sending yet another rpc by looking at what the
server has already sent (or not)?  we do have all the information.

AFAICS, the server should then reply either with an NFS4_OK or
> NFS4ERR_SEQ_FALSE_RETRY. In either case, we should then be guaranteed
> that the sa_sequenceid has been bumped by exactly 1 irrespective of
> whether or not the server processed the synchronous RPC call.

Cheers
>    Trond
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://linux-nfs.org/pipermail/pnfs/attachments/20070927/f775af01/attachment-0001.htm 


More information about the pNFS mailing list