[pnfs] [PATCH 9/9] pnfs client prevent race in sequence slot

Rahul Iyer iyer at netapp.com
Thu Sep 27 17:29:26 EDT 2007


Benny Halevy wrote:
> William A. (Andy) Adamson wrote:
>   
>> On 9/27/07, *Trond Myklebust* <trond.myklebust at fys.uio.no 
>> <mailto:trond.myklebust at fys.uio.no>> wrote:
>>
>>     On Thu, 2007-09-27 at 15:12 -0400, Trond Myklebust wrote:
>>     > On Thu, 2007-09-27 at 15:00 -0400, William A. (Andy) Adamson wrote:
>>     > >
>>     > >
>>     > > On 9/26/07, Trond Myklebust < trond.myklebust at fys.uio.no
>>     <mailto:trond.myklebust at fys.uio.no>> wrote:
>>     > >         If we have to interrupt an RPC call, then we
>>     immediately fire
>>     > >         off an
>>     > >         asynchronous RPC call with a single SEQUENCE call that
>>     uses
>>     > >         the _same_
>>     > >         sa_sequence id as the synchronous RPC call that was
>>     cancelled
>>     > >         (and drops
>>     > >         all the other arguments).
>>     > >
>>     > > ok. but can't we get around sending yet another rpc by looking
>>     at what
>>     > > the server has already sent (or not)?  we do have all the
>>     information.
>>     >
>>     > I don't understand. If you have a reply, then you're not
>>     interrupting an
>>     > RPC call,
>>
>>     however the fact that you don't have a reply means nothing:
>>     > assuming that an RPC call was actually sent by the client then the
>>     > server may or may not have received it (we just don't know until
>>     we have
>>     > a reply).
>>
>>     Oh... I think I see your point. Are you perhaps suggesting that we
>>     just
>>     look for the NFS4ERR_SEQ_MISORDERED reply, and use that as to indicate
>>     that the server didn't see our previous RPC call on that slot?
>>     That might work too.
>>
>>
>> yes, just remember in the slot that the RPC with seq# was interrupted 
>> and reset the slot sequence number iff MISORDERED err on seq#+1.
>>     
> The problem with doing this is that you might be racing with the 
> original call in case it
> was delayed in the network for some obscure reason and you could get 
> MISORDERED or DELAY
> on the retry.  In the DELAY case you'll need to poll the server until it 
> finishes processing the previous
> request, then you should get MISORDERED again and you can try seq#+1 yet 
> again.
>
>   
Wouldn't it be far simpler to tear down the session and re-create it? 
That's what the new 2.6.23 sessions implementation does when it receives 
anything that leads it to believe that it may be out of sync with the 
server.
-Rahul


More information about the pNFS mailing list