[pnfs] [nfs41] 10 patches for async session reset

Benny Halevy bhalevy at panasas.com
Thu Jun 12 11:47:29 EDT 2008


On Jun. 12, 2008, 17:01 +0300, "William A. (Andy) Adamson" <androsadamson at gmail.com> wrote:
> On Thu, Jun 12, 2008 at 7:40 AM, Benny Halevy <bhalevy at panasas.com> wrote:
> 
>> Andy, sorry but after reading the patchset I think the method
>> you implemented can be simplified if the destroy session path
>> wouldn't spawn a kthread for session_destroyer().
>>
>> nfs41_destroy_session could just mark the session for destroy
>> and as expired and that will function as a barrier.
>> That should be the only thing to do on the various cleanup paths.
>> (including nfs41_sequence_call_done which starts session recovery
>> today)
>>
>> Recovery should be done only on the setup_sequence path and it should
>> wait until the slot table is drained, otherwise it should sleep
>> on the slot_tbl_waitq.  When all slots are done, recovery can start.
>> For the reset case it should just start with sending a destroy_session
>> before recovering it.
>>
>> Can this work?
> 
> 
> I see what you mean, the code could be simplified. Don't free the session on
> reset, just re-initialize it and therefore use the slot_tlb_waitq instead of
> the destroy_waitq. Combine the destroy thread with the recover thread which
> will now have three tasks:
> 
> 1) just call create_session (NFS41_SESSION_EXPIRED) - set in the exchange_id
> path
> 2) call destroy session (wait for completion) call create session.
> NFS41_SESSION_RESET - a new state set by the error handlers
> 3) just call destroy session. (NFS41_SESSION_DESTROY) set in the kill_super
> path

Exactly.

> 
> It should work. I'll try it. Thanks for the review. I'll pay more attention
> to indent......

perfect ;-)

Benny

> 
> -->Andy
> 
>>
>> Benny
>>
>> On Jun. 11, 2008, 21:18 +0300, "Adamson, Andy" <William.Adamson at netapp.com>
>> wrote:
>>> The first patch in this series fixes a bug in create session. All of the
>> rest implement asynchronous session reset, where a session gets an
>> op_sequence operation error. This code places any new rpc_task that would
>> look for a slot on a new queue moves any rpc_tasks waiting for a slot on the
>> forward channel to the same queue, waits for the current outstanding rpc's
>> (e.g. slots to clear), destroys the session, initializes a new session, and
>> wakes
>>> up all the tasks on the new queue. the session recovery code is then
>> called by the next rpc_task - create session is then called,
>>> and voila! the session is reset.
>>>
>>> I wrote a simple add-on to the pynfs server that returned
>> NFS4ERR_MISORDERED every 50th OP_SEQUENCE call. This code passes basic and
>> general connectathon tests against this server. I'll work on adding the
>> pynfs changes to the python distro.
>>> -->Andy
>>>
>>>
>>>
>>> ------------------------------------------------------------------------
>>>



More information about the pNFS mailing list