[pnfs] [PATCH 4/12] use kref for nfsd nfs4_client

Benny Halevy bhalevy at panasas.com
Thu May 10 01:24:44 EDT 2007


William A. (Andy) Adamson wrote:
> For layout recalls (that are accessed not under the state_lock)
> 
> 
> when would any layout recalls be accessed outside of the state_lock?
> delegation recalls are accessed outside of the state lock only when a
> non-nfsd local process performs an action on a file with a delegation that
> causes break_lease to be called - e.g. the integration of delegations with
> file leases.

Currently layout recall is synchronous and there might be many
of them so I'm calling it outside the state_lock.  I can make the
call asynchronous and initiate the rpc(s) under the state_lock
and then release it.  This is nice since I see no reason to even
grab the lock on the async response as the layoutrecall just needs
to be put (unreferenced) and possibly freed, without touching the
nfs4_client.cl_layoutrecalls list.

> On 5/9/07, Benny Halevy <bhalevy at panasas.com> wrote:
>> J. Bruce Fields wrote:
>>> On Tue, May 08, 2007 at 10:26:12PM +0300, Benny Halevy wrote:
>>>
>>>> So the first step can be justified in mainline as in
>> nfsd4_probe_callback's
>>>> out_rpciod error case atomic_dec(&clp->cl_count) won't free up the
>> client
>>>> structure if it was expired concurrently.  expire_client shuts down the
>>>> callback rpc path so that may wait for nfsd4_probe_callback to finish
>>>> but I'm not sure.  At any rate, calling put_nfs4_client() there is
>>>> more robust.
>>>>
>>> Yes, the destruction of the rpc_client shouldn't succeed until any
>>> outstanding tasks are done.  But I agree that the reference counting
>>> there is at best odd.
>>>
>>> Actually, the caller (setclientid_confirm) also works with this client
>>> without a reference; if the client could be expired at any time, then
>>> that would already be a problem for it.  So I suppose it's just
>>> depending on the state lock to prevent the client expiration.  We're
>>> taking the reference in nfsd4_probe_callback() because we know it may be
>>> held after the state lock is dropped.
>>>
>> I think that the synchronization between expire_client and nfsd4_cb_recall
>> is broken if the callback is done not under the state_lock, even if the
>> caller of the callback has a reference on the client since the rpc_clnt in
>> clp->cl_callback.cb_client can go away before the callback rpc starts.
> 
> 
> good point. bump both the rpc_clnt->cl_count as well as the
> nfs4_client->cl_count in nfsd_break_deleg_cb() prior to the kthread_run and
> then call put_nfs4_client() and rpc_destroy_client() at the end of
> nfs4_cb_recall().
> 
> I'm not sure yet what would be the best way to solve that.  I think
>> that any state that can be accessed not under the state lock must be
>> cleaned up only in free_nfs4_client which is called only when the
>> structure's reference count goes to zero, and any access outside the
>> state_lock must take a reference count on the client.
> 
> 
> For layout recalls (that are accessed not under the state_lock)
> 
> 
> when would any layout recalls be accessed outside of the state_lock?
> delegation recalls are accessed outside of the state lock only when a
> non-nfsd local process performs an action on a file with a delegation that
> causes break_lease to be called - e.g. the integration of delegations with
> file leases.
> 
> we need a refcount for each nfs4_layoutrecall, before the callback
>> rpc starts, while under the state_lock the refcount is 2.
>> One reference is dropped after the rpc completes, the
>> other drops on layoutercall_done().  The latter happens
>> either on rpc error, on a final layoutreturn, or on
>> expire_client.  The way I currently implemented it
>> layoutrecall_done is always being called under the state_lock
>> (the invariant is that all lists are manipulated under the state_lock)
>> and that it is idempotent so if expire_client already went
>> through layoutrecall_done and the callback path calls it again,
>> the second call is just a no-op.
>>
>> Benny
>>
>>> I think that's the real reason the big state lock makes me
>>> uncomfortable--there's so many places like this where we aren't doing
>>> the kind of reference counting and locking that I'm used to elsewhere,
>>> because we're counting on mutual exclusion between these huge
>>> complicated bits of code.
>>>
>>> --b.
>>>
>> _______________________________________________
>> pNFS mailing list
>> pNFS at linux-nfs.org
>> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
>>
>>
> 


-- 
Benny Halevy
Panasas Inc.
Accelerating Time to Results(TM) with Clustered Storage

www.panasas.com
bhalevy at panasas.com
Tel/Fax: +972-3-647-8340
Mobile: +972-54-802-8340


More information about the pNFS mailing list