[pnfs] [PATCH 4/12] use kref for nfsd nfs4_client

Benny Halevy bhalevy at panasas.com
Wed May 9 14:19:19 EDT 2007


J. Bruce Fields wrote:
> On Tue, May 08, 2007 at 10:26:12PM +0300, Benny Halevy wrote:
>   
>> So the first step can be justified in mainline as in nfsd4_probe_callback's
>> out_rpciod error case atomic_dec(&clp->cl_count) won't free up the client
>> structure if it was expired concurrently.  expire_client shuts down the 
>> callback rpc path so that may wait for nfsd4_probe_callback to finish
>> but I'm not sure.  At any rate, calling put_nfs4_client() there is
>> more robust.
>>     
>
> Yes, the destruction of the rpc_client shouldn't succeed until any
> outstanding tasks are done.  But I agree that the reference counting
> there is at best odd.
>
> Actually, the caller (setclientid_confirm) also works with this client
> without a reference; if the client could be expired at any time, then
> that would already be a problem for it.  So I suppose it's just
> depending on the state lock to prevent the client expiration.  We're
> taking the reference in nfsd4_probe_callback() because we know it may be
> held after the state lock is dropped.
>   
I think that the synchronization between expire_client and nfsd4_cb_recall
is broken if the callback is done not under the state_lock, even if the
caller of the callback has a reference on the client since the rpc_clnt in
clp->cl_callback.cb_client can go away before the callback rpc starts.
I'm not sure yet what would be the best way to solve that.  I think
that any state that can be accessed not under the state lock must be
cleaned up only in free_nfs4_client which is called only when the
structure's reference count goes to zero, and any access outside the
state_lock must take a reference count on the client.

For layout recalls (that are accessed not under the state_lock)
we need a refcount for each nfs4_layoutrecall, before the callback
rpc starts, while under the state_lock the refcount is 2.
One reference is dropped after the rpc completes, the
other drops on layoutercall_done().  The latter happens
either on rpc error, on a final layoutreturn, or on
expire_client.  The way I currently implemented it
layoutrecall_done is always being called under the state_lock
(the invariant is that all lists are manipulated under the state_lock)
and that it is idempotent so if expire_client already went
through layoutrecall_done and the callback path calls it again,
the second call is just a no-op.

Benny

> I think that's the real reason the big state lock makes me
> uncomfortable--there's so many places like this where we aren't doing
> the kind of reference counting and locking that I'm used to elsewhere,
> because we're counting on mutual exclusion between these huge
> complicated bits of code.
>
> --b.
>   



More information about the pNFS mailing list