[pnfs] [PATCH 1/1] nfsd41: Fix callback probe reference countbug

J. Bruce Fields bfields at fieldses.org
Tue Jan 15 14:24:02 EST 2008


On Tue, Jan 15, 2008 at 11:12:46AM -0800, Labiaga, Ricardo wrote:
> Let me explain a bit more...   fs/nfsd/nfs4callback.c:nfs4_cb_null() 
> invokes put_nfs4_client() but atomic_inc() was never called prior to 
> the call to rpc_call_async().  This results in an incorrect attempt 
> to free the nfs4_client structure. 

OK; so sounds like a totally different problem from what we've got in
upstream; apologies for the confusion.

--b.

> 
> - ricardo
> 
> 
> > -----Original Message-----
> > From: Labiaga, Ricardo 
> > Sent: Tuesday, January 15, 2008 10:42 AM
> > To: J. Bruce Fields
> > Cc: pnfs at linux-nfs.org
> > Subject: Re: [pnfs] [PATCH 1/1] nfsd41: Fix callback probe 
> > reference countbug
> > 
> > Hi Bruce,
> > 
> > This particular problem was introduced in Benny's NFSv4.1 
> > tree only.  I
> > can reproduce it with a simple CREATE_SESSION with the
> > SESSION4_BACK_CHAN bit set.  It's not really a race, it happens
> > everytime.  We haven't seen it because the 2.6.24 client currently
> > doesn't ask for a backchannel yet.
> > 
> > - ricardo
> > 
> > 
> > > -----Original Message-----
> > > From: J. Bruce Fields [mailto:bfields at fieldses.org] 
> > > Sent: Tuesday, January 15, 2008 8:57 AM
> > > To: Labiaga, Ricardo
> > > Cc: pnfs at linux-nfs.org
> > > Subject: Re: [pnfs] [PATCH 1/1] nfsd41: Fix callback probe 
> > > reference countbug
> > > 
> > > On Mon, Jan 14, 2008 at 04:36:05PM -0800, Ricardo Labiaga wrote:
> > > > The server hits a BUG_ON() condition trying to free the 
> > nfs4_client
> > > > structure during the callback probe.  It is releasing a reference
> > > > to the structure with put_nfs4_client() without having first
> > > > acquired a reference to it.
> > > > 
> > > > Fix by holding a reference in nfsd4*_probe_callback() before
> > > > calling rpc_call_async().  In the vanilla NFSv4 callback probe,
> > > > a reference is indeed acquired and a new kthread executes a
> > > > synchronous RPC call.  This is equivalent to what we now do by
> > > > holding a reference and scheduling an async RPC.  The reference
> > > > is released after the async thread is done processing.
> > > 
> > > I believe these sorts of races should be gone in the 
> > nfs-server-stable
> > > branch
> > > 
> > > 	git://linux-nfs.org/~bfields/linux.git
> > > 
> > > If you have a good test case and got a chance to test with 
> > > that kernel,
> > > that could be helpful....
> > > 
> > > --b.
> > > 
> > > > 
> > > > Signed-off-by: Ricardo Labiaga <ricardo.labiaga at netapp.com>
> > > > ---
> > > >  fs/nfsd/nfs4callback.c |   14 ++++++++++++++
> > > >  1 files changed, 14 insertions(+), 0 deletions(-)
> > > > 
> > > > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > > > index 7fb7786..22c4ca0 100644
> > > > --- a/fs/nfsd/nfs4callback.c
> > > > +++ b/fs/nfsd/nfs4callback.c
> > > > @@ -563,6 +563,12 @@ nfsd4_probe_callback(struct nfs4_client *clp)
> > > >  		goto out_err;
> > > >  	}
> > > >  
> > > > +	/*
> > > > +	 * The task holds a reference to the nfs4_client struct.
> > > > +	 * It is released in the callback function upon success.
> > > > +	 */
> > > > +	atomic_inc(&clp->cl_count);
> > > > +
> > > >  	status = rpc_call_async(cb->cb_client, &msg, RPC_TASK_ASYNC,
> > > >  				&nfs4_cb_null_ops, NULL);
> > > >  
> > > > @@ -573,6 +579,7 @@ nfsd4_probe_callback(struct nfs4_client *clp)
> > > >  	return;
> > > >  
> > > >  out_release_clp:
> > > > +	atomic_dec(&clp->cl_count);
> > > >  	rpc_shutdown_client(cb->cb_client);
> > > >  out_err:
> > > >  	cb->cb_client = NULL;
> > > > @@ -645,6 +652,12 @@ nfsd41_probe_callback(struct 
> > nfs4_client *clp)
> > > >  	dprintk("NFSD: %s: clp %p cb_client %p\n", __FUNCTION__,
> > > >  		clp, clp->cl_callback.cb_client);
> > > >  
> > > > +	/*
> > > > +	 * The task holds a reference to the nfs4_client struct.
> > > > +	 * It is released in the callback function upon success.
> > > > +	 */
> > > > +	atomic_inc(&clp->cl_count);
> > > > +
> > > >  	status = rpc_call_async(cb->cb_client, &msg, RPC_TASK_ASYNC,
> > > >  				&nfs4_cb_null_ops, NULL);
> > > >  
> > > > @@ -655,6 +668,7 @@ nfsd41_probe_callback(struct nfs4_client *clp)
> > > >  	return;
> > > >  
> > > >  out_release_clp:
> > > > +	atomic_dec(&clp->cl_count);
> > > >  	rpc_shutdown_client(cb->cb_client);
> > > >  out_err:
> > > >  	cb->cb_client = NULL;
> > > > -- 
> > > > 1.5.3.3
> > > > _______________________________________________
> > > > pNFS mailing list
> > > > pNFS at linux-nfs.org
> > > > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
> > > 
> > _______________________________________________
> > pNFS mailing list
> > pNFS at linux-nfs.org
> > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
> > 


More information about the pNFS mailing list