[pnfs] [PATCH 1/1] nfsd41: Fix callback probe reference countbug
J. Bruce Fields
bfields at fieldses.org
Tue Jan 15 14:24:02 EST 2008
On Tue, Jan 15, 2008 at 11:12:46AM -0800, Labiaga, Ricardo wrote:
> Let me explain a bit more... fs/nfsd/nfs4callback.c:nfs4_cb_null()
> invokes put_nfs4_client() but atomic_inc() was never called prior to
> the call to rpc_call_async(). This results in an incorrect attempt
> to free the nfs4_client structure.
OK; so sounds like a totally different problem from what we've got in
upstream; apologies for the confusion.
--b.
>
> - ricardo
>
>
> > -----Original Message-----
> > From: Labiaga, Ricardo
> > Sent: Tuesday, January 15, 2008 10:42 AM
> > To: J. Bruce Fields
> > Cc: pnfs at linux-nfs.org
> > Subject: Re: [pnfs] [PATCH 1/1] nfsd41: Fix callback probe
> > reference countbug
> >
> > Hi Bruce,
> >
> > This particular problem was introduced in Benny's NFSv4.1
> > tree only. I
> > can reproduce it with a simple CREATE_SESSION with the
> > SESSION4_BACK_CHAN bit set. It's not really a race, it happens
> > everytime. We haven't seen it because the 2.6.24 client currently
> > doesn't ask for a backchannel yet.
> >
> > - ricardo
> >
> >
> > > -----Original Message-----
> > > From: J. Bruce Fields [mailto:bfields at fieldses.org]
> > > Sent: Tuesday, January 15, 2008 8:57 AM
> > > To: Labiaga, Ricardo
> > > Cc: pnfs at linux-nfs.org
> > > Subject: Re: [pnfs] [PATCH 1/1] nfsd41: Fix callback probe
> > > reference countbug
> > >
> > > On Mon, Jan 14, 2008 at 04:36:05PM -0800, Ricardo Labiaga wrote:
> > > > The server hits a BUG_ON() condition trying to free the
> > nfs4_client
> > > > structure during the callback probe. It is releasing a reference
> > > > to the structure with put_nfs4_client() without having first
> > > > acquired a reference to it.
> > > >
> > > > Fix by holding a reference in nfsd4*_probe_callback() before
> > > > calling rpc_call_async(). In the vanilla NFSv4 callback probe,
> > > > a reference is indeed acquired and a new kthread executes a
> > > > synchronous RPC call. This is equivalent to what we now do by
> > > > holding a reference and scheduling an async RPC. The reference
> > > > is released after the async thread is done processing.
> > >
> > > I believe these sorts of races should be gone in the
> > nfs-server-stable
> > > branch
> > >
> > > git://linux-nfs.org/~bfields/linux.git
> > >
> > > If you have a good test case and got a chance to test with
> > > that kernel,
> > > that could be helpful....
> > >
> > > --b.
> > >
> > > >
> > > > Signed-off-by: Ricardo Labiaga <ricardo.labiaga at netapp.com>
> > > > ---
> > > > fs/nfsd/nfs4callback.c | 14 ++++++++++++++
> > > > 1 files changed, 14 insertions(+), 0 deletions(-)
> > > >
> > > > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > > > index 7fb7786..22c4ca0 100644
> > > > --- a/fs/nfsd/nfs4callback.c
> > > > +++ b/fs/nfsd/nfs4callback.c
> > > > @@ -563,6 +563,12 @@ nfsd4_probe_callback(struct nfs4_client *clp)
> > > > goto out_err;
> > > > }
> > > >
> > > > + /*
> > > > + * The task holds a reference to the nfs4_client struct.
> > > > + * It is released in the callback function upon success.
> > > > + */
> > > > + atomic_inc(&clp->cl_count);
> > > > +
> > > > status = rpc_call_async(cb->cb_client, &msg, RPC_TASK_ASYNC,
> > > > &nfs4_cb_null_ops, NULL);
> > > >
> > > > @@ -573,6 +579,7 @@ nfsd4_probe_callback(struct nfs4_client *clp)
> > > > return;
> > > >
> > > > out_release_clp:
> > > > + atomic_dec(&clp->cl_count);
> > > > rpc_shutdown_client(cb->cb_client);
> > > > out_err:
> > > > cb->cb_client = NULL;
> > > > @@ -645,6 +652,12 @@ nfsd41_probe_callback(struct
> > nfs4_client *clp)
> > > > dprintk("NFSD: %s: clp %p cb_client %p\n", __FUNCTION__,
> > > > clp, clp->cl_callback.cb_client);
> > > >
> > > > + /*
> > > > + * The task holds a reference to the nfs4_client struct.
> > > > + * It is released in the callback function upon success.
> > > > + */
> > > > + atomic_inc(&clp->cl_count);
> > > > +
> > > > status = rpc_call_async(cb->cb_client, &msg, RPC_TASK_ASYNC,
> > > > &nfs4_cb_null_ops, NULL);
> > > >
> > > > @@ -655,6 +668,7 @@ nfsd41_probe_callback(struct nfs4_client *clp)
> > > > return;
> > > >
> > > > out_release_clp:
> > > > + atomic_dec(&clp->cl_count);
> > > > rpc_shutdown_client(cb->cb_client);
> > > > out_err:
> > > > cb->cb_client = NULL;
> > > > --
> > > > 1.5.3.3
> > > > _______________________________________________
> > > > pNFS mailing list
> > > > pNFS at linux-nfs.org
> > > > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
> > >
> > _______________________________________________
> > pNFS mailing list
> > pNFS at linux-nfs.org
> > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
> >
More information about the pNFS
mailing list