[pnfs] [PATCH 9/9] pnfs client prevent race in sequence slot

Iyer, Rahul Rahul.Iyer at netapp.com
Wed Sep 26 19:02:29 EDT 2007


 

> -----Original Message-----
> From: Benny Halevy [mailto:bhalevy at panasas.com] 
> Sent: Wednesday, September 26, 2007 3:26 PM
> To: William A. (Andy) Adamson
> Cc: pnfs at linux-nfs.org; andros at umich.edu
> Subject: Re: [pnfs] [PATCH 9/9] pnfs client prevent race in 
> sequence slot
> 
> William A. (Andy) Adamson wrote:
> > Here is what is happening on the failure of the 
> Connectathon lock test #12.
> > 
> > The client produces a SEQUENCE:PUTFH:GETATTR (nfs4_proc_getattr) 
> > compond followed by a SEQUENCE:PUTFH:LOCKU (nfs4_proc_unlck).
> > 
> > Wireshark shows the nfs4_proc_getattr as succeeding, and the 
> > nfs4_proc_unlck as failing with BAD_SESSION.
> > 
> > The nfs4_proc_getattr rpc task catches a signal (from the test) and 
> > returns -ERESTARTSYS, having not called decode, which 
> results in the 
> > nfs4_proc_getattr local variable nfs41_sequence_res.status 
> being left 
> > unset - so whatever garbage is in the un-initalized 
> > nfs41_sequence_res.status is what is passed to nfs41_sequence_done. 
> > This happens to be non-zero, and is interpreted by 
> > nfs41_proc_sequence_done() as an error, which means that 
> the sequence 
> > number for the slot is decremented, and the next rpc 
> (nfs4_proc_unlk) 
> > will send the same sequence number and get an error 
> (BAD_SESSION on our server).
> > 
> > The client does not know if the rpc succeeded, because it never 
> > decoded the reply. But, in order to process the SEQUENCE 
> operation on 
> > the nfs4_proc_getattr and not get out of sync with the server, the 
> > client needs to know the status of the SEQUENCE operation 
> sent by the server.
> > 
> > Suggestions?
> 
> I'm not sure how hard it is to implement but I think you want 
> the application thread to get the signal and return but have 
> an internal thread take over the rpc book-keeping to completion.
>
There are two things wrong with the sequence_done code:
1. It doesn't check for the error code before performing the decrement.
So, in this case, something like a lookup failure could also result in
the sequence number not getting incremented and thus resulting in a
terrible mess. So, ideally, we should switch on the error code. 

2. It needs to catch the signal events and set the status to something
like EINTR.

Regards
Rahul
 
> > 
> > -->Andy
> > 
> > 
> > On 9/26/07, *Trond Myklebust* <trond.myklebust at fys.uio.no 
> > <mailto:trond.myklebust at fys.uio.no>> wrote:
> > 
> >     On Wed, 2007-09-26 at 08:45 -0400, andros at umich.edu
> >     <mailto:andros at umich.edu> wrote:
> >     > From: Andy Adamson <andros at umich.edu 
> <mailto:andros at umich.edu>>
> >     >
> >     > Move the lock to cover reading the sequence list in 
> rpc_sequence
> >     > which allowed multiple rpc's to be in flight with the 
> same slot
> >     > sequence id.
> > 
> >     Sorry. This patch is just wrong. There is no reason to 
> take the spinlock
> >     when just testing for equality here.
> > 
> >     Trond
> > 
> >     > Signed-off by: Andy Adamson< andros at umich.edu
> >     <mailto:andros at umich.edu>>
> >     > ---
> >     >  fs/nfs/nfs4state.c |    4 ++--
> >     >  1 files changed, 2 insertions(+), 2 deletions(-)
> >     >
> >     > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> >     > index 4963454..cbc08af 100644
> >     > --- a/fs/nfs/nfs4state.c
> >     > +++ b/fs/nfs/nfs4state.c
> >     > @@ -773,15 +773,15 @@ int nfs_wait_on_sequence(struct 
> nfs_seqid
> >     *seqid, struct rpc_task *task)
> >     >       struct rpc_sequence *sequence = 
> seqid->sequence->sequence;
> >     >       int status = 0;
> >     >
> >     > +     spin_lock(&sequence->lock);
> >     >       if (sequence->list.next == &seqid->list)
> >     >               goto out;
> >     > -     spin_lock(&sequence->lock);
> >     >       if (sequence->list.next != &seqid->list) {
> >     >               rpc_sleep_on(&sequence->wait, task, NULL, NULL);
> >     >               status = -EAGAIN;
> >     >       }
> >     > -     spin_unlock(&sequence->lock);
> >     >  out:
> >     > +     spin_unlock(&sequence->lock);
> >     >       return status;
> >     >  }
> >     >
> > 
> >     _______________________________________________
> >     pNFS mailing list
> >     pNFS at linux-nfs.org <mailto:pNFS at linux-nfs.org>
> >     http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
> > 
> > 
> > 
> > 
> ----------------------------------------------------------------------
> > --
> > 
> > _______________________________________________
> > pNFS mailing list
> > pNFS at linux-nfs.org
> > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
> 
> 
> --
> Benny Halevy
> Software Architect
> Tel/Fax: +972-3-647-8340
> Mobile: +972-54-802-8340
> US:      +1-412-203-3187
> bhalevy at panasas.com
>  
> Panasas, Inc.
> Leader in Parallel Storage
> www.panasas.com
> _______________________________________________
> pNFS mailing list
> pNFS at linux-nfs.org
> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
> 


More information about the pNFS mailing list