[pnfs] [nfsv4] layout stateid handling/processing updates

Benny Halevy bhalevy at panasas.com
Mon Mar 24 08:33:09 EDT 2008


On Mar. 24, 2008, 14:23 +0200, "Noveck, Dave" <Dave.Noveck at netapp.com> wrote:
> STALE_STATEID is not appropriate and had been obsoleted in v4.1 for good
> reason.  In v4.1, stateids are unique within the conetxt of a specific
> client (determine by the current sessionid).  The point is that if you
> have something from an earlier server instance, you will find it out
> when the sessionid is presented.  If SEQUENCE is ok, no stateids can be
> STALE, although they can be BAD, OLD, ADMIN_REVOKED etc.

Got it.  Thanks.

> 
> With regard to OLD and BAD depending on seqid, the current plan within
> the working group as I understand it is not to check for these for
> layout stateids.  Spencer is putting together the text but if you have
> an objection, it needs to be raised with the working group.

I've just sent a response to Spencer's latest email about layout
stateids with a proposal to add a couple paragraphs on this.

Benny

> 
> -----Original Message-----
> From: Benny Halevy [mailto:bhalevy at panasas.com] 
> Sent: Monday, March 24, 2008 7:57 AM
> To: William A. (Andy) Adamson; Noveck, Dave
> Cc: pnfs at linux-nfs.org
> Subject: Re: [pnfs] [nfsv4] layout stateid handling/processing updates
> 
> Here the current definitions of the relevant error codes:
> 
> 15.1.5.2.  NFS4ERR_BAD_STATEID (Error Code 10026)
> 
>    A stateid does not properly designate any valid state.  See
>    Section 8.2.4 and Section 8.2.3 for a discussion of how stateids are
>    validated.
> 
> 15.1.5.5.  NFS4ERR_OLD_STATEID (Error Code 10024)
> 
>    A stateid with a non-zero seqid value does match the current seqid
>    for the state designated by the user.
> 
> 15.1.16.5.  NFS4ERR_STALE_STATEID (Error Code 10023)
> 
>    A stateid generated by an earlier server instance was used.
> 
> I think that NFS4ERR_STALE_STATEID and NFS4ERR_BAD_STATEID are
> appropriate if the the client uses the wrong opaque part of the layout
> stateid.
> 
> NFS4ERR_OLD_STATEID makes sense to return for layout ops that were sent
> before a CB_LAYOUTRECALL and are processed after it.  Wraparound
> processing is important here to determine what is "before" or "after" a
> particular seqid.
> 
> To prevent a situation where LAYOUTRETURNs sent before a CB_LAYOUTRECALL
> are mistaken to be in response to the CB_LAYOUTRECALL the server
> shouldn't just accept any seqid.  During normal operation, without any
> layout recalls, the server must allow only seqids which are in the range
> of the current seqid (as maintained by the server) minus some threshold
> to the current seqid, all modulo
> |seqid| (2^32).  If the seqid is too old or too new the server can 
> |seqid| return either
> NFS4ERR_OLD_STATEID or NFS4ERR_BAD_STATEID, respectively.
> 
> Benny
> 
> On Mar. 21, 2008, 19:42 +0200, "William A. (Andy) Adamson"
> <andros at citi.umich.edu> wrote:
>> On Fri, Mar 21, 2008 at 1:32 PM, Noveck, Dave <Dave.Noveck at netapp.com>
> wrote:
>>
>> 	> I think I missed a bad stateid case
>> 	>
>> 	>       - operation layout stateid sequence number is greater
> than the
>> 	server stored value.
>> 	
>> 	
>> 	Fun with wraparound.  I assume this only applies if the seqid
> has nto
>> 	wrapped around
>>
>>
>> good point.
>>
>>
>>
>>
>> 	I think you could enforce this, but it seems odd to just pick
> this one
>> 	constraint to enforce.  If the server does not use the value,
> and I
>> 	don't see where it does, why put this constarint on it, jsut so
> the
>> 	server can, in this one case, say "gotcha".
>>
>>
>> The reason comes from Spencer noting that the client must send a
> layout stateid.seqid that was valid in the past, which in turn comes
> from section 12.5.5.2.1.3 which describes the server processing of the
> operation stateid.seqid WRT the CB_LAYOUTRECALL layout stateid.seqid.
> This is where the server uses the seqid value.
>> -->Andy
>>
>>
>>
>>
>> 	-----Original Message-----
>> 	From: William A. (Andy) Adamson [mailto:andros at citi.umich.edu]
>> 	
>> 	Sent: Friday, March 21, 2008 12:57 PM
>> 	To: Noveck, Dave
>> 	Cc: pnfs at linux-nfs.org
>> 	Subject: Re: [nfsv4] layout stateid handling/processing updates
>> 	
>> 	
>> 	
>> 	
>> 	
>> 	On Fri, Mar 21, 2008 at 12:44 PM, William A. (Andy) Adamson
>> 	<andros at citi.umich.edu> wrote:
>> 	
>> 	here is what I come up with:
>> 	
>> 	NFS4ERR_STALE_STATEID should apply to the layout stateid (?)
>> 	NFS4ERR_BAD_STATEID:
>> 	    - initial open/lock/delegation stateid is not valid
>> 	    - layout stateid sequence number is 0
>> 	    - opaque portion of layout stateid changed (layout stateid
> can't be
>> 	found....)
>> 	
>> 	I think I missed a bad stateid case
>> 	
>> 	       - operation layout stateid sequence number is greater
> than the
>> 	server stored value.
>> 	
>> 	
>> 	-->Andy
>> 	
>> 	
>> 	
>> 	
>> 	NFS4ERR_OLD_STATEID does not apply to layout stateid.
>> 	
>> 	
>> 	
>> 	On Fri, Mar 21, 2008 at 11:59 AM, Noveck, Dave
> <Dave.Noveck at netapp.com>
>> 	wrote:
>> 	
>> 	I don't get it.
>> 	
>> 	Are you saying that the processing rules that you sent out on
> 3/7 are
>> 	not enforced by the server and I can send any random seqid I
> choose
>> 	(except zero perhaps).
>> 	
>> 	The client can't use any old seqid:
>> 	
>> 	
>> 	---- from Spencer ----
>> 	
>> 	 Once the client receives a layout stateid, it MUST use the
> correct
>> 	  "seqid" for subsequent LAYOUTGET or LAYOUTRETURN operations.
> The
>> 	  correct "seqid" is defined as the highest "seqid" value from
> fully
>> 	  processed responses to LAYOUTGET or LAYOUTRETURN operations or
> fully
>> 	  processed arguments of a CB_LAYOUTRECALL operation.
>> 	
>> 	---------
>> 	
>> 	But, the server can't check that the above MUST is followed by
> the
>> 	client.
>> 	
>> 	
>> 	
>> 	 But if any other value is OK, why isn't zero OK.
>> 	
>> 	I guess the server can use ordering info as well?
>> 	
>> 	
>> 	---- from Spencer ----
>> 	
>> 	  The fundemental
>> 	  requirement in client processing is that the "seqid" is used
> to
>> 	  provide the ordering of processing.
>> 	
>> 	---------
>> 	
>> 	-->Andy
>> 	
>> 	
>> 	
>> 	
>> 	If that is so, then you would need to say so explicitly and that
>> 	OLD_STATEID (or BAD_STATEID because of a seid other than zero)
> is not
>> 	returned.
>> 	
>> 	The section you mention just lists cases in which using the
> current
>> 	stateid is permissible and doesn't seem to say anything about
> things
>> 	which are impermissable.
>> 	
>> 	
>> 	-----Original Message-----
>> 	From: Spencer Shepler [mailto:Spencer.Shepler at Sun.COM]
>> 	Sent: Friday, March 21, 2008 11:31 AM
>> 	To: NFSv4
>> 	Subject: Re: [nfsv4] layout stateid handling/processing updates
>> 	
>> 	
>> 	
>> 	On Mar 21, 2008, at 9:58 AM, Noveck, Dave wrote:
>> 	
>> 	> Caveat: I haven't got a chance to read Spencer's email on this
> subject
>> 	> in detail;
>> 	>
>> 	> My expectation though was that OLD_STATEID would follow the
> rules
>> 	> Spencer set out in his mail of 3/7:
>> 	>
>> 	>> The processing rules would be something like this:
>> 	>> If the client is sending layoutgets, the stateid.seqid for
>> 	>> the requests must be greater than or equal to the last value
>> 	>> returned for a layoutreturn.  If the client is sending
>> 	>> layoutreturns, the stateid.seqid must be greater than or
> equal
>> 	>> to the last received layoutget or the value provided in the
>> 	>> cb_layoutrecall.
>> 	>
>> 	> My expectation was that if the seqid you sent did not follow
> the
>> 	> appropriate "equal or greater" clause, you would get
> OLD_STATEID while
>> 	> if it greater than the last one currently given out, you would
> get
>> 	> BAD_STATEID.  Chapter 8 would say that the rules for seqid on
> layouts
>> 	> are slightly different from those for IO, opens, byte-range
> locks,
>> 	> etc.
>> 	> and have an xref to chapter 12 where this was discussed.
>> 	>
>> 	> Appropriate wordsmithing of the above to deal with the fact
> that
>> 	> seqids
>> 	> are subject to wraparound is left as an exercize for anybody
> who
>> 	> doesn't
>> 	> pass the buck quickly enough :-)
>> 	
>> 	
>> 	Having reviewed all of the layout stateid material in the I-D I
> came
>> 	to the conclusion that seqid checking/validation the server
> needed
>> 	to do was limited to what is stated in:
>> 	      12.5.5.2.1.  Layout Recall and Return Sequencing
>> 	that I sent out (mainly unchanged).  Otherwise, the server just
>> 	processes the requests as they arrive and ensures that the seqid
>> 	is updated for each resposne.  It is then the client's
> responsibility
>> 	to sort out the responses to deal with overlapping layoutget/
>> 	layoutreturn
>> 	processing.
>> 	
>> 	My apologies for not making this clear.
>> 	
>> 	I would suggest this wording be added to the second paragraph of
>> 	"12.5.3.  Layout Stateid"
>> 	
>> 	 Note the seqid processing for layout stateid differs from the
> stateid
>> 	 types.  To support the parallelism desired, seqid processing
> rules
>> 	 are limited to those specified in the pNFS chapters (mainly
> this
>> 	 section and Section 12.5.5.2).
>> 	
>> 	
>> 	Spencer
>> 	
>> 	
>> 	> -----Original Message-----
>> 	> From: William A. (Andy) Adamson [mailto:andros at citi.umich.edu]
>> 	> Sent: Friday, March 21, 2008 10:05 AM
>> 	> To: Spencer Shepler
>> 	> Cc: NFSv4
>> 	> Subject: Re: [nfsv4] layout stateid handling/processing
> updates
>> 	>
>> 	>
>> 	> What about NFS4ERR_OLD_STATEID? If the client sends parallel
>> 	> LAYOUTGET/RETURN operations, they could all have the same
> stateid. The
>> 	> server will process the first one, and all remaining will have
> old
>> 	> stateids. Should the server ignore this error for layout
> stateids?
>> 	>
>> 	> --->Andy
>> 	>
>> 	>
>> 	> On Wed, Mar 19, 2008 at 3:06 PM, Spencer Shepler
>> 	> <Spencer.Shepler at sun.com> wrote:
>> 	>
>> 	> Included is the updated/proposed text to clarify layout
> stateid
>> 	> handling.
>> 	> The diffs were a little difficult to read so I have included
> the
>> 	> sections
>> 	> that have been updated.
>> 	>
>> 	> Spencer
>> 	>
>> 	> ----
>> 	>
>> 	> 12.5.3.  Layout Stateid
>> 	>
>> 	>    As with all other stateids, the layout stateid consists of
> a
>> 	> "seqid"
>> 	>    and "other" field.  Once a layout stateid is changed, the
> "other"
>> 	>    field will stay constant unless the stateid is revoked, or
> the
>> 	> client
>> 	>    returns all layouts on the file and the server disposes of
> the
>> 	>    stateid.  The "seqid" field is initially set to one, and is
> never
>> 	>    zero on any NFSv4.1 operation that uses layout stateids,
> whether it
>> 	>    is fore channel or backchannel operation.  After the layout
> stateid
>> 	>    is established, the server increments by one the value of
> the
>> 	> "seqid"
>> 	>    in each subsequent LAYOUTGET and LAYOUTRETURN response, and
> in each
>> 	>    CB_LAYOUTRECALL request.
>> 	>
>> 	>    Given the design goal of pNFS to provide parallelism, the
> layout
>> 	>    stateid differs from other stateid types in that the client
> is
>> 	>    expected to send LAYOUTGET and LAYTOUTRETURN operations in
>> 	> parallel.
>> 	>    Since the server is incrementing the "seqid" value on each
> layout
>> 	>    operation, the client may determine the order of operation
>> 	> processing
>> 	>    by inspecting the "seqid" value.  In the case of
> overlapping ranges
>> 	>    in the client's requests, the ordering information will
> provide the
>> 	>    client the knowledge of which layout ranges are held.
> Additional
>> 	>    layout stateid sequencing requirements are provided in
>> 	>    Section 12.5.5.2.
>> 	>
>> 	>    Once the client receives a layout stateid, it MUST use the
> correct
>> 	>    "seqid" for subsequent LAYOUTGET or LAYOUTRETURN
> operations.  The
>> 	>    correct "seqid" is defined as the highest "seqid" value
> from fully
>> 	>    processed responses to LAYOUTGET or LAYOUTRETURN operations
> or
>> 	> fully
>> 	>    processed arguments of a CB_LAYOUTRECALL operation.
>> 	>
>> 	>    The client's receipt of a "seqid" is not sufficient for
> subsequent
>> 	>    use.  The client must fully process the operations before
> the
>> 	> "seqid"
>> 	>    can be used.  For LAYOUTGET results, if the client is not
> using the
>> 	>    forgetful model (Section 12.5.5.1), it MUST first update
> its record
>> 	>    of what ranges of the file's layout it has before using the
> seqid.
>> 	>    For LAYOUTRETURN results, the client MUST delete the range
> from its
>> 	>    record of what ranges of the file's layout it had before
> using the
>> 	>    seqid.  For CB_LAYOUTRECALL arguments, the client MUST send
> a
>> 	>    response to the recall before using the seqid.  The
> fundemental
>> 	>    requirement in client processing is that the "seqid" is
> used to
>> 	>    provide the ordering of processing.  LAYOUTGET results may
> be
>> 	>    processed in parallel.  LAYOUTRETURN results may be
> processed in
>> 	>    parallel.  LAYOUTGET and LAYOUTRETURN responses may be
> processed in
>> 	>    parallel as long as the ranges do not overlap.
> CB_LAYOUTRECALL
>> 	>    request processing MUST be processed in "seqid" order at
> all times.
>> 	>
>> 	>    Once a client has no more layouts on a file, the layout
> stateid
>> 	> is no
>> 	>    longer valid, and MUST NOT be used.  Any attempt to use
> such a
>> 	> layout
>> 	>    stateid will result in NFS4ERR_BAD_STATEID.
>> 	>
>> 	>
>> 	> ------
>> 	>
>> 	> 12.5.5.2.1.  Layout Recall and Return Sequencing
>> 	>
>> 	>    One critical issue with regard to layout operations
> sequencing
>> 	>    concerns callbacks.  The protocol must defend against races
> between
>> 	>    the reply to a LAYOUTGET or LAYOUTRETURN operation and a
> subsequent
>> 	>    CB_LAYOUTRECALL.  A client MUST NOT process a
> CB_LAYOUTRECALL that
>> 	>    implies one or more outstanding LAYOUTGET or LAYOUTRETURN
>> 	> operations
>> 	>    to which the client has not yet received a reply.  The
> client
>> 	> detects
>> 	>    such a CB_LAYOUTRECALL by examining the "seqid" field of
> the
>> 	> recall's
>> 	>    layout stateid.  If the "seqid" is not one higher than what
> the
>> 	>    client currently has recorded, and the client has at least
> one
>> 	>    LAYOUTGET and/or LAYOUTRETURN operation outstanding, the
> client
>> 	> knows
>> 	>    the server sent the CB_LAYOUTRECALL after sending a
> response to an
>> 	>    outstanding LAYOUTGET or LAYOUTRETURN.  The client MUST
> wait before
>> 	>    processing such a CB_LAYOUTRECALL until it processes all
> replies
>> 	> for
>> 	>    outstanding LAYOUTGET and LAYOUTRETURN operations for the
>> 	>    corresponding file with seqid less than the seqid given by
>> 	>    CB_LAYOUTRECALL (lor_stateid, see Section 20.3.)
>> 	>
>> 	>    In addition to the seqid-based mechanism, Section 2.10.5.3
>> 	> describes
>> 	>    the sessions mechanism for allowing the client to detect
> callback
>> 	>    race conditions and delay processing such a
> CB_LAYOUTRECALL.  The
>> 	>    server MAY reference conflicting operations in the
> CB_SEQUENCE that
>> 	>    precedes the CB_LAYOUTRECALL.  Because the server has
> already sent
>> 	>    replies for these operations before issuing the callback,
> the
>> 	> replies
>> 	>    may race with the CB_LAYOUTRECALL.  The client MUST wait
> for all
>> 	> the
>> 	>    referenced calls to complete and update its view of the
> layout
>> 	> state
>> 	>    before processing the CB_LAYOUTRECALL.
>> 	>
>> 	> 12.5.5.2.1.1.  Get/Return Sequencing
>> 	>
>> 	>    The protocol allows the client to send concurrent LAYOUTGET
> and
>> 	>    LAYOUTRETURN operations to the server.  The protocol does
> not
>> 	> provide
>> 	>    any means for the server to process the requests in the
> same
>> 	> order in
>> 	>    which they were created.  However, through the use of the
> "seqid"
>> 	>    field in the layout stateid, the client can determine the
> order in
>> 	>    which parallel outstanding operations were processed by the
> server.
>> 	>    Thus, when a layout retrieved by an outstanding LAYOUTGET
> operation
>> 	>    intersects with a layout returned by an outstanding
> LAYOUTRETURN on
>> 	>    the same file, the order in which the two conflicting
> operations
>> 	> are
>> 	>    processed determines the final state of the overlapping
> layout.
>> 	> The
>> 	>    order is determined by the "seqid" returned in each
> operation: the
>> 	>    operation with the higher seqid was executed later.
>> 	>
>> 	>    It is permissible for the client to send in parallel
> multiple
>> 	>    LAYOUTGET operations for the same file or multiple
> LAYOUTRETURN
>> 	>    operations for the same file, and a mix of both.
>> 	>
>> 	>    It is permissilble for the client to use the current
> stateid (see
>> 	>    Section 16.2.3.1.2) for LAYOUTGET operations for example
> when
>> 	>    compounding LAYOUTGETs or compounding OPEN and LAYOUTGETs.
> It is
>> 	>    also permissible to use the current stateid when
> compounding
>> 	>    LAYOUTRETURNs.
>> 	>
>> 	>    It is permissible for the client to current stateid when
> combining
>> 	>    LAYOUTRETURN and LAYOUTGET operations for the same file in
> the same
>> 	>    COMPOUND request since the server MUST process these in
> order.
>> 	>    However, if a client does send such COMPOUND requests, it
> MUST NOT
>> 	>    have more than one outstanding for the same file at the
> same time
>> 	> and
>> 	>    MUST NOT have other LAYOUTGET or LAYOUTRETURN operations
>> 	> outstanding
>> 	>    at the same time for that same file.
>> 	>
>> 	> ----
>> 	>
>> 	> 12.5.5.2.1.2.  Client Considerations
>> 	>
>> 	>    Consider a pNFS client that has sent a LAYOUTGET and before
> it
>> 	>    receives the reply to LAYOUTGET, it receives a
> CB_LAYOUTRECALL for
>> 	>    the same file with an overlapping range.  There are two
>> 	>    possibilities, which the client can distinguish via the
> layout
>> 	>    stateid in the recall.
>> 	>
>> 	>    1.  The server processed the LAYOUTGET before issuing the
>> 	> recall, so
>> 	>        the LAYOUTGET must be waited for because it may be
> carrying
>> 	>        layout information that will need to be returned to
> deal with
>> 	> the
>> 	>        CB_LAYOUTRECALL.
>> 	>
>> 	>    2.  The server sent the callback before receiving the
> LAYOUTGET.
>> 	> The
>> 	>        server will not respond to the LAYOUTGET until the
>> 	>        CB_LAYOUTRECALL is processed.
>> 	>
>> 	>    If these possibilities cannot be distinguished, a deadlock
> could
>> 	>    result, as the client must wait for the LAYOUTGET response
> before
>> 	>    processing the recall in the first case, but that response
> will not
>> 	>    arrive until after the recall is processed in the second
> case.
>> 	> Note
>> 	>    that in the first case, the "seqid" in the layout stateid
> of the
>> 	>    recall is two greater than what the client has recorded and
> in the
>> 	>    second case, the "seqid" is one greater than what the
> client has
>> 	>    recorded.  This allows the client to disambiguate between
> the two
>> 	>    cases.  The client thus knows precisely which possibility
> applies.
>> 	>
>> 	>    In case 1 the client knows it needs to wait for the
> LAYOUTGET
>> 	>    response before processing the recall (or the client can
> return
>> 	>    NFS4ERR_DELAY).
>> 	>
>> 	>    In case 2 the client will not wait for the LAYOUTGET
> response
>> 	> before
>> 	>    processing the recall, because waiting would cause
> deadlock.
>> 	>    Therefore, the action at the client will only require
> waiting in
>> 	> the
>> 	>    case that the client has not yet seen the server's earlier
>> 	> responses
>> 	>    to the LAYOUTGET operation(s).
>> 	>
>> 	>    The recall process can be considered completed when the
> final
>> 	>    LAYOUTRETURN operation for the recalled range is completed.
> The
>> 	>    LAYOUTRETURN uses the layout stateid (with seqid) specified
> in
>> 	>    CB_LAYOUTRECALL.  If the client uses multiple LAYOUTRETURNs
> in
>> 	>    processing the recall, the first LAYOUTRETURN will use the
> layout
>> 	>    stateid as specified in CB_LAYOUTRECALL.  Subsequent
> LAYOUTRETURNs
>> 	>    will use the highest seqid as is the usual case.
>> 	>
>> 	> _______________________________________________
>> 	> nfsv4 mailing list
>> 	> nfsv4 at ietf.org
>> 	> https://www.ietf.org/mailman/listinfo/nfsv4
>> 	
>> 	_______________________________________________
>> 	nfsv4 mailing list
>> 	nfsv4 at ietf.org
>> 	https://www.ietf.org/mailman/listinfo/nfsv4
>> 	_______________________________________________
>> 	nfsv4 mailing list
>> 	nfsv4 at ietf.org
>> 	https://www.ietf.org/mailman/listinfo/nfsv4
>> 	
>> 	
>>
>>
>>
> 



More information about the pNFS mailing list