miscellaneous server patches

Neil Brown neilb at suse.de
Thu May 3 21:43:13 EDT 2007


On Wednesday May 2, bfields at fieldses.org wrote:
> 
> None of this is urgent.  The first three are just random bits I've had
> sitting in my tree, and that you already have in at least one case.
> 
> The fourth shares the grace period calculation between lockd and nfsd4,
> closing the obvious race that could occur when they end their grace
> periods at different times.  I've been holding off just because the
> nfsv4 server reboot recovery code is in some sort of limbo--we've been
> meaning to replace it, but the replacement isn't ready yet.
> 
> But any replacement is going to do something very similar, and I think
> it'd be useful to have this dealt with when we deal with the HA failover
> stuff.
> 
> This actually still leaves open a small window where a lockd client
> could get a lock before the nfsv4 grace period is completely finished;
> fixing that:
> 
> 	http://linux-nfs.org/cgi-bin/gitweb.cgi?p=bfields-2.6.git;a=commit;h=cf8c96007eb55851fbbaf53cdacab63d626b7c0a
> 
> requires some more locking, and seems like overkill for now.
> 

How important is it to get the grace periods synchronised anyway?

The clients don't really know when it is going to end - they just know
how long it is and at some point they find out that it has started.
Then they just slam out lock reclaim requests and hope they get them
all in in time.

To ensure a consistent picture for the clients to see, we just need to
ensure that we don't return success for any reclaim request after we
have permitted any (possibly conflicting) lock (or IO for NFSv4 and
mandatory locks).

So we just need to set a bit to say "grace is really over" - either a
system-wide bit or a per-file bit.

When we get a reclaim request, we check the bit and if it isn't set,
we allow the reclaim.
When we get a non-reclaim conflicting request, if the bit is clear and
the grace period is not up, we reject the request, else we set the bit
and process the request.

Obviously every bit of our code needs to have roughly the same idea
about the length of the grace period, so the NFSv4 server doesn't lie
too much to the client, but I don't think we need that much
synchronisation - just one bit somewhere

Related topic:  I think it would be good to move all the purely
server-side parts of lockd into fs/nfsd.  This would include all the
grace handling, the list of open files, and a lot of code for handling
incoming lock requests.

The core lockd thread and the host monitoring stuff would need to stay
in fs/lockd/ as that is shared by client and server, but lots of stuff
could move.

I imagine nfsd passing an array of "struct svc_procedure" to lockd
which lists all the procedures that nfsd handles (well, one array for
each protocol version).  The nfs client could pass a similar list, and
there would be some that the lockd core handles (like sm_notify).

Doing this would move all the grace handling into the one module and
probably make any synchronisation issue a lot easier to manage.

Anyone feel like trying?

NeilBrown


More information about the NFSv4 mailing list