[PATCH 0/3] [RFC] knfsd: convert to kthread API and remove signaling for shutdown
Neil Brown
neilb at suse.de
Mon May 19 23:13:09 EDT 2008
On Monday May 19, jlayton at redhat.com wrote:
>
> I find all of the locking in rpc/nlm/nfs/nfsd *very* confusing. It's
> always been explained to me that it's best to not use locking to
> protect a section of code, but rather data structures. This makes it
> easier to get the locking right when the code changes.
Absolutely.
>
> This is the problem we have now with the BKL. So much of
> rpc/nlm/nfs/nfsd runs under it that it's nearly impossible to tell what
> it's intended to actually protect. If we're going to start a push
> toward BKL removal, my humble request is that we try to be as explicit
> as possible about what locks protect what data structures.
>
> So that's my question about this patch -- exactly what data is this new
> mutex intended to protect?
>
Fair question.
The new nfsd_mutex primarily protects "nfsd_serv", both the pointer
itself and various members of the structure that it sometimes points
to.
In particular, ->sv_nrthreads but also to some extent sv_temp_socks
and sv_permsocks.
Also nfsdstats.th_cnt.
If (out side the lock) nfsd_serv is non-NULL, then it must point to
a properly initialised 'struct svc_serv' with ->sv_nrthreads > 0, and
that number of nfsd threads must exist and each must listed in
->sp_all_threads for exactly on ->sv_pools[].
The particular thing that the lock is protecting is transitions
between 0 and non-0. On these transitions, the svc_serv struct needs
to be either created and initialised, or freed up.
Having said all that, I think I see a race.
When a new rqst is created at put on sp_all_threads, ->rq_task it not
set and doesn't get set until the thread runs and gets the mutex. So
there is a brief hole when ->rq_task isn't set. I don't know if that
can cause a problem, but it feels wrong.
When you switch to kthreads, you use the fact that kthread_create
returns a task_struct, and assign that to ->rq_task in
__svc_create_thread instead of nfsd, which will close that hole.
If you look in nfsctl.c, you will probably be able to find plenty of
places where nfsd_serv is dereferenced without any locking. These are
all wrong and need fixing.
NeilBrown
More information about the NFSv4
mailing list