possible module refcount leak with auth_gss
J. Bruce Fields
bfields at fieldses.org
Mon Dec 8 12:37:06 EST 2008
On Mon, Dec 08, 2008 at 10:28:55AM -0500, Jeff Layton wrote:
> We had someone report a bug against Fedora that they were seeing very
> high module reference counts for some krb5 related modules on his nfs
> server. For instance:
>
> # lsmod
> Module Size Used by
> des_generic 25216 52736
> cbc 12160 52736
> rpcsec_gss_krb5 15632 26370
>
> ...the cbc and des_generic each have roughly 2 module references per
> rpcsec_gss_krb5 refcount so I'm assuming that the "lynchpin" here is
> the rpcsec_gss_krb5 refcount which seems to be increasing w/o bound.
You may want to see this discussion:
http://marc.info/?t=122819524700001&r=1&w=2
And these patches:
http://marc.info/?l=linux-nfs&m=122843371318602&w=2
In addition to increasing the timeouts on those cache entries, perhaps
we could flush the contexts on rmmod? Or change the reference counting
somehow--e.g., take a reference only in the presence of export cache
entries that mention krb5, and destroy contexts when the last such goes
away?
Also to check: a recent client should be sending destroy_ctx calls on
unmount, and a recent server should be acting on them. Perhaps there's
a bug there. I'd do an unmount, watch the wire to make sure the
destroy_ctx calls are really going across (they'll look like NFSv4 NULL
calls, with the interesting fields in the cred in the rpc header). Then
take a close look at the destroy_ctx code (see the second occurence of
RPC_GSS_PROC_DESTROY in svcauth_gss_accept(), around line 1126).
--b.
>
> I've been able to reproduce this fairly easily by setting up a nfs
> server with a krb5 authenticated export. If I then mount that and
> immediately unmount it from a client, the refcount on rpcsec_gss_krb5 on
> the server increases by 1. For instance:
>
> First mount and unmount:
> Module Size Used by
> cbc 12288 2
> rpcsec_gss_krb5 19208 1
> des_generic 25344 2
>
> Second mount and unmount:
> Module Size Used by
> cbc 12288 4
> rpcsec_gss_krb5 19208 2
> des_generic 25344 4
>
> Third mount and unmount:
> Module Size Used by
> cbc 12288 6
> rpcsec_gss_krb5 19208 3
> des_generic 25344 6
>
> ...while that's an easy way to reproduce it, there may be other ways to
> make it grow.
>
> Some printk debugging shows that the references are increased as a
> result of rsc_parse(). From my (rather naive) look at this code, it
> looks like each entry in the rsc_cache holds a module reference.
>
> I'm guessing that when these cache entries are released that the module
> references also get released, but I haven't been successful in making
> that occur. It seems like the module references are never put, so either
> the entries are never getting flushed out of the cache or the module
> references aren't being properly released by this code. There's no
> "content" file for this cache though, so it's hard to tell whether the
> cache is populated at any given time.
>
> Either way, this seems likely to be a bug. There doesn't seem to be a
> way to make the refcounts go down again once they've been increased. Can
> anyone confirm whether this is working as intended? If not, do you have
> any idea where the problem may be, or how to approach tracking this
> down? Unfortunately, I'm finding this code to be very hard to follow.
>
> Any help or suggestions appreciated...
>
> Thanks,
> --
> Jeff Layton <jlayton at redhat.com>
> _______________________________________________
> NFSv4 mailing list
> NFSv4 at linux-nfs.org
> http://linux-nfs.org/cgi-bin/mailman/listinfo/nfsv4
More information about the NFSv4
mailing list