PATCH - Locking issues and memory leak

J. Bruce Fields bfields at fieldses.org
Mon Dec 3 10:48:21 EST 2007


On Tue, Nov 27, 2007 at 08:48:44PM -0500, bfields wrote:
> On Wed, Nov 28, 2007 at 08:42:23AM +0900, Steven Wilton wrote:
> > > -----Original Message-----
> > > From: J. Bruce Fields [mailto:bfields at fieldses.org] 
> > > Sent: Tuesday, 27 November 2007 11:20 PM
> > > To: Steven Wilton
> > > Cc: Benny Halevy; Trond Myklebust; nfsv4 at linux-nfs.org
> > > Subject: Re: PATCH - Locking issues and memory leak
> > > 
> > > On Tue, Nov 27, 2007 at 02:41:23PM +0900, Steven Wilton wrote:
> > > > I may have spoken up too early...  The patch does appear to fix the
> > > > memory leak on the test server, and does get rid of the 
> > > sequence error
> > > > messages from the kernel logs on the production server, 
> > > however the NFS
> > > > server still looks like it's leaking nfsd4_stateowners and size-32
> > > > objects in the output of slabtop.
> > > 
> > > What's the slabtop data?  (And how are you sure it represents a leak?)
> > > 
> > 
> > 1014496 1014444  99%    0.03K   9058      112     36232K size-32
> > 1012122 1012122 100%    0.41K 112458        9    449832K
> > nfsd4_stateowners
> > 299174 242530  81%    0.20K  15746       19     62984K dentry_cache
> > 
> > These are the top 3 entries.  The top 2 are increasing over time, and
> > will eventually consume all the available memory in the system.  In
> > contrast, the other server (same clients, same kernel version, still
> > acquiring/releasing locks, but different and limited applications
> > performing file operations and more NFS operations being performed) has
> > the following values in the output of slabtop:
> > 
> >  34832  20023  57%    0.03K    311      112      1244K size-32
> >  18117  17862  98%    0.41K   2013        9      8052K nfsd4_stateowners
> 
> Got it, thanks.
> 
> ...
> 
> > I'll try to figure out a test case that reproduces the memory leaks.  Is
> > it possible to get a quick explanation on how the stateowner slab memory
> > is freed / managed in the nfs4 code.  The only place that I could see it
> > happening obviously was in nfs4_free_stateowner(), but that only looks
> > like it's used as a callback in nfsd4_encode_lock_denied().
> 
> Most of the calls to nfs4_free_stateowner() are by way of
> nfs4_put_stateowner().

OK, sorry for neglecting this.  I guess the most likely way to help make
progress is to get a testcase, but I'm not sure how to suggest we find
one.

It's unfortunately possible, as I've said before, to DOS our server just
by opening a lot of files and never closing them--we need to impose some
limits.  But assuming your clients aren't buggy, it's seems unlikely
they're really holding over a million open files.

--b.


More information about the NFSv4 mailing list