[PATCH] nfsd: handle empty list in move_to_close_lru

J. Bruce Fields bfields at fieldses.org
Wed Jan 23 18:36:17 EST 2008


On Thu, Jan 24, 2008 at 01:30:06AM +0200, Benny Halevy wrote:
> J. Bruce Fields wrote:
>> On Wed, Jan 23, 2008 at 03:01:42PM +0200, Benny Halevy wrote:
>>   
>>> Apparently, fs/nfsd/nfs4state.c:move_to_close_lru
>>> may be called when sop->so_close_lru is empty.
>>> Without retruning early list_move_tail on the empty
>>> list crashes.
>>>     
>>
>> The list_move functions are just list_del's followed by list_add's, and
>> list_del's should be completely safe on empty lists.
>>
>> How did you decide the bad pointer deference was in move_to_close_lru()?
>>   
>
> I was looking at the disassembled code at  nfsd4_close+0xd3
> I can send you the assembly code sniplet tomorrow.

Sure, thanks!  I take it you don't have a way to reproduce this?  (Do
you know what was happening at the time?)  And this is a kernel with a
lot of 4.1 patches?

--b.

>
> Benny
>
>> --b.
>>
>>   
>>> Here's an oops trace for example: (note that move_to_close_lru
>>> is inlined by the compiler into nfsd4_close)
>>>
>>> Jan 23 12:37:16 bh-testlin1 kernel: Unable to handle kernel paging request at 0000000000100108 RIP:
>>> Jan 23 12:37:16 bh-testlin1 kernel:  [<ffffffff8824c604>] :nfsd:nfsd4_close+0xd3/0x123
>>> Jan 23 12:37:16 bh-testlin1 kernel: PGD 73827067 PUD 79cee067 PMD 0
>>> Jan 23 12:37:16 bh-testlin1 kernel: Oops: 0002 [1] SMP
>>> Jan 23 12:37:16 bh-testlin1 kernel: CPU 0
>>> Jan 23 12:37:16 bh-testlin1 kernel: Modules linked in: panfs(P) panlayoutdriver vmnet(P) parport_pc parport vmmon(P) nfsd auth_rpcgss exportfs autofs4 nfs lockd nfs_acl sunrpc ipv6 video output sbs sbshc battery ac sr_mod k8temp i2c_nforce2 hwmon i2c_core pcspkr forcedeth button cdrom pata_amd ata_generic sata_nv libata sd_mod scsi_mod ext3 jbd mbcache ehci_hcd ohci_hcd uhci_hcd
>>> Jan 23 12:37:16 bh-testlin1 kernel: Pid: 2163, comm: nfsd Tainted: P        2.6.24-rc8-panlayout #8
>>> Jan 23 12:37:16 bh-testlin1 kernel: RIP: 0010:[<ffffffff8824c604>]  [<ffffffff8824c604>] :nfsd:nfsd4_close+0xd3/0x123
>>> Jan 23 12:37:16 bh-testlin1 kernel: RSP: 0018:ffff81007d947dc0  EFLAGS: 00010246
>>> Jan 23 12:37:16 bh-testlin1 kernel: RAX: 0000000000200200 RBX: ffff810028118000 RCX: ffffffff8824c5d0
>>> Jan 23 12:37:16 bh-testlin1 kernel: RDX: 0000000000100100 RSI: ffffe20000909cd0 RDI: ffff810028118058
>>> Jan 23 12:37:16 bh-testlin1 kernel: RBP: ffff81007c524290 R08: 0000000000000166 R09: ffff810029516000
>>> Jan 23 12:37:16 bh-testlin1 kernel: R10: 0000000000000000 R11: ffff81007c524290 R12: ffff81007c4f0400
>>> Jan 23 12:37:16 bh-testlin1 kernel: R13: 0000000000000000 R14: ffff81007c4f0400 R15: ffff81007c4c4000
>>> Jan 23 12:37:16 bh-testlin1 kernel: FS:  00002ab8fd1ea6f0(0000) GS:ffffffff81371000(0000) knlGS:00000000f3a2cb90
>>> Jan 23 12:37:16 bh-testlin1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>> Jan 23 12:37:16 bh-testlin1 kernel: CR2: 0000000000100108 CR3: 0000000073838000 CR4: 00000000000006e0
>>> Jan 23 12:37:16 bh-testlin1 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> Jan 23 12:37:16 bh-testlin1 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> Jan 23 12:37:16 bh-testlin1 kernel: Process nfsd (pid: 2163, threadinfo ffff81007d946000, task ffff81007d944000)
>>> Jan 23 12:37:16 bh-testlin1 kernel: Stack:  0000000000000000 ffffffff88264ec0 ffff810029516000 ffffffff88264da0
>>> Jan 23 12:37:16 bh-testlin1 kernel:  ffff81007c525000 ffff81007c524000 ffff81007c4f0400 ffffffff8823fef0
>>> Jan 23 12:37:16 bh-testlin1 kernel:  ffffffff882650f8 ffff81007c524288 ffff81007b8892c0 ffff81007c4c4000
>>> Jan 23 12:37:16 bh-testlin1 kernel: Call Trace:
>>> Jan 23 12:37:16 bh-testlin1 kernel:  [<ffffffff8823fef0>] :nfsd:nfsd4_proc_compound+0x2b1/0x476
>>> Jan 23 12:37:16 bh-testlin1 kernel:  [<ffffffff88231245>] :nfsd:nfsd_dispatch+0xde/0x1b6
>>> Jan 23 12:37:16 bh-testlin1 kernel:  [<ffffffff88187b9d>] :sunrpc:svc_process_common+0x2fc/0x5bd
>>> Jan 23 12:37:16 bh-testlin1 kernel:  [<ffffffff88188cce>] :sunrpc:svc_process+0x101/0x143
>>> Jan 23 12:37:16 bh-testlin1 kernel:  [<ffffffff88231819>] :nfsd:nfsd+0x1a1/0x2bc
>>> Jan 23 12:37:16 bh-testlin1 kernel:  [<ffffffff8100ccd8>] child_rip+0xa/0x12
>>> Jan 23 12:37:16 bh-testlin1 kernel:  [<ffffffff88231678>] :nfsd:nfsd+0x0/0x2bc
>>> Jan 23 12:37:16 bh-testlin1 kernel:  [<ffffffff8100ccce>] child_rip+0x0/0x12
>>> Jan 23 12:37:16 bh-testlin1 kernel:
>>> Jan 23 12:37:16 bh-testlin1 kernel:
>>> Jan 23 12:37:16 bh-testlin1 kernel: Code: 48 89 42 08 48 8b 35 69 53 02 00 48 89 10 48 c7 c2 70 19 27
>>> Jan 23 12:37:16 bh-testlin1 kernel: RIP  [<ffffffff8824c604>] :nfsd:nfsd4_close+0xd3/0x123
>>> Jan 23 12:37:16 bh-testlin1 kernel:  RSP <ffff81007d947dc0>
>>> Jan 23 12:37:16 bh-testlin1 kernel: CR2: 0000000000100108
>>> Jan 23 12:37:16 bh-testlin1 kernel: ---[ end trace 90ea1dfbd28e9e52 ]---
>>>
>>> Signed-off-by: Benny Halevy <bhalevy at panasas.com>
>>> ---
>>>  fs/nfsd/nfs4state.c |    3 +++
>>>  1 files changed, 3 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>> index 27f284f..d181817 100644
>>> --- a/fs/nfsd/nfs4state.c
>>> +++ b/fs/nfsd/nfs4state.c
>>> @@ -1739,6 +1739,9 @@ move_to_close_lru(struct nfs4_stateowner *sop)
>>>  {
>>>  	dprintk("NFSD: move_to_close_lru nfs4_stateowner %p\n", sop);
>>>  +	if (list_empty(&sop->so_close_lru))
>>> +		return;
>>> +
>>>  	list_move_tail(&sop->so_close_lru, &close_lru);
>>>  	sop->so_time = get_seconds();
>>>  }
>>> -- 
>>> 1.5.3.3
>>>
>>>     
>


More information about the NFSv4 mailing list