[PATCH] nfsd: handle empty list in move_to_close_lru
J. Bruce Fields
bfields at fieldses.org
Wed Jan 23 18:36:17 EST 2008
On Thu, Jan 24, 2008 at 01:30:06AM +0200, Benny Halevy wrote:
> J. Bruce Fields wrote:
>> On Wed, Jan 23, 2008 at 03:01:42PM +0200, Benny Halevy wrote:
>>
>>> Apparently, fs/nfsd/nfs4state.c:move_to_close_lru
>>> may be called when sop->so_close_lru is empty.
>>> Without retruning early list_move_tail on the empty
>>> list crashes.
>>>
>>
>> The list_move functions are just list_del's followed by list_add's, and
>> list_del's should be completely safe on empty lists.
>>
>> How did you decide the bad pointer deference was in move_to_close_lru()?
>>
>
> I was looking at the disassembled code at nfsd4_close+0xd3
> I can send you the assembly code sniplet tomorrow.
Sure, thanks! I take it you don't have a way to reproduce this? (Do
you know what was happening at the time?) And this is a kernel with a
lot of 4.1 patches?
--b.
>
> Benny
>
>> --b.
>>
>>
>>> Here's an oops trace for example: (note that move_to_close_lru
>>> is inlined by the compiler into nfsd4_close)
>>>
>>> Jan 23 12:37:16 bh-testlin1 kernel: Unable to handle kernel paging request at 0000000000100108 RIP:
>>> Jan 23 12:37:16 bh-testlin1 kernel: [<ffffffff8824c604>] :nfsd:nfsd4_close+0xd3/0x123
>>> Jan 23 12:37:16 bh-testlin1 kernel: PGD 73827067 PUD 79cee067 PMD 0
>>> Jan 23 12:37:16 bh-testlin1 kernel: Oops: 0002 [1] SMP
>>> Jan 23 12:37:16 bh-testlin1 kernel: CPU 0
>>> Jan 23 12:37:16 bh-testlin1 kernel: Modules linked in: panfs(P) panlayoutdriver vmnet(P) parport_pc parport vmmon(P) nfsd auth_rpcgss exportfs autofs4 nfs lockd nfs_acl sunrpc ipv6 video output sbs sbshc battery ac sr_mod k8temp i2c_nforce2 hwmon i2c_core pcspkr forcedeth button cdrom pata_amd ata_generic sata_nv libata sd_mod scsi_mod ext3 jbd mbcache ehci_hcd ohci_hcd uhci_hcd
>>> Jan 23 12:37:16 bh-testlin1 kernel: Pid: 2163, comm: nfsd Tainted: P 2.6.24-rc8-panlayout #8
>>> Jan 23 12:37:16 bh-testlin1 kernel: RIP: 0010:[<ffffffff8824c604>] [<ffffffff8824c604>] :nfsd:nfsd4_close+0xd3/0x123
>>> Jan 23 12:37:16 bh-testlin1 kernel: RSP: 0018:ffff81007d947dc0 EFLAGS: 00010246
>>> Jan 23 12:37:16 bh-testlin1 kernel: RAX: 0000000000200200 RBX: ffff810028118000 RCX: ffffffff8824c5d0
>>> Jan 23 12:37:16 bh-testlin1 kernel: RDX: 0000000000100100 RSI: ffffe20000909cd0 RDI: ffff810028118058
>>> Jan 23 12:37:16 bh-testlin1 kernel: RBP: ffff81007c524290 R08: 0000000000000166 R09: ffff810029516000
>>> Jan 23 12:37:16 bh-testlin1 kernel: R10: 0000000000000000 R11: ffff81007c524290 R12: ffff81007c4f0400
>>> Jan 23 12:37:16 bh-testlin1 kernel: R13: 0000000000000000 R14: ffff81007c4f0400 R15: ffff81007c4c4000
>>> Jan 23 12:37:16 bh-testlin1 kernel: FS: 00002ab8fd1ea6f0(0000) GS:ffffffff81371000(0000) knlGS:00000000f3a2cb90
>>> Jan 23 12:37:16 bh-testlin1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>> Jan 23 12:37:16 bh-testlin1 kernel: CR2: 0000000000100108 CR3: 0000000073838000 CR4: 00000000000006e0
>>> Jan 23 12:37:16 bh-testlin1 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> Jan 23 12:37:16 bh-testlin1 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> Jan 23 12:37:16 bh-testlin1 kernel: Process nfsd (pid: 2163, threadinfo ffff81007d946000, task ffff81007d944000)
>>> Jan 23 12:37:16 bh-testlin1 kernel: Stack: 0000000000000000 ffffffff88264ec0 ffff810029516000 ffffffff88264da0
>>> Jan 23 12:37:16 bh-testlin1 kernel: ffff81007c525000 ffff81007c524000 ffff81007c4f0400 ffffffff8823fef0
>>> Jan 23 12:37:16 bh-testlin1 kernel: ffffffff882650f8 ffff81007c524288 ffff81007b8892c0 ffff81007c4c4000
>>> Jan 23 12:37:16 bh-testlin1 kernel: Call Trace:
>>> Jan 23 12:37:16 bh-testlin1 kernel: [<ffffffff8823fef0>] :nfsd:nfsd4_proc_compound+0x2b1/0x476
>>> Jan 23 12:37:16 bh-testlin1 kernel: [<ffffffff88231245>] :nfsd:nfsd_dispatch+0xde/0x1b6
>>> Jan 23 12:37:16 bh-testlin1 kernel: [<ffffffff88187b9d>] :sunrpc:svc_process_common+0x2fc/0x5bd
>>> Jan 23 12:37:16 bh-testlin1 kernel: [<ffffffff88188cce>] :sunrpc:svc_process+0x101/0x143
>>> Jan 23 12:37:16 bh-testlin1 kernel: [<ffffffff88231819>] :nfsd:nfsd+0x1a1/0x2bc
>>> Jan 23 12:37:16 bh-testlin1 kernel: [<ffffffff8100ccd8>] child_rip+0xa/0x12
>>> Jan 23 12:37:16 bh-testlin1 kernel: [<ffffffff88231678>] :nfsd:nfsd+0x0/0x2bc
>>> Jan 23 12:37:16 bh-testlin1 kernel: [<ffffffff8100ccce>] child_rip+0x0/0x12
>>> Jan 23 12:37:16 bh-testlin1 kernel:
>>> Jan 23 12:37:16 bh-testlin1 kernel:
>>> Jan 23 12:37:16 bh-testlin1 kernel: Code: 48 89 42 08 48 8b 35 69 53 02 00 48 89 10 48 c7 c2 70 19 27
>>> Jan 23 12:37:16 bh-testlin1 kernel: RIP [<ffffffff8824c604>] :nfsd:nfsd4_close+0xd3/0x123
>>> Jan 23 12:37:16 bh-testlin1 kernel: RSP <ffff81007d947dc0>
>>> Jan 23 12:37:16 bh-testlin1 kernel: CR2: 0000000000100108
>>> Jan 23 12:37:16 bh-testlin1 kernel: ---[ end trace 90ea1dfbd28e9e52 ]---
>>>
>>> Signed-off-by: Benny Halevy <bhalevy at panasas.com>
>>> ---
>>> fs/nfsd/nfs4state.c | 3 +++
>>> 1 files changed, 3 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>> index 27f284f..d181817 100644
>>> --- a/fs/nfsd/nfs4state.c
>>> +++ b/fs/nfsd/nfs4state.c
>>> @@ -1739,6 +1739,9 @@ move_to_close_lru(struct nfs4_stateowner *sop)
>>> {
>>> dprintk("NFSD: move_to_close_lru nfs4_stateowner %p\n", sop);
>>> + if (list_empty(&sop->so_close_lru))
>>> + return;
>>> +
>>> list_move_tail(&sop->so_close_lru, &close_lru);
>>> sop->so_time = get_seconds();
>>> }
>>> --
>>> 1.5.3.3
>>>
>>>
>
More information about the NFSv4
mailing list