[pnfs] callback problem
William A. (Andy) Adamson
andros at citi.umich.edu
Wed May 23 17:17:08 EDT 2007
On 5/23/07, Iyer, Rahul <Rahul.Iyer at netapp.com> wrote:
>
>
>
> > -----Original Message-----
> > From: William A. (Andy) Adamson [mailto:andros at citi.umich.edu]
> > Sent: Wednesday, May 23, 2007 2:08 PM
> > To: Iyer, Rahul
> > Cc: pnfs at linux-nfs.org
> > Subject: Re: [pnfs] callback problem
> >
> > hi rahul
> >
> > what branch?
> >
> It's the 4.1-sessions branch. Also, oddly enough, I'm not hitting the
> mempool error.
i got the mempool error when exchange_id failed in setup_session. the "Too
many channel attr bitmaps" message is in nfsd4_decode_create_session which
implies that exchange_id succeeded.
-->Andy
This time around, I had the v4.1 mount fail and the v4.0
> mount succeed. The server did have a NULL pointer dereference though.
> Regards
> Rahul.
>
> >
> >
> > On 5/23/07, Iyer, Rahul <Rahul.Iyer at netapp.com> wrote:
> >
> > Hi Andy,
> > I get a NULL pointer dereference on the server side.
> >
> > Here's what I see on the server side:
> >
> > <start>
> > Too many channel attr bitmaps!
> > Unable to handle kernel NULL pointer dereference at
> > 00000000000000b0
> > RIP:
> > [<ffffffff880568e2>] :nfsd:cmp_creds+0x2/0x10
> > PGD 1fc581067 PUD 1fc55f067 PMD 0
> > Oops: 0000 [1] PREEMPT SMP
> > CPU 1
> > Modules linked in: nfs nfsd exportfs lockd sunrpc
> > Pid: 4910, comm: nfsd Not tainted 2.6.18.3 #47
> > RIP: 0010:[<ffffffff880568e2>] [<ffffffff880568e2>]
> > :nfsd:cmp_creds+0x2/0x10
> > RSP: 0018:ffff8101fce01a78 EFLAGS: 00010286
> > RAX: 0000000000000000 RBX: ffff8101fe7bf448 RCX:
> > 0000000000000000
> > RDX: 2222222222222222 RSI: ffff8101fe7bf448 RDI:
> > 00000000000000b0
> > RBP: ffff8101fce01b10 R08: ffffffff88079990 R09:
> > ffffffff88079990
> > R10: ffff8101fce01a90 R11: ffff8101fe7e9800 R12:
> > ffff8101fcf23080
> > R13: ffff8101fe7bf400 R14: ffff8101fce01a90 R15:
> > 0000000000000003
> > FS: 00002baa68a666d0(0000) GS:ffff8101ffc01340(0000)
> > knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > CR2: 00000000000000b0 CR3: 00000001fcab6000 CR4:
> > 00000000000006e0
> > Process nfsd (pid: 4910, threadinfo ffff8101fce00000, task
> > ffff8101fde60140)
> > Stack: ffffffff8805a14d ffff8101fe7e9800 190c1eac80244b79
> > 3864663632396533
> > 3436393164623462 6335633234653932 3730383534633231
> > ffff8101fe7bf400
> > ffff8101fcf23000 ffff810100000024 ffff8101fce5c088
> > 590e5b2216a85446
> > Call Trace:
> > [<ffffffff8805a14d>] :nfsd:nfsd4_exchange_id+0x16d/0x2a0
> > [<ffffffff8804fd5a>] :nfsd:nfsd4_proc_compound+0x15ba/0x1820
> > [<ffffffff8024465f>] __lock_acquire+0xc3f/0xd20
> > [<ffffffff80487e5f>] release_sock+0x2f/0xe0
> > [<ffffffff88009fd9>] :sunrpc:svc_sock_enqueue+0x49/0x2c0
> > [<ffffffff80242ba0>] check_usage+0x40/0x2b0
> > [<ffffffff80244d7b>] trace_hardirqs_on+0x11b/0x150
> > [<ffffffff8024465f>] __lock_acquire+0xc3f/0xd20
> > [<ffffffff8800a22d>] :sunrpc:svc_sock_enqueue+0x29d/0x2c0
> > [<ffffffff8800ff08>] :sunrpc:sunrpc_cache_lookup+0x98/0x170
> > [<ffffffff804f39e0>] _read_unlock+0x30/0x60
> > [<ffffffff8800ff08>] :sunrpc:sunrpc_cache_lookup+0x98/0x170
> > [<ffffffff8800c801>] :sunrpc:ip_map_lookup+0xc1/0xd0
> > [<ffffffff8805557a>]
> > :nfsd:nfs4svc_decode_compoundargs+0xfaa/0x1080
> > [<ffffffff8803e121>] :nfsd:nfsd_dispatch+0x101/0x1e0
> > [<ffffffff880093fa>] :sunrpc:svc_process+0x46a/0x740
> > [<ffffffff8803e6f8>] :nfsd:nfsd+0x1d8/0x380
> > [<ffffffff8020ad2c>] child_rip+0xa/0x12
> > [<ffffffff804f3bbb>] _spin_unlock_irq+0x2b/0x60
> > [<ffffffff8020a33c>] restore_args+0x0/0x30
> > [<ffffffff8803e520>] :nfsd:nfsd+0x0/0x380
> > [<ffffffff8020ad22>] child_rip+0x0/0x12
> >
> >
> > Code: 39 07 55 48 89 e5 c9 0f 94 c0 0f b6 c0 c3 48 8b
> > 05 b1 a4 02
> > RIP [<ffffffff880568e2>] :nfsd:cmp_creds+0x2/0x10
> > RSP <ffff8101fce01a78>
> > CR2: 00000000000000b0
> >
> > =====================================
> > [ BUG: lock held at task exit time! ]
> > -------------------------------------
> > nfsd/4910 is exiting with locks still held!
> > 2 locks held by nfsd/4910:
> > #0: (hash_sem){..--}, at: [<ffffffff88045200>]
> > exp_readlock+0x10/0x20
> > [nfsd]
> > #1: (client_mutex){--..}, at: [<ffffffff804f2015>]
> > mutex_lock+0x25/0x30
> >
> > stack backtrace:
> >
> > Call Trace:
> > [<ffffffff80241ab9>] debug_check_no_locks_held+0x89/0xa0
> > [<ffffffff8022acf3>] do_exit+0x923/0x990
> > [<ffffffff8021d4b3>] do_page_fault+0x7e3/0x8e0
> > [<ffffffff8024465f>] __lock_acquire+0xc3f/0xd20
> > [<ffffffff8026fae8>] kfree+0xf8/0x110
> > [<ffffffff804f1fbb>] __mutex_lock_slowpath+0x23b/0x270
> > [<ffffffff80244b79>] mark_held_locks+0x79/0xa0
> > [<ffffffff8020ab5d>] error_exit+0x0/0x96
> > [<ffffffff880568e2>] :nfsd:cmp_creds+0x2/0x10
> > [<ffffffff8805a14d>] :nfsd:nfsd4_exchange_id+0x16d/0x2a0
> > [<ffffffff8804fd5a>] :nfsd:nfsd4_proc_compound+0x15ba/0x1820
> > [<ffffffff8024465f>] __lock_acquire+0xc3f/0xd20
> > [<ffffffff80487e5f>] release_sock+0x2f/0xe0
> > [<ffffffff88009fd9>] :sunrpc:svc_sock_enqueue+0x49/0x2c0
> > [<ffffffff80242ba0>] check_usage+0x40/0x2b0
> > [<ffffffff80244d7b>] trace_hardirqs_on+0x11b/0x150
> > [<ffffffff8024465f>] __lock_acquire+0xc3f/0xd20
> > [<ffffffff8800a22d>] :sunrpc:svc_sock_enqueue+0x29d/0x2c0
> > [<ffffffff8800ff08>] :sunrpc:sunrpc_cache_lookup+0x98/0x170
> > [<ffffffff804f39e0>] _read_unlock+0x30/0x60
> > [<ffffffff8800ff08>] :sunrpc:sunrpc_cache_lookup+0x98/0x170
> > [<ffffffff8800c801>] :sunrpc:ip_map_lookup+0xc1/0xd0
> > [<ffffffff8805557a>]
> > :nfsd:nfs4svc_decode_compoundargs+0xfaa/0x1080
> > [<ffffffff8803e121>] :nfsd:nfsd_dispatch+0x101/0x1e0
> > [<ffffffff880093fa>] :sunrpc:svc_process+0x46a/0x740
> > [<ffffffff8803e6f8>] :nfsd:nfsd+0x1d8/0x380
> > [<ffffffff8020ad2c>] child_rip+0xa/0x12
> > [<ffffffff804f3bbb>] _spin_unlock_irq+0x2b/0x60
> > [<ffffffff8020a33c>] restore_args+0x0/0x30
> > [<ffffffff8803e520>] :nfsd:nfsd+0x0/0x380
> > [<ffffffff8020ad22>] child_rip+0x0/0x12
> >
> >
> > <end>
> >
> > Regards
> > Rahul
> >
> >
> > > -----Original Message-----
> > > From: William A. (Andy) Adamson [mailto:andros at citi.umich.edu]
> > > Sent: Wednesday, May 23, 2007 1:42 PM
> > > To: Marc Eshel
> > > Cc: Iyer, Rahul; pnfs at linux-nfs.org
> > <mailto:pnfs at linux-nfs.org>
> > > Subject: Re: [pnfs] callback problem
> > >
> > >
> > >
> > > On 5/23/07, Marc Eshel <eshel at almaden.ibm.com> wrote:
> > >
> > > Some more observation I will continue to debug.
> > >
> > > On the client side I got the following messages after
> > > mount and 2 of them
> > > where incorrect. The first one "Couldn't mount...." but
> > > mount did complete
> > > and I was able to use it. The seconds one "started cb
> > > service!" is wrong
> > > callback failed since the IP address was 0.0.0.0 on the
> > > server side.
> > >
> > > [root at fin30 ~]# mount -t nfs4 fin18:/ /mnt
> > > Message from syslogd at fin30 at Wed May 23
> > 11:34:03 2007 ...
> > > fin30 kernel: started cb service!
> > > Message from syslogd at fin30 at Wed May 23
> > 11:34:05 2007 ...
> > > fin30 kernel: Couldn't mount using minorversion 1
> > > Message from syslogd at fin30 at Wed May 23
> > 11:34:05 2007 ...
> > > fin30 kernel: started cb service!
> > >
> > >
> > >
> > > On the server side I had to comment list_move_tail line
> > > since it was
> > > crashing with a null pointer.
> > >
> > > static inline void
> > > renew_client(struct nfs4_client *clp)
> > > {
> > > /*
> > > * Move client to the end to the LRU list.
> > > */
> > > dprintk("renewing client (clientid
> > %08x/%08x)\n",
> > > clp->cl_clientid.cl_boot,
> > > clp->cl_clientid.cl_id);
> > > //??? list_move_tail(&clp->cl_lru, &client_lru);
> > > clp->cl_time = get_seconds();
> > > }
> > >
> > >
> > > i think there is a race on the server regarding the
> > > nfs_client - i've failed in
> > >
> > > 1) find_confirmed_client_by_str called by nfsd4_exchange_id
> > > 2) succeeded in calling renew_client, and failed in
> > nfsd4_sequence.
> > >
> > > -->Andy
> > >
> > >
> > >
> > >
> >
> >
> >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://linux-nfs.org/pipermail/pnfs/attachments/20070523/ab3a9237/attachment-0001.htm
More information about the pNFS
mailing list