[pnfs] [PATCH] nfs41: Correctly use new session in exception handling
William A. (Andy) Adamson
androsadamson at gmail.com
Tue May 27 12:48:30 EDT 2008
Hi Benny
I can't reproduce the problem. Using todays tree, origin/nfs41 branch on
client and server, I attempt to mount a non-existing path: the server does
not export a /unix4.
mount -t nfs4 192.168.1.105:/unix4 /mnt
What I see is a successful EXCHANGE_ID, CREATE_SESSION (with a successful
callback channel established), SEQ/PUTROOTFH/GETFH/GETATTR, a couple of more
successful SEQ/PUTFH/GETATTRS to establish the root, then a
PUTFH/LOOKUP/GETATTR for /unix4 that fails, followed by a DESTROY_SESSION.
Are you doing something different?
Here is my /etc/exports file:
/var/lib/nfs/v4root *(fsid=0,rw,insecure,no_root_squash,nohide,crossmnt)
/var/lib/nfs/v4root/spnfs *(rw,insecure,no_root_squash,nohide,crossmnt)
/var/lib/nfs/v4root/unix *(rw,insecure,no_root_squash,nohide,crossmnt)
with
/export on /var/lib/nfs/v4root/spnfs type none (rw,bind)
/unix on /var/lib/nfs/v4root/unix type none (rw,bind)
-->Andy
On Sun, May 25, 2008 at 1:19 PM, Benny Halevy <bhalevy at panasas.com> wrote:
> Andy, I think that this approach is better than what we had
> before but bare in mind that it breaks the server when
> the exported path is not found.
>
> I get the following oops when the client mounts a non-existing path:
> (with this patch applied)
>
> May 25 18:48:01 bh-testlin1 kernel: BUG: unable to handle kernel NULL
> pointer dereference at 00000000000000a0
> May 25 18:48:01 bh-testlin1 kernel: IP: [<ffffffff88263233>]
> :sunrpc:bc_send_request+0xc0/0x1e2
> May 25 18:48:01 bh-testlin1 kernel: PGD 78800067 PUD 78811067 PMD 0
> May 25 18:48:01 bh-testlin1 kernel: Oops: 0000 [1] SMP
> May 25 18:48:01 bh-testlin1 kernel: CPU 0
> May 25 18:48:01 bh-testlin1 kernel: Modules linked in: nfsd auth_rpcgss
> exportfs autofs4 nfs lockd nfs_acl sunrpc ipv6 dm_mirror dm_multipath dm_mod
> video output sbs sbshc battery ac kvm_amd kvm snd_hda_intel button snd_pcm
> k8temp i2c_nforce2 i2c_core snd_timer pcspkr snd_page_alloc snd_hwdep sr_mod
> snd soundcore hwmon forcedeth serio_raw sg cdrom pata_amd ata_generic
> pata_acpi sata_nv libata sd_mod scsi_mod ext3 jbd mbcache ehci_hcd ohci_hcd
> uhci_hcd [last unloaded: scsi_wait_scan]
> May 25 18:48:01 bh-testlin1 kernel: Pid: 3271, comm: nfs4_cb_probe Not
> tainted 2.6.25-nfs41 #39
> May 25 18:48:01 bh-testlin1 kernel: RIP: 0010:[<ffffffff88263233>]
> [<ffffffff88263233>] :sunrpc:bc_send_request+0xc0/0x1e2
> May 25 18:48:01 bh-testlin1 kernel: RSP: 0018:ffff810078991d10 EFLAGS:
> 00010246
> May 25 18:48:01 bh-testlin1 kernel: RAX: 0000000000000000 RBX:
> 0000000000000074 RCX: 0000000000000005
> May 25 18:48:01 bh-testlin1 kernel: RDX: ffff8100788ec008 RSI:
> 0000000000000057 RDI: ffff8100788ec008
> May 25 18:48:01 bh-testlin1 kernel: RBP: ffff8100788ec008 R08:
> ffff81007880eb60 R09: ffff81007880eb20
> May 25 18:48:01 bh-testlin1 kernel: R10: 0000000000000000 R11:
> ffffffff88267550 R12: ffff81007893e000
> May 25 18:48:01 bh-testlin1 kernel: R13: ffff81007893e008 R14:
> 0000000000000000 R15: ffff8100798ee340
> May 25 18:48:01 bh-testlin1 kernel: FS: 00007f05577b26f0(0000)
> GS:ffffffff8139c000(0000) knlGS:0000000000000000
> May 25 18:48:01 bh-testlin1 kernel: CS: 0010 DS: 0018 ES: 0018 CR0:
> 000000008005003b
> May 25 18:48:01 bh-testlin1 kernel: CR2: 00000000000000a0 CR3:
> 0000000078810000 CR4: 00000000000006e0
> May 25 18:48:01 bh-testlin1 kernel: DR0: 0000000000000000 DR1:
> 0000000000000000 DR2: 0000000000000000
> May 25 18:48:01 bh-testlin1 kernel: DR3: 0000000000000000 DR6:
> 00000000ffff0ff0 DR7: 0000000000000400
> May 25 18:48:01 bh-testlin1 kernel: Process nfs4_cb_probe (pid: 3271,
> threadinfo ffff810078990000, task ffff810076157360)
> May 25 18:48:01 bh-testlin1 kernel: Stack: 00000000ffffffff
> ffff810079d1c258 0000000000000286 0000000000000000
> May 25 18:48:01 bh-testlin1 kernel: 0000000000000074 ffff81007880eb20
> ffff81007880e800 ffff81007893e000
> May 25 18:48:01 bh-testlin1 kernel: ffff81007e1f0200 ffff81007893e0b8
> 0000000000000000 ffffffff8826197b
> May 25 18:48:01 bh-testlin1 kernel: Call Trace:
> May 25 18:48:01 bh-testlin1 kernel: [<ffffffff8826197b>] ?
> :sunrpc:xprt_transmit+0xd3/0x1c1
> May 25 18:48:01 bh-testlin1 kernel: [<ffffffff8832873c>] ?
> :nfsd:nfs4_xdr_enc_cb_null+0x0/0x50
> May 25 18:48:01 bh-testlin1 kernel: [<ffffffff8825f4ab>] ?
> :sunrpc:call_transmit+0x200/0x23f
> May 25 18:48:01 bh-testlin1 kernel: [<ffffffff88265f3f>] ?
> :sunrpc:__rpc_execute+0x7b/0x21a
> May 25 18:48:01 bh-testlin1 kernel: [<ffffffff8825fc62>] ?
> :sunrpc:rpc_run_task+0x4f/0x56
> May 25 18:48:01 bh-testlin1 kernel: [<ffffffff8825fcfe>] ?
> :sunrpc:rpc_call_sync+0x3e/0x5b
> May 25 18:48:01 bh-testlin1 kernel: [<ffffffff88328e8f>] ?
> :nfsd:do_probe_callback+0x250/0x2c6
> May 25 18:48:01 bh-testlin1 kernel: [<ffffffff88328c3f>] ?
> :nfsd:do_probe_callback+0x0/0x2c6
> May 25 18:48:01 bh-testlin1 kernel: [<ffffffff81043ddd>] ?
> kthread+0x47/0x76
> May 25 18:48:01 bh-testlin1 kernel: [<ffffffff8100cac8>] ?
> child_rip+0xa/0x12
> May 25 18:48:01 bh-testlin1 kernel: [<ffffffff81043d96>] ?
> kthread+0x0/0x76
> May 25 18:48:01 bh-testlin1 kernel: [<ffffffff8100cabe>] ?
> child_rip+0x0/0x12
> May 25 18:48:01 bh-testlin1 kernel:
> May 25 18:48:01 bh-testlin1 kernel:
> May 25 18:48:01 bh-testlin1 kernel: Code: 49 8b 5d 08 48 63 c7 c7 44 24 14
> 00 80 00 00 48 39 d8 b8 00
> 00 00 00 0f 45 44 24 14 89 44 24 14 48 8b 6d 08 49 8b 47 10 48 89 ef <4c>
> 8b a0 a0 00 00 00 e8 61 f2
> db f8 48 c1 e8 0c 89 ea 48 be 00
> May 25 18:48:01 bh-testlin1 kernel: RIP [<ffffffff88263233>]
> :sunrpc:bc_send_request+0xc0/0x1e2
> May 25 18:48:01 bh-testlin1 kernel: RSP <ffff810078991d10>
> May 25 18:48:01 bh-testlin1 kernel: CR2: 00000000000000a0
> May 25 18:48:01 bh-testlin1 kernel: ---[ end trace e76d1803ca214c10 ]---
>
> On May. 23, 2008, 21:29 +0300, andros at netapp.com wrote:
> > From: Andy Adamson <andros at netapp.com>
> >
> > We will need to handle several layers of session recovery depending on
> the
> > error. As a last resort, we will need to use the big hammer; destroy
> > an existing session and create a new one.
> >
> > The current code creates a new session without destroying the old one.
> > Destroy the existing session and initialize a new session before calling
> > nfs41_recover_session_sync.
> >
> > Signed-off-by: Andy Adamson<andros at netapp.com>
> > ---
> > fs/nfs/client.c | 3 ++-
> > fs/nfs/internal.h | 4 ++++
> > fs/nfs/nfs41_session_recovery.c | 25 +++++++++++++++++++++++++
> > fs/nfs/nfs4proc.c | 4 ++--
> > include/linux/nfs41_session_recovery.h | 1 +
> > 5 files changed, 34 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/nfs/client.c b/fs/nfs/client.c
> > index a014f32..d1161ef 100644
> > --- a/fs/nfs/client.c
> > +++ b/fs/nfs/client.c
> > @@ -1116,7 +1116,7 @@ error:
> > /*
> > * Allocate and initialize a session if required, including its
> backchannel.
> > */
> > -static int nfs4_init_session(struct nfs_client *clp, struct nfs4_session
> **spp,
> > +int nfs4_init_session(struct nfs_client *clp, struct nfs4_session **spp,
> > struct rpc_clnt *clnt)
> > {
> > int error = 0;
> > @@ -1148,6 +1148,7 @@ static int nfs4_init_session(struct nfs_client
> *clp, struct nfs4_session **spp,
> > }
> > return error;
> > }
> > +EXPORT_SYMBOL(nfs4_init_session);
> > #endif /* CONFIG_NFS_V4_1 */
> >
> > /*
> > diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
> > index 22228bc..735dbb7 100644
> > --- a/fs/nfs/internal.h
> > +++ b/fs/nfs/internal.h
> > @@ -66,6 +66,10 @@ struct nfs_parsed_mount_data {
> > };
> >
> > /* client.c */
> > +#ifdef CONFIG_NFS_V4_1
> > +extern int nfs4_init_session(struct nfs_client *, struct nfs4_session
> **,
> > + struct rpc_clnt *);
> > +#endif /* CONFIG_NFS_V4_1 */
> > extern struct rpc_program nfs_program;
> >
> > extern void nfs_put_client(struct nfs_client *);
> > diff --git a/fs/nfs/nfs41_session_recovery.c
> b/fs/nfs/nfs41_session_recovery.c
> > index 46ddb55..cf02704 100644
> > --- a/fs/nfs/nfs41_session_recovery.c
> > +++ b/fs/nfs/nfs41_session_recovery.c
> > @@ -22,6 +22,7 @@
> > #include <linux/nfs_fs_sb.h>
> > #include <linux/nfs41_session_recovery.h>
> > #include "nfs4_fs.h"
> > +#include "internal.h"
> >
> > #define NFSDBG_FACILITY NFSDBG_PROC
> >
> > @@ -191,4 +192,28 @@ int nfs41_recover_session_sync(struct rpc_clnt
> *clnt, struct nfs_client *clp,
> > }
> > EXPORT_SYMBOL(nfs41_recover_session_sync);
> >
> > +/*
> > + * nfs41_new_session()
> > + *
> > + * Big Hammer. Destroy existing session and create a new session
> > + */
> > +int nfs41_new_session(struct nfs_server *server)
> > +{
> > + int ret;
> > +
> > + dprintk(" --> %s\n", __func__);
> > +
> > + nfs4_put_session(&server->session);
> > + ret = nfs4_init_session(server->nfs_client, &server->session,
> > + server->client);
> > + if (ret)
> > + goto out;
> > + ret = nfs41_recover_session_sync(server->client,
> server->nfs_client,
> > + server->session);
> > +out:
> > + dprintk(" <-- %s returns %d\n", __func__, ret);
> > + return ret;
> > +}
> > +EXPORT_SYMBOL(nfs41_new_session);
> > +
> > #endif /* CONFIG_NFS_V4_1 */
> > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > index ccec2e0..9c12bfb 100644
> > --- a/fs/nfs/nfs4proc.c
> > +++ b/fs/nfs/nfs4proc.c
> > @@ -3416,8 +3416,8 @@ static int nfs4_handle_exception(const struct
> nfs_server *server, int errorcode,
> > case -NFS4ERR_RETRY_UNCACHED_REP:
> > case -NFS4ERR_TOO_MANY_OPS:
> > case -NFS4ERR_OP_NOT_IN_SESSION:
> > - ret =
> nfs41_recover_session_sync(server->session->clnt,
> > - server->nfs_client, server->session);
> > +
> > + ret = nfs41_new_session((struct nfs_server
> *)server);
> > if (!ret) {
> > exception->retry = 1;
> > break;
> > diff --git a/include/linux/nfs41_session_recovery.h
> b/include/linux/nfs41_session_recovery.h
> > index 2f1df6f..5364b34 100644
> > --- a/include/linux/nfs41_session_recovery.h
> > +++ b/include/linux/nfs41_session_recovery.h
> > @@ -36,6 +36,7 @@ int nfs41_set_session_valid(struct nfs4_session *);
> > int nfs41_recover_session(struct nfs_client *, struct nfs4_session *);
> > int nfs41_recover_session_sync(struct rpc_clnt *, struct nfs_client *,
> > struct nfs4_session *);
> > +int nfs41_new_session(struct nfs_server *);
> >
> > #endif /* CONFIG_NFS_V4_1 */
> > #endif /* __NFS41_SESSION_RECOVERY_H__ */
>
> _______________________________________________
> pNFS mailing list
> pNFS at linux-nfs.org
> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://linux-nfs.org/pipermail/pnfs/attachments/20080527/b250b90f/attachment.htm
More information about the pNFS
mailing list