[pnfs] [PATCH] nfs41: Correctly use new session in exception handling

William A. (Andy) Adamson andros at citi.umich.edu
Tue May 27 14:10:53 EDT 2008


I turned all debug off on both client and server, still was unable to
reproduce the problem, got the same results as below.... :(

-->Andy

On Tue, May 27, 2008 at 12:53 PM, Benny Halevy <bhalevy at panasas.com> wrote:

> I forgot to mention that I couldn't reproduce this with debug printing
> on...
> Did you try with nfsd_debug by any chance?
> Benny
>
> William A. (Andy) Adamson wrote:
> > Hi Benny
> >
> > I can't reproduce the problem. Using todays tree, origin/nfs41 branch
> > on client and server, I attempt to mount a non-existing path: the
> > server does not export a /unix4.
> >
> > mount -t nfs4 192.168.1.105:/unix4 /mnt
> >
> > What I see is a successful EXCHANGE_ID, CREATE_SESSION (with a
> > successful callback channel established), SEQ/PUTROOTFH/GETFH/GETATTR,
> > a couple of more successful SEQ/PUTFH/GETATTRS to establish the root,
> > then a PUTFH/LOOKUP/GETATTR for /unix4 that fails, followed by a
> > DESTROY_SESSION.
> >
> > Are you doing something different?
> >
> > Here is my /etc/exports file:
> >
> > /var/lib/nfs/v4root *(fsid=0,rw,insecure,no_root_squash,nohide,crossmnt)
> > /var/lib/nfs/v4root/spnfs *(rw,insecure,no_root_squash,nohide,crossmnt)
> > /var/lib/nfs/v4root/unix *(rw,insecure,no_root_squash,nohide,crossmnt)
> >
> > with
> > /export on /var/lib/nfs/v4root/spnfs type none (rw,bind)
> > /unix on /var/lib/nfs/v4root/unix type none (rw,bind)
> >
> >
> > -->Andy
> >
> > On Sun, May 25, 2008 at 1:19 PM, Benny Halevy <bhalevy at panasas.com
> > <mailto:bhalevy at panasas.com>> wrote:
> >
> >     Andy, I think that this approach is better than what we had
> >     before but bare in mind that it breaks the server when
> >     the exported path is not found.
> >
> >     I get the following oops when the client mounts a non-existing path:
> >     (with this patch applied)
> >
> >     May 25 18:48:01 bh-testlin1 kernel: BUG: unable to handle kernel
> >     NULL pointer dereference at 00000000000000a0
> >     May 25 18:48:01 bh-testlin1 kernel: IP: [<ffffffff88263233>]
> >     :sunrpc:bc_send_request+0xc0/0x1e2
> >     May 25 18:48:01 bh-testlin1 kernel: PGD 78800067 PUD 78811067 PMD 0
> >     May 25 18:48:01 bh-testlin1 kernel: Oops: 0000 [1] SMP
> >     May 25 18:48:01 bh-testlin1 kernel: CPU 0
> >     May 25 18:48:01 bh-testlin1 kernel: Modules linked in: nfsd
> >     auth_rpcgss exportfs autofs4 nfs lockd nfs_acl sunrpc ipv6
> >     dm_mirror dm_multipath dm_mod video output sbs sbshc battery ac
> >     kvm_amd kvm snd_hda_intel button snd_pcm k8temp i2c_nforce2
> >     i2c_core snd_timer pcspkr snd_page_alloc snd_hwdep sr_mod snd
> >     soundcore hwmon forcedeth serio_raw sg cdrom pata_amd ata_generic
> >     pata_acpi sata_nv libata sd_mod scsi_mod ext3 jbd mbcache ehci_hcd
> >     ohci_hcd uhci_hcd [last unloaded: scsi_wait_scan]
> >     May 25 18:48:01 bh-testlin1 kernel: Pid: 3271, comm: nfs4_cb_probe
> >     Not tainted 2.6.25-nfs41 #39
> >     May 25 18:48:01 bh-testlin1 kernel: RIP: 0010:[<ffffffff88263233>]
> >      [<ffffffff88263233>] :sunrpc:bc_send_request+0xc0/0x1e2
> >     May 25 18:48:01 bh-testlin1 kernel: RSP: 0018:ffff810078991d10
> >      EFLAGS: 00010246
> >     May 25 18:48:01 bh-testlin1 kernel: RAX: 0000000000000000 RBX:
> >     0000000000000074 RCX: 0000000000000005
> >     May 25 18:48:01 bh-testlin1 kernel: RDX: ffff8100788ec008 RSI:
> >     0000000000000057 RDI: ffff8100788ec008
> >     May 25 18:48:01 bh-testlin1 kernel: RBP: ffff8100788ec008 R08:
> >     ffff81007880eb60 R09: ffff81007880eb20
> >     May 25 18:48:01 bh-testlin1 kernel: R10: 0000000000000000 R11:
> >     ffffffff88267550 R12: ffff81007893e000
> >     May 25 18:48:01 bh-testlin1 kernel: R13: ffff81007893e008 R14:
> >     0000000000000000 R15: ffff8100798ee340
> >     May 25 18:48:01 bh-testlin1 kernel: FS:  00007f05577b26f0(0000)
> >     GS:ffffffff8139c000(0000) knlGS:0000000000000000
> >     May 25 18:48:01 bh-testlin1 kernel: CS:  0010 DS: 0018 ES: 0018
> >     CR0: 000000008005003b
> >     May 25 18:48:01 bh-testlin1 kernel: CR2: 00000000000000a0 CR3:
> >     0000000078810000 CR4: 00000000000006e0
> >     May 25 18:48:01 bh-testlin1 kernel: DR0: 0000000000000000 DR1:
> >     0000000000000000 DR2: 0000000000000000
> >     May 25 18:48:01 bh-testlin1 kernel: DR3: 0000000000000000 DR6:
> >     00000000ffff0ff0 DR7: 0000000000000400
> >     May 25 18:48:01 bh-testlin1 kernel: Process nfs4_cb_probe (pid:
> >     3271, threadinfo ffff810078990000, task ffff810076157360)
> >     May 25 18:48:01 bh-testlin1 kernel: Stack:  00000000ffffffff
> >     ffff810079d1c258 0000000000000286 0000000000000000
> >     May 25 18:48:01 bh-testlin1 kernel:  0000000000000074
> >     ffff81007880eb20 ffff81007880e800 ffff81007893e000
> >     May 25 18:48:01 bh-testlin1 kernel:  ffff81007e1f0200
> >     ffff81007893e0b8 0000000000000000 ffffffff8826197b
> >     May 25 18:48:01 bh-testlin1 kernel: Call Trace:
> >     May 25 18:48:01 bh-testlin1 kernel:  [<ffffffff8826197b>] ?
> >     :sunrpc:xprt_transmit+0xd3/0x1c1
> >     May 25 18:48:01 bh-testlin1 kernel:  [<ffffffff8832873c>] ?
> >     :nfsd:nfs4_xdr_enc_cb_null+0x0/0x50
> >     May 25 18:48:01 bh-testlin1 kernel:  [<ffffffff8825f4ab>] ?
> >     :sunrpc:call_transmit+0x200/0x23f
> >     May 25 18:48:01 bh-testlin1 kernel:  [<ffffffff88265f3f>] ?
> >     :sunrpc:__rpc_execute+0x7b/0x21a
> >     May 25 18:48:01 bh-testlin1 kernel:  [<ffffffff8825fc62>] ?
> >     :sunrpc:rpc_run_task+0x4f/0x56
> >     May 25 18:48:01 bh-testlin1 kernel:  [<ffffffff8825fcfe>] ?
> >     :sunrpc:rpc_call_sync+0x3e/0x5b
> >     May 25 18:48:01 bh-testlin1 kernel:  [<ffffffff88328e8f>] ?
> >     :nfsd:do_probe_callback+0x250/0x2c6
> >     May 25 18:48:01 bh-testlin1 kernel:  [<ffffffff88328c3f>] ?
> >     :nfsd:do_probe_callback+0x0/0x2c6
> >     May 25 18:48:01 bh-testlin1 kernel:  [<ffffffff81043ddd>] ?
> >     kthread+0x47/0x76
> >     May 25 18:48:01 bh-testlin1 kernel:  [<ffffffff8100cac8>] ?
> >     child_rip+0xa/0x12
> >     May 25 18:48:01 bh-testlin1 kernel:  [<ffffffff81043d96>] ?
> >     kthread+0x0/0x76
> >     May 25 18:48:01 bh-testlin1 kernel:  [<ffffffff8100cabe>] ?
> >     child_rip+0x0/0x12
> >     May 25 18:48:01 bh-testlin1 kernel:
> >     May 25 18:48:01 bh-testlin1 kernel:
> >     May 25 18:48:01 bh-testlin1 kernel: Code: 49 8b 5d 08 48 63 c7 c7
> >     44 24 14 00 80 00 00 48 39 d8 b8 00
> >      00 00 00 0f 45 44 24 14 89 44 24 14 48 8b 6d 08 49 8b 47 10 48 89
> >     ef <4c> 8b a0 a0 00 00 00 e8 61 f2
> >      db f8 48 c1 e8 0c 89 ea 48 be 00
> >     May 25 18:48:01 bh-testlin1 kernel: RIP  [<ffffffff88263233>]
> >     :sunrpc:bc_send_request+0xc0/0x1e2
> >     May 25 18:48:01 bh-testlin1 kernel:  RSP <ffff810078991d10>
> >     May 25 18:48:01 bh-testlin1 kernel: CR2: 00000000000000a0
> >     May 25 18:48:01 bh-testlin1 kernel: ---[ end trace
> >     e76d1803ca214c10 ]---
> >
> >     On May. 23, 2008, 21:29 +0300, andros at netapp.com
> >     <mailto:andros at netapp.com> wrote:
> >     > From: Andy Adamson <andros at netapp.com <mailto:andros at netapp.com>>
> >     >
> >     > We will need to handle several layers of session recovery
> >     depending on the
> >     > error. As a last resort, we will need to use the big hammer;
> destroy
> >     > an existing session and create a new one.
> >     >
> >     > The current code creates a new session without destroying the
> >     old one.
> >     > Destroy the existing session and initialize a new session before
> >     calling
> >     > nfs41_recover_session_sync.
> >     >
> >     > Signed-off-by: Andy Adamson<andros at netapp.com
> >     <mailto:andros at netapp.com>>
> >     > ---
> >     >  fs/nfs/client.c                        |    3 ++-
> >     >  fs/nfs/internal.h                      |    4 ++++
> >     >  fs/nfs/nfs41_session_recovery.c        |   25
> >     +++++++++++++++++++++++++
> >     >  fs/nfs/nfs4proc.c                      |    4 ++--
> >     >  include/linux/nfs41_session_recovery.h |    1 +
> >     >  5 files changed, 34 insertions(+), 3 deletions(-)
> >     >
> >     > diff --git a/fs/nfs/client.c b/fs/nfs/client.c
> >     > index a014f32..d1161ef 100644
> >     > --- a/fs/nfs/client.c
> >     > +++ b/fs/nfs/client.c
> >     > @@ -1116,7 +1116,7 @@ error:
> >     >  /*
> >     >   * Allocate and initialize a session if required, including its
> >     backchannel.
> >     >   */
> >     > -static int nfs4_init_session(struct nfs_client *clp, struct
> >     nfs4_session **spp,
> >     > +int nfs4_init_session(struct nfs_client *clp, struct
> >     nfs4_session **spp,
> >     >                            struct rpc_clnt *clnt)
> >     >  {
> >     >       int error = 0;
> >     > @@ -1148,6 +1148,7 @@ static int nfs4_init_session(struct
> >     nfs_client *clp, struct nfs4_session **spp,
> >     >       }
> >     >       return error;
> >     >  }
> >     > +EXPORT_SYMBOL(nfs4_init_session);
> >     >  #endif /* CONFIG_NFS_V4_1 */
> >     >
> >     >  /*
> >     > diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
> >     > index 22228bc..735dbb7 100644
> >     > --- a/fs/nfs/internal.h
> >     > +++ b/fs/nfs/internal.h
> >     > @@ -66,6 +66,10 @@ struct nfs_parsed_mount_data {
> >     >  };
> >     >
> >     >  /* client.c */
> >     > +#ifdef CONFIG_NFS_V4_1
> >     > +extern int nfs4_init_session(struct nfs_client *, struct
> >     nfs4_session **,
> >     > +                                     struct rpc_clnt *);
> >     > +#endif /* CONFIG_NFS_V4_1 */
> >     >  extern struct rpc_program nfs_program;
> >     >
> >     >  extern void nfs_put_client(struct nfs_client *);
> >     > diff --git a/fs/nfs/nfs41_session_recovery.c
> >     b/fs/nfs/nfs41_session_recovery.c
> >     > index 46ddb55..cf02704 100644
> >     > --- a/fs/nfs/nfs41_session_recovery.c
> >     > +++ b/fs/nfs/nfs41_session_recovery.c
> >     > @@ -22,6 +22,7 @@
> >     >  #include <linux/nfs_fs_sb.h>
> >     >  #include <linux/nfs41_session_recovery.h>
> >     >  #include "nfs4_fs.h"
> >     > +#include "internal.h"
> >     >
> >     >  #define NFSDBG_FACILITY              NFSDBG_PROC
> >     >
> >     > @@ -191,4 +192,28 @@ int nfs41_recover_session_sync(struct
> >     rpc_clnt *clnt, struct nfs_client *clp,
> >     >  }
> >     >  EXPORT_SYMBOL(nfs41_recover_session_sync);
> >     >
> >     > +/*
> >     > + * nfs41_new_session()
> >     > + *
> >     > + * Big Hammer. Destroy existing session and create a new session
> >     > + */
> >     > +int nfs41_new_session(struct nfs_server *server)
> >     > +{
> >     > +     int ret;
> >     > +
> >     > +     dprintk(" --> %s\n", __func__);
> >     > +
> >     > +     nfs4_put_session(&server->session);
> >     > +     ret = nfs4_init_session(server->nfs_client, &server->session,
> >     > +                             server->client);
> >     > +     if (ret)
> >     > +             goto out;
> >     > +     ret = nfs41_recover_session_sync(server->client,
> >     server->nfs_client,
> >     > +                             server->session);
> >     > +out:
> >     > +     dprintk(" <-- %s returns %d\n", __func__, ret);
> >     > +     return ret;
> >     > +}
> >     > +EXPORT_SYMBOL(nfs41_new_session);
> >     > +
> >     >  #endif /* CONFIG_NFS_V4_1 */
> >     > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> >     > index ccec2e0..9c12bfb 100644
> >     > --- a/fs/nfs/nfs4proc.c
> >     > +++ b/fs/nfs/nfs4proc.c
> >     > @@ -3416,8 +3416,8 @@ static int nfs4_handle_exception(const
> >     struct nfs_server *server, int errorcode,
> >     >               case -NFS4ERR_RETRY_UNCACHED_REP:
> >     >               case -NFS4ERR_TOO_MANY_OPS:
> >     >               case -NFS4ERR_OP_NOT_IN_SESSION:
> >     > -                     ret =
> >     nfs41_recover_session_sync(server->session->clnt,
> >     > -                             server->nfs_client, server->session);
> >     > +
> >     > +                     ret = nfs41_new_session((struct nfs_server
> >     *)server);
> >     >                       if (!ret) {
> >     >                               exception->retry = 1;
> >     >                               break;
> >     > diff --git a/include/linux/nfs41_session_recovery.h
> >     b/include/linux/nfs41_session_recovery.h
> >     > index 2f1df6f..5364b34 100644
> >     > --- a/include/linux/nfs41_session_recovery.h
> >     > +++ b/include/linux/nfs41_session_recovery.h
> >     > @@ -36,6 +36,7 @@ int nfs41_set_session_valid(struct
> >     nfs4_session *);
> >     >  int nfs41_recover_session(struct nfs_client *, struct
> >     nfs4_session *);
> >     >  int nfs41_recover_session_sync(struct rpc_clnt *, struct
> >     nfs_client *,
> >     >                              struct nfs4_session *);
> >     > +int nfs41_new_session(struct nfs_server *);
> >     >
> >     >  #endif       /* CONFIG_NFS_V4_1 */
> >     >  #endif       /* __NFS41_SESSION_RECOVERY_H__ */
> >
> >     _______________________________________________
> >     pNFS mailing list
> >     pNFS at linux-nfs.org <mailto:pNFS at linux-nfs.org>
> >     http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
> >
> >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://linux-nfs.org/pipermail/pnfs/attachments/20080527/d615738a/attachment-0001.htm 


More information about the pNFS mailing list