A Handful of 2.6.21.5 bugs

John McCorquodale mcq at cacr.caltech.edu
Thu Jun 14 10:53:05 EDT 2007


Guys,

Since the NFS v3 client is having so much trouble at the moment and nobody is
working on it, I converted our compute cluster over to v4.  These are diskless
2-socket 2-core x86_64 nodes that mount / via nfs4, clients and servers running
2.6.21.5.  Been running about 10 hours now and am noticing a bounded nonempty
set of errors showing up.  Nothing has yet interrupted service, which is
excellent news.  Figured you guys might like the reports anyway, since you're
printing the errors for a reason.  :)

Feel free to tell me to do things for you if you need more information to
progress on any problems you're interested in.

Cheers,

-mcq

---

The keys of form 'DOMAIN user N' for nfs4.idtoname should render N as an
unsigned, not as a signed.

If you have a file in the server's exported (ext3) fs with a big owner UID, say
4294967294, when it's accessed from the client, a query comes out of
nfs4.idtoname that looks like:

  domain user -2

And when you respond with that key, you get Invalid Argument presumably due to
the minus sign.  Thus, you can't repond to the query and the client hangs on
the access until the client-side idmap timeout expires.

---

rm'ing really big files answers an error (but succeeds).  Both client and
server were rebooted after creating the file and before rm'ing it:

$ ls -las biffy
52480056 -rw-r--r-- 1 mcq mcq 53687091200 Jun 14 01:09 biffy

$ rm biffy
rm: cannot remove `biffy': No such file or directory

$ ls biffy
ls: biffy: No such file or directory

---

Recurring trickle of errors on client dmesg:

nfs_delegation_claim_locks: unhandled error -10038.
NFS: v4 server returned a bad sequence-id error!
NFS: v4 server returned a bad sequence-id error!
NFS: v4 server returned a bad sequence-id error!
nfs_delegation_claim_locks: unhandled error -10038.
NFS: v4 server returned a bad sequence-id error!
nfs_delegation_claim_locks: unhandled error -10038.
NFS: v4 server returned a bad sequence-id error!
nfs_delegation_claim_locks: unhandled error -10038.
NFS: v4 server returned a bad sequence-id error!
NFS: v4 server returned a bad sequence-id error!
nfs_delegation_claim_locks: unhandled error -10038.
nfs_delegation_claim_locks: unhandled error -10038.
NFS: v4 server returned a bad sequence-id error!
NFS: v4 server returned a bad sequence-id error!

Note that there are locking-related problems; things will try to get a lock,
fail and retry (reporting 'someone else has the lock') when clearly someone
else does NOT have the lock.  It appears that the someone my be the locker
itself (could be a general kind of replay problem like the rm above?).  It's
nondeterministic as far as I can tell.

---

Constant trickle of errors in server dmesg:

NFSD: preprocess_seqid_op: bad seqid (expected 194, got 195)
NFSD: preprocess_seqid_op: bad seqid (expected 204, got 205)
NFSD: preprocess_seqid_op: bad seqid (expected 204, got 205)
NFSD: preprocess_seqid_op: bad seqid (expected 204, got 205)
NFSD: preprocess_seqid_op: bad seqid (expected 204, got 205)
NFSD: preprocess_seqid_op: bad seqid (expected 6, got 7)
NFSD: preprocess_seqid_op: bad seqid (expected 6, got 7)
NFSD: preprocess_seqid_op: bad seqid (expected 6, got 7)
NFSD: preprocess_seqid_op: bad seqid (expected 204, got 205)
NFSD: preprocess_seqid_op: bad seqid (expected 194, got 195)
NFSD: preprocess_seqid_op: bad seqid (expected 194, got 195)
NFSD: preprocess_seqid_op: bad seqid (expected 194, got 195)
NFSD: preprocess_seqid_op: bad seqid (expected 204, got 205)
NFSD: preprocess_seqid_op: bad seqid (expected 204, got 205)
NFSD: preprocess_seqid_op: bad seqid (expected 204, got 205)
NFSD: preprocess_seqid_op: bad seqid (expected 204, got 205)
NFSD: preprocess_seqid_op: bad seqid (expected 204, got 205)
NFSD: preprocess_seqid_op: bad seqid (expected 204, got 205)
NFSD: preprocess_seqid_op: bad seqid (expected 204, got 205)
NFSD: preprocess_seqid_op: bad seqid (expected 204, got 205)
NFSD: preprocess_seqid_op: bad seqid (expected 41, got 42)
NFSD: preprocess_seqid_op: bad seqid (expected 41, got 42)
NFSD: preprocess_seqid_op: bad seqid (expected 41, got 42)
NFSD: preprocess_seqid_op: bad seqid (expected 41, got 42)
NFSD: preprocess_seqid_op: bad seqid (expected 41, got 42)
NFSD: preprocess_seqid_op: bad seqid (expected 194, got 195)
NFSD: preprocess_seqid_op: bad seqid (expected 898, got 899)
NFSD: preprocess_seqid_op: bad seqid (expected 898, got 899)
NFSD: preprocess_seqid_op: bad seqid (expected 898, got 899)
...

---

server oops (this occurred right after I drastically changed the system time,
which may be a coincidence):

Unable to handle kernel NULL pointer dereference at 0000000000000010 RIP:
 [<ffffffff805bdd9e>] xprt_adjust_timeout+0x6e/0x100
PGD 22610d067 PUD 22515c067 PMD 0
Oops: 0000 [1] SMP
CPU 3
Modules linked in:
Pid: 1614, comm: nfsv4-recall Not tainted 2.6.21.5 #1
RIP: 0010:[<ffffffff805bdd9e>]  [<ffffffff805bdd9e>] xprt_adjust_timeout+0x6e/0x100
RSP: 0018:ffff810136d0de10  EFLAGS: 00010202
RAX: 00000001000258a7 RBX: ffff81023711c6c0 RCX: 0000000000000018
RDX: 0000000000000010 RSI: ffff810136d0ddf0 RDI: ffff8102260fc000
RBP: ffff8102260fc000 R08: ffff810136d0c000 R09: ffff81013780be90
R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000000000
R13: ffff81013c10b800 R14: ffffffff80242090 R15: ffff810137a15a80
FS:  00002b0a206846f0(0000) GS:ffff81013c153940(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000010 CR3: 0000000224de6000 CR4: 00000000000006e0
Process nfsv4-recall (pid: 1614, threadinfo ffff810136d0c000, task ffff810137543000)
Stack:  ffff81023711c6c0 ffff81013c10b800 ffff81023711c790 ffffffff805bd03e
 ffff81023711c6c0 ffff81023711c6c0 0000000000000000 ffffffff805c1fad
 ffff81023711c6c0 00000000fffffff4 ffff810136d0deb0 ffffffff805bc479
Call Trace:
 [<ffffffff805bd03e>] call_timeout+0x1e/0x100
 [<ffffffff805c1fad>] __rpc_execute+0x7d/0x240
 [<ffffffff805bc479>] rpc_call_sync+0x79/0xa0
 [<ffffffff8031421e>] nfsd4_cb_recall+0xce/0x140
 [<ffffffff8030e1b0>] do_recall+0x0/0x20
 [<ffffffff80242090>] keventd_create_kthread+0x0/0x90
 [<ffffffff8030e1ca>] do_recall+0x1a/0x20
 [<ffffffff802421f9>] kthread+0xd9/0x120
 [<ffffffff8020a7a8>] child_rip+0xa/0x12
 [<ffffffff80242090>] keventd_create_kthread+0x0/0x90
 [<ffffffff80242120>] kthread+0x0/0x120
 [<ffffffff8020a79e>] child_rip+0x0/0x12


Code: 49 8b 44 24 10 49 8d 9c 24 18 03 00 00 c7 87 10 01 00 00 00
RIP  [<ffffffff805bdd9e>] xprt_adjust_timeout+0x6e/0x100
 RSP <ffff810136d0de10>
CR2: 0000000000000010

---

Since this is the filesystem on a private backend net of a cluster, it's
running AUTH_UNIX and complex idmapping is unnecessary.

I have simple, homegrown idmappers running that just take N -> "N" and
"N" -> N (with clamping to 0..65534).  It was necessary to write this, because
you can't do files-based idmapping when /etc is itself in nfs4: this eventually
leads to deadlock when the idmapper needs to be invoked in order for the
idmapper to read the files.  The main idmapper could be fixed to avoid this,
but it's still a massively overcomplex piece of code with a ridiculous bulk
of library dependencies for use in an embedded fashion.

I would have thought that N->"N" and v.v. would have been a default hardcoded
behavior for when nothing has the idmapping pipes (idmap, nfs4.XtoY) open,
which would eliminate the need for me to push more silly little programs
around during boot.

What do you guys think of that idea?


More information about the NFSv4 mailing list