[pnfs] Query about {p,}nfs4_commit_done() calls in rpc_ops...
Mc Carthy, Fergal
Fergal.McCarthy at hp.com
Mon Sep 18 12:25:17 EDT 2006
While doing some recent testing with NFSV4_FILES_LAYOUT I encountered
errors like the following:
NFS: server zen4 error: fileid changed
fsid 0:1b: expected fileid 0xc007, got 0x808009
For my test env the DS fileids are not necessarily (or usually) the same
as the MDS fileid, which may be why I'm seeing this.
After a little investigation I found that the problem only presented
itself if I was writing multiple stripes of data. I stuck in a
dump_stack() call just before this error message was being printed and
noticed that the call stack was including the call to
nfs4_commit_done(). Some further debug messages identified that the RPCs
for which this was being invoked appeared to be targeted at my DS nodes,
and not my MDS node.
I then noticed that the set of pNFS override rpc_ops (pnfs_v4_clientops)
was including the nfs4_commit_done() routine as it's commit_done()
method, whereas the standard NFS rpc_ops are including
pnfs4_commit_done() as their commit_done() method.
I just tested with the follow change, to reverse the placement of these
two functions in the respective rpc_ops tables, and I'm not seeing the
error messages anymore...
--- fs/nfs/nfs4proc.c (revision 26)
+++ fs/nfs/nfs4proc.c (working copy)
@@ -4075,7 +4103,7 @@ struct nfs_rpc_ops nfs_v4_clientops = {
.write_setup = nfs4_proc_write_setup,
.write_done = nfs4_write_done,
.commit_setup = nfs4_proc_commit_setup,
- .commit_done = pnfs4_commit_done,
+ .commit_done = nfs4_commit_done,
.file_open = nfs_open,
.file_release = nfs_release,
.lock = nfs4_proc_lock,
@@ -4124,7 +4152,7 @@ struct nfs_rpc_ops pnfs_v4_clientops = {
.write_setup = nfs4_proc_write_setup,
.write_done = pnfs4_write_done,
.commit_setup = pnfs4_proc_commit_setup,
- .commit_done = nfs4_commit_done,
+ .commit_done = pnfs4_commit_done,
.file_open = nfs_open,
.file_release = nfs_release,
.lock = nfs4_proc_lock,
[Line numbers differ because I've dropped some debug changes]
So my question is whether or not this is a type-O of some sort, or was
there a reason for the this being in the code?
Fergal.
--
Fergal.McCarthy at HP.com
(The contents of this message and any attachments to it are confidential
and may be legally privileged. If you have received this message in
error you should delete it from your system immediately and advise the
sender. To any recipient of this message within HP, unless otherwise
stated, you should consider this message and attachments as "HP
CONFIDENTIAL".)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://linux-nfs.org/pipermail/pnfs/attachments/20060918/78e84a4e/attachment.html
More information about the pNFS
mailing list