[pnfs] Fwd: [PATCH 0/14] linux-pnfs-2.6-latest pNFS client read I/O - Draft 13
William A. (Andy) Adamson
andros at citi.umich.edu
Thu Nov 15 14:34:46 EST 2007
On Nov 15, 2007 1:22 PM, Benny Halevy <bhalevy at panasas.com> wrote:
>
> On Nov. 15, 2007, 19:51 +0200, "William A. (Andy) Adamson" <andros at citi.umich.edu> wrote:
> > ummm! i didn't 'cc the list!!!!
> >
> >
> > ---------- Forwarded message ----------
> > From: William A. (Andy) Adamson <andros at citi.umich.edu>
> > Date: Nov 15, 2007 12:38 PM
> > Subject: Re: [pnfs] [PATCH 0/14] linux-pnfs-2.6-latest pNFS client
> > read I/O - Draft 13
> > To: Benny Halevy <bhalevy at panasas.com>
> >
> >
> > On Nov 15, 2007 6:12 AM, Benny Halevy <bhalevy at panasas.com> wrote:
> >> On Nov. 14, 2007, 19:23 +0200, Trond Myklebust <trond.myklebust at fys.uio.no> wrote:
> >>> On Wed, 2007-11-14 at 12:11 -0500, William A. (Andy) Adamson wrote:
> >>>
> >>>> for example in fs/nfs/read.c: nfs_read_rpcsetup()
> >>>>
> >>>> ifdef CONFIG_PNFS
> >>>> /* XXX pnfs_try_to_read_data should never return an error less than 0.
> >>>> * ret == 0 means pnfs read succees.
> >>>> * ret == 1 means do an NFSv4.1 read to the MDS.
> >>>> */
> >>>> if ((pnfs_try_to_read_data(data, call_ops)) <= 0)
> >>>> return;
> >>>> #endif /* CONFIG_PNFS*/
> >>>>
> >>>> nfs_initiate_read(data, NFS_CLIENT(inode), call_ops);
> >>>> }
> >>> The above is a seriously brilliant comment! Are we perhaps missing a
> >>> BUG_ON() there?
> >> Actually, I think it's just wrong.
> >>> 0 means do MDS read
> >> = 0 means success
> >> < 0 means pnfs_readpages failed and returned an error
> >> in this case the 2.6.18.3 code base returned an error from nfs_read_rpcsetup
> >> which is sort of legitimate (not that I like doing the I/O form within
> >> nfs_read_rpcsetup, but that's a separate issue)
> >> in the future we will want to retry the I/O via the MDS.
> >
> > the future is now! a failure from pnfs_try_to_read means try via the
> > MDS. there is no reason to put this off any longer. the 2.6.18.3 code
> > is wrong.
> >
> >> Anyhow, this is a good example why having a thorough process for porting
> >> the patches is so important. Making changes in the porting process is
> >> inevitable since the environment changed (a lot), however changing the
> >> functionality and/or fixing bugs (or what we think are bugs) in the
> >> process is quite risky.
> >
> > this is a bug. there is no reason to propagate bugs. instead, i
>
> sorry, I disagree. Returning an error to the app or data server error
> is not a bug. It's sub-optimal but it's a correct behavior
>
> > propose that in the "porting process" we take the time to improve the
> > code and handle all the issues that we have ignored in order to get a
>
> Andy, really I'm not just trying to be stubborn. I carry my scars of trying
> to make more than one change at a time and that is just error-prone.
Benny - you will get scars either way! We are already making more
than one change at a time. The sessions code is a complete rewrite.
The nfs client I/O path is rewritten. The rpc layer has changed.....
etc etc etc. What do you mean? The problem exists, and 'porting
existing functionality' will not be 'one change at a time'.
> I'm not asking to port bugs, but rather to postpone any improvement
> until after the porting of all existing functionality is done.
Even if the existing functionality is wrong or a hack?? At the very
least, we should separate the good code from the 'temporaray hack'
code, and not munge it all together into one giant merge patch.
>
> > prototype. the 2.6.18.3 code does not many error cases. the 2.6-latest
> > code should.
> >
> >> I tried pulling the changes from linux-pnfs to linux-pnfs-2.6
> >> in order to establish a depot for the patches we they can be manipulated
> >> more easily in git by resetting a branch in linux-pnfs-2.6 to v2.6.18
> >> which is a common ancestor of both trees. This went ok, but attempting
> >> to rebase the patchset onto a more recent point went horribly wrong.
> >>
> >> Basically every merge point in which sessions patches were pushed to
> >> the master branch presented conflicts when trying to do the rebase and
> >> the merge has to be replayed (and since we didn't enabled rerere there
> >> doesn't seem to be an easy way to do that since git-show of the merge
> >> commit is insufficient to resolve the merge conflict. Is there
> >> an easy way to do this with git that I'm not aware of?
> >
> > there is no reason to worry about this. the 2.6.18.3 code is no good.
> > it gave us experience - fine. but it has glaring problems that should
> > be fixed, not propoagated.
>
> one question still, how can you make sure nothing was dropped along the way?
Because we read the 2.6.18.3 code and move it forward one logical
chunk at a time, creating a patch stream that has some possibility of
being useful for kernel submission. Your approach leaves us in the
same horrible position of a giant pnfs client merge patch.
Futhermore, a lot of the 2.6.18.3 code won't be useful until there is
a back channel. I think we should port forward the code that can run
on 2.6-latest, and get it right.
I presetented for review the code needed to read via pNFS. A logical
chunk that can be tested. Write will follow. This will let us review
GETDEVICELIST, GETDEVICEINFO, LAYOUTGET, LAYOUTRETURN, and the read
and write I/O path, including failure to the MDS. I'm sure we can
identify and move forward all 2.6.18.3 features that relate to this
functionallity.
If you need it, we could add a patch to hack the nfs read path to
return errors from nfs_read_rpcsetup and friends all the way back to
nfs_readpages. This should be it's own patch with an approrpiate
comment, so that it can easily be identified and removed at a later
date. It should not be included unidentified in a giant merge patch.
-->Andy
>
> >
> >> After discussing this with Boaz we came to the conclusion that
> >> using rebase rather than merge should do a better job in integrating
> >> the pnfs patches on top of the sessions patches as the each committed
> >> patch after the rebase will be in sync with the branch so I'm going to
> >> retry this by applying the sessions patches onto a 2.6.18.3 branch
> >> (off of linux-pnfs-2.6) and having a sub-branch of that for the pnfs-patches
> >> which I'll rebase at the merge points we have in the 2.6.18.3 tree so that
> >> we'll have a coherent set of patches to play with.
> >
> > i disagree with a merge. if you want all the bugs, use 2.6.18.3.
>
> I just tried that as a technique to bring the original patches into
> a branch of the new tree (which is based off of v2.6.18 to start with)
> just for reference.
>
> >
> > -->Andy
> >
> >
> >> Benny
> >>
> >>> Trond
> >>>
> >>
> >> --
> >> Benny Halevy
> >> Software Architect
> >> Tel/Fax: +972-3-647-8340
> >> Mobile: +972-54-802-8340
> >> US: +1-412-203-3187
> >> bhalevy at panasas.com
> >>
> >> Panasas, Inc.
> >> Leader in Parallel Storage
> >> www.panasas.com
> >>
> >>
>
> > _______________________________________________
> > pNFS mailing list
> > pNFS at linux-nfs.org
> > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
>
>
More information about the pNFS
mailing list