[pnfs] [PATCH 0/14] linux-pnfs-2.6-latest pNFS client read I/O - Draft 13
William A. (Andy) Adamson
andros at citi.umich.edu
Wed Nov 14 12:11:50 EST 2007
On Nov 13, 2007 5:27 AM, Benny Halevy <bhalevy at panasas.com> wrote:
> On Nov. 12, 2007, 21:29 +0200, "William A. (Andy) Adamson" <andros at citi.umich.edu> wrote:
> > I thought it best to implement the draft-ietf-nfsv4-minorversion1-13
> > version of pNFS in the new tree. We can update to a later version post
> > Vancouver IETF.
>
> What you sent looks pretty good yet it doesn't include everything
> we have we currently have in the 2.6.18.3 based tree (e.g. commit
> 1ce266d8 and friends that introduce has_layout())so I guess you've
> reimplemented everything rather than porting the patches?
i look at it as porting, but i guess you could call it a
reimplementation because so much has changed on the client between
2.6.18 and 2.6.24.
i ported/re-implemented the minimal feature set to enable "use nfs
cache" pnfs read I/O. the same for the write path will follow.
WRT has_layout, it is declared as an op in linux-pnfs-2.6-latest and
called by virtual_update_layout(), so what do you mean? it's just not
ported forward for the filelayout driver yet.
i did not claim to have ported the whole client forward.
i look at this as an opportunity to review our pnfs code, which is in
no shape to even begin to think of presenting for kernel inclusion. we
need to improve the quality of the pnfs code - paying particular
attention to failure cases.
for example in fs/nfs/read.c: nfs_read_rpcsetup()
ifdef CONFIG_PNFS
/* XXX pnfs_try_to_read_data should never return an error less than 0.
* ret == 0 means pnfs read succees.
* ret == 1 means do an NFSv4.1 read to the MDS.
*/
if ((pnfs_try_to_read_data(data, call_ops)) <= 0)
return;
#endif /* CONFIG_PNFS*/
nfs_initiate_read(data, NFS_CLIENT(inode), call_ops);
}
this call is designedfor all layoutdrivers to try pnfs, and if pnfs
fails, to try nfs through the MDS. the idea is that the layoutdriver
should return on failure with the nfs_read_data structure all setup
for nfs read via MDS.
the filelayoutdriver does not do this. filelayout_read_pagelist()
calls nfs_initiate_read upon a failure to get a layout, or failure to
find or communicate with a data server.
it also pays no attention to the rsize used to gather the pages. this
needs to be fixed.
we need to have a generic failure path used by all layoutdrivers, and
we should code this next.
what happens if a data server/osd storagedevice/ block disk goes down
or is network partitioned? etc....
the session forward channel code is OK, but we need to get all the
failure cases working in the sessions code, and then address how pNFS
reacts.
> What else is missing? How can we make sure everything
> we have in the 2.6.18.3 client was ported to 2.6-latest?
a lot is missing. i don't know that we want/neeed eveything in
2.6.18.3 ported to 2.6-.latest.
>
> Regarding which draft to implement, we're going to talk about draft-16 in
> Vancouver so I think we should at least take a stab in implementing it so
> we can come up with some implementation experience.
there is no time between now and vancouver (2 weeks) to gain draft-16
experience.
-->Andy
> I'll be glad to
> help as much as I can.
>
> Benny
>
>
> >
> > These apply agaist the 2.6.24-rc2 latest...
> >
> > This code is close enough for review. I need to debug a server umount bug.
> >
> > This set of 14 patches
> > - refactors the client sessions code to enable use by data servers
> > including re-establishing a session to a data server.
> > - replaces unimplemented pnfs nfs_rpc_ops with nfs versions
> > - implementents/updates GETDEVICELIST/GETDEVICEINFO
> > - refactors data server lookup
> > - obtains a layout at the beginning of the nfs
> > address_space_operations nfs_readpages().
> > - implements pNFS filelayout read
> > - implements LAYOUTRETURN
> > - some cleanup
> >
> > With these patches, the nfslayoutdriver passes connectathon basic,
> > special, and general tests as well as iozone testing of large read
> > I/O.
> >
> > Issues:
> > - there is a server bug in expire_layout where UNLOCKED STATE is hit.
> > - the renew_state code can not find an open_stateowner and therefore
> > can not find a cred at EXCHANGE_ID time.
> > which means that no keep alive sequence op is sent.
> >
> > I have write and LAYOUTCOMMIT working, but the patches are not organized.
> > Please look these over. I'll continue on with the rest next week after SC07.
> >
> > -->Andy
> > _______________________________________________
> > pNFS mailing list
> > pNFS at linux-nfs.org
> > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
> >
>
>
More information about the pNFS
mailing list