[pnfs] Need for coherent layout driver interface
William A. (Andy) Adamson
andros at citi.umich.edu
Fri Mar 14 16:09:58 EDT 2008
On Fri, Mar 14, 2008 at 3:03 PM, Dean Hildebrand <seattleplus at gmail.com>
wrote:
> Excerpt from block patch:
> > --- a/include/linux/nfs4_pnfs.h
> > +++ b/include/linux/nfs4_pnfs.h
> > @@ -141,6 +141,10 @@ struct layoutdriver_io_operations {
> > int (*write_begin) (struct pnfs_layout_segment *lseg, struct page
> *page,
> > loff_t pos, unsigned count,
> > struct pnfs_fsdata **fsdata);
> > + /* Hook into nfs_create_request, for setting wb_private */
> > + void (*new_request)(struct pnfs_layout_segment *lseg,
> > + struct nfs_page *req, loff_t pos, unsigned
> count,
> > + struct pnfs_fsdata *fsdata);
> > void (*free_request_data) (struct nfs_page *);
>
> > void (*free_fsdata) (struct pnfs_fsdata *data);
>
> I have a problem with these additions to the layout driver i/o
> interface. This isn't a complaint necessarily at just these new ops for
> the block driver (although they make the problem much much worse), but
> our I/O interface in general. I would like a freeze on the
> layoutdriver_io_operations struct (without the block patches) until we
> can figure it out. If we don't do it now, it will become onerous to
> change the layout drivers later on.
It is a good idea to review the interface.
>
>
> Posing as a new layout driver implementor for a second, I have the
> following functions that I might want to implement to do writes:
> write_begin
> new_request
> flush_one
> free_request_data
> write_pagelist
> free_fsdata
> free_request_data
>
> What is the motivation of each one? Why would I implement one and not
> another? Why are there names so generic, e.g., what type of new
> request?
Good questions. The file layout driver introduced a new private field in
struct nfs_page: wb_private. it stores the data server info during it's
write_pagelist() so that it's flush_one() does not have to look it up again.
the free_request_data() callout is called when the struct nfs_page is
un-allocated. The file layout driver doesn't need a new_request callout
because it sets wb_private in the layout driver.
So, the type of request is layout driver specific private data passed in
struct nfs_page.
Why is there no symmetry, e.g., write_begin/end exists but not
> read_begin/end?
there is no need for read begin/end because data is read in block size. the
write_begin/end hooks for the block layout exist to fill data (either read
the data or place zero's) around a bit of a block that is written, and to
bail back to NFS if the block I/O fails.
also, gathering writes in memory to flush all at once asynchronously means
extra code, and means that the read and write path will not be symmetric WRT
the interface.
>
> ...now as me... We are beginning to create a completely incoherent
> interface. We need to view this interface as an abstraction layer--
> have a goal of abstracting NFS logic out of the interface e.g., a single
> write operation instead of flush_one AND write_pagelist.
write_pagelist populates the nfs page cache. flush_one flushes the page
cache after (potentially) many write_pagelist calls. two calls are needed
for the two different jobs.
> The functions
> in pnfs.c should be able to package up the i/o requests from the nfs
> layer and use a simplified interface to the layout driver.
>
> We need a symmetric and coherent interface for both the read and write
> path. (something simple like 3 functions for read and 3 functions for
> write: begin_read/write, read/write_pagelist, end_read/write)
>
> What do people think? Something to talk about at the next call? What
> are the requirements of this I/O interface? Based on your input, I'd
> like to start looking into what can be done.
I think it is a good idea. Part of this discussion should be I/O recovery
through the MDS, which influences the interface.
-->Andy
> Dean
> _______________________________________________
> pNFS mailing list
> pNFS at linux-nfs.org
> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://linux-nfs.org/pipermail/pnfs/attachments/20080314/d4f82241/attachment.htm
More information about the pNFS
mailing list