[pnfs] [PATCH 11/14] pnfs: pnfs_write_begin
Fredric Isaman
iisaman at citi.umich.edu
Thu Apr 3 09:41:34 EDT 2008
On Wed, 2 Apr 2008, Dean Hildebrand wrote:
>
>
> Fredric Isaman wrote:
>>
>>
>> On Wed, 2 Apr 2008, Dean Hildebrand wrote:
>>
>>>
>>>
>>> Fredric Isaman wrote:
>>>>
>>>>
>>>> On Mon, 31 Mar 2008, Dean Hildebrand wrote:
>>>>
>>>>>
>>>>>>>>
>>>>>>>> diff --git a/include/linux/nfs4_pnfs.h b/include/linux/nfs4_pnfs.h
>>>>>>>> index 1844838..f711df2 100644
>>>>>>>> --- a/include/linux/nfs4_pnfs.h
>>>>>>>> +++ b/include/linux/nfs4_pnfs.h
>>>>>>>> @@ -48,6 +48,12 @@ struct pnfs_layout_type {
>>>>>>>> u8 ld_data[]; /* layout driver private data */
>>>>>>>> };
>>>>>>>> +struct pnfs_fsdata {
>>>>>>>> + struct layoutdriver_io_operations *ld_io_ops;
>>>>>>>> + u8 ld_fsdata[];
>>>>>>>> +};
>>>>>>>>
>>>>>>> I'm searching through the block and pnfs code and I can't seem to find
>>>>>>> any uses of either of ld_fsdata or ld_io_ops. What is the meaning of
>>>>>>> the internals of this structure? All I see in the block code is
>>>>>>> pointer comparisons of the fsdata stuff.
>>>>>>
>>>>>> The reason for the pnfs_fsdata structure is that when free() needs to
>>>>>> be called, there is no longer any reference to the layout driver, so
>>>>>> the only way to know which free code to call is to store some sort of
>>>>>> pointer to it in the data structure itself. See pnfs_free_fsdata().
>>>>>> Note that there is a similar problem with wb_private, thus the need to
>>>>>> set wb_ops for use in pnfs_free_request_data().
>>>>> Your justification for having variables in pnfs_fsdata is dubious for
>>>>> multiple reasons:
>>>>> 1) nfs_write_end calls pnfs_write_end_cleanup calls pnfs_free_fsdata
>>>>> which uses pnfs_fsdata.ld_io_iops. If you simply pass the nfs_server
>>>>> into pnfs_write_end_clean up from nfs_write_end then ld_io_ops is not
>>>>> required.
>>>>> ------> If ld_io_ops is not required, and ld_fsdata is never used, then
>>>>> struct pnfs_fsdata does not require any internal variables.
>>>>> 2) No layout driver implements free_fsdata, so it is currently not
>>>>> required anyways.
>>>>>
>>>>> Therefore, if a struct doesn't require internal arguments, it can be
>>>>> replaced with a single bit or a boolean (as you agree below)
>>>>>
>>>>> Side note: wb_private is disappearing as per our conf call. It is no
>>>>> longer needed once flush_one and writepagelist are merged.
>>>>>>
>>>>>>> Is the only purpose of creating and using pnfs_fsdata just to
>>>>>>> determine if pNFS is being used?
>>>>>>
>>>>>> Not quite, it's purpose is to remember if nfs_write_begin properly set
>>>>>> up the page for use by the block driver.
>>>>> So this is a boolean?
>>>>>>
>>>>>>> That seems to be the only purpose in bl_do flush. It seems file layout
>>>>>>> (nfs_page.wb_private) and block (fsdata) are competing for ways to
>>>>>>> store layout driver specific information. We will be removing the
>>>>>>> need for wb_private in the file layout driver once we merge flush_one
>>>>>>> and write_pagelist.
>>>>>>>
>>>>>>> For block, is the goal to remember, for each page, whether it should
>>>>>>> use pnfs or not? If so, would it be easier to use the wb_private
>>>>>>> field (ie use nfs_find_page_request in nfs_write_begin, pass the
>>>>>>> nfs_page structure to pnfs_write_begin, and then set a flag in the
>>>>>>> nfs_page)? In fact, couldn't we just use a flag in the nfs_page?
>>>>>>>
>>>>>>
>>>>>> No. The problem is that none of these structures exist yet. The
>>>>>> nfs_page request is not created until nfs_write_end, and at that time
>>>>>> we need to know, based on information from nfs_write_begin (passed
>>>>>> through fsdata), what flags/fields to set in nfs_page. Note that this
>>>>>> communication between write_begin and write_end is exactly what fsdata
>>>>>> is there for.
>>>>> Ahh I see, on the existing pages that need to be flushed have a
>>>>> nfs_page. Ok, so we are back to setting a non-dynamic memory boolean on
>>>>> fsdata. No free memory cleanup required.
>>>>>>
>>>>>>> I'd like to figure out a way to remove pnfs_fsdata and its associated
>>>>>>> messiness (just like we will remove wb_private.)
>>>>>>
>>>>>> Admittedly, since I am currently just passing a single bit of data, I
>>>>>> could possibly just grab a page flag. However, I think the fsdata
>>>>>> approach is more general, and follows naturally from the current code
>>>>>> design.
>>>>> If all you need to do is set a single bit of data (using fsdata) then do
>>>>> that instead and remove free_fsdata. I'd prefer not to add interface
>>>>> functions that predict how others might used random bits of code. At
>>>>> this point, we have a write_begin and a write_end, which according to
>>>>> yourself, is good enough for all known layout drivers. Let's let the
>>>>> next layout driver try and make the arguments for adding dynamic data
>>>>> and extra interface ops.
>>>>> Dean
>>>>
>>>> I see your point, and won't push too hard, other than to say that it
>>>> feels very wrong to make available use of void *, without a corresponding
>>>> way to free it. At the very least, extensive comments warning of the
>>>> omission should be added.
>>> Maybe I still don't understand the point of fsdata, but if the point of
>>> fsdata is to transfer info from write_begin to write_end, it should
>>> allocated in write_begin and freed in write_end.
>>> Dean
>>
>>
>> Which is precisely what pnfs_write_begin and pnfs_write_end_cleanup are
>> there to do.
> You are worried about dynamic data, but as of the current moment we still
> don't need any in the layout driver, so I think I might have a compromise.
> 1. change the contents of pnfs_fsdata to be a enum of type:
> a. READ
> b. WRITE, so far ok to use pnfs
> c. WRITE, must not use pnfs
>
> 2. Allocate pnfs_fsdata in _pnfs_write_begin, pass it to layout driver
> 3. Only pass pnfs_fsdata* to write_begin instead of a pnfs_fsdata**
> 4. layout driver can modify the enum in pnfs_fsdata as it wishes
> 5. de-alloc pnfs_fsdata in pnfs_write_end_cleanup
> 6. eliminate layout driver interface free_fsdata function
>
> then I think we already agreed on
> 7. eliminate layout driver interface function *new_request* and move
> functionality from block driver into _pnfs_modify_new_request
> - btw, this has several benefits. One is simplifying the LD interface.
> Another benefit which is just as important is maintaining the separation of
> the nfs code and the layout driver (layering if you will), i.e., the nfs_page
> structure is an nfs-specific data structure and should not be interpreted by
> the layout driver.
>
> and at the conf call
> 8. eliminate LD interface functions free_request_data and flush_one by
> merging the layout driver flush_one and write_pagelist functions.
>
> Dean
>
Seems reasonable.
Fred
More information about the pNFS
mailing list