[pnfs] [PATCH 11/14] pnfs: pnfs_write_begin
Dean Hildebrand
seattleplus at gmail.com
Wed Apr 2 19:52:15 EDT 2008
Fredric Isaman wrote:
>
>
> On Wed, 2 Apr 2008, Dean Hildebrand wrote:
>
>>
>>
>> Fredric Isaman wrote:
>>>
>>>
>>> On Mon, 31 Mar 2008, Dean Hildebrand wrote:
>>>
>>>>
>>>>>>>
>>>>>>> diff --git a/include/linux/nfs4_pnfs.h b/include/linux/nfs4_pnfs.h
>>>>>>> index 1844838..f711df2 100644
>>>>>>> --- a/include/linux/nfs4_pnfs.h
>>>>>>> +++ b/include/linux/nfs4_pnfs.h
>>>>>>> @@ -48,6 +48,12 @@ struct pnfs_layout_type {
>>>>>>> u8 ld_data[]; /* layout driver private data */
>>>>>>> };
>>>>>>> +struct pnfs_fsdata {
>>>>>>> + struct layoutdriver_io_operations *ld_io_ops;
>>>>>>> + u8 ld_fsdata[];
>>>>>>> +};
>>>>>>>
>>>>>> I'm searching through the block and pnfs code and I can't seem to
>>>>>> find any uses of either of ld_fsdata or ld_io_ops. What is the
>>>>>> meaning of the internals of this structure? All I see in the
>>>>>> block code is pointer comparisons of the fsdata stuff.
>>>>>
>>>>> The reason for the pnfs_fsdata structure is that when free() needs
>>>>> to be called, there is no longer any reference to the layout
>>>>> driver, so the only way to know which free code to call is to
>>>>> store some sort of pointer to it in the data structure itself.
>>>>> See pnfs_free_fsdata(). Note that there is a similar problem with
>>>>> wb_private, thus the need to set wb_ops for use in
>>>>> pnfs_free_request_data().
>>>> Your justification for having variables in pnfs_fsdata is dubious
>>>> for multiple reasons:
>>>> 1) nfs_write_end calls pnfs_write_end_cleanup calls
>>>> pnfs_free_fsdata which uses pnfs_fsdata.ld_io_iops. If you simply
>>>> pass the nfs_server into pnfs_write_end_clean up from nfs_write_end
>>>> then ld_io_ops is not required.
>>>> ------> If ld_io_ops is not required, and ld_fsdata is never used,
>>>> then struct pnfs_fsdata does not require any internal variables.
>>>> 2) No layout driver implements free_fsdata, so it is currently not
>>>> required anyways.
>>>>
>>>> Therefore, if a struct doesn't require internal arguments, it can
>>>> be replaced with a single bit or a boolean (as you agree below)
>>>>
>>>> Side note: wb_private is disappearing as per our conf call. It is
>>>> no longer needed once flush_one and writepagelist are merged.
>>>>>
>>>>>> Is the only purpose of creating and using pnfs_fsdata just to
>>>>>> determine if pNFS is being used?
>>>>>
>>>>> Not quite, it's purpose is to remember if nfs_write_begin properly
>>>>> set up the page for use by the block driver.
>>>> So this is a boolean?
>>>>>
>>>>>> That seems to be the only purpose in bl_do flush. It seems file
>>>>>> layout (nfs_page.wb_private) and block (fsdata) are competing for
>>>>>> ways to store layout driver specific information. We will be
>>>>>> removing the need for wb_private in the file layout driver once
>>>>>> we merge flush_one and write_pagelist.
>>>>>>
>>>>>> For block, is the goal to remember, for each page, whether it
>>>>>> should use pnfs or not? If so, would it be easier to use the
>>>>>> wb_private field (ie use nfs_find_page_request in
>>>>>> nfs_write_begin, pass the nfs_page structure to pnfs_write_begin,
>>>>>> and then set a flag in the nfs_page)? In fact, couldn't we just
>>>>>> use a flag in the nfs_page?
>>>>>>
>>>>>
>>>>> No. The problem is that none of these structures exist yet. The
>>>>> nfs_page request is not created until nfs_write_end, and at that
>>>>> time we need to know, based on information from nfs_write_begin
>>>>> (passed through fsdata), what flags/fields to set in nfs_page.
>>>>> Note that this communication between write_begin and write_end is
>>>>> exactly what fsdata is there for.
>>>> Ahh I see, on the existing pages that need to be flushed have a
>>>> nfs_page. Ok, so we are back to setting a non-dynamic memory
>>>> boolean on fsdata. No free memory cleanup required.
>>>>>
>>>>>> I'd like to figure out a way to remove pnfs_fsdata and its
>>>>>> associated messiness (just like we will remove wb_private.)
>>>>>
>>>>> Admittedly, since I am currently just passing a single bit of
>>>>> data, I could possibly just grab a page flag. However, I think
>>>>> the fsdata approach is more general, and follows naturally from
>>>>> the current code design.
>>>> If all you need to do is set a single bit of data (using fsdata)
>>>> then do that instead and remove free_fsdata. I'd prefer not to add
>>>> interface functions that predict how others might used random bits
>>>> of code. At this point, we have a write_begin and a write_end,
>>>> which according to yourself, is good enough for all known layout
>>>> drivers. Let's let the next layout driver try and make the
>>>> arguments for adding dynamic data and extra interface ops.
>>>> Dean
>>>
>>> I see your point, and won't push too hard, other than to say that it
>>> feels very wrong to make available use of void *, without a
>>> corresponding way to free it. At the very least, extensive comments
>>> warning of the omission should be added.
>> Maybe I still don't understand the point of fsdata, but if the point
>> of fsdata is to transfer info from write_begin to write_end, it
>> should allocated in write_begin and freed in write_end.
>> Dean
>
>
> Which is precisely what pnfs_write_begin and pnfs_write_end_cleanup
> are there to do.
You are worried about dynamic data, but as of the current moment we
still don't need any in the layout driver, so I think I might have a
compromise.
1. change the contents of pnfs_fsdata to be a enum of type:
a. READ
b. WRITE, so far ok to use pnfs
c. WRITE, must not use pnfs
2. Allocate pnfs_fsdata in _pnfs_write_begin, pass it to layout driver
3. Only pass pnfs_fsdata* to write_begin instead of a pnfs_fsdata**
4. layout driver can modify the enum in pnfs_fsdata as it wishes
5. de-alloc pnfs_fsdata in pnfs_write_end_cleanup
6. eliminate layout driver interface free_fsdata function
then I think we already agreed on
7. eliminate layout driver interface function *new_request* and move
functionality from block driver into _pnfs_modify_new_request
- btw, this has several benefits. One is simplifying the LD
interface. Another benefit which is just as important is maintaining
the separation of the nfs code and the layout driver (layering if you
will), i.e., the nfs_page structure is an nfs-specific data structure
and should not be interpreted by the layout driver.
and at the conf call
8. eliminate LD interface functions free_request_data and flush_one by
merging the layout driver flush_one and write_pagelist functions.
Dean
More information about the pNFS
mailing list