[pnfs] [PATCH 11/14] pnfs: pnfs_write_begin
Fredric Isaman
iisaman at citi.umich.edu
Wed Apr 2 13:25:51 EDT 2008
On Wed, 2 Apr 2008, Dean Hildebrand wrote:
>
>
> Fredric Isaman wrote:
>>
>>
>> On Mon, 31 Mar 2008, Dean Hildebrand wrote:
>>
>>>
>>>>>>
>>>>>> diff --git a/include/linux/nfs4_pnfs.h b/include/linux/nfs4_pnfs.h
>>>>>> index 1844838..f711df2 100644
>>>>>> --- a/include/linux/nfs4_pnfs.h
>>>>>> +++ b/include/linux/nfs4_pnfs.h
>>>>>> @@ -48,6 +48,12 @@ struct pnfs_layout_type {
>>>>>> u8 ld_data[]; /* layout driver private data */
>>>>>> };
>>>>>> +struct pnfs_fsdata {
>>>>>> + struct layoutdriver_io_operations *ld_io_ops;
>>>>>> + u8 ld_fsdata[];
>>>>>> +};
>>>>>>
>>>>> I'm searching through the block and pnfs code and I can't seem to find
>>>>> any uses of either of ld_fsdata or ld_io_ops. What is the meaning of
>>>>> the internals of this structure? All I see in the block code is pointer
>>>>> comparisons of the fsdata stuff.
>>>>
>>>> The reason for the pnfs_fsdata structure is that when free() needs to be
>>>> called, there is no longer any reference to the layout driver, so the
>>>> only way to know which free code to call is to store some sort of pointer
>>>> to it in the data structure itself. See pnfs_free_fsdata(). Note that
>>>> there is a similar problem with wb_private, thus the need to set wb_ops
>>>> for use in pnfs_free_request_data().
>>> Your justification for having variables in pnfs_fsdata is dubious for
>>> multiple reasons:
>>> 1) nfs_write_end calls pnfs_write_end_cleanup calls pnfs_free_fsdata which
>>> uses pnfs_fsdata.ld_io_iops. If you simply pass the nfs_server into
>>> pnfs_write_end_clean up from nfs_write_end then ld_io_ops is not required.
>>> ------> If ld_io_ops is not required, and ld_fsdata is never used, then
>>> struct pnfs_fsdata does not require any internal variables.
>>> 2) No layout driver implements free_fsdata, so it is currently not
>>> required anyways.
>>>
>>> Therefore, if a struct doesn't require internal arguments, it can be
>>> replaced with a single bit or a boolean (as you agree below)
>>>
>>> Side note: wb_private is disappearing as per our conf call. It is no
>>> longer needed once flush_one and writepagelist are merged.
>>>>
>>>>> Is the only purpose of creating and using pnfs_fsdata just to determine
>>>>> if pNFS is being used?
>>>>
>>>> Not quite, it's purpose is to remember if nfs_write_begin properly set up
>>>> the page for use by the block driver.
>>> So this is a boolean?
>>>>
>>>>> That seems to be the only purpose in bl_do flush. It seems file layout
>>>>> (nfs_page.wb_private) and block (fsdata) are competing for ways to store
>>>>> layout driver specific information. We will be removing the need for
>>>>> wb_private in the file layout driver once we merge flush_one and
>>>>> write_pagelist.
>>>>>
>>>>> For block, is the goal to remember, for each page, whether it should use
>>>>> pnfs or not? If so, would it be easier to use the wb_private field (ie
>>>>> use nfs_find_page_request in nfs_write_begin, pass the nfs_page
>>>>> structure to pnfs_write_begin, and then set a flag in the nfs_page)? In
>>>>> fact, couldn't we just use a flag in the nfs_page?
>>>>>
>>>>
>>>> No. The problem is that none of these structures exist yet. The
>>>> nfs_page request is not created until nfs_write_end, and at that time we
>>>> need to know, based on information from nfs_write_begin (passed through
>>>> fsdata), what flags/fields to set in nfs_page. Note that this
>>>> communication between write_begin and write_end is exactly what fsdata is
>>>> there for.
>>> Ahh I see, on the existing pages that need to be flushed have a nfs_page.
>>> Ok, so we are back to setting a non-dynamic memory boolean on fsdata. No
>>> free memory cleanup required.
>>>>
>>>>> I'd like to figure out a way to remove pnfs_fsdata and its associated
>>>>> messiness (just like we will remove wb_private.)
>>>>
>>>> Admittedly, since I am currently just passing a single bit of data, I
>>>> could possibly just grab a page flag. However, I think the fsdata
>>>> approach is more general, and follows naturally from the current code
>>>> design.
>>> If all you need to do is set a single bit of data (using fsdata) then do
>>> that instead and remove free_fsdata. I'd prefer not to add interface
>>> functions that predict how others might used random bits of code. At this
>>> point, we have a write_begin and a write_end, which according to yourself,
>>> is good enough for all known layout drivers. Let's let the next layout
>>> driver try and make the arguments for adding dynamic data and extra
>>> interface ops.
>>> Dean
>>
>> I see your point, and won't push too hard, other than to say that it feels
>> very wrong to make available use of void *, without a corresponding way to
>> free it. At the very least, extensive comments warning of the omission
>> should be added.
> Maybe I still don't understand the point of fsdata, but if the point of
> fsdata is to transfer info from write_begin to write_end, it should allocated
> in write_begin and freed in write_end.
> Dean
Which is precisely what pnfs_write_begin and pnfs_write_end_cleanup are
there to do.
Fred
More information about the pNFS
mailing list