[pnfs] [PATCH 11/14] pnfs: pnfs_write_begin

Fredric Isaman iisaman at citi.umich.edu
Thu Apr 3 09:41:34 EDT 2008



On Wed, 2 Apr 2008, Dean Hildebrand wrote:

>
>
> Fredric Isaman wrote:
>> 
>> 
>> On Wed, 2 Apr 2008, Dean Hildebrand wrote:
>> 
>>> 
>>> 
>>> Fredric Isaman wrote:
>>>> 
>>>> 
>>>> On Mon, 31 Mar 2008, Dean Hildebrand wrote:
>>>> 
>>>>> 
>>>>>>>> 
>>>>>>>> diff --git a/include/linux/nfs4_pnfs.h b/include/linux/nfs4_pnfs.h
>>>>>>>> index 1844838..f711df2 100644
>>>>>>>> --- a/include/linux/nfs4_pnfs.h
>>>>>>>> +++ b/include/linux/nfs4_pnfs.h
>>>>>>>> @@ -48,6 +48,12 @@ struct pnfs_layout_type {
>>>>>>>>      u8 ld_data[];            /* layout driver private data */
>>>>>>>>  };
>>>>>>>>  +struct pnfs_fsdata {
>>>>>>>> +    struct layoutdriver_io_operations *ld_io_ops;
>>>>>>>> +    u8 ld_fsdata[];
>>>>>>>> +};
>>>>>>>> 
>>>>>>> I'm searching through the block and pnfs code and I can't seem to find 
>>>>>>> any uses of either of ld_fsdata or ld_io_ops.  What is the meaning of 
>>>>>>> the internals of this structure?  All I see in the block code is 
>>>>>>> pointer comparisons of the fsdata stuff.
>>>>>> 
>>>>>> The reason for the pnfs_fsdata structure is that when free() needs to 
>>>>>> be called, there is no longer any reference to the layout driver, so 
>>>>>> the only way to know which free code to call is to store some sort of 
>>>>>> pointer to it in the data structure itself.  See pnfs_free_fsdata(). 
>>>>>> Note that there is a similar problem with wb_private, thus the need to 
>>>>>> set wb_ops for use in pnfs_free_request_data().
>>>>> Your justification for having variables in pnfs_fsdata is dubious for 
>>>>> multiple reasons:
>>>>> 1) nfs_write_end calls pnfs_write_end_cleanup calls pnfs_free_fsdata 
>>>>> which uses pnfs_fsdata.ld_io_iops.  If you simply pass the nfs_server 
>>>>> into pnfs_write_end_clean up from nfs_write_end then ld_io_ops is not 
>>>>> required.
>>>>> ------> If ld_io_ops is not required, and ld_fsdata is never used, then 
>>>>> struct pnfs_fsdata does not require any internal variables.
>>>>> 2) No layout driver implements free_fsdata, so it is currently not 
>>>>> required anyways.
>>>>> 
>>>>> Therefore, if a struct doesn't require internal arguments, it can be 
>>>>> replaced with a single bit or a boolean (as you agree below)
>>>>> 
>>>>> Side note: wb_private is disappearing as per our conf call.  It is no 
>>>>> longer needed once flush_one and writepagelist are merged.
>>>>>> 
>>>>>>> Is the only purpose of creating and using pnfs_fsdata just to 
>>>>>>> determine if pNFS is being used?
>>>>>> 
>>>>>> Not quite, it's purpose is to remember if nfs_write_begin properly set 
>>>>>> up the page for use by the block driver.
>>>>> So this is a boolean?
>>>>>> 
>>>>>>> That seems to be the only purpose in bl_do flush. It seems file layout 
>>>>>>> (nfs_page.wb_private) and block (fsdata) are competing for ways to 
>>>>>>> store layout driver specific information.  We will be removing the 
>>>>>>> need for wb_private in the file layout driver once we merge flush_one 
>>>>>>> and write_pagelist.
>>>>>>> 
>>>>>>> For block, is the goal to remember, for each page, whether it should 
>>>>>>> use pnfs or not?  If so, would it be easier to use the wb_private 
>>>>>>> field (ie use nfs_find_page_request in nfs_write_begin, pass the 
>>>>>>> nfs_page structure to pnfs_write_begin, and then set a flag in the 
>>>>>>> nfs_page)?  In fact, couldn't we just use a flag in the nfs_page?
>>>>>>> 
>>>>>> 
>>>>>> No.  The problem is that none of these structures exist yet.  The 
>>>>>> nfs_page request is not created until nfs_write_end, and at that time 
>>>>>> we need to know, based on information from nfs_write_begin (passed 
>>>>>> through fsdata), what flags/fields to set in nfs_page.  Note that this 
>>>>>> communication between write_begin and write_end is exactly what fsdata 
>>>>>> is there for.
>>>>> Ahh I see, on the existing pages that need to be flushed have a 
>>>>> nfs_page. Ok, so we are back to setting a non-dynamic memory boolean on 
>>>>> fsdata.  No free memory cleanup required.
>>>>>> 
>>>>>>> I'd like to figure out a way to remove pnfs_fsdata and its associated 
>>>>>>> messiness (just like we will remove wb_private.)
>>>>>> 
>>>>>> Admittedly, since I am currently just passing a single bit of data, I 
>>>>>> could possibly just grab a page flag.  However, I think the fsdata 
>>>>>> approach is more general, and follows naturally from the current code 
>>>>>> design.
>>>>> If all you need to do is set a single bit of data (using fsdata) then do 
>>>>> that instead and remove free_fsdata.  I'd prefer not to add interface 
>>>>> functions that predict how others might used random bits of code.  At 
>>>>> this point, we have a write_begin and a write_end, which according to 
>>>>> yourself, is good enough for all known layout drivers.  Let's let the 
>>>>> next layout driver try and make the arguments for adding dynamic data 
>>>>> and extra interface ops.
>>>>> Dean
>>>> 
>>>> I see your point, and won't push too hard, other than to say that it 
>>>> feels very wrong to make available use of void *, without a corresponding 
>>>> way to free it.  At the very least, extensive comments warning of the 
>>>> omission should be added.
>>> Maybe I still don't understand the point of fsdata, but if the point of 
>>> fsdata is to transfer info from write_begin to write_end, it should 
>>> allocated in write_begin and freed in write_end.
>>> Dean
>> 
>> 
>> Which is precisely what pnfs_write_begin and pnfs_write_end_cleanup are 
>> there to do.
> You are worried about dynamic data, but as of the current moment we still 
> don't need any in the layout driver, so I think I might have a compromise. 
> 1. change the contents of pnfs_fsdata to be a enum of type:
> a. READ
> b. WRITE, so far ok to use pnfs
> c. WRITE, must not use pnfs
>
> 2. Allocate pnfs_fsdata in  _pnfs_write_begin, pass it to layout driver
> 3. Only pass pnfs_fsdata* to write_begin instead of a  pnfs_fsdata**
> 4. layout driver can modify the enum in pnfs_fsdata as it wishes
> 5. de-alloc pnfs_fsdata in pnfs_write_end_cleanup
> 6. eliminate layout driver interface free_fsdata function
>
> then I think we already agreed on
> 7. eliminate layout driver interface function *new_request* and move 
> functionality from block driver into _pnfs_modify_new_request
>   - btw, this has several benefits.  One is simplifying the LD interface. 
> Another benefit which is just as important is maintaining the separation of 
> the nfs code and the layout driver (layering if you will), i.e., the nfs_page 
> structure is an nfs-specific data structure and should not be interpreted by 
> the layout driver.
>
> and at the conf call
> 8. eliminate LD interface functions free_request_data and flush_one by 
> merging the layout driver flush_one and write_pagelist functions.
>
> Dean
>


Seems reasonable.

 	Fred



More information about the pNFS mailing list