[pnfs] Issues implementing LAYOUTRETURN
Benny Halevy
bhalevy at panasas.com
Wed Jul 12 19:57:59 EDT 2006
Initially, the thinking was to implement a layout cache at the layout
driver layer
but I think there are advantages to having a unified, generic layout cache,
in particular to scheduling the layout recall process in which possibly
several layout segments will have to be returned while the returns
need to be serialized with in-flight layout operations.
Keep in mind that object based layouts need to be treated a differently
than files and blocks as explained in draft-ietf-nfsv4-pnfs-obj-01.txt
since each layout segment is immutable and can't be split or merge
with neighboring segments (extents).
In summary, any design decision we make should take into account the layout
recall process...
Benny
Iyer, Rahul wrote:
>Hi!
>Currently, even though the LAYOUTRETURN RPC is implemented, it is not
>called anywhere. I was planning to add code to make these calls to the
>code. However, I'm faced with a problem.
>
>The arguments for the LAYOUTRETURN call are as defined below:
>struct nfs4_pnfs_layoutreturn_arg {
> __u64 clientid;
> __u64 offset;
> __u64 length;
> __u32 iomode;
> __u32 reclaim;
> __u32 type;
> struct inode* inode;
>};
>
>Unfortunately, from all the places LAYOUTRETURN needs to be called, all
>we have is the inode. So, all we can get is the struct pnfs_layout_type.
>Since the offset, length and iomode are parts of the opaque layout,
>there's no way for a client can fill in these arguments. There is a
>function pnfs_return_layout() which calls LAYOUTRETURN. It hardcodes the
>iomode to write and the offset and length respectively to 0 and
>0xffffffff (this is probably more correctly expressed as ~0, since
>length is __u64, not __u32), but this will obviously not work with
>extents.
>
>In order to solve this issue, there could be two possible schemes:
>1. Add a call similar to setup_layoutcommit to the layoutdriver ops.
>2. Move the offset, length and iomode fields from the opaque layout into
>to the generic layout (struct pnfs_layout_type).
>
>While both will work, approach 1 will still pose a problem later on.
>Currently, the code seems to assume whole file layouts. However, if we
>do move to an extent based layout scheme in future, then we would be
>faced with a dilemma on how to maintain the layout cache. The current
>nfs_inode->current_layout will have to be replaced with a list of
>layouts available for the inode (file) and this will mostly be indexed
>by the (offset, length and iomode) fields, which are currently not
>available. The alternative, of course, would be to have the layout
>driver handle cache management, in which case approach 1 will do fine.
>
>I was wondering what people thought on this issue.
>Thanks
>Rahul
>_______________________________________________
>pNFS mailing list
>pNFS at linux-nfs.org
>http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
>
>
More information about the pNFS
mailing list