[pnfs] [PATCH 13/29] pnfs: read path set ds_size

Benny Halevy bhalevy at panasas.com
Tue Jan 1 11:48:41 EST 2008


On Dec. 28, 2007, 10:45 +0200, Benny Halevy <bhalevy at panasas.com> wrote:
> From: Andy Adamson <andros at umich.edu>
> 
> Use the pNFS ds_rsize for read pageio.
> 
> When using the nfs page cache for pNFS I/O, it is necessary to setupe pages
> using the rsize of the storage device instead of the rsize of the MDS.
> 
> In 2.6.18.3, there are several read I/O paths through the client code and
> the rsize was used in each of these paths to setup the nfs page cache. Since
> several functions used the rsize, an rpc_ops->rsize function makes sense.
> 
> In 2.6.24, there is only one read I/O path through the client code. The
> interface to the nfs page cache has been re-written. It is now shared between
> the read and write code paths, and moved entirely to fs/nfs/pagelist.c.
> 
> As a result, there is now only one place to set the rsize.
> 
> This patch removes the rpc_ops->rsize function pointer and
> sets the rsize to the pNFS ds_size when pNFS is being used.
> 
> pNFS is not being used if layoutget fails, or the count is below the
> threshold.
> 
> Signed-off-by: Andy Adamson<andros at umich.edu>
> Signed-off-by: Benny Halevy <bhalevy at panasas.com>
> ---
>  fs/nfs/nfs4proc.c |    1 -
>  fs/nfs/pnfs.c     |   48 ++++++++++++++++++++++++++++++++++++++----------
>  fs/nfs/pnfs.h     |   23 ++++++++++++++++++++++-
>  fs/nfs/read.c     |    7 +++++++
>  4 files changed, 67 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 2294754..3613960 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -5312,7 +5312,6 @@ const struct nfs_rpc_ops pnfs_v41_clientops = {
>  	.file_open      = nfs_open,
>  	.file_release   = nfs_release,
>  	.lock		= nfs4_proc_lock,
> -	.rsize		= pnfs_rsize,
>  	.wsize		= pnfs_wsize,
>  	.rpages		= pnfs_rpages,
>  	.wpages		= pnfs_wpages,
> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
> index 33de1e9..c269cf6 100644
> --- a/fs/nfs/pnfs.c
> +++ b/fs/nfs/pnfs.c
> @@ -640,6 +640,44 @@ check:
>  		return 0;
>  }
>  
> +/*
> + * rsize is already set by caller to MDS rsize.
> + */
> +void
> +pnfs_set_ds_size(struct inode *inode,
> +		 struct nfs_open_context *ctx,
> +		 struct list_head *pages,
> +		 loff_t offset,
> +		 size_t *rsize)
> +{
> +	struct nfs_server *nfss = NFS_SERVER(inode);
> +	struct page *page;
> +	size_t count = 0;
> +	int status = 0;
> +
> +	dprintk("--> %s inode %p ctx %p pages %p offset %lu\n",
> +		__func__, inode, ctx, pages, (unsigned long)offset);
> +
> +	if (!pnfs_enabled_sb(nfss))
> +		return;
> +
> +	/* Calculate the total read-ahead count */
> +	list_for_each_entry(page, pages, lru)
> +		count += pnfs_page_length(page, inode);

Andy, I'm not sure why we need to do all of this...
I presume that the byte count on the last page may be the whole page
and extend over the file read size and you want to get the layout
only up to the file size, is that so?

Anyway, the pages are consecutive so why do you need to go over the 
the list and not just rely on the number of pages, the offset, and
the file size? (the current calculation also seems to have a bug
since it doesn't take the offset into the first page into account...)

How about doing something like this:
- pass nr_pages to pnfs_set_ds_rsize

	loff_t i_size = i_size_read(inode);
	loff_t end_offset = (offset & PAGE_CACHE_MASK) + nr_pages * PAGE_CACHE_SIZE;

	if (end_offset > i_size)
		end_offset = i_size;
	count = end_offset - offset;

- Benny


More information about the pNFS mailing list