Fw: generic_file_buffered_write(): deadlock on vectored write patch in 2.6.17.7

Bryce Harrington bryce at osdl.org
Sat Sep 16 02:47:47 EDT 2006


On Fri, Sep 15, 2006 at 01:30:49PM -0700, Andrew Morton wrote:
> Bryce, I think William has identified the source of the NFS rewrite
> performance regression.

Cool, yes this fixes that re-write glitch:

    http://crucible.osdl.org/runs/2083/test_output/iozone.sys.log.png

For reference here is one without William's patch:

    http://crucible.osdl.org/runs/2081/test_output/iozone.sys.log.png

The re-write line has come back inline with the write performance line
as expected.

The read and re-read performance is still lower compared with 2.6.17*.
Do you know of other patches that would be worth experimenting with?

Thanks,
Bryce

> Begin forwarded message:
> 
> Date: Tue, 12 Sep 2006 16:15:30 -0400
> From: William Xu <wxu at max-t.com>
> To: akpm at osdl.org
> Subject: generic_file_buffered_write(): deadlock on vectored write patch in 2.6.17.7
> 
> 
> Hi Andrew,
> 
> I found a regression of NFS performance in the latest kernel and traced 
> it down to this patch:
> "generic_file_buffered_write(): deadlock on vectored write" checked in 
> 2.6.17.7.
> 
> With this patch, the NFS rewrite performance is very slow. I found that 
> in my NFS rewrite
> tests, the size of segments ( cur_iov->iov_len - iov_base) could be very 
> small.  With this patch,
> the kernel tends to call write operations (prepare_write, commit_write) 
> for each segment,
> while without this patch, the kernel calls write operations for the rest 
> of a page
> (bytes=PAGE_CACHE_SIZE - offset).
> 
> Since the purpose of this patch is to avoid the deadlock during writes, 
> I wonder if we can just
> prefault all user pages to avoid the same deadlock. The attached patch 
> does exactly that. By this,
> it seems to me that we can avoid both deadlock and performance regression.
> 
> Please tell me what you think about the patch.
> 
> Thanks a lot,
> william
> 

> --- linux-2.6.17.6/mm/filemap.c.orig	2006-07-15 15:00:43.000000000 -0400
> +++ linux-2.6.17.6/mm/filemap.c	2006-09-12 15:11:22.440667029 -0400
> @@ -1971,6 +1971,21 @@
>  }
>  EXPORT_SYMBOL(generic_file_direct_write);
>  
> +static void fault_in_pages_readable_from_user_iovec(const struct iovec *iov, size_t base, size_t bytes)
> +{
> +	while (bytes) {
> +		unsigned long	maxlen = iov->iov_len - base;
> +		char __user	*buf = iov->iov_base + base;
> +
> +		base = 0;
> +		if (maxlen > bytes)
> +			maxlen = bytes;
> +		fault_in_pages_readable(buf, maxlen);
> +		bytes -=maxlen;
> +		iov++;
> +	}
> +}
> +
>  ssize_t
>  generic_file_buffered_write(struct kiocb *iocb, const struct iovec *iov,
>  		unsigned long nr_segs, loff_t pos, loff_t *ppos,
> @@ -2004,7 +2019,6 @@
>  	do {
>  		unsigned long index;
>  		unsigned long offset;
> -		unsigned long maxlen;
>  		size_t copied;
>  
>  		offset = (pos & (PAGE_CACHE_SIZE -1)); /* Within page */
> @@ -2019,10 +2033,7 @@
>  		 * same page as we're writing to, without it being marked
>  		 * up-to-date.
>  		 */
> -		maxlen = cur_iov->iov_len - iov_base;
> -		if (maxlen > bytes)
> -			maxlen = bytes;
> -		fault_in_pages_readable(buf, maxlen);
> +		fault_in_pages_readable_from_user_iovec(cur_iov, iov_base, bytes);
>  
>  		page = __grab_cache_page(mapping,index,&cached_page,&lru_pvec);
>  		if (!page) {
> 



More information about the NFSv4 mailing list