nfs client hanging

J. Bruce Fields bfields at fieldses.org
Tue Jul 15 13:50:22 EDT 2008


On Tue, Jul 15, 2008 at 05:15:29PM +0200, Guillaume Rousse wrote:
> After upgrading the kernel to 2.6.26-rc8, I now have this kind of trace  
> in the logs:
>
> Jul 15 16:23:28 localhost kernel: INFO: task mv:16429 blocked for more  
> than 120 seconds.
> Jul 15 16:23:28 localhost kernel: "echo 0 >  
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Jul 15 16:23:28 localhost kernel: mv            D ffffffff804e7dc0     0  
> 16429  26998
> Jul 15 16:23:28 localhost kernel:  ffff8103f8827638 0000000000000082  
> ffff8103f8827600 ffff8103f88275fc
> Jul 15 16:23:28 localhost kernel:  0000000300000000 ffff81043985d8c0  
> ffff81043dde0000 ffff81043985dc10
> Jul 15 16:23:28 localhost kernel:  0000000400000286 0000000103a0985e  
> ffff81043985dc10 000000000000060c
> Jul 15 16:23:28 localhost kernel: Call Trace:
> Jul 15 16:23:28 localhost kernel:  [<ffffffffa02782dc>]  
> :nfs:nfs_idmap_id+0x1dc/0x2a0

I think nfs_idmap_id just refuses to return until it gets a response to
the idmap upcall, so it may hold the two mutexes there indefinitely.  I
don't know if that's fixable or if it has to be that way.

In any case:

> rpc.idmapd crashed, its last message occurs at 16:11:28, 12 minutes  
> before the trace. It seems to validate the hypothesis than the initial  
> failures comes from there, but doesn't get much information about why it  
> crashed, despite its verbosity level set to 3.

Could get a backtrace, core dump, or something?  What are the final
messages before the crash?

--b.


More information about the NFSv4 mailing list