nfs client hanging

Guillaume Rousse Guillaume.Rousse at inria.fr
Tue Jul 15 11:15:29 EDT 2008


J. Bruce Fields a écrit :
> And figuring out why idmapd crashed would be a really good thing to do
> in any case....  So that's the question to work on, I think.

After upgrading the kernel to 2.6.26-rc8, I now have this kind of trace 
in the logs:

Jul 15 16:23:28 localhost kernel: INFO: task mv:16429 blocked for more 
than 120 seconds.
Jul 15 16:23:28 localhost kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 15 16:23:28 localhost kernel: mv            D ffffffff804e7dc0     0 
16429  26998
Jul 15 16:23:28 localhost kernel:  ffff8103f8827638 0000000000000082 
ffff8103f8827600 ffff8103f88275fc
Jul 15 16:23:28 localhost kernel:  0000000300000000 ffff81043985d8c0 
ffff81043dde0000 ffff81043985dc10
Jul 15 16:23:28 localhost kernel:  0000000400000286 0000000103a0985e 
ffff81043985dc10 000000000000060c
Jul 15 16:23:28 localhost kernel: Call Trace:
Jul 15 16:23:28 localhost kernel:  [<ffffffffa02782dc>] 
:nfs:nfs_idmap_id+0x1dc/0x2a0
Jul 15 16:23:28 localhost kernel:  [<ffffffff80235630>] ? 
default_wake_function+0x0/0x10
Jul 15 16:23:28 localhost kernel:  [<ffffffffa02783f3>] 
:nfs:nfs_map_name_to_uid+0x23/0x30
Jul 15 16:23:28 localhost kernel:  [<ffffffffa026fa83>] 
:nfs:decode_getfattr+0x493/0x11d0
Jul 15 16:23:28 localhost kernel:  [<ffffffffa0270c67>] 
:nfs:nfs4_xdr_dec_access+0xa7/0x110
Jul 15 16:23:28 localhost kernel:  [<ffffffffa020b749>] 
:sunrpc:rpcauth_unwrap_resp+0x89/0xc0
Jul 15 16:23:28 localhost kernel:  [<ffffffffa0270bc0>] ? 
:nfs:nfs4_xdr_dec_access+0x0/0x110
Jul 15 16:23:28 localhost kernel:  [<ffffffffa02032c5>] 
:sunrpc:call_decode+0x1d5/0x870
Jul 15 16:23:28 localhost kernel:  [<ffffffff80257190>] ? 
wake_bit_function+0x0/0x40
Jul 15 16:23:28 localhost kernel:  [<ffffffffa0270bc0>] ? 
:nfs:nfs4_xdr_dec_access+0x0/0x110
Jul 15 16:23:28 localhost kernel:  [<ffffffffa020a8ea>] 
:sunrpc:__rpc_execute+0xaa/0x290
Jul 15 16:23:28 localhost kernel:  [<ffffffffa020ab66>] 
:sunrpc:rpc_execute+0x96/0xb0
Jul 15 16:23:28 localhost kernel:  [<ffffffffa0203a37>] 
:sunrpc:rpc_run_task+0x37/0x80
Jul 15 16:23:28 localhost kernel:  [<ffffffffa0203b7d>] 
:sunrpc:rpc_call_sync+0x3d/0x60
Jul 15 16:23:28 localhost kernel:  [<ffffffffa0269310>] 
:nfs:_nfs4_proc_access+0x100/0x1a0
Jul 15 16:23:28 localhost kernel:  [<ffffffff802406d0>] ? 
try_acquire_console_sem+0x10/0x40
Jul 15 16:23:28 localhost kernel:  [<ffffffffa02693eb>] 
:nfs:nfs4_proc_access+0x3b/0x70
Jul 15 16:23:28 localhost kernel:  [<ffffffffa0251b1d>] 
:nfs:nfs_do_access+0xcd/0x390
Jul 15 16:23:29 localhost kernel:  [<ffffffffa020c9c0>] ? 
:sunrpc:generic_lookup_cred+0x10/0x20
Jul 15 16:23:29 localhost kernel:  [<ffffffffa0251f26>] 
:nfs:nfs_permission+0x146/0x1c0
Jul 15 16:23:29 localhost kernel:  [<ffffffff802c8e4a>] 
permission+0xaa/0x150
Jul 15 16:23:29 localhost kernel:  [<ffffffff802c8f24>] 
vfs_permission+0x14/0x20
Jul 15 16:23:29 localhost kernel:  [<ffffffff802cb307>] 
__link_path_walk+0x97/0x1090
Jul 15 16:23:29 localhost kernel:  [<ffffffff802dbeaa>] ? 
mntput_no_expire+0x2a/0x150
Jul 15 16:23:29 localhost kernel:  [<ffffffff802cc366>] path_walk+0x66/0xd0
Jul 15 16:23:29 localhost kernel:  [<ffffffff802cc672>] 
do_path_lookup+0xa2/0x290
Jul 15 16:23:29 localhost kernel:  [<ffffffff802ccbb7>] 
__path_lookup_intent_open+0x67/0xd0
Jul 15 16:23:29 localhost kernel:  [<ffffffff802ccc2c>] 
path_lookup_open+0xc/0x10
Jul 15 16:23:29 localhost kernel:  [<ffffffff802cdae8>] 
do_filp_open+0xb8/0xa40
Jul 15 16:23:29 localhost kernel:  [<ffffffff802cc672>] ? 
do_path_lookup+0xa2/0x290
Jul 15 16:23:29 localhost kernel:  [<ffffffff802cb021>] ? putname+0x31/0x50
Jul 15 16:23:29 localhost kernel:  [<ffffffff802bf41a>] ? 
sys_faccessat+0xca/0x1d0
Jul 15 16:23:29 localhost kernel:  [<ffffffff802be4ec>] ? 
get_unused_fd_flags+0x8c/0x140
Jul 15 16:23:29 localhost kernel:  [<ffffffff802be616>] 
do_sys_open+0x76/0x100
Jul 15 16:23:29 localhost kernel:  [<ffffffff802be6cb>] sys_open+0x1b/0x20
Jul 15 16:23:29 localhost kernel:  [<ffffffff8020c55a>] 
system_call_after_swapgs+0x8a/0x8f

rpc.idmapd crashed, its last message occurs at 16:11:28, 12 minutes 
before the trace. It seems to validate the hypothesis than the initial 
failures comes from there, but doesn't get much information about why it 
crashed, despite its verbosity level set to 3.
-- 
Guillaume Rousse
Moyens Informatiques - INRIA Futurs
Tel: 01 69 35 69 62


More information about the NFSv4 mailing list