nfs client hanging
Guillaume Rousse
Guillaume.Rousse at inria.fr
Fri Jun 20 08:03:54 EDT 2008
Trond Myklebust a écrit :
> On Tue, 2008-06-17 at 14:14 +0200, Guillaume Rousse wrote:
>> Jun 17 12:05:46 chatelet kernel: RPC: __rpc_wake_up_task done
>> Jun 17 12:05:46 chatelet kernel: RPC: 2605 sync task resuming
>> Jun 17 12:05:46 chatelet kernel: RPC: 2605 return -512, status -512
>> Jun 17 12:05:46 chatelet kernel: RPC: 2605 release task
>> Jun 17 12:05:46 chatelet kernel: RPC: 2605 releasing UNIX cred e4ce8600
>> Jun 17 12:05:46 chatelet kernel: RPC: rpc_release_client(df59ad00)
>> Jun 17 12:05:46 chatelet kernel: nfs_revalidate_inode: (0:16/100)
>> getattr failed, error=-512
>> Jun 17 12:05:46 chatelet kernel: NFS: revalidating (0:16/100)
>> Jun 17 12:05:46 chatelet kernel: RPC: new task initialized,
>> procpid 722
>> Jun 17 12:05:46 chatelet kernel: RPC: allocated task f151fe00
>> Jun 17 12:05:46 chatelet kernel: RPC: 0 looking up UNIX cred
>> Jun 17 12:05:46 chatelet kernel: RPC: 2606 __rpc_execute flags=0x80
>> Jun 17 12:05:46 chatelet kernel: RPC: 2606 call_start nfs4 proc 1 (sync)
>> Jun 17 12:05:46 chatelet kernel: RPC: 2606 call_reserve (status 0)
>> Jun 17 12:05:46 chatelet kernel: RPC: waiting for request slot
>> Jun 17 12:05:46 chatelet kernel: RPC: 2605 freeing task
>> Jun 17 12:05:46 chatelet kernel: RPC: 2606 sleep_on(queue
>> "xprt_backlog" time 7869098)
>> Jun 17 12:05:46 chatelet kernel: RPC: 2606 added to queue f7d28964
>> "xprt_backlog"
>> Jun 17 12:05:46 chatelet kernel: RPC: 2606 sync task going to sleep
>>
>> I'm not an expert, but it seems the 'nfs_revalidate_inode: (0:16/100)
>> getattr failed, error=-512', followed by 'added to queue f7d28964
>> "xprt_backlog"' basically implies something get wrong, and is then put
>> in a queue to be tried again later. Which seems to explain the hang.
>
> No. It is quite correct: look at the id of the tasks (the first number),
> which clearly shows that these are two different RPC calls.
>
> However the fact that everything is being queued on the xprt_backlog
> means that there is a heavy congestion, and that all the RPC slots are
> full. It supports the suspicion that the server is failing to respond to
> the client, and so the client requests are all backed up.
>
> Does 'netstat -t' show the client as still being connected to the
> server?
I just had the problem again, and the client is disconnected from the
server.
And I keep getting those "RPC: failed to contact local rpcbind server
(errno 5)" error messages in the logs. I'm switching to regular portmap
instead of rpcbind to see if it helps.
--
Guillaume Rousse
Moyens Informatiques - INRIA Futurs
Tel: 01 69 35 69 62
More information about the NFSv4
mailing list