[nfsv4] NFSv4 performance issues on a lagged network
Bryce Harrington
bryce at osdl.org
Fri Sep 8 21:29:25 EDT 2006
On Wed, Aug 30, 2006 at 05:01:30PM -0400, Trond Myklebust wrote:
> On Tue, 2006-08-29 at 17:07 -0400, Trond Myklebust wrote:
> > See http://crucible.osdl.org/runs/1621/sysinfo/nfs05.console
> >
> > It looks like the networking layer is passing back an ENETUNREACH error
> > that we weren't expecting, and consequently weren't handling.
> >
> > Hmm... Not much you can do in those circumstances except delay and then
> > retry. I suppose the same goes for EHOSTUNREACH (which we didn't
> > receive, but could conceivably happen too).
> >
> > I'll look into drafting a patch.
>
> OK... Please could you see if the attached patch has an effect on those
> errors.
>
> Cheers,
> Trond
Hi Trond,
Thanks for the patch. This has the effect of causing iozone to go into
a loop and fail to finish. Here are three runs on this patch with
identical conditions:
http://crucible.osdl.org/runs/1791/test_output/iozone.sys.log
http://crucible.osdl.org/runs/1793/test_output/iozone.sys.log
http://crucible.osdl.org/runs/1794/test_output/iozone.sys.log
All three runs are getting stuck at almost exactly the same spot in the
test.
On the console, the output is:
[http://crucible.osdl.org/runs/1793/sysinfo/nfs05.console]
...
nfs05 login: nfsd: last server has exited
nfsd: unexporting all filesystems
NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery
directory
NFSD: starting 90-second grace period
*** Run 1793: Running 'src/current/iozone -+q 1 -i 0 -i 1 -g 128M -Race
-U /mnt/192.168.10.4 -f /mnt/192.168.10.4/iozone.tmp' ***
nfs: server 192.168.10.4 not responding, still trying
[-- MARK -- Tue Sep 5 23:00:00 2006]
[-- MARK -- Wed Sep 6 00:00:00 2006]
...
Ideas on what to try next?
Bryce
More information about the NFSv4
mailing list