state recovery failed on NFSv4 server with error 2

Thomas Garner thomas536 at gmail.com
Mon Mar 17 01:00:48 EDT 2008


I have a Nexenta file server (1.0RC2 --- b80), serving my home
directory over nfs4 to several linux machines running Debian.  On my
main machine (currently stock debian kernel 2.6.18-6-k7, but also had
the same problem under 2.6.23), after anywhere from 3 minutes to
several days, its load average will start creeping up as the following
messages fill up syslog at 3k per second:

Error: state recovery failed on NFSv4 server 192.168.0.10 with error 2

Searching around the web has shed little light onto this issue.  While
repeatable (mostly via a combination of Firefox and JEdit), I have as
of yet to determine a surefire way to get it to start, nor get it to
stop.  Tweaking the nfs mount options, I am now able to force umount
the nfs mount with some luck, but it puts only the tiniest bandaid on
the problem, as normally it will quickly revert into the failed
recovery state (but does tend to ward off a hard reboot for a little
longer).  It's getting exceptionally frustrating to have to kill X and
every process with an open file on the mount, or worse to have to do
an emergency sync/remount ro/reboot.

Looking forward to debugging and resolving,
Thomas


More information about the NFSv4 mailing list