problem during configuring High Availability NFS Server
Sadananda Tripathy
tripathy at noida.atrenta.com
Tue Mar 13 04:33:51 EDT 2007
Hi,
I have faced problem during configuring High Availability (DRBD +
Heartbeat) NFS Server.
*Problem in brief: *
When active NFS server goes down another server becomes active(As it is
a Active Passive clustering), the new requests for the export area works
fine but the clients which are connected to the previous server gives
the error like "Stale NFS file handle".
Again when I down the current active server and make the previous
server as active, then the clients which are giving "Stale NFS file
handle" error, starts working fine. But the clients which are connected
to the other server starts giving "Stale NFS file handle" error.
*My setup is as bellow...*
OS: Red Hat Enterprise Linux 4 (Update 1) [kernel 2.6.9-11.ELsmp]
DRBD version: drbd-0.7.23
Heartbeat: heartbeat-2.0.8
NFS: nfs-utils-1.0.6-46
DRBD volume: /dev/drbd0 mounted on /home
Heartbeat configuration file:
# cat /usr/local/etc/ha.d/haresources
drbd2 drbddisk::r0 Filesystem::/dev/drbd0::/home::xfs 192.168.2.209 nfs
nfslock
DRBD with Heartbeat is working fine. (i.e. when one node goes down the
other node become active. It mounts the DRBD volume on /home and set
the service IP address (192.168.2.209) on eth0:0)
*Problem description in details:*
I have moved nfs folder from /var/lib to /home (# cd /var/lib; mv nfs
/home/)
And make a link of the same at /var/lib. (# cd /var/lib; ln --s
/home/nfs nfs)
As /home area is /dev/drbd0( i.e. it is in sync in both nodes),
/var/lib/nfs is same for both the nodes.
I have two nodes Named drbd1 and drbd2. Both have same configuration as
above .
At drbd1:
[root at drbd1 ~]# df -lh
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 487M 158M 304M 35% /
none 501M 0 501M 0% /dev/shm
/dev/sda5 2.0G 36M 1.8G 2% /tmp
/dev/sda3 8.7G 3.8G 4.5G 46% /usr
/dev/sda6 2.0G 92M 1.8G 5% /var
/dev/drbd0 60G 3.7M 60G 1% /home
#cd /var/lib
[root at drbd1 lib]# ls -l nfs
lrwxrwxrwx 1 root root 9 Mar 9 14:23 nfs -> /home/nfs
[root at drbd1 lib]# ls -l nfs/
total 12
-rw-r--r-- 1 root root 139 Mar 13 12:46 etab
-rw-r--r-- 1 root root 98 Mar 13 04:03 rmtab
drwxr-xr-x 7 root root 0 Mar 13 12:46 rpc_pipefs
-rw-r--r-- 1 root root 0 Mar 9 20:01 stat
drwx------ 4 rpcuser rpcuser 40 Nov 30 2004 statd
-rw------- 1 root root 0 Nov 30 2004 state
-rw-r--r-- 1 root root 0 Nov 30 2004 xtab
From a client I mount the /home Area of the cluster (when drbd1 is
active).
# mount 192.168.2.209: /home /mnt
#ls --l /mnt
total 4
drwxr-xr-x 2 root root 47 Mar 7 18:14 ldap
drwxr-xr-x 4 root root 92 Mar 13 12:46 nfs
drwxrwxrwx 3 root root 25 Mar 9 20:40 test
Now I have down the active cluster server (drbd1), then other server
(drbd2) becomes active.
[root at drbd2 src]# df -lh
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 587M 207M 351M 38% /
none 501M 0 501M 0% /dev/shm
/dev/sda5 2.0G 36M 1.8G 2% /tmp
/dev/sda3 8.7G 5.9G 2.4G 72% /usr
/dev/sda6 2.0G 202M 1.7G 11% /var
/dev/drbd0 60G 3.7M 60G 1% /home
#cd /var/lib
[root at drbd2 lib]# ls -l nfs
lrwxrwxrwx 1 root root 9 Mar 9 13:42 nfs -> /home/nfs
[root at drbd2 lib]# ls -l nfs/
total 12
-rw-r--r-- 1 root root 139 Mar 13 13:04 etab
-rw-r--r-- 1 root root 98 Mar 13 04:03 rmtab
drwxr-xr-x 7 root root 0 Mar 8 19:12 rpc_pipefs
-rw-r--r-- 1 root root 0 Mar 9 20:01 stat
drwx------ 4 rpcuser rpcuser 40 Nov 30 2004 statd
-rw------- 1 root root 0 Nov 30 2004 state
-rw-r--r-- 1 root root 0 Nov 30 2004 xtab
But from the client when I try to access my mounted area, I get the
following error.
# ls --l /mnt
ls: /mnt: Stale NFS file handle
But Please note If I mount the area again (i.e. #mount
192.168.2.209:/home /mnt) then it works fine.
Or if I down the drbd2 server , then drbd1 server becomes active and the
client can continue the old access also.
#ls --l /mnt
total 4
drwxr-xr-x 2 root root 47 Mar 7 18:14 ldap
drwxr-xr-x 4 root root 92 Mar 13 12:46 nfs
drwxrwxrwx 3 root root 25 Mar 9 20:40 test
If any one have any clue for the above problem please help me.
Thanks and Regards,
Sadananda
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://linux-nfs.org/pipermail/nfsv4/attachments/20070313/8f68ebb0/attachment.htm
More information about the NFSv4
mailing list