[NFS-Ganesha-Support] Re: NFSv3 active-passive cluster with FSAL CephFS and keepalived - Stale file handle after restart and also failover

Thursday, 4 February 2021

On Tue, 2021-02-02 at 18:16 +0000, rainer.stumbaum(a)gmail.com wrote:
...
 Hi Jeff,

 so the minimal config would be like this:
 - CephFS
 - two NFS-Ganesha nodes (node a: 10.20.56.240, node b:10.20.56.241) with keepalived
installed and a VIP of 10.20.56.2
 - One client able to mount NFS shares in that 10.20.56.0/24 network

 Test 1:
 Mount a NFS share:
 mount -o
ro,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,acregmin=600,acregmax=600,acdirmin=600,acdirmax=600,hard,nocto,nolock,noacl,proto=tcp,port=2049,timeo=100,retrans=360,sec=sys
10.20.56.2:/vol/test /tmp/nfs-test1

 Failover from a to b by rebooting. That should work.

 Test 2:
 Create a snapshot on the Ceph node:
 cd /vol/test ; mkdir .snap/12345
 Mount the snapshot:
 mount -o
ro,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,acregmin=600,acregmax=600,acdirmin=600,acdirmax=600,hard,nocto,nolock,noacl,proto=tcp,port=2049,timeo=100,retrans=360,sec=sys
10.20.56.2:/vol/test/.snap/12345 /tmp/nfs-test2
 ls /tmp/nfs-test2
 ...should work
 Failover from a to b by rebooting
 ls /tmp/nfs-test2 should give the ESTALE. 

Thanks. I was able to come up with one that was a bit easier with NFSv4:

- mount the mount
- create a snapshot: mkdir /mnt/ganesha/cephfs/.snap/1
- tail -f /mnt/ganesha/cephfs/.snap/1/testfile
- kill ganesha and restart it

...at that point, the tail command gets back ESTALE when it tries to
recover the stateid. I have a PR that adds a new interface to libcephfs:

    https://github.com/ceph/ceph/pull/39294

...once that's merged I'll submit the corresponding patch for ganesha
(it's pretty small):

https://github.com/jtlayton/nfs-ganesha/commit/6f297bc33062de841c7c43e6be...

Feel free to test these out if you're able to hand-build ceph and
ganesha, otherwise they should trickle out to distros eventually.

Cheers,
-- 
Jeff Layton <jlayton(a)poochiereds.net&gt;

2025

2024

2023

2022

2021

2020

2019

2018

[NFS-Ganesha-Support] Re: NFSv3 active-passive cluster with FSAL CephFS and keepalived - Stale file handle after restart and also failover