Hi Kaleb,
Setting up of the grace-period is not working in case of active/active setup. we have
tried setting up Grace_Period = 20; but it does not impact the problem.
Setup:
3 Node Ceph Cluster,
Node1: ceph(mon/mds/rgw/osd)
Node2: ceph(mon/mds/rgw/osd)+NFS(Active)
Node3: Ceph(only Mon)+NFS(Active)
Issues is :
I/O operations is not resuming after NFS Node is powered off.
if Node 2(refer above) is powered off, it remains stuck unless powered-on.
if Node3is powered off,it takes around 5 min to resume the I/O.
---
Yes i see the ganesha.log for entering and leaving NFS-GRACE?
14/04/2021 17:50:17 : epoch 6076ddcc : cephnode2 : ganesha.nfsd-1769[main] nfs_start_grace
:STATE :EVENT :NFS Server Now IN GRACE, duration 20
14/04/2021 17:50:23 : epoch 6076ddcc : cephnode2 : ganesha.nfsd-1769[main]
rados_cluster_grace_enforcing :CLIENT ID :EVENT :rados_cluster_grace_enforcing: ret=0
14/04/2021 17:50:23 : epoch 6076ddcc : cephnode2 : ganesha.nfsd-1769[main] nfs_start :NFS
STARTUP :EVENT :-------------------------------------------------
14/04/2021 17:50:34 : epoch 6076ddcc : cephnode2 : ganesha.nfsd-1769[reaper]
nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE
in current state we are only using pcs cluster to failover IP resource, but in case of
active/passive - pcs property set stonith-enabled=false
any standard configuration for this ?