I've been experimenting with an HA NFS configuration using pacemaker and nfs-ganesha.
I've noticed that after a failover event, it takes about five minutes for clients to
recover, and that seems to be independent of the settings of Lease_lifetime and
Grace_period. Client recovery also doesn't seem to correspond to the ":NFS Server
Now NOT IN GRACE " message in the ganesha.log. Is this normal behavior?
My pacemaker configuration looks like:
Full List of Resources:
* Resource Group: nfs:
* nfsd (systemd:nfs-ganesha): Started nfs2.storage
* nfs_vip (ocf::heartbeat:IPaddr2): Started nfs2.storage
And the ganesha configuration looks like:
NFS_CORE_PARAM
{
Enable_NLM = false;
Enable_RQUOTA = false;
Protocols = 4;
}
NFSv4
{
RecoveryBackend = rados_ng;
Minor_Versions = 1,2;
# From
https://www.suse.com/support/kb/doc/?id=000019374
Lease_Lifetime = 10;
Grace_Period = 20;
}
MDCACHE {
# Size the dirent cache down as small as possible.
Dir_Chunk = 0;
}
EXPORT
{
Export_ID=100;
Protocols = 4;
Transports = TCP;
Path = /;
Pseudo = /data;
Access_Type = RW;
Attr_Expiration_Time = 0;
Squash = none;
FSAL {
Name = CEPH;
Filesystem = "tank";
User_Id = "nfs";
}
}
RADOS_KV
{
UserId = "nfsmeta";
pool = "cephfs.tank.meta";
namespace = "ganesha";
}