[NFS-Ganesha-Support] HA failover in less than five minutes?

Friday, 16 July 2021

I've been experimenting with an HA NFS configuration using pacemaker and nfs-ganesha.
I've noticed that after a failover event, it takes about five minutes for clients to
recover, and that seems to be independent of the settings of Lease_lifetime and
Grace_period. Client recovery also doesn't seem to correspond to the ":NFS Server
Now NOT IN GRACE " message in the ganesha.log. Is this normal behavior?

My pacemaker configuration looks like:

    Full List of Resources:
      * Resource Group: nfs:
        * nfsd      (systemd:nfs-ganesha):   Started nfs2.storage
        * nfs_vip   (ocf::heartbeat:IPaddr2):        Started nfs2.storage

And the ganesha configuration looks like:

    NFS_CORE_PARAM
    {
            Enable_NLM = false;
            Enable_RQUOTA = false;
            Protocols = 4;
    }

    NFSv4
    {
            RecoveryBackend = rados_ng;
            Minor_Versions =  1,2;

            # From https://www.suse.com/support/kb/doc/?id=000019374
            Lease_Lifetime = 10;
            Grace_Period = 20;
    }

    MDCACHE {
            # Size the dirent cache down as small as possible.
            Dir_Chunk = 0;
    }

    EXPORT
    {
            Export_ID=100;
            Protocols = 4;
            Transports = TCP;
            Path = /;
            Pseudo = /data;
            Access_Type = RW;
            Attr_Expiration_Time = 0;
            Squash = none;

            FSAL {
                    Name = CEPH;
                    Filesystem = "tank";
                    User_Id = "nfs";
            }
    }

    RADOS_KV
    {
            UserId = "nfsmeta";
            pool = "cephfs.tank.meta";
            namespace = "ganesha";
    }

2026

2025

2024

2023

2022

2021

2020

2019

2018

[NFS-Ganesha-Support] HA failover in less than five minutes?