Hi Daniel,
I agree with your thought, but the implemation of FSAL_CEPH seems to be different with
yours. Here is the design of it.
https://www.mankier.com/8/ganesha-rados-cluster-design
I also checked the code. When one of servers enter a grace period, it will call
rados_cluster_read_clids() to read culster wide variable C and R. Only if R==0, the whole
culster enter grace period. If so, the cluster goes grace period when reboot, not when
crash. It also seems no interface for pacemaker/corosync to get information and put the
cluster into grace period.
Thanks,
Marvin