Hello,
In our environment of Ceph Cluster(version 15.2.7) we are trying to use NFS HA Mode.Facing
certain issues in the same as below:
"Active/Passive HA NFS Cluster"
When we are using Active/Passive HA Config for NFS Server using Corosync/Pacemekar:
1. configuration is done and we are able to perform fail-over, but when an active
node is tested with power-off two scenarios are observed:
1.1 : I/O operations gets stuck until the node is powered on although the
handover from active to other standby node happens immediately once the node is
powered-off. All the existing requests are stuck.
1.2 : from other client if we try to check for the heartbeat of mount-point,
it is also stuck for the same duration.
1.3 from the new client creating a new mount to the same subvolume works
fine.
Issues/Concern:
I/O operations should resume just after the Failover happens.We are not able to achive
this state, Can anyone please help in supporting any known
configuration/solution/work-around that can be done done at NFS-Ganesha level to achieve
healthy NFS HA Mode.
Just a Note:
mount points using Ceph's native FS driver works fine in the same shutdown/poweroff
scenarios.
Ceph version: 15.2.7
NFS Ganesha : 3.3
Ganesha Conf:
- NFS Node 1:
[ansible@cephnode2 ~]$ cat /etc/ganesha/ganesha.conf
# Please do not change this file directly since it is managed by Ansible and will be
overwritten
NFS_Core_Param
{
Enable_NLM = false;
Enable_RQUOTA = false;
Protocols = 3,4;
}
EXPORT_DEFAULTS {
Attr_Expiration_Time = 0;
}
CACHEINODE {
Dir_Chunk = 0;
NParts = 1;
Cache_Size = 1;
}
RADOS_URLS {
ceph_conf = '/etc/ceph/ceph.conf';
userid = "admin";
watch_url = "rados://nfs_ganesha/ganesha-export/conf-cephnode2";
}
NFSv4 {
RecoveryBackend = 'rados_cluster';
Lease_Lifetime = 10;
Grace_Period = 20;
}
RADOS_KV {
ceph_conf = '/etc/ceph/ceph.conf';
userid = "admin";
pool = "nfs_ganesha";
namespace = "ganesha-grace";
nodeid = "cephnode2";
}
%url rados://nfs_ganesha/ganesha-export/conf-cephnode2
LOG {
Facility {
name = FILE;
destination = "/var/log/ganesha/ganesha.log";
enable = active;
}
}
EXPORT
{
Export_id=20235;
Path = "/volumes/hns/conf/bb21b7c7-c663-40e9-ad11-a61441e6f77f";
Pseudo = /conf;
Access_Type = RW;
Protocols = 3,4;
Transports = TCP;
SecType = sys,krb5,krb5i,krb5p;
Squash = No_Root_Squash;
Attr_Expiration_Time = 0;
FSAL {
Name = CEPH;
User_Id = "admin";
}
}
EXPORT
{
Export_id=20236;
Path = "/volumes/hns/opr/138304ca-a70d-4962-9754-b572bce196b6";
Pseudo = /opr;
Access_Type = RW;
Protocols = 3,4;
Transports = TCP;
SecType = sys,krb5,krb5i,krb5p;
Squash = No_Root_Squash;
Attr_Expiration_Time = 0;
FSAL {
Name = CEPH;
User_Id = "admin";
}
}
# NFS Node 2:
[ansible@cephnode3 ~]$ cat /etc/ganesha/ganesha.conf
# Please do not change this file directly since it is managed by Ansible and will be
overwritten
NFS_Core_Param
{
Enable_NLM = false;
Enable_RQUOTA = false;
Protocols = 3,4;
}
EXPORT_DEFAULTS {
Attr_Expiration_Time = 0;
}
CACHEINODE {
Dir_Chunk = 0;
NParts = 1;
Cache_Size = 1;
}
RADOS_URLS {
ceph_conf = '/etc/ceph/ceph.conf';
userid = "admin";
watch_url = "rados://nfs_ganesha/ganesha-export/conf-cephnode3";
}
NFSv4 {
RecoveryBackend = 'rados_cluster';
Lease_Lifetime = 10;
Grace_Period = 20;
}
RADOS_KV {
ceph_conf = '/etc/ceph/ceph.conf';
userid = "admin";
pool = "nfs_ganesha";
namespace = "ganesha-grace";
nodeid = "cephnode3";
}
%url rados://nfs_ganesha/ganesha-export/conf-cephnode3
LOG {
Facility {
name = FILE;
destination = "/var/log/ganesha/ganesha.log";
enable = active;
}
}
EXPORT
{
Export_id=20235;
Path = "/volumes/hns/conf/bb21b7c7-c663-40e9-ad11-a61441e6f77f";
Pseudo = /conf;
Access_Type = RW;
Protocols = 3,4;
Transports = TCP;
SecType = sys,krb5,krb5i,krb5p;
Squash = No_Root_Squash;
Attr_Expiration_Time = 0;
FSAL {
Name = CEPH;
User_Id = "admin";
}
}
EXPORT
{
Export_id=20236;
Path = "/volumes/hns/opr/138304ca-a70d-4962-9754-b572bce196b6";
Pseudo = /opr;
Access_Type = RW;
Protocols = 3,4;
Transports = TCP;
SecType = sys,krb5,krb5i,krb5p;
Squash = No_Root_Squash;
Attr_Expiration_Time = 0;
FSAL {
Name = CEPH;
User_Id = "admin";
}
}
## Mount Commands at client side:
sudo mount -t nfs -o nfsvers=4.1,proto=tcp 10.0.4.14:/conf /mnt/nfsconf
where 10.0.4.14 is the floating IP.