Hello
We have a shared cephfs volume exported with nfs-ganesha in two nodes.
oot@c51a ~]$ ceph fs status
esx - 2 clients
===
RANK STATE MDS ACTIVITY DNS INOS
0 active esx.c51b.cyxkod Reqs: 0 /s 33 32
POOL TYPE USED AVAIL
cephfs.esx.meta metadata 530M 23.4T
cephfs.esx.data data 148G 23.4T
STANDBY MDS
esx.c51a.hfqjyo
ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)
[root@c51a ~]$ ceph nfs cluster info
{
"esx": [
{
"hostname": "c51a",
"ip": [
"172.16.8.160"
],
"port": 2049
},
{
"hostname": "c51b",
"ip": [
"172.16.8.161"
],
"port": 2049
}
]
}
[root@c51a ~]$ ceph nfs export ls esx --detailed
[
{
"export_id": 1,
"path": "/",
"cluster_id": "esx",
"pseudo": "/ceph",
"access_type": "RW",
"squash": "no_root_squash",
"security_label": true,
"protocols": [
4
],
"transports": [
"TCP"
],
"fsal": {
"name": "CEPH",
"user_id": "esx1",
"fs_name": "esx",
"sec_label_xattr": ""
},
"clients": []
}
]
In both nodes port 2049 listens on all ip's
[root@c51a ~]$ netstat -tulnp | grep 2049
tcp6 0 0 :::2049 :::* LISTEN
1511372/ganesha.nfs
udp6 0 0 :::2049 :::*
1511372/ganesha.nfs
We have a balanced service ip with keepalived.
If we mount that nfs in a linux client and we switch the ip from one node to another it
continues listing the content of the mount without problems, when it switches the ip there
is a small delay in the ls but it ends up reconnecting.
However in the ESX when adding a new datastore of type NFS it adds it correctly but when
the ip switches it loses the datastore and it is no longer able to reconnect giving the
following error:
2020-12-09T13:35:05.472Z cpu34:2099634)WARNING: NFS41: NFS41FSAPDNotify:6100: Lost
connection to the server 172.16.1.222 mount point nfs_c51, mounted as
39a1079b-e140bb96-0000-000000000000 ("/ceph")
2020-12-09T13:35:05.474Z cpu34:2099632)WARNING: NFS41: NFS41ProcessExidResult:2460:
Cluster Mismatch due to different server Major or scope. Probable server bug. Remount data
store to access
We do not know if it is a problem in the configuration of Ganesha or in the implementation
of the ESX NFS client since a NFS client of a linux distribution that is not ESX is able
to reconnect with the host that has the service ip.
Thank you for your help.
Regards,
Roberto