The NFS-Ganesha project is happy to announce a new stable version of
Ganesha and nTiRPC: Version 3.4.
This is the latest version in our stable release branch. It contains
over 80 bug fixes and improvements.
We have a shared cephfs volume exported with nfs-ganesha in two nodes.
oot@c51a ~]$ ceph fs status
esx - 2 clients
RANK STATE MDS ACTIVITY DNS INOS
0 active esx.c51b.cyxkod Reqs: 0 /s 33 32
POOL TYPE USED AVAIL
cephfs.esx.meta metadata 530M 23.4T
cephfs.esx.data data 148G 23.4T
ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)
[root@c51a ~]$ ceph nfs cluster info
[root@c51a ~]$ ceph nfs export ls esx --detailed
In both nodes port 2049 listens on all ip's
[root@c51a ~]$ netstat -tulnp | grep 2049
tcp6 0 0 :::2049 :::* LISTEN 1511372/ganesha.nfs
udp6 0 0 :::2049 :::* 1511372/ganesha.nfs
We have a balanced service ip with keepalived.
If we mount that nfs in a linux client and we switch the ip from one node to another it continues listing the content of the mount without problems, when it switches the ip there is a small delay in the ls but it ends up reconnecting.
However in the ESX when adding a new datastore of type NFS it adds it correctly but when the ip switches it loses the datastore and it is no longer able to reconnect giving the following error:
2020-12-09T13:35:05.472Z cpu34:2099634)WARNING: NFS41: NFS41FSAPDNotify:6100: Lost connection to the server 172.16.1.222 mount point nfs_c51, mounted as 39a1079b-e140bb96-0000-000000000000 ("/ceph")
2020-12-09T13:35:05.474Z cpu34:2099632)WARNING: NFS41: NFS41ProcessExidResult:2460: Cluster Mismatch due to different server Major or scope. Probable server bug. Remount data store to access
We do not know if it is a problem in the configuration of Ganesha or in the implementation of the ESX NFS client since a NFS client of a linux distribution that is not ESX is able to reconnect with the host that has the service ip.
Thank you for your help.
We are using nfs-ganesha with gluster using a two node ubuntu cluster. There are around 3 to 10 NFS clients connecting to nfs-ganesha. When any of the NFS client, starts reading a lot of files on the mounted storage, we see that the memory utilization increases very quickly and does not release the memory. The memory keep increasing causing the OOM killer to kill the process or the linux server reboots.
I have tried to lot of options under MD-Cache but it has not worked. Any suggestions/help to fix this issue would be hugely appreciated. Please let me know if additional information is needed.
Note : The issue at : https://github.com/nfs-ganesha/nfs-ganesha/issues/116 was the closest I could find similar to my issue but am not sure if the fix for this is in GA released version.
The memory usage on ganesha is as below,
Files Read - memory usage
150K - 923508 kB
200K - 1418032 kB
250K - 2161752 kB
300K - 2926580 kB
400K - 4457200 kB
Version info :
NFS-ganesha version - 3.3-ubuntu1~bionic6
Gluster version - 8.2-ubuntu1~bionic1
Ganesha config file :
Protocols = 3, 4;
mount_path_pseudo = true;
MNT_Port = 38465;
NLM_Port = 38467;
FD_HWMark_Percent = 60;
Reaper_Work_Per_Lane = 5000;
# Export Id (mandatory, each EXPORT must have a unique Export_Id)
Export_Id = 77;
Transports = TCP, UDP;
# Exported path (mandatory)
Path = "/storage";
# Pseudo Path (required for NFS v4)
Pseudo = "/storage";
# Required for access (default is None)
# Could use CLIENT blocks instead
Access_Type = RW;
# Allow root access
Squash = No_Root_Squash;
# Security flavor supported
SecType = "sys";
# Exporting FSAL
Name = "GLUSTER";
Hostname = localhost;
Volume = "storage";
Up_poll_usec = 10; # Upcall poll interval in microseconds
Transport = tcp; # tcp or rdma