We are using nfs-ganesha via the IBM Spectrum Scale protocol setup, currently consisting of 2 servers and around 50 clients.
After a couple of months of smooth run, we started to experience (already twice in three days) a critical issue that consists in all clients
not being able to mount the filesystem.
When the issue happens this is what we see via rpcinfo on the server:
[root@server ~]# rpcinfo -s
program version(s) netid(s) service owner
100000 2,3,4 local,udp,tcp,udp6,tcp6 portmapper superuser
100024 1 tcp6,udp6,tcp,udp status 29
100003 4,3 tcp6,tcp,udp6,udp nfs superuser
100005 3,1 tcp6,tcp,udp6,udp mountd superuser
100021 4 tcp6,tcp,udp6,udp nlockmgr superuser
100011 2,1 tcp6,tcp,udp6,udp rquotad superuser
[root@host ~]# rpcinfo -T tcp localhost 100003 3
rpcinfo: RPC: Timed out
The logs are set to EVENT and when the issue starts, ganesha.log gets full of lines like the following:
2018-10-04 16:40:45 : epoch 0002003d : server : gpfs.ganesha.nfsd-40452[State_Async] nlm_send_async :NLM :MAJ :Cannot create NLM async tcp connection to client ::ffff:126.96.36.199
2018-10-04 16:40:45 : epoch 0002003d : server: gpfs.ganesha.nfsd-40452[State_Async] nlm4_send_grant_msg :NLM :MAJ :GRANTED_MSG RPC call failed with return code -1. Removing the blocking lock
The nfs-ganesha version is 2.5.3 even if that’s the ibm version so I am not sure what are the changes.
I was wondering if someone on the mailing list had an idea of what direction to take to investigate this further.
I’m running into an interesting problem, and since I haven’t seen anything similar on the mailing or issues list I thought I would see if there’s something I’m missing. Here’s my setup:
1. CentOS 7 server running GlusterFS 4.1.4 and NFS-Ganesha 2.6.3
2. Windows 10 Pro system with nfs-client installed. Mounting the external NFS drive on the system (v3), and have made uid and guid changes in the registry to match the uid for the share.
With this setup, I’m able to copy most files from the windows system to the share without a problem. However, if the file on windows has the ‘Read-Only’ attribute set, the copy fails with a permission denied error, and looking at the Wireshark trace I’m seeing an NFS3ERR_ACCES error. Is there something I can set in the NFS-Ganesha config file to resolve this issue?
Group Lead / Principal Software Engineer
The MITRE Corporation
Phone: (781)271-2517 Email: srjones(a)mitre.org<mailto:firstname.lastname@example.org>