[NFS-Ganesha-Support] unavailability of NFSv3

Sunday, 7 October 2018

Hi all,

We are using nfs-ganesha via the IBM Spectrum Scale protocol setup, currently consisting
of 2 servers and around 50 clients.
After a couple of months of smooth run, we started to experience (already twice in three
days) a critical issue that consists in all clients
not being able to mount the filesystem.

When the issue happens this is what we see via rpcinfo on the server:

[root@server ~]# rpcinfo -s
   program version(s) netid(s)                         service     owner
    100000  2,3,4     local,udp,tcp,udp6,tcp6          portmapper  superuser
    100024  1         tcp6,udp6,tcp,udp                status      29
    100003  4,3       tcp6,tcp,udp6,udp                nfs         superuser
    100005  3,1       tcp6,tcp,udp6,udp                mountd      superuser
    100021  4         tcp6,tcp,udp6,udp                nlockmgr    superuser
    100011  2,1       tcp6,tcp,udp6,udp                rquotad     superuser
[root@host ~]# rpcinfo -T tcp localhost 100003 3
rpcinfo: RPC: Timed out

The logs are set to EVENT and when the issue starts, ganesha.log gets full of lines like
the following:

2018-10-04 16:40:45 : epoch 0002003d : server : gpfs.ganesha.nfsd-40452[State_Async]
nlm_send_async :NLM :MAJ :Cannot create NLM async tcp connection to client
::ffff:129.129.117.65
2018-10-04 16:40:45 : epoch 0002003d : server: gpfs.ganesha.nfsd-40452[State_Async]
nlm4_send_grant_msg :NLM :MAJ :GRANTED_MSG RPC call failed with return code -1. Removing
the blocking lock

The nfs-ganesha version is 2.5.3 even if that’s the ibm version so I am not sure what are
the changes.

I was wondering if someone on the mailing list had an idea of what direction to take to
investigate this further.

Thanks,
Ivano

2025

2024

2023

2022

2021

2020

2019

2018

[NFS-Ganesha-Support] unavailability of NFSv3