Peter Schwenke has uploaded this change for review.
NLM: Fix NFS rpc.statd-related hang
Fixes Issue #680
This implements Frank Filz's suggestion in comment
https://github.com/nfs-ganesha/nfs-ganesha/issues/680#issuecomment-2664319185
When the SM_MON and SM_UNMON RPC calls were made to statd
and the threads were exhausted, nfs-ganesha was hanging
on the ssc_mutex lock.
ssc_monitored has been changed from a boolean to an enum
{UNMONITORED, ATTEMPTING, MONITORED}
nsm_monitor_noretry()/nsm_monitor_noretry() first check
ssc_monitored to see if a monitor/unmonitor is being
attempted. If so, we sleep for 1 sec and retry up to 3
times.
Then we check if scc_monitored is already in the state we
want i.e. UNMONITORED/MONITORED. If so, we bail.
Otherwise, we set ssc_monitored to ATTEMPTING and try
the RPC call.
On RPC success, we flip ssc_monitored to UNMONITORED/MONITORED.
On error, we set it back to what it would have been i.e.
MONITORED/UNMONITORED
Change-Id: I0f0814842d806132e6e4441fe0956236e5d914c5
Signed-off-by: Peter Schwenke <pschwenke@ddn.com>
---
M src/Protocols/NLM/nsm.c
M src/SAL/nlm_owner.c
M src/include/sal_data.h
3 files changed, 56 insertions(+), 19 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/44/1214144/1
To view, visit change 1214144. To unsubscribe, or for help writing mail filters, visit settings.