Nishant Puri has uploaded this change for review.

View Change

SAL: recover gracefully from zombie lock-owner so_related_owner mismatch

When a kernel NFSv4 client crashes or its lease expires while holding a
byte-range lock, the lock-owner ID it generated via the IDA allocator
(ida_simple_get) may be recycled for a completely new process. When that
new process opens the same file and requests a lock, the server calls
create_nfs4_owner() and finds an existing STATE_LOCK_OWNER_NFSV4 entry in
ht_nfs4_owner whose opaque owner bytes match -- but whose so_related_owner
points to the old (now-stale) open owner rather than the new one.

Before this change the code unconditionally logged a CRIT message and
returned NULL, which propagated to the client as NFS4ERR_RESOURCE.
Because the client retried indefinitely the file appeared permanently
"zombie locked" from that perspective.

The root cause on the client side is IDA-based lock-owner ID recycling
(fixed in upstream Linux kernels by switching to atomic64_inc_return, but
still present in RHEL 8 / kernel 4.18 based deployments).

The fix distinguishes two cases at the point of mismatch:

1. so_lock_list is non-empty (zombie still holds active POSIX locks):
This is a genuine conflict. Keep the existing CRIT log and return
NULL so the client receives NFS4ERR_RESOURCE and can retry cleanly.

2. so_lock_list is empty (zombie has no active POSIX locks):
The lock owner is a harmless zombie whose POSIX-lock state was
already released (e.g. after LOCKU without a subsequent CLOSE or
FREE_STATEID). Re-associate so_related_owner to the new open owner
instead of failing, allowing the incoming LOCK request to succeed.

Reference counts are maintained correctly: the stale open-owner reference
is decremented and the new one is incremented, both under so_mutex.

Change-Id: I9b76a2582b9066b196b01c75a2777a6dd335ee9a
Signed-off-by: Nishant Puri <npuri@redhat.com>
---
M src/SAL/nfs4_owner.c
1 file changed, 33 insertions(+), 7 deletions(-)

git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/16/1238516/1

To view, visit change 1238516. To unsubscribe, or for help writing mail filters, visit settings.

Gerrit-MessageType: newchange
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I9b76a2582b9066b196b01c75a2777a6dd335ee9a
Gerrit-Change-Number: 1238516
Gerrit-PatchSet: 1
Gerrit-Owner: Nishant Puri <npuri@redhat.com>