Nishant Puri has uploaded this change for review. (
https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/1238516?usp=email )
Change subject: SAL: recover gracefully from zombie lock-owner so_related_owner mismatch
......................................................................
SAL: recover gracefully from zombie lock-owner so_related_owner mismatch
When a kernel NFSv4 client crashes or its lease expires while holding a
byte-range lock, the lock-owner ID it generated via the IDA allocator
(ida_simple_get) may be recycled for a completely new process. When that
new process opens the same file and requests a lock, the server calls
create_nfs4_owner() and finds an existing STATE_LOCK_OWNER_NFSV4 entry in
ht_nfs4_owner whose opaque owner bytes match -- but whose so_related_owner
points to the old (now-stale) open owner rather than the new one.
Before this change the code unconditionally logged a CRIT message and
returned NULL, which propagated to the client as NFS4ERR_RESOURCE.
Because the client retried indefinitely the file appeared permanently
"zombie locked" from that perspective.
The root cause on the client side is IDA-based lock-owner ID recycling
(fixed in upstream Linux kernels by switching to atomic64_inc_return, but
still present in RHEL 8 / kernel 4.18 based deployments).
The fix distinguishes two cases at the point of mismatch:
1. so_lock_list is non-empty (zombie still holds active POSIX locks):
This is a genuine conflict. Keep the existing CRIT log and return
NULL so the client receives NFS4ERR_RESOURCE and can retry cleanly.
2. so_lock_list is empty (zombie has no active POSIX locks):
The lock owner is a harmless zombie whose POSIX-lock state was
already released (e.g. after LOCKU without a subsequent CLOSE or
FREE_STATEID). Re-associate so_related_owner to the new open owner
instead of failing, allowing the incoming LOCK request to succeed.
Reference counts are maintained correctly: the stale open-owner reference
is decremented and the new one is incremented, both under so_mutex.
Change-Id: I9b76a2582b9066b196b01c75a2777a6dd335ee9a
Signed-off-by: Nishant Puri <npuri(a)redhat.com>
---
M src/SAL/nfs4_owner.c
1 file changed, 33 insertions(+), 7 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/16/1238516/1
--
To view, visit
https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/1238516?usp=email
To unsubscribe, or for help writing mail filters, visit
https://review.gerrithub.io/settings?usp=email
Gerrit-MessageType: newchange
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I9b76a2582b9066b196b01c75a2777a6dd335ee9a
Gerrit-Change-Number: 1238516
Gerrit-PatchSet: 1
Gerrit-Owner: Nishant Puri <npuri(a)redhat.com>