Hmm, ok, I thought I considered all the cases where reachability was an issue...
I think actually the correct fix is to re-arrange things so we don't drop the sentinel
ref until after cleaning the entry out of the export list... The sentinel ref should
remain until the object is truly unreachable.
Frank
-----Original Message-----
From: Ashish Sangwan [mailto:ashishsangwan2@gmail.com]
Sent: Wednesday, April 22, 2020 11:25 AM
To: devel(a)lists.nfs-ganesha.org; Daniel Gryniewicz <dang(a)redhat.com>;
ffilzlnx(a)mindspring.com
Subject: [NFS-Ganesha-Devel] mdcache unexport race with Rmdir/Unlink
It seems that mdcache unexport can race with remove RPC.
Remove operation drops the sentinel ref and hence when the last call to
mdcache_put() happen during the execution of remove RPC, the corresponding
mdcache_entry will be freed. If an export remove comes in while the remove OP
is still executing, in mdcache_unexport() when we call mdcache_lru_ref(entry,
LRU_REQ_INITIAL) we can end up taking a ref on the entry whose refcnt has
already reached 0 and which is currently going through mdcache_lru_clean()
execution in the context of remove. The entry is still present in the
mdcache_fsal_export's entry_list because mdcache_lru_clean() has not yet
called the mdc_clean_entry(). Now when do mdcache_put() for this entry later in
mdcache_unexport(), by this time the entry is already freed and we end up
executing mdcache_lru_clean() on a freed entry.
What we need to do inside mdcache_unexport() is to take QLOCK() and check
refcount, if it's 0, continue the for loop for next entry, if it's not 0 than
increment the ref count, drop the QLOCK and continue with normal current
execution path.
Does it makes sense?
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe send an email to
devel-leave(a)lists.nfs-ganesha.org