Change in ...nfs-ganesha[next]: Allow EXPORT pseudo path to be changed during export update
by Frank Filz (GerritHub)
Frank Filz has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490334 )
Change subject: Allow EXPORT pseudo path to be changed during export update
......................................................................
Allow EXPORT pseudo path to be changed during export update
This also fully allows adding or removing NFSv4 support from an export
since we can now handle the PseudoFS swizzing that occurs.
Note that an explicit PseudoFS export may be removed or added, though
you can not change it from export_id 0 because we currently don't allow
changing the export_id.
Note that this patch doesn't handle DBUS add or remove export though
that is an option to improve. I may add them to this patch (it wouldn't
be that hard) but I want to get this reviewed as is right now.
There are implications to a client of changing the PseudoFS. I have
tested moving an export in the PseudoFS with a client mounted. The
client will be able to continue accessing the export, though it may
see an ESTALE error if it navigates out of the export. The current
working directory will go bad and the pwd comment will fail indicating
a disconnected mount. I have also seen referencing .. from the root of
the export wrapping around back to the root (I believe this is how
disconnected mounts are set up).
FSAL_PSEUDO lookups and create handles (PUTFH or any use of an NFSv3
handle where the inode isn't cached) which fail during an export update
are instead turned into ERR_FSAL_DELAY which turns into NFS4ERR_DELAY or
NFS3ERR_JUKEBOX to force the client to retry under the completed update.
Change-Id: I507dc17a651936936de82303ff1291677ce136be
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
---
M src/FSAL/FSAL_PSEUDO/handle.c
M src/MainNFSD/libganesha_nfsd.ver
M src/Protocols/NFS/nfs4_pseudo.c
M src/include/export_mgr.h
M src/include/nfs_proto_functions.h
M src/support/export_mgr.c
M src/support/exports.c
7 files changed, 560 insertions(+), 203 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/34/490334/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490334
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I507dc17a651936936de82303ff1291677ce136be
Gerrit-Change-Number: 490334
Gerrit-PatchSet: 1
Gerrit-Owner: Frank Filz <ffilzlnx(a)mindspring.com>
Gerrit-MessageType: newchange
10 months
Monitoring in Ganesha?
by Bjorn Leffler
Apart from the counters that you can access through dbus, is there any
other monitoring built into Ganesha?
I'm thinking of adding it with this higher level plan:
- Exporting metrics from Ganesha to Prometheus.
- Aggregate data in Prometheus.
- Display monitoring consoles and graphs with Grafana.
- Package up Prometheus, Grafana and the preconfigured rules/dashboards as
a Docker image.
- This makes it straightforward to deploy monitoring alongside a Gansha
binary.
My rough coding plan for the code is to:
- Add a USE_MONITORING directive to the CMakeLists.txt files.
- Add a build dependency to the Prometheus C client
<https://github.com/digitalocean/prometheus-client-c>.
- Create a src/monitoring directory for the new source files and templates.
- Increment counters and timers throughout the code.
- Use histograms to compute latency percentiles, heatmaps, etc.
Is this a good idea? Any objections or suggestions?
Thanks,
Bjorn
3 years
Change in ...nfs-ganesha[next]: v4_recovery: off-by-one error, v4_recov_dir is truncated
by Kaleb KEITHLEY (GerritHub)
Kaleb KEITHLEY has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/515560 )
Change subject: v4_recovery: off-by-one error, v4_recov_dir is truncated
......................................................................
v4_recovery: off-by-one error, v4_recov_dir is truncated
noticed log entries with invalid path name, e.g.
... fs_rm_revoked_handles :CLIENT ID :EVENT :opendir
/var/lib/nfs/ganesha/v4reco/::ffff:192.168.122.102...
and
... fs_rm_clid_impl :CLIENT ID :EVENT :Failed to remove
client recovery dir (/var/lib/nfs/ganesha/v4reco/::ffff:192...
Inspection of running process shows that v4_recov_dir_len is 27,
but "/var/lib/nfs/ganesha/v4recov" is 28 chars in length. As a
result, at lines 375-376 of fs_recovery.c (in 3.x) one char too
few are copied from the initial parent_path (i.e. v4_recov_dir)
when called from fs_rm_clid().
And at line 892 it kludges around the mistake by copying
v4_recov_dir_len + 1 chars.
Also change to unsigned int for v4_recov_dir_len and v4_old_dir_len
which can never be negative. (Could have used size_t I suppose.)
Change-Id: If25ef835ea459b3e8c796e38fc940890f27b58a7
Signed-off-by: Kaleb S. KEITHLEY <kkeithle(a)redhat.com>
---
M src/SAL/recovery/recovery_fs.c
1 file changed, 5 insertions(+), 5 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/60/515560/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/515560
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: If25ef835ea459b3e8c796e38fc940890f27b58a7
Gerrit-Change-Number: 515560
Gerrit-PatchSet: 1
Gerrit-Owner: Kaleb KEITHLEY <kaleb(a)redhat.com>
Gerrit-MessageType: newchange
3 years, 7 months
Change in ...nfs-ganesha[next]: selinux: additional policy for RHEL8.4
by Kaleb KEITHLEY (GerritHub)
Kaleb KEITHLEY has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/515361 )
Change subject: selinux: additional policy for RHEL8.4
......................................................................
selinux: additional policy for RHEL8.4
various fusefs_t, new in RHEL8.4
Change-Id: Ib41f89174550756aa8edb2a923b7ca5312505e8d
Signed-off-by: Kaleb S. KEITHLEY <kkeithle(a)redhat.com>
---
M src/selinux/ganesha.te
1 file changed, 9 insertions(+), 0 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/61/515361/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/515361
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: Ib41f89174550756aa8edb2a923b7ca5312505e8d
Gerrit-Change-Number: 515361
Gerrit-PatchSet: 1
Gerrit-Owner: Kaleb KEITHLEY <kaleb(a)redhat.com>
Gerrit-MessageType: newchange
3 years, 7 months
Announce Push of V4-dev.58
by Frank Filz
Branch next
Tag:V4-dev.58
Merge Highlights
* Fix stall when readdir races with I/O and a new mdcache entry created
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
Contents:
576e3ba Frank S. Filz V4-dev.58
aadb9cf Frank S. Filz Fix stall when readdir races with I/O and a new
mdcache entry created
3 years, 8 months
Re: Failed to grab state owner mutex in _state_del_locked
by Sriram Patil
Hi,
With full debug logs for RW LOCK, I could see that the lock gets destroyed as part of free_client_id.
2021-04-22T13:09:39Z : epoch 6080e5a2 : w1hs3r0302.vsanstfsad.local : ganesha.nfsd-180[::ffff:172.30.100.232] [svc_440] 349 :free_client_id :RW LOCK :Destroy mutex 0x7fa390179148 (&clientid->cid_owner.so_mutex) at /build/mts/release/bora-17422501/cayman_nfs-ganesha/nfs-ganesha/src/src/SAL/nfs4_clientid.c:349
……
……
2021-04-22T13:19:44Z : epoch 6080e5a2 : w1hs3r0302.vsanstfsad.local : ganesha.nfsd-180[none] [dbus_heartbeat] 397 :_state_del_locked :RW LOCK :Error 22, acquiring mutex 0x7fa390179148 (&owner->so_mutex) at /build/mts/release/bora-17422501/cayman_nfs-ganesha/nfs-ganesha/src/src/SAL/nfs4_state.c:397
I think the culprit is “free_client_id” function. In this function the “cid_owner->so_mutex” is destroyed without worrying about the refcount
PTHREAD_MUTEX_destroy(&clientid->cid_mutex);
PTHREAD_MUTEX_destroy(&clientid->cid_owner.so_mutex);
if (clientid->cid_minorversion == 0)
PTHREAD_MUTEX_destroy(&clientid->cid_cb.v40.cb_chan.mtx);
put_gsh_client(clientid->gsh_client);
I see that there have been many changes in export CRUD workflows so it would be difficult to simply backport some of these changes to 2.8.4.
Thanks,
Sriram
From: Sriram Patil <sriramp(a)vmware.com>
Date: Tuesday, April 20, 2021 at 12:53 PM
To: Matthew DeVore <matvore(a)google.com>
Cc: Solomon Boulos <boulos(a)google.com>, Frank Filz <ffilzlnx(a)mindspring.com>, dang(a)redhat.com <dang(a)redhat.com>, devel(a)lists.nfs-ganesha.org <devel(a)lists.nfs-ganesha.org>
Subject: [NFS-Ganesha-Devel] Re: Failed to grab state owner mutex in _state_del_locked
Thanks Matthew. I checked the fixes. Unfortunately, are not related to state owner mutex. I am re-running the tests on our side with full debug logging to see if I can debug the issue further.
Thanks,
Sriram
From: Matthew DeVore <matvore(a)google.com>
Date: Tuesday, April 20, 2021 at 12:39 PM
To: Sriram Patil <sriramp(a)vmware.com>
Cc: Solomon Boulos <boulos(a)google.com>, Frank Filz <ffilzlnx(a)mindspring.com>, dang(a)redhat.com <dang(a)redhat.com>, devel(a)lists.nfs-ganesha.org <devel(a)lists.nfs-ganesha.org>
Subject: Re: [NFS-Ganesha-Devel] Re: Failed to grab state owner mutex in _state_del_locked
2021年4月20日(火) 11:42 Sriram Patil <sriramp(a)vmware.com<mailto:sriramp@vmware.com>>:
Hi Solomon,
Thanks for CC’ing Matthew. I saw a couple of review requests from him but they were abandoned. So, not sure which ones are valid patches.
You can see the ones that were accepted with `git log --author=matvore@`
My last pthread-related fix was 28e457 - I guess you need Ganesha v4 for that - `git tag --contains 28e457` shows for me:
V4-dev.41
V4-dev.42
V4-dev.43
V4-dev.44
Thanks,
Sriram
From: Solomon Boulos <boulos(a)google.com<mailto:boulos@google.com>>
Date: Tuesday, April 20, 2021 at 11:39 AM
To: Frank Filz <ffilzlnx(a)mindspring.com<mailto:ffilzlnx@mindspring.com>>
Cc: Matthew DeVore <matvore(a)google.com<mailto:matvore@google.com>>, Sriram Patil <sriramp(a)vmware.com<mailto:sriramp@vmware.com>>, dang(a)redhat.com<mailto:dang@redhat.com> <dang(a)redhat.com<mailto:dang@redhat.com>>, devel(a)lists.nfs-ganesha.org<mailto:devel@lists.nfs-ganesha.org> <devel(a)lists.nfs-ganesha.org<mailto:devel@lists.nfs-ganesha.org>>
Subject: Re: [NFS-Ganesha-Devel] Re: Failed to grab state owner mutex in _state_del_locked
Weren’t there a bunch of fixes from devore (cc’ed) where the union of mutex and something else (condition variable?) weren’t used consistently? That was the first thing that popped in my mind for EINVAL.
On Tue, Apr 20, 2021 at 11:33 Frank Filz <ffilzlnx(a)mindspring.com<mailto:ffilzlnx@mindspring.com>> wrote:
There are some possibly related fixes in V3.4 that aren’t in V2.8.4 but they don’t look like they might be relevant.
I do know that delegations have not been well tested lately, so if you’re using delegations, there might well be a lock problem.
Frank
From: Sriram Patil [mailto:sriramp@vmware.com<mailto:sriramp@vmware.com>]
Sent: Tuesday, April 20, 2021 10:33 AM
To: devel(a)lists.nfs-ganesha.org<mailto:devel@lists.nfs-ganesha.org>
Cc: Frank Filz <ffilzlnx(a)mindspring.com<mailto:ffilzlnx@mindspring.com>>; dang(a)redhat.com<mailto:dang@redhat.com>
Subject: [NFS-Ganesha-Devel] Failed to grab state owner mutex in _state_del_locked
Hi,
Recently we have been observing a ganesha abort because it receives EINVAL when trying to lock the state owner lock (owner->so_mutex).
2021-03-28T10:09:12Z : epoch 605fa7ae : w1hs3i1902.vsanstfsad.local : ganesha.nfsd-90[none] [dbus_heartbeat] 397 :_state_del_locked :RW LOCK :Error 22, acquiring mutex 0x7f2fc4007958 (&owner->so_mutex) at /build/mts/release/bora-17422501/cayman_nfs-ganesha/nfs-ganesha/src/src/SAL/nfs4_state.c:397
I modified some macros and printed RW LOCK activities whenever mutex name is “&owner->so_mutex”. In this, I observed that the lock is never destroyed. So, this EINVAL error is confusing. The EINVAL is observed when removing the export. The previous log for the lock is in DELEG RETURN.
Could the mutex have been destroyed in the free_client_id function (nfs4_clientid.c)? In that case, it won't use that string to destroy it, but instead use `&clientid->cid_owner.so_mutex`
FWIW I found it effective to log all the destroys/creates (-N NIV_FULL_DEBUG) and then grep the log for the hex address (0x7f2fc4007958 in this case) to know all activity on it.
2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local : ganesha.nfsd-90[::ffff:172.30.72.54] [svc_165] 775 :process_one_op :NFS4 :Request 3: opcode 8 is OP_DELEGRETURN
2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local : ganesha.nfsd-90[::ffff:172.30.72.54] [svc_165] 76 :nfs4_op_delegreturn :NFS4 LOCK :Entering NFS v4 DELEGRETURN handler -----------------------------------------------------
2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local : ganesha.nfsd-90[none] [svc_181] 1377 :free_nfs_request :DISP :SVC_DECODE on 0x7f18d800be70 fd 90 (::ffff:172.30.72.54:720<https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2F172.30.7...>) xid=2141597608 returned XPRT_IDLE
2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local : ganesha.nfsd-90[::ffff:172.30.72.54] [svc_165] 129 :nfs4_op_delegreturn :NFS4 LOCK :Successful exit
2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local : ganesha.nfsd-90[::ffff:172.30.72.54] [svc_165] 397 :_state_del_locked :RW LOCK :Acquired mutex 0x7f18b4017058 (&owner->so_mutex) at /build/mts/release/sb-45847366/cayman_nfs-ganesha/nfs-ganesha/src/src/SAL/nfs4_state.c:397
……
…..
2021-04-20T04:57:00Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local : ganesha.nfsd-90[none] [dbus_heartbeat] 397 :_state_del_locked :RW LOCK :Error 22, acquiring mutex 0x7f18b4017058 (&owner->so_mutex) at /build/mts/release/sb-45847366/cayman_nfs-ganesha/nfs-ganesha/src/src/SAL/nfs4_state.c:397
I am not very familiar with the NFSv4 state owner code. But does this look like some known issue?
Note: We are using ganesha 2.8.4
Thanks,
Sriram
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org<mailto:devel@lists.nfs-ganesha.org>
To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org<mailto:devel-leave@lists.nfs-ganesha.org>
3 years, 8 months
Re: Failed to grab state owner mutex in _state_del_locked
by Matthew DeVore
2021年4月20日(火) 11:42 Sriram Patil <sriramp(a)vmware.com>:
> Hi Solomon,
>
>
>
> Thanks for CC’ing Matthew. I saw a couple of review requests from him but
> they were abandoned. So, not sure which ones are valid patches.
>
You can see the ones that were accepted with `git log --author=matvore@`
My last pthread-related fix was 28e457 - I guess you need Ganesha v4 for
that - `git tag --contains 28e457` shows for me:
V4-dev.41
V4-dev.42
V4-dev.43
V4-dev.44
>
>
> Thanks,
>
> Sriram
>
>
>
> *From: *Solomon Boulos <boulos(a)google.com>
> *Date: *Tuesday, April 20, 2021 at 11:39 AM
> *To: *Frank Filz <ffilzlnx(a)mindspring.com>
> *Cc: *Matthew DeVore <matvore(a)google.com>, Sriram Patil <
> sriramp(a)vmware.com>, dang(a)redhat.com <dang(a)redhat.com>,
> devel(a)lists.nfs-ganesha.org <devel(a)lists.nfs-ganesha.org>
> *Subject: *Re: [NFS-Ganesha-Devel] Re: Failed to grab state owner mutex
> in _state_del_locked
>
> Weren’t there a bunch of fixes from devore (cc’ed) where the union of
> mutex and something else (condition variable?) weren’t used consistently?
> That was the first thing that popped in my mind for EINVAL.
>
>
>
> On Tue, Apr 20, 2021 at 11:33 Frank Filz <ffilzlnx(a)mindspring.com> wrote:
>
> There are some possibly related fixes in V3.4 that aren’t in V2.8.4 but
> they don’t look like they might be relevant.
>
>
>
> I do know that delegations have not been well tested lately, so if you’re
> using delegations, there might well be a lock problem.
>
>
>
> Frank
>
>
>
> *From:* Sriram Patil [mailto:sriramp@vmware.com]
> *Sent:* Tuesday, April 20, 2021 10:33 AM
> *To:* devel(a)lists.nfs-ganesha.org
> *Cc:* Frank Filz <ffilzlnx(a)mindspring.com>; dang(a)redhat.com
> *Subject:* [NFS-Ganesha-Devel] Failed to grab state owner mutex in
> _state_del_locked
>
>
>
> Hi,
>
>
>
> Recently we have been observing a ganesha abort because it receives EINVAL
> when trying to lock the state owner lock (owner->so_mutex).
>
>
>
> 2021-03-28T10:09:12Z : epoch 605fa7ae : w1hs3i1902.vsanstfsad.local :
> ganesha.nfsd-90[none] [dbus_heartbeat] 397 :_state_del_locked :RW LOCK
> :Error 22, acquiring mutex 0x7f2fc4007958 (&owner->so_mutex) at
> /build/mts/release/bora-17422501/cayman_nfs-ganesha/nfs-ganesha/src/src/SAL/nfs4_state.c:397
>
>
>
> I modified some macros and printed RW LOCK activities whenever mutex name
> is “&owner->so_mutex”. In this, I observed that the lock is never
> destroyed. So, this EINVAL error is confusing. The EINVAL is observed when
> removing the export. The previous log for the lock is in DELEG RETURN.
>
> Could the mutex have been destroyed in the free_client_id function
(nfs4_clientid.c)? In that case, it won't use that string to destroy it,
but instead use `&clientid->cid_owner.so_mutex`
FWIW I found it effective to log all the destroys/creates (-N
NIV_FULL_DEBUG) and then grep the log for the hex address (0x7f2fc4007958
in this case) to know all activity on it.
>
>
> 2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local :
> ganesha.nfsd-90[::ffff:172.30.72.54] [svc_165] 775 :process_one_op :NFS4
> :Request 3: opcode 8 is OP_DELEGRETURN
>
>
>
> 2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local :
> ganesha.nfsd-90[::ffff:172.30.72.54] [svc_165] 76 :nfs4_op_delegreturn
> :NFS4 LOCK :Entering NFS v4 DELEGRETURN handler
> -----------------------------------------------------
>
>
>
> 2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local :
> ganesha.nfsd-90[none] [svc_181] 1377 :free_nfs_request :DISP :SVC_DECODE on
> 0x7f18d800be70 fd 90 (::ffff:172.30.72.54:720
> <https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2F172.30.7...>)
> xid=2141597608 returned XPRT_IDLE
>
>
> 2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local :
> ganesha.nfsd-90[::ffff:172.30.72.54] [svc_165] 129 :nfs4_op_delegreturn
> :NFS4 LOCK :Successful exit
>
> 2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local :
> ganesha.nfsd-90[::ffff:172.30.72.54] [svc_165] 397 :_state_del_locked :RW
> LOCK :Acquired mutex 0x7f18b4017058 (&owner->so_mutex) at
> /build/mts/release/sb-45847366/cayman_nfs-ganesha/nfs-ganesha/src/src/SAL/nfs4_state.c:397
>
> ……
>
> …..
>
> 2021-04-20T04:57:00Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local :
> ganesha.nfsd-90[none] [dbus_heartbeat] 397 :_state_del_locked :RW LOCK
> :Error 22, acquiring mutex 0x7f18b4017058 (&owner->so_mutex) at
> /build/mts/release/sb-45847366/cayman_nfs-ganesha/nfs-ganesha/src/src/SAL/nfs4_state.c:397
>
>
>
> I am not very familiar with the NFSv4 state owner code. But does this look
> like some known issue?
>
>
>
> Note: We are using ganesha 2.8.4
>
>
>
> Thanks,
>
> Sriram
>
>
>
>
>
> _______________________________________________
> Devel mailing list -- devel(a)lists.nfs-ganesha.org
> To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
>
>
3 years, 8 months
Re: Failed to grab state owner mutex in _state_del_locked
by Frank Filz
There are some possibly related fixes in V3.4 that aren't in V2.8.4 but they
don't look like they might be relevant.
I do know that delegations have not been well tested lately, so if you're
using delegations, there might well be a lock problem.
Frank
From: Sriram Patil [mailto:sriramp@vmware.com]
Sent: Tuesday, April 20, 2021 10:33 AM
To: devel(a)lists.nfs-ganesha.org
Cc: Frank Filz <ffilzlnx(a)mindspring.com>; dang(a)redhat.com
Subject: [NFS-Ganesha-Devel] Failed to grab state owner mutex in
_state_del_locked
Hi,
Recently we have been observing a ganesha abort because it receives EINVAL
when trying to lock the state owner lock (owner->so_mutex).
2021-03-28T10:09:12Z : epoch 605fa7ae : w1hs3i1902.vsanstfsad.local :
ganesha.nfsd-90[none] [dbus_heartbeat] 397 :_state_del_locked :RW LOCK
:Error 22, acquiring mutex 0x7f2fc4007958 (&owner->so_mutex) at
/build/mts/release/bora-17422501/cayman_nfs-ganesha/nfs-ganesha/src/src/SAL/
nfs4_state.c:397
I modified some macros and printed RW LOCK activities whenever mutex name is
"&owner->so_mutex". In this, I observed that the lock is never destroyed.
So, this EINVAL error is confusing. The EINVAL is observed when removing the
export. The previous log for the lock is in DELEG RETURN.
2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local :
ganesha.nfsd-90[::ffff:172.30.72.54] [svc_165] 775 :process_one_op :NFS4
:Request 3: opcode 8 is OP_DELEGRETURN
2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local :
ganesha.nfsd-90[::ffff:172.30.72.54] [svc_165] 76 :nfs4_op_delegreturn :NFS4
LOCK :Entering NFS v4 DELEGRETURN handler
-----------------------------------------------------
2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local :
ganesha.nfsd-90[none] [svc_181] 1377 :free_nfs_request :DISP :SVC_DECODE on
0x7f18d800be70 fd 90 (::ffff:172.30.72.54:720) xid=2141597608 returned
XPRT_IDLE
2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local :
ganesha.nfsd-90[::ffff:172.30.72.54] [svc_165] 129 :nfs4_op_delegreturn
:NFS4 LOCK :Successful exit
2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local :
ganesha.nfsd-90[::ffff:172.30.72.54] [svc_165] 397 :_state_del_locked :RW
LOCK :Acquired mutex 0x7f18b4017058 (&owner->so_mutex) at
/build/mts/release/sb-45847366/cayman_nfs-ganesha/nfs-ganesha/src/src/SAL/nf
s4_state.c:397
..
...
2021-04-20T04:57:00Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local :
ganesha.nfsd-90[none] [dbus_heartbeat] 397 :_state_del_locked :RW LOCK
:Error 22, acquiring mutex 0x7f18b4017058 (&owner->so_mutex) at
/build/mts/release/sb-45847366/cayman_nfs-ganesha/nfs-ganesha/src/src/SAL/nf
s4_state.c:397
I am not very familiar with the NFSv4 state owner code. But does this look
like some known issue?
Note: We are using ganesha 2.8.4
Thanks,
Sriram
3 years, 8 months
Change in ...nfs-ganesha[next]: Fix stall when readdir races with I/O and a new mdcache entry created
by Frank Filz (GerritHub)
Frank Filz has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/514964 )
Change subject: Fix stall when readdir races with I/O and a new mdcache entry created
......................................................................
Fix stall when readdir races with I/O and a new mdcache entry created
If readdir causes a new mdcache entry to be created and that races
with I/O on another instance of the cache entry, which thus requires
merge_share to resolve the share reservation, the need to take the
obj_lock for write blocks while the I/O completes while holding the
lock for read.
In this case, there's actually no need to do anything in merge_share
because all the counters in the readdir's copy of the obj handle are
0. Since that copy of the obj handle MUST not accessible to any other
thread, it can be accessed without holding the lock.
So we move acquiring and releasing the lock into merge_share and
check that the dupe_share has non-zero counters before taking the
lock.
Change-Id: I372713360a5c795f77b1365c263c66302ec2490a
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
---
M src/FSAL/FSAL_CEPH/handle.c
M src/FSAL/FSAL_GLUSTER/handle.c
M src/FSAL/FSAL_GPFS/file.c
M src/FSAL/FSAL_KVSFS/kvsfs_handle.c
M src/FSAL/FSAL_LIZARDFS/handle.c
M src/FSAL/FSAL_MEM/mem_handle.c
M src/FSAL/FSAL_RGW/handle.c
M src/FSAL/FSAL_VFS/file.c
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c
M src/FSAL/commonlib.c
M src/include/FSAL/fsal_commonlib.h
11 files changed, 53 insertions(+), 53 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/64/514964/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/514964
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I372713360a5c795f77b1365c263c66302ec2490a
Gerrit-Change-Number: 514964
Gerrit-PatchSet: 1
Gerrit-Owner: Frank Filz <ffilzlnx(a)mindspring.com>
Gerrit-MessageType: newchange
3 years, 8 months