Failed to grab state owner mutex in _state_del_locked
by Sriram Patil
Hi,
Recently we have been observing a ganesha abort because it receives EINVAL when trying to lock the state owner lock (owner->so_mutex).
2021-03-28T10:09:12Z : epoch 605fa7ae : w1hs3i1902.vsanstfsad.local : ganesha.nfsd-90[none] [dbus_heartbeat] 397 :_state_del_locked :RW LOCK :Error 22, acquiring mutex 0x7f2fc4007958 (&owner->so_mutex) at /build/mts/release/bora-17422501/cayman_nfs-ganesha/nfs-ganesha/src/src/SAL/nfs4_state.c:397
I modified some macros and printed RW LOCK activities whenever mutex name is “&owner->so_mutex”. In this, I observed that the lock is never destroyed. So, this EINVAL error is confusing. The EINVAL is observed when removing the export. The previous log for the lock is in DELEG RETURN.
2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local : ganesha.nfsd-90[::ffff:172.30.72.54] [svc_165] 775 :process_one_op :NFS4 :Request 3: opcode 8 is OP_DELEGRETURN
2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local : ganesha.nfsd-90[::ffff:172.30.72.54] [svc_165] 76 :nfs4_op_delegreturn :NFS4 LOCK :Entering NFS v4 DELEGRETURN handler -----------------------------------------------------
2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local : ganesha.nfsd-90[none] [svc_181] 1377 :free_nfs_request :DISP :SVC_DECODE on 0x7f18d800be70 fd 90 (::ffff:172.30.72.54:720) xid=2141597608 returned XPRT_IDLE
2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local : ganesha.nfsd-90[::ffff:172.30.72.54] [svc_165] 129 :nfs4_op_delegreturn :NFS4 LOCK :Successful exit
2021-04-20T04:07:12Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local : ganesha.nfsd-90[::ffff:172.30.72.54] [svc_165] 397 :_state_del_locked :RW LOCK :Acquired mutex 0x7f18b4017058 (&owner->so_mutex) at /build/mts/release/sb-45847366/cayman_nfs-ganesha/nfs-ganesha/src/src/SAL/nfs4_state.c:397
……
…..
2021-04-20T04:57:00Z : epoch 607e2d18 : w1hs3r0313.vsanstfsad.local : ganesha.nfsd-90[none] [dbus_heartbeat] 397 :_state_del_locked :RW LOCK :Error 22, acquiring mutex 0x7f18b4017058 (&owner->so_mutex) at /build/mts/release/sb-45847366/cayman_nfs-ganesha/nfs-ganesha/src/src/SAL/nfs4_state.c:397
I am not very familiar with the NFSv4 state owner code. But does this look like some known issue?
Note: We are using ganesha 2.8.4
Thanks,
Sriram
3 years, 8 months
Announce Push of V4-dev.57
by Frank Filz
Branch next
Tag:V4-dev.57
Merge Highlights
* Cap NFS v4.0 max response room
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
Contents:
2f7effe Frank S. Filz V4-dev.57
3c7ac66 Olivier Garaud Cap NFS v4.0 max response room
3 years, 8 months
Hung with oracle linux client
by gaurav gangalwar
Hi,
We are facing IO hang with oracle linux 7.9
On debugging further found that we are sending two zero window since
fd is not closed but we destroyed the xprt, so we will not be polling
on it and the recv queue exhausted.
We tried with a mix of centos 8 and oracle linux 7.9 clients, so on
zero window centos clients reset the connection and start a new
connection to recover, so IO hung for sometime but it recovers. But
oracle linux don't reset the connection and remain in hung state
forever.
To reproduce it easily I ran IO with this patch to simulate connection
destroy while doing IO, I just removed these lines from
svc_rqst_clean_func
- if ((acc->ts.tv_sec - REC_XPRT(xprt)->recv.ts.tv_sec) < acc->timeout)
- return (false);
-
On checking the code further we found out the issue where we could
rearm with refs taken and but there won't be any task executed from
epoll since xprt is in a destroyed state.
This is code path which could cause the issue
In svc_ioq_write, svc_rqst_evchan_write rearm with refs on EWOULDBLOCK
In svc_rqst_epoll_event, svc_xprt_lookup got xprt in a destroyed
state(it got destroyed in some other path, could be due to some error
or idle cleanup happening at same time).
So svc_rqst_xprt_task_send won't get a chance to execute and cleanup
the refs taken for responses.
I have a patch for it and it's working fine with the patch.
https://github.com/nfs-ganesha/ntirpc/pull/227
Anyone faced this issue with oracle linux client and is there any work
around we can do from client?
Regards,
Gaurav
3 years, 8 months
Announce Push of V4-dev.56
by Frank Filz
Branch next
Tag:V4-dev.56
Merge Highlights
* Rework fsal_filesystem handling for better config reload and handle
validation
* Add support for btrfs subvols
* Some config documentation updates.
* Add configurable RecoveryRoot, RecoveryDir, RecoveryOldDir
* Remove most references to cache_inode
* Cap READDIRPLUS maxcount memory to export MaxRead
* NFSv3: increase the maximum number of entries that can be returned by
READDIR
* log config errors: wait for systemd
* dbus/export: fix ShowExports not work
* Replace ${CMAKE_SOURCE_DIR}.
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
Contents:
18413fe Frank S. Filz V4-dev.56
3fb9b1b Gao Mingfei Replace ${CMAKE_SOURCE_DIR}.
086f45c Vicente Cheng dbus/export: fix ShowExports not work
6a1eba8 Olivier Garaud log config errors: wait for systemd
02d9877 Olivier Garaud NFSv3: increase the maximum number of entries that
can be returned by READDIR
ca1478d Olivier Garaud Cap READDIRPLUS maxcount memory to export MaxRead
ba3dbde Frank S. Filz Remove most references to cache_inode
83cdf63 Frank S. Filz Add configurable RecoveryRoot, RecoveryDir,
RecoveryOldDir
95b61d3 Frank S. Filz Some config updates.
3985087 Frank S. Filz Add support for btrfs subvols
9fb03f1 Frank S. Filz Rework fsal_filesystem handling for better config
reload and handle validation
3 years, 8 months
Isse with Compiling nfs-ganesha with kerberoes enabled on SLES 15
by Chakra Divi
Hi All,
Im trying to compile nfs-ganesha v3-stable on SUSE Linux Enterprise Server 15 SP2 machine. i could compile ganesha with USE_GSS=OFF, but when enabled it says krb5-config not found.
I have installed all available krb5 development libraries as well, but still the cmake scripts says KRB5_C_CONFIG-NOTFOUND.
Found that cmake script is checking for krb5-config binary - which is available on ubuntu, centos but not SLES.
Do i need to make any changes in cmake scripts to fix this krb5 config dependency on SLES 15SP2 machine ? or Is there any other library that im missing to install on SLES machine to enable GSS with ganesha ?
Thanks for the help
Regards,
Chakra
3 years, 8 months
Change in ...nfs-ganesha[next]: Remove most references to cache_inode
by Frank Filz (GerritHub)
Frank Filz has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/514191 )
Change subject: Remove most references to cache_inode
......................................................................
Remove most references to cache_inode
Most of what remains is dbus, config, and logging which are all exposed
to sysadmin.
Change-Id: I2efee5b7f491da25bf029291f29e2e9540f1307a
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
---
M src/FSAL/FSAL_CEPH/ds.c
M src/FSAL/FSAL_GLUSTER/ds.c
M src/FSAL/FSAL_GLUSTER/handle.c
M src/FSAL/FSAL_GPFS/file.c
M src/FSAL/FSAL_GPFS/fsal_ds.c
M src/FSAL/FSAL_GPFS/handle.c
M src/FSAL/FSAL_KVSFS/kvsfs_ds.c
M src/FSAL/FSAL_KVSFS/kvsfs_file.c
M src/FSAL/FSAL_KVSFS/kvsfs_handle.c
M src/FSAL/FSAL_MEM/mem_handle.c
M src/FSAL/FSAL_PSEUDO/handle.c
M src/FSAL/FSAL_VFS/handle.c
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_ext.h
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_hash.c
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_hash.h
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_int.h
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.h
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_main.c
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_read_conf.c
M src/FSAL/Stackable_FSALs/FSAL_NULL/file.c
M src/FSAL/Stackable_FSALs/FSAL_NULL/handle.c
M src/FSAL/fsal_helper.c
M src/FSAL_UP/fsal_up_top.c
M src/Protocols/9P/9p_getattr.c
M src/Protocols/NFS/nfs4_op_commit.c
M src/Protocols/NFS/nfs4_op_lookupp.c
M src/Protocols/NFS/nfs4_op_open.c
M src/Protocols/NFS/nfs4_op_putfh.c
M src/Protocols/NFS/nfs4_op_read.c
M src/Protocols/NFS/nfs4_op_write.c
M src/Protocols/NLM/nlm_Cancel.c
M src/Protocols/NLM/nlm_Granted_Res.c
M src/Protocols/NLM/nlm_Unlock.c
M src/SAL/nfs4_state.c
M src/SAL/state_lock.c
M src/include/fsal_api.h
M src/include/fsal_types.h
M src/include/nlm_util.h
M src/include/sal_data.h
M src/support/exports.c
42 files changed, 79 insertions(+), 104 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/91/514191/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/514191
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I2efee5b7f491da25bf029291f29e2e9540f1307a
Gerrit-Change-Number: 514191
Gerrit-PatchSet: 1
Gerrit-Owner: Frank Filz <ffilzlnx(a)mindspring.com>
Gerrit-MessageType: newchange
3 years, 8 months
Change in ...nfs-ganesha[next]: Add configurable RecoveryRoot, RecoveryDir, RecoveryOldDir
by Frank Filz (GerritHub)
Frank Filz has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/514184 )
Change subject: Add configurable RecoveryRoot, RecoveryDir, RecoveryOldDir
......................................................................
Add configurable RecoveryRoot, RecoveryDir, RecoveryOldDir
Change-Id: I2b47a694ed9d6b6fe1914efe692f4796f4502a77
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
---
M src/SAL/recovery/recovery_fs.c
M src/SAL/recovery/recovery_fs.h
M src/SAL/recovery/recovery_fs_ng.c
M src/config_samples/config.txt
M src/doc/man/ganesha-core-config.rst
M src/include/config-h.in.cmake
M src/include/gsh_config.h
M src/support/nfs_read_conf.c
8 files changed, 73 insertions(+), 36 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/84/514184/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/514184
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I2b47a694ed9d6b6fe1914efe692f4796f4502a77
Gerrit-Change-Number: 514184
Gerrit-PatchSet: 1
Gerrit-Owner: Frank Filz <ffilzlnx(a)mindspring.com>
Gerrit-MessageType: newchange
3 years, 8 months
Change in ...nfs-ganesha[next]: NFSv3: increase the maximum number of entries that can be returned by...
by Olivier Garaud (GerritHub)
Olivier Garaud has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/514149 )
Change subject: NFSv3: increase the maximum number of entries that can be returned by READDIR
......................................................................
NFSv3: increase the maximum number of entries that can be returned by READDIR
It was limited to 120 entries, the response size is now capped to XDR_BYTES_MAXLEN_IO
Change-Id: I91b0ae4e25f0f98996beec5154f9cda41d1fe98b
Signed-off-by: Olivier Garaud <olivier.garaud(a)scality.com>
---
M src/Protocols/NFS/nfs3_readdir.c
1 file changed, 5 insertions(+), 1 deletion(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/49/514149/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/514149
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I91b0ae4e25f0f98996beec5154f9cda41d1fe98b
Gerrit-Change-Number: 514149
Gerrit-PatchSet: 1
Gerrit-Owner: Olivier Garaud <olivier.garaud(a)scality.com>
Gerrit-MessageType: newchange
3 years, 8 months
Change in ...nfs-ganesha[next]: Cap NFS v4.0 max response room
by Olivier Garaud (GerritHub)
Olivier Garaud has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/514148 )
Change subject: Cap NFS v4.0 max response room
......................................................................
Cap NFS v4.0 max response room
Maximum response size is now capped to export MaxRead
or by default to XDR_BYTES_MAXLEN_IO to prevent CVE-2018-17159
Change-Id: I19a5e07617a92347e65de3c88d4dd225b3ef2f0e
Signed-off-by: Olivier Garaud <olivier.garaud(a)scality.com>
---
M src/Protocols/NFS/nfs_proto_tools.c
1 file changed, 23 insertions(+), 4 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/48/514148/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/514148
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I19a5e07617a92347e65de3c88d4dd225b3ef2f0e
Gerrit-Change-Number: 514148
Gerrit-PatchSet: 1
Gerrit-Owner: Olivier Garaud <olivier.garaud(a)scality.com>
Gerrit-MessageType: newchange
3 years, 8 months
Change in ...nfs-ganesha[next]: Cap READDIRPLUS maxcount memory to export MaxRead
by Olivier Garaud (GerritHub)
Olivier Garaud has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/514147 )
Change subject: Cap READDIRPLUS maxcount memory to export MaxRead
......................................................................
Cap READDIRPLUS maxcount memory to export MaxRead
to prevent CVE-2018-17159
Change-Id: Ie3265d7644e627836a073199bf873a2a123a5327
Signed-off-by: Olivier Garaud <olivier.garaud(a)scality.com>
---
M src/Protocols/NFS/nfs3_readdirplus.c
1 file changed, 5 insertions(+), 1 deletion(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/47/514147/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/514147
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: Ie3265d7644e627836a073199bf873a2a123a5327
Gerrit-Change-Number: 514147
Gerrit-PatchSet: 1
Gerrit-Owner: Olivier Garaud <olivier.garaud(a)scality.com>
Gerrit-MessageType: newchange
3 years, 8 months