April 2023 - Devel - Nfs Ganesha List Archives

Client id expiration - multiple clients using same hostname and deadlock issues.

by Deepak Arumugam Sankara Subramanian

Hi , We recently ran into some issues in client-id expiration code paths. We've had multiple issues where a CREATE_SESSION tries to expire a clientid when the clientid is being used by other rpcs like OPEN, EXCHANGE_ID etc.You typically don't expect a client to send a OPEN/EXCHANGE_ID rpc while the server is processing a CREATE_SESSSION rpc unless the client is misbehaving. We found that in our labs/test-beds multiple clients had the same hostname set and ganesha was mapping them to the same client record. Now although the clients were misbehaving we feel that the server should handle/fail gracefully. In these issues the server usually runs into an unexpected assert,segfault and crashes or it runs into a deadlock and hangs forever . Q1: We need some recommendations on whats the right thing to do when multiple clients use the same owner_id. The RFC says this about co_ownerid The string should be unique so that multiple clients do not present the same string. The consequences of two clients presenting the same string range from one client getting an error to one client having its leased state abruptly and unexpectedly cancelled. but we wanted to know if one of these responses is better than other. These are some more details on the individual issues raised from setups having the same hostname Deadlock: Two client ids say c1 and c2 from 2 different clients were associated with the same co_ownerid and same record cr1 1. Client 1 -> thread 1 was doing a open. It was inside the get_state_owner function trying to get a owner for the open state. It acquired ht_owner->partitions[15].lock created a new open owner and was trying to hold the mutex on cr1 aka nfs4_owner->so_clientrec->cid_mutex so it could add the owner to the client record cr1. 2. Client2 -> thread 2 was doing a create session. It was inside nfs_clientid_expire function. It was parsing the open owner list inside c1 and trying to 'delete' each owner (the second while(true) loop). It was holding nfs4_owner->so_clientrec->cid_mutex and trying to get ht_owner->partitions[15].lock Q2: If we were to fix this deadlock, whats the recommended order for acquiring the locks - should we acquire the client_rec->cid_mutex first and then acquire ht_owner table lock or vice versa This looks like it could be a common deadlock pattern we might've encountered before since we have 2 structures referencing each other each with their own locks I can update more details on the other failures(asserts, segfaults etc). Let me know if that is needed Thanks, Deepak

2 years, 8 months

1
0
0 / 0

UDP mount in nfs-ganesha

by Sagar Singh

Hi, We are trying to work on a UDP mount for ganesha, and need some feedback. For NLM lock recovery we use the node local ip address of ganesha server-node. It works perfectly fine with the TCP mount. But with the UDP mount we have seen that local rq_xprt->xp_local.nb.buf points to valid local-ip, however rq_xprt->xp_local.ss have ip address ADDR_ANY. (gdb) p *((struct sockaddr_in*)&reqdata->svc->rq_xprt->xp_local.ss) $2 = {sin_family = 2, sin_port = 264, sin_addr = {s_addr = 0}, sin_zero = "\000\000\000\000\000\000\000"} (gdb) p *((struct sockaddr_in*)reqdata->svc->rq_xprt->xp_local.nb.buf) $3 = {sin_family = 3, sin_port = 0, sin_addr = {s_addr = 2237796106}, sin_zero = "\n\017b\205\000\000\000"} Which causes problems in lock recovery and other functionalities. We have made some changes in the initialization code of local ip: https://github.com/nfs-ganesha/ntirpc/pull/273/files Which seems to resolve the issue of the local address being ADDR_ANY and lock recovery works fine. We are not sure that :- 1. Will this fix work with all kinds of clients? 2. Will this break any existing functionalities? Any feedback will be greatly appreciated. Thanks, Sagar Singh

2 years, 8 months

2
1
0 / 0

[S] Change in ...nfs-ganesha[next]: cmake: minium version for ntirpc

by Kaleb KEITHLEY (GerritHub)

Kaleb KEITHLEY has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/553137 ) Change subject: cmake: minium version for ntirpc ...................................................................... cmake: minium version for ntirpc 4.0 -> 5.0 Change-Id: If51ab58838e372f866f9a12a135fec00bb4c76f1 Signed-off-by: Kaleb S. KEITHLEY <kkeithle(a)redhat.com> --- M src/CMakeLists.txt 1 file changed, 13 insertions(+), 1 deletion(-) git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/37/553137/1 -- To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/553137 To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings Gerrit-Project: ffilz/nfs-ganesha Gerrit-Branch: next Gerrit-Change-Id: If51ab58838e372f866f9a12a135fec00bb4c76f1 Gerrit-Change-Number: 553137 Gerrit-PatchSet: 1 Gerrit-Owner: Kaleb KEITHLEY <kaleb(a)redhat.com> Gerrit-MessageType: newchange

2 years, 8 months

1
0
0 / 0

[S] Change in ...nfs-ganesha[next]: fix: broken v4_recov_dir_len when using fs_ng backend

by Shuoran Liu (GerritHub)

Shuoran Liu has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/553133 ) Change subject: fix: broken v4_recov_dir_len when using fs_ng backend ...................................................................... fix: broken v4_recov_dir_len when using fs_ng backend The fs_ng backend reuses v4_recov_dir and v4_recov_dir_len from the fs backend. However, v4_recov_dir_len is not set properly, so the client recovery dir won't be deleted correctly in fs_rm_clid. Signed-off-by: Shuoran Liu <shuoran.liu(a)smartx.com> Change-Id: I1de7796d769850e619ca74e8721742756f29c77b --- M src/SAL/recovery/recovery_fs.h M src/SAL/recovery/recovery_fs_ng.c 2 files changed, 17 insertions(+), 0 deletions(-) git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/33/553133/1 -- To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/553133 To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings Gerrit-Project: ffilz/nfs-ganesha Gerrit-Branch: next Gerrit-Change-Id: I1de7796d769850e619ca74e8721742756f29c77b Gerrit-Change-Number: 553133 Gerrit-PatchSet: 1 Gerrit-Owner: Shuoran Liu <shuoran.liu(a)smartx.com> Gerrit-MessageType: newchange

2 years, 8 months

1
0
0 / 0

[M] Change in ...nfs-ganesha[next]: rgw: rgw_xattr fixes

by Matt Benjamin (GerritHub)

Matt Benjamin has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/553107 ) Change subject: rgw: rgw_xattr fixes ...................................................................... rgw: rgw_xattr fixes * fix rgw_getxattrs preamble in getxattrs * rgw: remove xattrs don't work comment * fix rgw_setxattrs preamble in setxattrs * fix rgw_rmxattr preamble in removexattrs * restructure listxattrs to enumerate returned names * rgw: conditionally set xattr_support in fs_info Possibly a regression in other fsals, this is being set in FSAL_CEPH, but not FSAL_GPFS. * rgw: don't pass address of pointer xa_value as cb_arg That causes xa_value to point to junk. Change-Id: I35607443754015dbfbf4d1b02c5fa6777c3ee4a9 Signed-off-by: Matt Benjamin <mbenjamin(a)redhat.com> --- M src/FSAL/FSAL_RGW/handle.c M src/FSAL/FSAL_RGW/main.c M src/cmake/modules/FindRGW.cmake 3 files changed, 79 insertions(+), 48 deletions(-) git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/07/553107/1 -- To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/553107 To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings Gerrit-Project: ffilz/nfs-ganesha Gerrit-Branch: next Gerrit-Change-Id: I35607443754015dbfbf4d1b02c5fa6777c3ee4a9 Gerrit-Change-Number: 553107 Gerrit-PatchSet: 1 Gerrit-Owner: Matt Benjamin <mbenjamin(a)redhat.com> Gerrit-MessageType: newchange

2 years, 8 months

1
0
0 / 0

[S] Change in ...nfs-ganesha[next]: rgw: require rgw_file version 1.2.1 (rgw_readdir2 returns int)

by Matt Benjamin (GerritHub)

Matt Benjamin has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/553106 ) Change subject: rgw: require rgw_file version 1.2.1 (rgw_readdir2 returns int) ...................................................................... rgw: require rgw_file version 1.2.1 (rgw_readdir2 returns int) Required for build against Reef+. Change-Id: I5637517edd4ac141e3068b79f9ee76c85f56bdcf Signed-off-by: Matt Benjamin <mbenjamin(a)redhat.com> --- M src/CMakeLists.txt M src/FSAL/FSAL_RGW/handle.c M src/FSAL/FSAL_RGW/internal.h 3 files changed, 17 insertions(+), 5 deletions(-) git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/06/553106/1 -- To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/553106 To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings Gerrit-Project: ffilz/nfs-ganesha Gerrit-Branch: next Gerrit-Change-Id: I5637517edd4ac141e3068b79f9ee76c85f56bdcf Gerrit-Change-Number: 553106 Gerrit-PatchSet: 1 Gerrit-Owner: Matt Benjamin <mbenjamin(a)redhat.com> Gerrit-MessageType: newchange

2 years, 8 months

1
0
0 / 0

[S] Change in ...nfs-ganesha[next]: Fix for fail to unmount fs after removing all exports of that fs

by Name of user not set (GerritHub)

yogendra858(a)yahoo.com has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/553096 ) Change subject: Fix for fail to unmount fs after removing all exports of that fs ...................................................................... Fix for fail to unmount fs after removing all exports of that fs Change-Id: I850ad9fda5de8f39da19b552921400f558fc697e Signed-off-by: Yogendra Charya <Yogendra.Charya(a)ibm.com> --- M src/FSAL/FSAL_GPFS/export.c 1 file changed, 13 insertions(+), 2 deletions(-) git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/96/553096/1 -- To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/553096 To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings Gerrit-Project: ffilz/nfs-ganesha Gerrit-Branch: next Gerrit-Change-Id: I850ad9fda5de8f39da19b552921400f558fc697e Gerrit-Change-Number: 553096 Gerrit-PatchSet: 1 Gerrit-Owner: yogendra858(a)yahoo.com Gerrit-MessageType: newchange

2 years, 8 months

1
0
0 / 0

V5.0 released

by Frank Filz

See release notes: https://github.com/nfs-ganesha/nfs-ganesha/wiki/ReleaseNotes_5

2 years, 8 months

1
0
0 / 0

Announce Push of V5.0

by Frank Filz

Branch next Tag:V5.0 NOTE: A new ntirpc pullup is included. Please update your submodule. Merge Highlights * ntirpc 5.0 * Several fixes for compile/build errors introduced by V5-dev.5 * Fix the NFSv3 -> v4 handle mapping length check. * ganesha-top: add tool named ganesha-top * Do not provide delegations when another client has a write open file handle * Lease_Lifetime cannot be 0. * Add a bit of debug for config * EXPORT: Add CLIENT blocks to EXPORT_DEFAULTS * FSAL: All FSALs must implement alloc_state and status2 * CONFIG: Make most parameters and blocks unique, clarify the rest * Add find_unused_blocks to check for unused config blocks * Add FSAL_LIST config block to list the FSAL specific config blocks * Add command line option for config errors to be fatal on startup * FSAL_VFS: On Linux make setattr >0x7ffffffffffffff return EFBIG * SAL: wrap init and destroy all_locks_mutex in #ifdef DEBUG_SAL * Eradicate most refereces to cache inode. * FSAL cap retries for stat while resolving POSIX filesystems * FSAL_CEPH: enable POSIX ACL * Fix race with granting blocked locks * Replace some LogCrit followed by exit with LogFatal * FSAL: posix2fsal_status(EBUSY); is too noisy Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com> Contents: 13419f57f Frank S. Filz V5.0 6bbbd60ea Daniel Gryniewicz Pullup to ntirpc 5.0 15a75d92d Daniel Gryniewicz Fix gtests for change in async API 315d2c0f4 Daniel Gryniewicz Can't have 'export' in a header; it breaks C++ 3874f3a2d Daniel Gryniewicz MDCACHE - fix lttng tracepoint 6423e227f Virtually Nick Fix the NFSv3 -> v4 handle mapping length check. 42ace9547 Vicente Cheng ganesha-top: add tool named ganesha-top 94cc885ad Deepak Arumugam Sankara Subramanian Do not provide delegations when another client has a write open file handle 6e455bbbe Frank S. Filz Lease_Lifetime cannot be 0. 017fd2ab0 Frank S. Filz Add a bit of debug for config c0b84ddd3 Frank S. Filz EXPORT: Add CLIENT blocks to EXPORT_DEFAULTS d2e46d371 Frank S. Filz FSAL: All FSALs must implement alloc_state and status2 991b1f5ea Frank S. Filz CONFIG: Make most parameters and blocks unique, clarify the rest b35d8cfee Frank S. Filz Add find_unused_blocks to check for unused config blocks 16119e22c Frank S. Filz Add FSAL_LIST config block to list the FSAL specific config blocks 2223ebf1a Frank S. Filz Add command line option for config errors to be fatal on startup eb5eebb61 Frank S. Filz FSAL_VFS: On Linux make setattr >0x7ffffffffffffff return EFBIG 117f07ade Frank S. Filz SAL: wrap init and destroy all_locks_mutex in #ifdef DEBUG_SAL 98771cef8 Frank S. Filz Eradicate most refereces to cache inode. 6e18be923 Frank S. Filz FSAL cap retries for stat while resolving POSIX filesystems 439c34706 Frank S. Filz FSAL_CEPH: enable POSIX ACL 92fe1d05c Frank S. Filz Fix race with granting blocked locks b52582d9c Frank S. Filz Replace some LogCrit followed by exit with LogFatal d323655d6 Frank S. Filz FSAL: posix2fsal_status(EBUSY); is too noisy

2 years, 8 months

1
0
0 / 0

[M] Change in ...nfs-ganesha[next]: Donot provide delegations when another client has a write open file h...

by Name of user not set (GerritHub)

deepakarumugam.s(a)nutanix.com has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/552963 ) Change subject: Donot provide delegations when another client has a write open file handle ...................................................................... Donot provide delegations when another client has a write open file handle Bug description: Today we provide read delegations to clients even when another client has a write open file handle. This causes read delegated clients to hold stale data(but think that the data is valid) when the write client tries to write to the file. Experiments show that when the write client writes to the file we respond to the write client saying write is done and then do a delegation recall much later this is problematic. How this bug was found: This bug was found by running pynfs delegation tests against ganesha server and comparing it against linux kernel nfs server. Fix: This patch fixes that issue by introducing another counter to keep track of the number of write opens. We don't provide delegations if another client has opened this file in write mode Change-Id: Icc203c9c144097efaed2b4fa928abe6d4ad6099e Signed-off-by: Deepak Arumugam Sankara Subramanian <deepakarumugam.s(a)nutanix.com> --- M src/Protocols/NFS/nfs4_op_close.c M src/Protocols/NFS/nfs4_op_open.c M src/SAL/state_deleg.c M src/include/sal_data.h 4 files changed, 63 insertions(+), 2 deletions(-) git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/63/552963/1 -- To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/552963 To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings Gerrit-Project: ffilz/nfs-ganesha Gerrit-Branch: next Gerrit-Change-Id: Icc203c9c144097efaed2b4fa928abe6d4ad6099e Gerrit-Change-Number: 552963 Gerrit-PatchSet: 1 Gerrit-Owner: deepakarumugam.s(a)nutanix.com Gerrit-MessageType: newchange

2 years, 8 months

1
0
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

Devel April 2023