Change in ...nfs-ganesha[next]: Allow EXPORT pseudo path to be changed during export update
by Frank Filz (GerritHub)
Frank Filz has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490334 )
Change subject: Allow EXPORT pseudo path to be changed during export update
......................................................................
Allow EXPORT pseudo path to be changed during export update
This also fully allows adding or removing NFSv4 support from an export
since we can now handle the PseudoFS swizzing that occurs.
Note that an explicit PseudoFS export may be removed or added, though
you can not change it from export_id 0 because we currently don't allow
changing the export_id.
Note that this patch doesn't handle DBUS add or remove export though
that is an option to improve. I may add them to this patch (it wouldn't
be that hard) but I want to get this reviewed as is right now.
There are implications to a client of changing the PseudoFS. I have
tested moving an export in the PseudoFS with a client mounted. The
client will be able to continue accessing the export, though it may
see an ESTALE error if it navigates out of the export. The current
working directory will go bad and the pwd comment will fail indicating
a disconnected mount. I have also seen referencing .. from the root of
the export wrapping around back to the root (I believe this is how
disconnected mounts are set up).
FSAL_PSEUDO lookups and create handles (PUTFH or any use of an NFSv3
handle where the inode isn't cached) which fail during an export update
are instead turned into ERR_FSAL_DELAY which turns into NFS4ERR_DELAY or
NFS3ERR_JUKEBOX to force the client to retry under the completed update.
Change-Id: I507dc17a651936936de82303ff1291677ce136be
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
---
M src/FSAL/FSAL_PSEUDO/handle.c
M src/MainNFSD/libganesha_nfsd.ver
M src/Protocols/NFS/nfs4_pseudo.c
M src/include/export_mgr.h
M src/include/nfs_proto_functions.h
M src/support/export_mgr.c
M src/support/exports.c
7 files changed, 560 insertions(+), 203 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/34/490334/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490334
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I507dc17a651936936de82303ff1291677ce136be
Gerrit-Change-Number: 490334
Gerrit-PatchSet: 1
Gerrit-Owner: Frank Filz <ffilzlnx(a)mindspring.com>
Gerrit-MessageType: newchange
10 months, 1 week
Help needed on FD
by Alok Sinha
FD created at below path does not have correct link in /proc/self/fd.
With vfs_readlink_by_handle , I can get a relative path but cannot get the
absolute path of the FD.
I need a suggestion for one of the below to get around a production bug:
- How to get the full path of the FD?
- How to get the parent vfs_fsal_obj_handle?
- Anyway to bypass this flow by changing any config file?
#0 vfs_create_handle (exp_hdl=0x55801aca22c0, hdl_desc=0x7f93889b1600,
handle=0x7f93889b13f8, attrs_out=0x7f93889b1430)
at /home/alok/pub/splfs-cache-2.8.3/src/FSAL/FSAL_VFS/handle.c:2020
#1 0x00007f94a1f0e13b in mdcache_locate_host (fh_desc=0x7f93889b1600,
export=0x55801aca1f20, entry=0x7f93889b1578, attrs_out=0x0)
at
/home/alok/pub/splfs-cache-2.8.3/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:1109
#2 0x00007f94a1f02455 in mdcache_create_handle (exp_hdl=0x55801aca1f20,
fh_desc=0x7f93889b1600, handle=0x7f93889b15d8, attrs_out=0x0)
at
/home/alok/pub/splfs-cache-2.8.3/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1613
#3 0x00007f94a1ec5ead in nfs4_mds_putfh (data=0x7f93201119e0)
at
/home/alok/pub/splfs-cache-2.8.3/src/Protocols/NFS/nfs4_op_putfh.c:211
#4 0x00007f94a1ec60d3 in nfs4_op_putfh (op=0x7f93201186d0,
data=0x7f93201119e0, resp=0x7f932010ab20)
at
/home/alok/pub/splfs-cache-2.8.3/src/Protocols/NFS/nfs4_op_putfh.c:281
#5 0x00007f94a1ea96ae in process_one_op (data=0x7f93201119e0,
status=0x7f93889b1b48)
at
/home/alok/pub/splfs-cache-2.8.3/src/Protocols/NFS/nfs4_Compound.c:920
#6 0x00007f94a1eaa8c5 in nfs4_Compound (arg=0x7f9320138d78,
req=0x7f93201385a0, res=0x7f9320106840)
at
/home/alok/pub/splfs-cache-2.8.3/src/Protocols/NFS/nfs4_Compound.c:1329
#7 0x00007f94a1de8567 in nfs_rpc_process_request (reqdata=0x7f93201385a0)
---Type <return> to continue, or q <return> to quit---
at
/home/alok/pub/splfs-cache-2.8.3/src/MainNFSD/nfs_worker_thread.c:1484
#8 0x00007f94a1de8854 in nfs_rpc_valid_NFS (req=0x7f93201385a0)
at
/home/alok/pub/splfs-cache-2.8.3/src/MainNFSD/nfs_worker_thread.c:1591
#9 0x00007f94a13b6faf in svc_vc_decode (req=0x7f93201385a0)
at /home/alok/pub/splfs-cache-2.8.3/src/libntirpc/src/svc_vc.c:829
#10 0x00007f94a13b3225 in svc_request (xprt=0x7f9320000ce0,
xdrs=0x7f9320066260)
at /home/alok/pub/splfs-cache-2.8.3/src/libntirpc/src/svc_rqst.c:793
#11 0x00007f94a13b6ec0 in svc_vc_recv (xprt=0x7f9320000ce0)
at /home/alok/pub/splfs-cache-2.8.3/src/libntirpc/src/svc_vc.c:802
#12 0x00007f94a13b31a5 in svc_rqst_xprt_task (wpe=0x7f9320000f00)
at /home/alok/pub/splfs-cache-2.8.3/src/libntirpc/src/svc_rqst.c:774
#13 0x00007f94a13bcab0 in work_pool_thread (arg=0x7f9218003340)
at /home/alok/pub/splfs-cache-2.8.3/src/libntirpc/src/work_pool.c:184
#14 0x00007f94a1badea5 in start_thread () from /lib64/libpthread.so.0
#15 0x00007f94a16ce9fd in clone () from /lib64/libc.so.6
--
Alok Sinha
www.spillbox.io
https://youtu.be/U-YupjLQ9bU
11 months
Announce Push of V5.3.2
by Frank Filz
Branch next
Tag:V5.3.2
Merge Highlights
* [GPFS] Change handle_size to 48 if GPFS returned handle with size 40
* Add LogEventLimited to trace in fsal_common_is_referral
* Handle granted upcalls when they are not in the blocked list
* Add an option to print op id as part of a log entry
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
Contents:
850c6fbb9 Frank S. Filz V5.3.2
a99efb78d Shahar Hochma Add an option to print op id as part of a log entry
137110141 Malahal Naineni Handle granted upcalls when they are not in the
blocked list
986bddc8e Madhu Thorat Add LogEventLimited to trace in
fsal_common_is_referral
d70dc1a42 Madhu Thorat [GPFS] Change handle_size to 48 if GPFS returned
handle with size 40
1 year, 5 months
[M] Change in ...nfs-ganesha[next]: MDCACHE: We need to reap chunks and implement a low water mark
by Frank Filz (GerritHub)
Frank Filz has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/556283?usp=email )
Change subject: MDCACHE: We need to reap chunks and implement a low water mark
......................................................................
MDCACHE: We need to reap chunks and implement a low water mark
We shouldn't cache directories forever. We should reap directory
chunks at some interval and rate.
Add a low water mark that we will reap down to.
Change-Id: I5683e9f8be41dc7d92a43a0c6e35ac98a3c0c4f2
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
---
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_ext.h
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.h
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_read_conf.c
M src/config_samples/config.txt
M src/doc/man/ganesha-cache-config.rst
6 files changed, 116 insertions(+), 11 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/83/556283/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/556283?usp=email
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I5683e9f8be41dc7d92a43a0c6e35ac98a3c0c4f2
Gerrit-Change-Number: 556283
Gerrit-PatchSet: 1
Gerrit-Owner: Frank Filz <ffilzlnx(a)mindspring.com>
Gerrit-MessageType: newchange
1 year, 5 months
[S] Change in ...nfs-ganesha[next]: MDCACHE: Fix run interval for chunk lru
by Frank Filz (GerritHub)
Frank Filz has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/556253?usp=email )
Change subject: MDCACHE: Fix run interval for chunk lru
......................................................................
MDCACHE: Fix run interval for chunk lru
If chunks_used > chunks_hiwat, wait_ratio could be negative which
would break if time_t is unsigned. Also improve comment for when
new_thread_wait winds up at 0.
Change-Id: I4a4a8a74daeda95b53c6202f095fd79b72be82cf
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
---
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c
1 file changed, 12 insertions(+), 4 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/53/556253/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/556253?usp=email
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I4a4a8a74daeda95b53c6202f095fd79b72be82cf
Gerrit-Change-Number: 556253
Gerrit-PatchSet: 1
Gerrit-Owner: Frank Filz <ffilzlnx(a)mindspring.com>
Gerrit-MessageType: newchange
1 year, 6 months
[M] Change in ...nfs-ganesha[next]: MDCACHE: make directory chunks references to mdcache entries long term
by Frank Filz (GerritHub)
Frank Filz has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/556252?usp=email )
Change subject: MDCACHE: make directory chunks references to mdcache entries long term
......................................................................
MDCACHE: make directory chunks references to mdcache entries long term
Also, rename mdcache_dir_entry_t "entry" to "mde_entry" for ease of
auditing the use of the mde_entry.
Change-Id: Iacc688bb8421e73619e5c7d767f6bde1b4fde884
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
---
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_avl.c
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_export.c
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_int.h
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.h
7 files changed, 51 insertions(+), 30 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/52/556252/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/556252?usp=email
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: Iacc688bb8421e73619e5c7d767f6bde1b4fde884
Gerrit-Change-Number: 556252
Gerrit-PatchSet: 1
Gerrit-Owner: Frank Filz <ffilzlnx(a)mindspring.com>
Gerrit-MessageType: newchange
1 year, 6 months
[S] Change in ...nfs-ganesha[next]: IBM-only: Validate client_record before using it.
by Name of user not set (GerritHub)
yogendra858(a)yahoo.com has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/556189?usp=email )
Change subject: IBM-only: Validate client_record before using it.
......................................................................
IBM-only: Validate client_record before using it.
Signed-off-by: Malahal Naineni <malahal(a)us.ibm.com>
(cherry picked from commit ccad258b2e501934f5bd977239bc38fd0942c1df)
Change-Id: Ia3d668b33f6f9a7f2d09c805007c8f3a376b323b
Signed-off-by: Yogendra Charya <Yogendra.Charya(a)ibm.com>
---
M src/Protocols/NFS/nfs4_op_setclientid_confirm.c
1 file changed, 18 insertions(+), 3 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/89/556189/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/556189?usp=email
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: Ia3d668b33f6f9a7f2d09c805007c8f3a376b323b
Gerrit-Change-Number: 556189
Gerrit-PatchSet: 1
Gerrit-Owner: yogendra858(a)yahoo.com
Gerrit-CC: Malahal <malahal(a)gmail.com>
Gerrit-MessageType: newchange
1 year, 6 months
[S] Change in ...nfs-ganesha[next]: Handle granted upcalls when they are not in the blocked list
by Name of user not set (GerritHub)
Attention is currently required from: Malahal.
Hello Malahal,
I'd like you to do a code review.
Please visit
https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/556129?usp=email
to review the following change.
Change subject: Handle granted upcalls when they are not in the blocked list
......................................................................
Handle granted upcalls when they are not in the blocked list
When we CANCEL a blocked lock request, we also send UNLOCK request. This
way the upcall processing code can ignore locks that are not in blocked
lock list.
Change-Id: Id02ba5b6ebcf0dcc40599a8a187846b69a039080
Signed-off-by: Malahal Naineni <malahal(a)us.ibm.com>
---
M src/SAL/state_lock.c
1 file changed, 38 insertions(+), 4 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/29/556129/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/556129?usp=email
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: Id02ba5b6ebcf0dcc40599a8a187846b69a039080
Gerrit-Change-Number: 556129
Gerrit-PatchSet: 1
Gerrit-Owner: skmprabhu252(a)gmail.com
Gerrit-Reviewer: Malahal <malahal(a)gmail.com>
Gerrit-Attention: Malahal <malahal(a)gmail.com>
Gerrit-MessageType: newchange
1 year, 6 months
Re: MDCACHE LRU reaper thread issues
by Frank Filz
What Ganesha version are you using?
Could you open a github issue for this?
Thanks
Frank
From: Deepak Arumugam Sankara Subramanian [mailto:deepakarumugam.s@nutanix.com]
Sent: Saturday, June 24, 2023 1:25 PM
To: devel(a)lists.nfs-ganesha.org; Frank Filz <ffilzlnx(a)mindspring.com>
Subject: [NFS-Ganesha-Devel] MDCACHE LRU reaper thread issues
ISSUE
We recently hit some issues where the number of ganesha mdcache entries grew to 6-9 Million when the mdcache hiwater mark was at 450,000.
ROOT CAUSE
Upon further debugging we found that readdir was not playing well with garbage collection. The major issue comes from the fact that we have a temporary pointer from the dirent to the inode mdcache entry.
Today when readdir fills up a chunk it puts a temporary reference from dirent to the inode cache entry. This happens inside mdc_readdir_chunk_object
new_dir_entry->entry = new_entry;
Then we clear it out inside mdcache_readdir_chunked,
if (has_write && dirent->entry) {
/* If we get here, we have the write lock, have an
* entry, and took a ref on it above. The dirent also
* has a ref on the entry. Drop that ref now. This can
* only be done under the write lock. If we don't have
* the write lock, then this was not the readdir that
* took the ref, and another readdir will drop the ref,
* or it will be dropped when the dirent is cleaned up.
* */
mdcache_lru_unref(dirent->entry, LRU_FLAG_NONE);
dirent->entry = NULL;
}
But this means each readdir can hold a refcount on 128 entries in the inode cache at a time.
This leads to a scenario where the following happens
1. We have multiple readdirs going in parallel each reading different directories with 1 Million entries. Each readdir ends up holding a refcount of 2 on 128 entries - some of which end up being the entries at the LRU edge. Because of this the reaper thread is not able to find an entry to reap
2. There is also another bug where we are not hitting the clearing out code path in mdcache_readdir_chunked. We don't have an RCA on where this bug is - but we often see systems where all the LRU edges of the inode cache queues are filled with a refcount of 2 even when there is no load on the system. And all the entries are being referenced by their dirents which is the cause of the refcount.
POSSIBLE RESOLUTIONS/QUESTIONS
1. Can we remove this temporary pointer from dirent to the normal mdcache entry? This optimization is only helpful in the cache miss code path. I think this optimization accomplishes two things,
(a) We don't have to do a lookup from dirent to inode entry in the readdir cache miss code path. But since we are already in the readdircache miss code path I don't think the performance penalty will even be visible.
(b) It prevents the inode cache entries from being flushed out when a readdir is in progress. I wonder if we should just use the LRU way to accomplish this - which brings me to the second question
2. Can we always move an entry to the MRU of L2 even during scans? Today this is a bit of an inconsistent behavior - we insert new entries created during readdir at the MRU of L2 but we don't adjust old entries to the MRU. Not sure if this is an intended behaviour or bug
Inside mdc_readdir_chunk_object we call mdcache_new_entry with LRU_FLAG_NONE
status = mdcache_new_entry(export, sub_handle, attrs_in, false, NULL,
false, &new_entry, NULL, LRU_FLAG_NONE);
if its a inode cache hit it follows this path
status = mdcache_find_keyed_reason(&key, entry, flags);
a. s.. if (likely(*entry)) {
fsal_status_t status;
/* Initial Ref on entry */
mdcache_lru_ref(*entry, flags);
since flags is not LRU_REQ_INITIAL this entry doesn't get adjusted to the MRU of L2 or LRU of L1
3. but if its a inode cache miss it follows this path
s.. mdcache_lru_insert(nentry, flags);
if flags has LRU_FLAG_NONE mdcache_lru_insert inserts the entry into the MRU of L2
We would really appreciate your comments on the questions/resolutions,
Thanks,
Deepak
1 year, 6 months