Change in ...nfs-ganesha[next]: MDCACHE - Reload chunk need reset chunk->next_ck
by sepia-liu (GerritHub)
sepia-liu has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490795 )
Change subject: MDCACHE - Reload chunk need reset chunk->next_ck
......................................................................
MDCACHE - Reload chunk need reset chunk->next_ck
The chunk reload because it failed to find entry,
reset chunk->next_ck can avoid lookup existing dirent
and chunk collision
Change-Id: I95310eeece701529a380e3eadcb6d22ecb937460
Signed-off-by: sepia-liu <liuwei_coder(a)163.com>
---
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c
1 file changed, 12 insertions(+), 1 deletion(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/95/490795/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490795
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I95310eeece701529a380e3eadcb6d22ecb937460
Gerrit-Change-Number: 490795
Gerrit-PatchSet: 1
Gerrit-Owner: sepia-liu <liuwei_coder(a)163.com>
Gerrit-MessageType: newchange
4 years, 8 months
Change in ...nfs-ganesha[next]: MDCACHE - Fix readdir reply to duplicate cookie to client
by sepia-liu (GerritHub)
sepia-liu has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490794 )
Change subject: MDCACHE - Fix readdir reply to duplicate cookie to client
......................................................................
MDCACHE - Fix readdir reply to duplicate cookie to client
When the last dirent lookup entry failed by ckey,
the chunk containing the dirent is freed, then
next_ck=chunk->reload_ck and the chunk will repopulate,
by skipping already used dirent and look_ck dirent,
here next_ck not change and look_ck=0 cause reply to duplicate cookie
Change-Id: Id71f53afbf2506728551dfab6fbffa581c1284a0
Signed-off-by: sepia-liu <liuwei_coder(a)163.com>
---
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c
1 file changed, 5 insertions(+), 5 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/94/490794/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490794
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: Id71f53afbf2506728551dfab6fbffa581c1284a0
Gerrit-Change-Number: 490794
Gerrit-PatchSet: 1
Gerrit-Owner: sepia-liu <liuwei_coder(a)163.com>
Gerrit-MessageType: newchange
4 years, 8 months
回复:NFSv3 mounts hang from a client forever and new mounts to the same share hangs
by QR
refer to https://lore.kernel.org/linux-nfs/20181212135157.4489-1-dwysocha@redhat.c...
--------------------------------
----- 原始邮件 -----
发件人:des(a)vmware.com
收件人:devel(a)lists.nfs-ganesha.org
主题:[NFS-Ganesha-Devel] NFSv3 mounts hang from a client forever and new mounts to the same share hangs
日期:2020年04月22日 14点36分
We are using NFS Ganesha and mounting NFSv3 with auth_sys on RHEL7.6 linux clients.
I am seeing this weird issue that after running some system tests some linux clients enter into a state where the existing mount point for one share(Lets says testShare1) becomes inaccessible and trying to mount again the same share hangs forever. Client does not get out of this situation at all. Strange thing is it is able to mount other shares successfully. To add it to it the testShare1 is accessible fine from other clients too.
The packet captures on client show no packets on the wire and the ganesha logs dont contain any hint too.
This looks like client issue and we have RHEL7.6 linux clients in this setup. In the client's /var/log/messages, I see this error continuously:
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: xs_tcp_send_request(524460) = -32
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 marshaling UNIX cred ffff889b0829c900
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 call_status (status -32)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 using AUTH_UNIX cred ffff889b0829c900 to wrap rpc data
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 xprt_transmit(524460)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: xs_tcp_send_request(524460) = -32
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 call_bind (status 0)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 call_status (status -32)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 call_connect xprt ffff889b3c5a2800 is connected
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 call_transmit (status 0)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 xprt_prepare_transmit
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 rpc_xdr_encode (status 0)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 marshaling UNIX cred ffff889b0829c900
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 using AUTH_UNIX cred ffff889b0829c900 to wrap rpc data
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 call_bind (status 0)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 call_connect xprt ffff889b3c5a6000 is connected
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 call_transmit (status 0)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 xprt_prepare_transmit
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 rpc_xdr_encode (status 0)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 marshaling UNIX cred ffff889b0829c900
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 xprt_transmit(524460)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: xs_tcp_send_request(524460) = -32
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 using AUTH_UNIX cred ffff889b0829c900 to wrap rpc data
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 call_status (status -32)
#define EPIPE 32 /* Broken pipe */
xs_tcp_send_request - write an RPC request to a TCP socket
https://github.com/torvalds/linux/blob/master/net/sunrpc/xprtsock.c#L1027
Linux source code pointed above shows client is not able to send the RPC request on the socket. socket send is failing with EPIPE error.
I believe the NFS packets are failing to be sent out from this RPC transport, hence all access for a particular share gets associated with same transport and all of them keep failing with EPIPE error.
I see this thread in https://bugzilla.redhat.com/show_bug.cgi?id=692315#c15 where Jeff Layton was discussing same issue but this seems to be fixed in RHEL6.2 itself.
Jeff, can you please help to understand if the fix for above bug in RHEL7.6 and if so why do we see this issue still?
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org
To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
4 years, 8 months
Change in ...nfs-ganesha[next]: rados_urls: when built with rados_urls, don't error if lib not installed
by Kaleb KEITHLEY (GerritHub)
Kaleb KEITHLEY has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490737 )
Change subject: rados_urls: when built with rados_urls, don't error if lib not installed
......................................................................
rados_urls: when built with rados_urls, don't error if lib not installed
The difference between rados-recov and rados-urls is that rados-recov
is explicitly configured in /etc/ganesha/ganesha.conf. If configured,
only then does ganesha try to load the rados-recov shlib. Then, if the
lib is not installed, it issues an error and exits.
But if built with rados-urls, an attempt is always made to load the
shlib, which might not be installed, if only because rados-urls aren't
actually being used. Prior to this both rados-recov and rados-urls
had similar logic and both exited with an error if the shlib couldn't
be loaded.
Now, with this change, just issue a warning that the rados-urls shlib
couldn't be loaded and continue.
Change-Id: I1f22a3237c0573985cfbb840b59317227a69cabb
Signed-off-by: Kaleb S. KEITHLEY <kkeithle(a)redhat.com>
---
M src/config_parsing/conf_url.c
1 file changed, 2 insertions(+), 2 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/37/490737/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490737
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I1f22a3237c0573985cfbb840b59317227a69cabb
Gerrit-Change-Number: 490737
Gerrit-PatchSet: 1
Gerrit-Owner: Kaleb KEITHLEY <kaleb(a)redhat.com>
Gerrit-MessageType: newchange
4 years, 8 months
NFSv3 mounts hang from a client forever and new mounts to the same share hangs
by des@vmware.com
We are using NFS Ganesha and mounting NFSv3 with auth_sys on RHEL7.6 linux clients.
I am seeing this weird issue that after running some system tests some linux clients enter into a state where the existing mount point for one share(Lets says testShare1) becomes inaccessible and trying to mount again the same share hangs forever. Client does not get out of this situation at all. Strange thing is it is able to mount other shares successfully. To add it to it the testShare1 is accessible fine from other clients too.
The packet captures on client show no packets on the wire and the ganesha logs dont contain any hint too.
This looks like client issue and we have RHEL7.6 linux clients in this setup. In the client's /var/log/messages, I see this error continuously:
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: xs_tcp_send_request(524460) = -32
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 marshaling UNIX cred ffff889b0829c900
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 call_status (status -32)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 using AUTH_UNIX cred ffff889b0829c900 to wrap rpc data
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 xprt_transmit(524460)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: xs_tcp_send_request(524460) = -32
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 call_bind (status 0)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 call_status (status -32)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 call_connect xprt ffff889b3c5a2800 is connected
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 call_transmit (status 0)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 xprt_prepare_transmit
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 rpc_xdr_encode (status 0)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 marshaling UNIX cred ffff889b0829c900
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 using AUTH_UNIX cred ffff889b0829c900 to wrap rpc data
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 call_bind (status 0)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 call_connect xprt ffff889b3c5a6000 is connected
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 call_transmit (status 0)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 xprt_prepare_transmit
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 rpc_xdr_encode (status 0)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 marshaling UNIX cred ffff889b0829c900
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 xprt_transmit(524460)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: xs_tcp_send_request(524460) = -32
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 using AUTH_UNIX cred ffff889b0829c900 to wrap rpc data
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 call_status (status -32)
#define EPIPE 32 /* Broken pipe */
xs_tcp_send_request - write an RPC request to a TCP socket
https://github.com/torvalds/linux/blob/master/net/sunrpc/xprtsock.c#L1027
Linux source code pointed above shows client is not able to send the RPC request on the socket. socket send is failing with EPIPE error.
I believe the NFS packets are failing to be sent out from this RPC transport, hence all access for a particular share gets associated with same transport and all of them keep failing with EPIPE error.
I see this thread in https://bugzilla.redhat.com/show_bug.cgi?id=692315#c15 where Jeff Layton was discussing same issue but this seems to be fixed in RHEL6.2 itself.
Jeff, can you please help to understand if the fix for above bug in RHEL7.6 and if so why do we see this issue still?
4 years, 8 months
Change in ...nfs-ganesha[next]: Split state_lock into st_lock and jct_lock
by Frank Filz (GerritHub)
Frank Filz has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490519 )
Change subject: Split state_lock into st_lock and jct_lock
......................................................................
Split state_lock into st_lock and jct_lock
The state_hdl state_lock is used for two different things.
When used for state (locks, shares, delegations) protection, it
is only used exclusively (i.e. write) so could be just a mutex.
When used for junction protection, it is used as a rwlock.
Also, it's handy for the locks to have a different order with
respect to the export lock. By breaking up and reversing the
order for jct_lock we eliminate a place where we drop the export
lock, take and drop the jct_lock and then retake the export lock.
And separating them makes the code a bit easier to read.
Also caught one place where the state_lock was not taken with
STATELOCK_lock.
Change-Id: I1ceb25d34c1ebae183fc142b2298a3468e6a1b02
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
---
M src/FSAL/FSAL_CEPH/handle.c
M src/FSAL/FSAL_GPFS/file.c
M src/FSAL/FSAL_MEM/mem_handle.c
M src/FSAL/FSAL_RGW/handle.c
M src/FSAL/FSAL_VFS/file.c
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.h
M src/FSAL/fsal_helper.c
M src/FSAL_UP/fsal_up_top.c
M src/Protocols/NFS/nfs4_op_close.c
M src/Protocols/NFS/nfs4_op_getattr.c
M src/Protocols/NFS/nfs4_op_layoutget.c
M src/Protocols/NFS/nfs4_op_layoutreturn.c
M src/Protocols/NFS/nfs4_op_lock.c
M src/Protocols/NFS/nfs4_op_lookup.c
M src/Protocols/NFS/nfs4_op_open.c
M src/Protocols/NFS/nfs4_op_readdir.c
M src/Protocols/NFS/nfs4_op_secinfo.c
M src/Protocols/NFS/nfs4_pseudo.c
M src/SAL/nfs4_state.c
M src/SAL/state_deleg.c
M src/SAL/state_layout.c
M src/SAL/state_lock.c
M src/SAL/state_misc.c
M src/include/fsal_api.h
M src/include/sal_data.h
M src/include/sal_functions.h
M src/support/exports.c
28 files changed, 123 insertions(+), 125 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/19/490519/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490519
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I1ceb25d34c1ebae183fc142b2298a3468e6a1b02
Gerrit-Change-Number: 490519
Gerrit-PatchSet: 1
Gerrit-Owner: Frank Filz <ffilzlnx(a)mindspring.com>
Gerrit-MessageType: newchange
4 years, 8 months
Change in ...nfs-ganesha[next]: selinux: additional policy for ganesha_var_log_t
by Kaleb KEITHLEY (GerritHub)
Kaleb KEITHLEY has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490517 )
Change subject: selinux: additional policy for ganesha_var_log_t
......................................................................
selinux: additional policy for ganesha_var_log_t
Seems to be needed as a side effect of upgrading to the latest
selinux-policy-targeted-2.14.4-50 on fedora-31. Unclear.
Change-Id: Ib79532e691b1cdf75373e9ad4e95340ce33c861d
Signed-off-by: Kaleb S. KEITHLEY <kkeithle(a)redhat.com>
---
M src/selinux/ganesha.te
1 file changed, 1 insertion(+), 0 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/17/490517/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490517
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: Ib79532e691b1cdf75373e9ad4e95340ce33c861d
Gerrit-Change-Number: 490517
Gerrit-PatchSet: 1
Gerrit-Owner: Kaleb KEITHLEY <kaleb(a)redhat.com>
Gerrit-MessageType: newchange
4 years, 8 months
Bugfix for a tagged version
by Yoni K
Hi, I want to create a small bugfix for V2.8.3 but I'm not sure what's the correct way to do it since there's only "next" branch on gerrit. Can I get a short explanation on how to do it (I also read the DevPolicy document, might have missed something)?
4 years, 8 months
Hitting a crash in mdcache_lru_cleanup_push.
by Pradeep Thomas
Hello Daniel/Frank,
While debugging a crash from 2.7.1 Ganesha, I see a potential race between the two paths below:
Thread 1 (waiting for the qlock to insert to LRU)
nfs4_mds_putfh -> mdcache_create_handle -> mdcache_locate_host -> mdcache_new_entry -> mdcache_lru_insert -> lru_insert_entry
Thread 2 (unlink the same object) - since the object is already in mdcache at this point, I believe other threads will get it.
fsal_remove -> mdcache_unlink -> _mdc_unreachable -> _mdcache_kill_entry -> mdcache_lru_cleanup_push
The second thread will find the lru something like this:
$5 = {q = {next = 0x0, prev = 0x0}, qid = LRU_ENTRY_NONE, refcnt = 2, flags = 0, lane = 12, cf = 0}
So, the below code will end up crashing:
if (!(lru->qid == LRU_ENTRY_CLEANUP)) {
struct lru_q *q;
/* out with the old queue */
q = lru_queue_of(entry); <<-- q will be NULL because qid == LRU_ENTRY_NONE
Should Thread 2 just ignore if q is NULL and let Thread1's operation to free the entry later?
Also, please let me know if there is any recent fixes in this area.
Thanks,
Pradeep
4 years, 8 months
Change in ...nfs-ganesha[next]: In mdcache_new_entry do mdcache_lru_insert before cih_set_latched
by Frank Filz (GerritHub)
Frank Filz has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490335 )
Change subject: In mdcache_new_entry do mdcache_lru_insert before cih_set_latched
......................................................................
In mdcache_new_entry do mdcache_lru_insert before cih_set_latched
We need to do mdcache_lru_insert while holding the latch. No reason
it can't be done before cih_set_latched.
Change-Id: I3432b33a54c899ceaac5c208e85a020c851a177c
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
---
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c
1 file changed, 3 insertions(+), 2 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/35/490335/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490335
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I3432b33a54c899ceaac5c208e85a020c851a177c
Gerrit-Change-Number: 490335
Gerrit-PatchSet: 1
Gerrit-Owner: Frank Filz <ffilzlnx(a)mindspring.com>
Gerrit-MessageType: newchange
4 years, 8 months