Announce Push of V4-dev.14
by Frank Filz
Branch next
Tag:V4-dev.14
Merge Highlights
* In mdcache_new_entry do mdcache_lru_insert before cih_set_latched
* Handle incorrect RPCSEC_GSS_DATA messages
* selinux: additional policy for ganesha_var_log_t, dbus, and more
* rados_urls: when built with rados_urls, don't error if lib not installed
* mdcache: cih_latch_entry always returns true
* FSAL: eliminate some bogus comments
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
Contents:
b3b0042 Frank S. Filz V4-dev.14
493ac5c Jeff Layton FSAL: eliminate some bogus comments
b655fa6 Jeff Layton mdcache: cih_latch_entry always returns true
5be95da Kaleb S. KEITHLEY rados_urls: when built with rados_urls, don't
error if lib not installed
199628a Kaleb S. KEITHLEY selinux: additional policy for ganesha_var_log_t,
dbus, and more
13600b7 Trishali Nayar Handle incorrect RPCSEC_GSS_DATA messages
6a2fb1f Frank S. Filz In mdcache_new_entry do mdcache_lru_insert before
cih_set_latched
4 years, 8 months
Change in ...nfs-ganesha[next]: FSAL: eliminate some bogus comments
by Jeff Layton (GerritHub)
Jeff Layton has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490939 )
Change subject: FSAL: eliminate some bogus comments
......................................................................
FSAL: eliminate some bogus comments
The release function is void return.
Change-Id: I33a1ccddf766662e83cccfdd9f38d4f83ee7c6b2
Signed-off-by: Jeff Layton <jlayton(a)redhat.com>
---
M src/FSAL/FSAL_CEPH/export.c
M src/FSAL/FSAL_RGW/export.c
2 files changed, 0 insertions(+), 6 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/39/490939/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490939
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I33a1ccddf766662e83cccfdd9f38d4f83ee7c6b2
Gerrit-Change-Number: 490939
Gerrit-PatchSet: 1
Gerrit-Owner: Jeff Layton <jlayton(a)redhat.com>
Gerrit-MessageType: newchange
4 years, 8 months
mdcache unexport race with Rmdir/Unlink
by Ashish Sangwan
It seems that mdcache unexport can race with remove RPC.
Remove operation drops the sentinel ref and hence when the last call
to mdcache_put() happen during the execution of remove RPC, the
corresponding mdcache_entry will be freed. If an export remove comes
in while the remove OP is still executing, in mdcache_unexport() when
we call mdcache_lru_ref(entry, LRU_REQ_INITIAL) we can end up taking a
ref on the entry whose refcnt has already reached 0 and which is
currently going through mdcache_lru_clean() execution in the context
of remove. The entry is still present in the mdcache_fsal_export's
entry_list because mdcache_lru_clean() has not yet called the
mdc_clean_entry(). Now when do mdcache_put() for this entry later in
mdcache_unexport(), by this time the entry is already freed and we end
up executing mdcache_lru_clean() on a freed entry.
What we need to do inside mdcache_unexport() is to take QLOCK() and
check refcount, if it's 0, continue the for loop for next entry, if
it's not 0 than increment the ref count, drop the QLOCK and continue
with normal current execution path.
Does it makes sense?
4 years, 8 months
回复:[NFS-Ganesha-Devel]回复:Re:_ganesha_hang_more_than_five_hours
by QR
Hi Dang, I generate a core dump for this.
It seems something wrong with "entry->content_lock".Is there a known issue for this? Thanks in advance.
Ganesha server info ganesha version: V2.7.6 FSAL : In housenfs client info nfs version : nfs v3 client info : Centos 7.4
=======================================================================================================================================================================(gdb) thread 204[Switching to thread 204 (Thread 0x7fa7932f2700 (LWP 348))]#0 0x00007faa5ba2cf4d in __lll_lock_wait () from /lib64/libpthread.so.0(gdb) bt#0 0x00007faa5ba2cf4d in __lll_lock_wait () from /lib64/libpthread.so.0#1 0x00007faa5ba28d02 in _L_lock_791 () from /lib64/libpthread.so.0#2 0x00007faa5ba28c08 in pthread_mutex_lock () from /lib64/libpthread.so.0#3 0x0000000000528657 in _mdcache_lru_unref_chunk (chunk=0x7fa7e03e31d0, func=0x59abd0 <__func__.20247> "mdcache_clean_dirent_chunks", line=579) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:2066#4 0x000000000053894c in mdcache_clean_dirent_chunks (entry=0x7fa740035400) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:578#5 0x0000000000538a30 in mdcache_dirent_invalidate_all (entry=0x7fa740035400) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:603#6 0x00000000005376fc in mdc_clean_entry (entry=0x7fa740035400) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:302#7 0x0000000000523559 in mdcache_lru_clean (entry=0x7fa740035400) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:592#8 0x00000000005278b1 in mdcache_lru_get (sub_handle=0x7fa7383cdf40) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:1821#9 0x0000000000536e91 in _mdcache_alloc_handle (export=0x1b4eed0, sub_handle=0x7fa7383cdf40, fs=0x0, reason=MDC_REASON_DEFAULT, func=0x59ac10 <__func__.20274> "mdcache_new_entry", line=691) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:174#10 0x0000000000538c50 in mdcache_new_entry (export=0x1b4eed0, sub_handle=0x7fa7383cdf40, attrs_in=0x7fa7932f00a0, attrs_out=0x7fa7932f0760, new_directory=false, entry=0x7fa7932f0018, state=0x0, reason=MDC_REASON_DEFAULT) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:690#11 0x000000000052d156 in mdcache_alloc_and_check_handle (export=0x1b4eed0, sub_handle=0x7fa7383cdf40, new_obj=0x7fa7932f01b0, new_directory=false, attrs_in=0x7fa7932f00a0, attrs_out=0x7fa7932f0760, tag=0x5999a4 "lookup ", parent=0x7fa878288cb0, name=0x7fa738229270 "cer_7_0.5", invalidate=0x7fa7932f009f, state=0x0) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:100#12 0x000000000053aea1 in mdc_lookup_uncached (mdc_parent=0x7fa878288cb0, name=0x7fa738229270 "cer_7_0.5", new_entry=0x7fa7932f02c8, attrs_out=0x7fa7932f0760) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:1391#13 0x000000000053ab27 in mdc_lookup (mdc_parent=0x7fa878288cb0, name=0x7fa738229270 "cer_7_0.5", uncached=true, new_entry=0x7fa7932f02c8, attrs_out=0x7fa7932f0760) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:1325#14 0x000000000052d5bb in mdcache_lookup (parent=0x7fa878288ce8, name=0x7fa738229270 "cer_7_0.5", handle=0x7fa7932f0878, attrs_out=0x7fa7932f0760) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:181#15 0x0000000000431a60 in fsal_lookup (parent=0x7fa878288ce8, name=0x7fa738229270 "cer_7_0.5", obj=0x7fa7932f0878, attrs_out=0x7fa7932f0760) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/fsal_helper.c:683#16 0x000000000048e83e in nfs3_lookup (arg=0x7fa7381bdc88, req=0x7fa7381bd580, res=0x7fa73839dab0) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/Protocols/NFS/nfs3_lookup.c:104#17 0x000000000045703b in nfs_rpc_process_request (reqdata=0x7fa7381bd580) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1329#18 0x000000000045781f in nfs_rpc_process_request_slowio (reqdata=0x7fa7381bd580, slowio_check_cb=0x553b4a <nfs3_timeout_proc>) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1542#19 0x0000000000457955 in nfs_rpc_valid_NFS (req=0x7fa7381bd580) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1586#20 0x00007faa5c901435 in svc_vc_decode (req=0x7fa7381bd580) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_vc.c:829#21 0x000000000044a497 in nfs_rpc_decode_request (xprt=0x7fa8840027d0, xdrs=0x7fa7384befa0) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1345#22 0x00007faa5c901346 in svc_vc_recv (xprt=0x7fa8840027d0) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_vc.c:802#23 0x00007faa5c8fda92 in svc_rqst_xprt_task (wpe=0x7fa8840029e8) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_rqst.c:769#24 0x00007faa5c8fdeea in svc_rqst_epoll_events (sr_rec=0x1b62c60, n_events=1) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_rqst.c:941#25 0x00007faa5c8fe17f in svc_rqst_epoll_loop (sr_rec=0x1b62c60) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_rqst.c:1014#26 0x00007faa5c8fe232 in svc_rqst_run_task (wpe=0x1b62c60) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_rqst.c:1050#27 0x00007faa5c906dbb in work_pool_thread (arg=0x7fa7540008c0) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/work_pool.c:181#28 0x00007faa5ba26dc5 in start_thread () from /lib64/libpthread.so.0#29 0x00007faa5b33321d in clone () from /lib64/libc.so.6(gdb) frame 3#3 0x0000000000528657 in _mdcache_lru_unref_chunk (chunk=0x7fa7e03e31d0, func=0x59abd0 <__func__.20247> "mdcache_clean_dirent_chunks", line=579) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:20662066 in /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c(gdb) p qlane->mtx$42 = {__data = {__lock = 2, __count = 0, __owner = 225, __nusers = 1, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\002\000\000\000\000\000\000\000\341\000\000\000\001", '\000' <repeats 26 times>, __align = 2}=======================================================================================================================================================================(gdb) info threads 84 Thread 0x7fa9a68e8700 (LWP 228) 0x00007faa5ba2a03e in pthread_rwlock_wrlock () from /lib64/libpthread.so.0 83 Thread 0x7fa9a69e9700 (LWP 227) 0x00007faa5ba29e24 in pthread_rwlock_rdlock () from /lib64/libpthread.so.0 82 Thread 0x7fa9a6aea700 (LWP 225) 0x00007faa5ba2cf4d in __lll_lock_wait () from /lib64/libpthread.so.0 81 Thread 0x7fa9a6beb700 (LWP 226) 0x00007faa5ba2a03e in pthread_rwlock_wrlock () from /lib64/libpthread.so.0 80 Thread 0x7fa9a6cec700 (LWP 224) 0x00007faa5ba2a03e in pthread_rwlock_wrlock () from /lib64/libpthread.so.0 79 Thread 0x7fa9a6ded700 (LWP 223) 0x00007faa5ba29e24 in pthread_rwlock_rdlock () from /lib64/libpthread.so.0 78 Thread 0x7fa9a6eee700 (LWP 222) 0x00007faa5ba2a03e in pthread_rwlock_wrlock () from /lib64/libpthread.so.0 77 Thread 0x7fa9a6fef700 (LWP 221) 0x00007faa5ba29e24 in pthread_rwlock_rdlock () from /lib64/libpthread.so.0 76 Thread 0x7fa9a70f0700 (LWP 220) 0x00007faa5ba2a03e in pthread_rwlock_wrlock () from /lib64/libpthread.so.0 75 Thread 0x7fa9a71f1700 (LWP 219) 0x00007faa5ba2a03e in pthread_rwlock_wrlock () from /lib64/libpthread.so.0 74 Thread 0x7fa9a72f2700 (LWP 218) 0x00007faa5ba29e24 in pthread_rwlock_rdlock () from /lib64/libpthread.so.0 73 Thread 0x7fa9a73f3700 (LWP 217) 0x00007faa5ba2a03e in pthread_rwlock_wrlock () from /lib64/libpthread.so.0=======================================================================================================================================================================(gdb) thread 82[Switching to thread 82 (Thread 0x7fa9a6aea700 (LWP 225))]#0 0x00007faa5ba2cf4d in __lll_lock_wait () from /lib64/libpthread.so.0(gdb) bt#0 0x00007faa5ba2cf4d in __lll_lock_wait () from /lib64/libpthread.so.0#1 0x00007faa5ba2a307 in _L_lock_14 () from /lib64/libpthread.so.0#2 0x00007faa5ba2a2b3 in pthread_rwlock_trywrlock () from /lib64/libpthread.so.0#3 0x0000000000524547 in lru_reap_chunk_impl (qid=LRU_ENTRY_L2, parent=0x7fa7f8122810) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:809#4 0x0000000000524a01 in mdcache_get_chunk (parent=0x7fa7f8122810, prev_chunk=0x0, whence=0) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:886#5 0x000000000053f0b6 in mdcache_populate_dir_chunk (directory=0x7fa7f8122810, whence=0, dirent=0x7fa9a6ae7f80, prev_chunk=0x0, eod_met=0x7fa9a6ae7f7f) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:2554#6 0x0000000000540aff in mdcache_readdir_chunked (directory=0x7fa7f8122810, whence=0, dir_state=0x7fa9a6ae8130, cb=0x43215e <populate_dirent>, attrmask=122830, eod_met=0x7fa9a6ae884b) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:2991#7 0x000000000052ee71 in mdcache_readdir (dir_hdl=0x7fa7f8122848, whence=0x7fa9a6ae8110, dir_state=0x7fa9a6ae8130, cb=0x43215e <populate_dirent>, attrmask=122830, eod_met=0x7fa9a6ae884b) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:559#8 0x0000000000432a85 in fsal_readdir (directory=0x7fa7f8122848, cookie=0, nbfound=0x7fa9a6ae884c, eod_met=0x7fa9a6ae884b, attrmask=122830, cb=0x491b0f <nfs3_readdirplus_callback>, opaque=0x7fa9a6ae8800) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/fsal_helper.c:1164#9 0x0000000000491968 in nfs3_readdirplus (arg=0x7fa91470b9f8, req=0x7fa91470b2f0, res=0x7fa91407b8f0) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/Protocols/NFS/nfs3_readdirplus.c:310#10 0x000000000045703b in nfs_rpc_process_request (reqdata=0x7fa91470b2f0) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1329#11 0x000000000045781f in nfs_rpc_process_request_slowio (reqdata=0x7fa91470b2f0, slowio_check_cb=0x553b4a <nfs3_timeout_proc>) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1542#12 0x0000000000457955 in nfs_rpc_valid_NFS (req=0x7fa91470b2f0) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1586#13 0x00007faa5c901435 in svc_vc_decode (req=0x7fa91470b2f0) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_vc.c:829#14 0x000000000044a497 in nfs_rpc_decode_request (xprt=0x7fa8840027d0, xdrs=0x7fa9142a1370) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1345#15 0x00007faa5c901346 in svc_vc_recv (xprt=0x7fa8840027d0) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_vc.c:802#16 0x00007faa5c8fda92 in svc_rqst_xprt_task (wpe=0x7fa8840029e8) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_rqst.c:769#17 0x00007faa5c8fdeea in svc_rqst_epoll_events (sr_rec=0x1b62c60, n_events=1) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_rqst.c:941#18 0x00007faa5c8fe17f in svc_rqst_epoll_loop (sr_rec=0x1b62c60) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_rqst.c:1014#19 0x00007faa5c8fe232 in svc_rqst_run_task (wpe=0x1b62c60) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_rqst.c:1050#20 0x00007faa5c906dbb in work_pool_thread (arg=0x7fa9380008c0) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/work_pool.c:181#21 0x00007faa5ba26dc5 in start_thread () from /lib64/libpthread.so.0#22 0x00007faa5b33321d in clone () from /lib64/libc.so.6(gdb) frame 3#3 0x0000000000524547 in lru_reap_chunk_impl (qid=LRU_ENTRY_L2, parent=0x7fa7f8122810) at /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:809809 in /export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c(gdb) p entry->content_lock$43 = {__data = {__lock = 2, __nr_readers = 32681, __readers_wakeup = 0, __writer_wakeup = 0, __nr_readers_queued = 0, __nr_writers_queued = 0, __writer = -598294472, __shared = 32680, __pad1 = 140366914447067, __pad2 = 0, __flags = 0}, __size = "\002\000\000\000\251\177", '\000' <repeats 18 times>, "\070\300V\334\250\177\000\000\333\346\022\270\251\177", '\000' <repeats 17 times>, __align = 140363826200578}=======================================================================================================================================================================
--------------------------------
----- 原始邮件 -----
发件人:"QR" <zhbingyin(a)sina.com>
收件人:"Daniel Gryniewicz" <dgryniew(a)redhat.com>, "ganesha-devel" <devel(a)lists.nfs-ganesha.org>,
主题:[NFS-Ganesha-Devel]回复:Re:_ganesha_hang_more_than_five_hours
日期:2020年02月26日 18点21分
Not yet. Because the docker did not enable core dump.
Will try to create a full backtrace for this, thanks.
--------------------------------
----- 原始邮件 -----
发件人:Daniel Gryniewicz <dgryniew(a)redhat.com>
收件人:devel(a)lists.nfs-ganesha.org
主题:[NFS-Ganesha-Devel] Re: ganesha hang more than five hours
日期:2020年02月25日 21点20分
No, definitely not a known issue. Do you have a full backtrace of one
(or several, if they're different) hung threads?
Daniel
On 2/24/20 9:28 PM, QR wrote:
> Hi Dang,
>
> Ganesha hangs more than five hours. It seems that 198 svc threads hang
> on nfs3_readdirplus.
> Is there a known issue about this? Thanks in advance.
>
> Ganesha server info
> ganesha version: V2.7.6
> FSAL : In house
> nfs client info
> nfs version : nfs v3
> client info : CentOS 7.4
>
> _______________________________________________
> Devel mailing list -- devel(a)lists.nfs-ganesha.org
> To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org
To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org
To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
4 years, 8 months
Change in ...nfs-ganesha[next]: FSAL: add ERR_FSAL_BUSY
by Jeff Layton (GerritHub)
Jeff Layton has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490855 )
Change subject: FSAL: add ERR_FSAL_BUSY
......................................................................
FSAL: add ERR_FSAL_BUSY
Change-Id: I611eb9f54640f01723cc4cf7220f22e765e3a685
Signed-off-by: Jeff Layton <jlayton(a)redhat.com>
---
M src/FSAL/commonlib.c
M src/Protocols/NFS/nfs_proto_tools.c
M src/SAL/state_misc.c
M src/include/fsal_types.h
M src/support/nfs_convert.c
5 files changed, 7 insertions(+), 0 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/55/490855/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490855
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I611eb9f54640f01723cc4cf7220f22e765e3a685
Gerrit-Change-Number: 490855
Gerrit-PatchSet: 1
Gerrit-Owner: Jeff Layton <jlayton(a)redhat.com>
Gerrit-MessageType: newchange
4 years, 8 months
Change in ...nfs-ganesha[next]: FSAL_CEPH: set ino_release_cb on mount
by Jeff Layton (GerritHub)
Jeff Layton has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490848 )
Change subject: FSAL_CEPH: set ino_release_cb on mount
......................................................................
FSAL_CEPH: set ino_release_cb on mount
Newer versions of libcephfs allow the application to register
callbacks that will be called when the client has been requested
to shrink the number of caps it holds.
When libcephfs calls into the application to request releasing an
inode, try to satisfy that request by calling the shrink upcall.
Change-Id: Ic89f5f3aa1e1474c1aa3d4e59e9f69248266ab18
Signed-off-by: Jeff Layton <jlayton(a)redhat.com>
---
M src/FSAL/FSAL_CEPH/main.c
M src/cmake/modules/FindCEPHFS.cmake
M src/include/config-h.in.cmake
3 files changed, 45 insertions(+), 0 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/48/490848/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490848
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: Ic89f5f3aa1e1474c1aa3d4e59e9f69248266ab18
Gerrit-Change-Number: 490848
Gerrit-PatchSet: 1
Gerrit-Owner: Jeff Layton <jlayton(a)redhat.com>
Gerrit-MessageType: newchange
4 years, 8 months
Change in ...nfs-ganesha[next]: FSAL_UP: add new "shrink" upcall
by Jeff Layton (GerritHub)
Jeff Layton has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490847 )
Change subject: FSAL_UP: add new "shrink" upcall
......................................................................
FSAL_UP: add new "shrink" upcall
Ceph MDS's can request that the client shrink their caps cache in
response to memory pressure, and libcephfs now has a way to set
callbacks that get called when it wants to drop caps for an inode.
Add a new shrink upcall that FSAL_CEPH can call for this.
The upcall will find the entry and test its refcount. If the refcount is
2, then we know that there are no other references than the one we
currently hold and the sentinel reference. That should mean that it's
safe to drop it from the hash and try to kill it off.
Change-Id: Ib195569c88563235014fe1b7e6c37ea1333ceeea
Signed-off-by: Jeff Layton <jlayton(a)redhat.com>
---
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_up.c
M src/FSAL_UP/fsal_up_top.c
M src/include/fsal_up.h
3 files changed, 78 insertions(+), 1 deletion(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/47/490847/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490847
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: Ib195569c88563235014fe1b7e6c37ea1333ceeea
Gerrit-Change-Number: 490847
Gerrit-PatchSet: 1
Gerrit-Owner: Jeff Layton <jlayton(a)redhat.com>
Gerrit-MessageType: newchange
4 years, 8 months
Change in ...nfs-ganesha[next]: mdcache: cih_latch_entry always returns true
by Jeff Layton (GerritHub)
Jeff Layton has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490826 )
Change subject: mdcache: cih_latch_entry always returns true
......................................................................
mdcache: cih_latch_entry always returns true
Get rid of some dead branches in the callers, and reduce indentation.
Change-Id: I9f2b17f97e2c1da69abc09b1abf477402efe8223
Signed-off-by: Jeff Layton <jlayton(a)redhat.com>
---
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_hash.h
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c
2 files changed, 63 insertions(+), 74 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/26/490826/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490826
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I9f2b17f97e2c1da69abc09b1abf477402efe8223
Gerrit-Change-Number: 490826
Gerrit-PatchSet: 1
Gerrit-Owner: Jeff Layton <jlayton(a)redhat.com>
Gerrit-MessageType: newchange
4 years, 8 months
回复:[NFS-Ganesha-Devel]_Re:_回复:NFSv3_mounts_hang_from_a_client_forever_and_new_mounts_to_the_same_share_hangs
by QR
RHEL7.7 is ok, I didn't try with higher kernel versions. But I think you are right.
--------------------------------
----- 原始邮件 -----
发件人:Deepthi Shivaramu <des(a)vmware.com>
收件人:"zhbingyin(a)sina.com" <zhbingyin(a)sina.com>, ganesha-devel <devel(a)lists.nfs-ganesha.org>
主题:[NFS-Ganesha-Devel]_Re:_回复:NFSv3_mounts_hang_from_a_client_forever_and_new_mounts_to_the_same_share_hangs
日期:2020年04月23日 18点03分
Thanks a lot for the pointer. Yes, this makes complete sense.
In our system we do failover an active NFS server by sending RST to the clients connected to it and in the tests I mentioned, we have had a failover before we started seeing this issue.
So this issue is fixed in RHEL7.7 and higher if my understanding of the kernel versions are right?
From: QR <zhbingyin(a)sina.com>
Reply to: "zhbingyin(a)sina.com" <zhbingyin(a)sina.com>
Date: Thursday, 23 April 2020 at 3:20 AM
To: Deepthi Shivaramu <des(a)vmware.com>, ganesha-devel <devel(a)lists.nfs-ganesha.org>
Subject: 回复:[NFS-Ganesha-Devel] NFSv3 mounts hang from a client forever and new mounts to the same share hangs
refer to https://lore.kernel.org/linux-nfs/20181212135157.4489-1-dwysocha@redhat.c...
--------------------------------
-----
原始邮件 -----
发件人:des(a)vmware.com
收件人:devel(a)lists.nfs-ganesha.org
主题:[NFS-Ganesha-Devel]
NFSv3 mounts hang from a client forever and new mounts to the same share hangs
日期:2020年04月22日
14点36分
We are using NFS Ganesha and mounting NFSv3 with auth_sys on RHEL7.6 linux clients.
I am seeing this weird issue that after running some system tests some linux clients enter into a state where the existing mount point for one share(Lets says testShare1) becomes inaccessible and trying to mount again the same share hangs forever. Client does
not get out of this situation at all. Strange thing is it is able to mount other shares successfully. To add it to it the testShare1 is accessible fine from other clients too.
The packet captures on client show no packets on the wire and the ganesha logs dont contain any hint too.
This looks like client issue and we have RHEL7.6 linux clients in this setup. In the client's /var/log/messages, I see this error continuously:
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: xs_tcp_send_request(524460) = -32
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 marshaling UNIX cred ffff889b0829c900
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 call_status (status -32)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 using AUTH_UNIX cred ffff889b0829c900 to wrap rpc data
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 xprt_transmit(524460)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: xs_tcp_send_request(524460) = -32
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 call_bind (status 0)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 call_status (status -32)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 call_connect xprt ffff889b3c5a2800 is connected
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 call_transmit (status 0)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 xprt_prepare_transmit
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 rpc_xdr_encode (status 0)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 marshaling UNIX cred ffff889b0829c900
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 using AUTH_UNIX cred ffff889b0829c900 to wrap rpc data
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 call_bind (status 0)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 call_connect xprt ffff889b3c5a6000 is connected
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 call_transmit (status 0)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 xprt_prepare_transmit
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 rpc_xdr_encode (status 0)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 marshaling UNIX cred ffff889b0829c900
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 xprt_transmit(524460)
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: xs_tcp_send_request(524460) = -32
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 56546 using AUTH_UNIX cred ffff889b0829c900 to wrap rpc data
Apr 19 03:47:00 w1h34v25-c0006 kernel: RPC: 57696 call_status (status -32)
#define EPIPE 32 /* Broken pipe */
xs_tcp_send_request - write an RPC request to a TCP socket
https://github.com/torvalds/linux/blob/master/net/sunrpc/xprtsock.c#L1027
Linux source code pointed above shows client is not able to send the RPC request on the socket. socket send is failing with EPIPE error.
I believe the NFS packets are failing to be sent out from this RPC transport, hence all access for a particular share gets associated with same transport and all of them keep failing with EPIPE error.
I see this thread in https://bugzilla.redhat.com/show_bug.cgi?id=692315#c15 where Jeff Layton was discussing same issue but this seems to be fixed in RHEL6.2 itself.
Jeff, can you please help to understand if the fix for above bug in RHEL7.6 and if so why do we see this issue still?
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org
To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org
To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
4 years, 8 months
Change in ...nfs-ganesha[next]: Handle incorrect RPCSEC_GSS_DATA messages
by Trishali Nayar (GerritHub)
Trishali Nayar has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490802 )
Change subject: Handle incorrect RPCSEC_GSS_DATA messages
......................................................................
Handle incorrect RPCSEC_GSS_DATA messages
If authentication is RPCSEC_GSS, we can get messages with an
incorrect sequence number. The libntirpc does not send an error
in this case, so as the caller ganesha code should return valid
error to the client. Else it can cause krb5i and krb5p mounts to
hang.
Change-Id: I445f3e20142e48f45918270ffabc66659fb96e76
Signed-off-by: Trishali Nayar <ntrishal(a)in.ibm.com>
---
M src/MainNFSD/nfs_worker_thread.c
1 file changed, 9 insertions(+), 0 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/02/490802/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490802
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I445f3e20142e48f45918270ffabc66659fb96e76
Gerrit-Change-Number: 490802
Gerrit-PatchSet: 1
Gerrit-Owner: Trishali Nayar <ntrishal(a)in.ibm.com>
Gerrit-MessageType: newchange
4 years, 8 months