Nothing directly related to the content lock. However, there have been
several use-after-free races fixed, and it's not impossible that trying
to interacte with a freed lock will cause a hang. The commits in
question are:
11e0e375e40658267cbf449afacaa53a136f7097
eb98f5b855147f44e79fe08dcec8d5057b05ea30
There have been quite a few fixes to readdir since 2.7.6 (and
specifically to whence-is-name; I believe your FSAL is a whence-is-name
FSAL?) so you might want to look into updating to a newer version.
2.8.4 should come out next week, and 3.3 soon after that.
Daniel
On 4/23/20 11:17 AM, QR wrote:
Hi Dang, I generate a core dump for this.
It seems something wrong with "entry->content_lock".
Is there a known issue for this? Thanks in advance.
Ganesha server info
ganesha version: V2.7.6
FSAL : In house
nfs client info
nfs version : nfs v3
client info : Centos 7.4
=======================================================================================================================================================================
(gdb) thread 204
[Switching to thread 204 (Thread 0x7fa7932f2700 (LWP 348))]
#0 0x00007faa5ba2cf4d in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0 0x00007faa5ba2cf4d in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00007faa5ba28d02 in _L_lock_791 () from /lib64/libpthread.so.0
#2 0x00007faa5ba28c08 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x0000000000528657 in _mdcache_lru_unref_chunk
(chunk=0x7fa7e03e31d0, func=0x59abd0 <__func__.20247>
"mdcache_clean_dirent_chunks", line=579)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:2066
#4 0x000000000053894c in mdcache_clean_dirent_chunks (entry=0x7fa740035400)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:578
#5 0x0000000000538a30 in mdcache_dirent_invalidate_all
(entry=0x7fa740035400)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:603
#6 0x00000000005376fc in mdc_clean_entry (entry=0x7fa740035400) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:302
#7 0x0000000000523559 in mdcache_lru_clean (entry=0x7fa740035400) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:592
#8 0x00000000005278b1 in mdcache_lru_get (sub_handle=0x7fa7383cdf40) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:1821
#9 0x0000000000536e91 in _mdcache_alloc_handle (export=0x1b4eed0,
sub_handle=0x7fa7383cdf40, fs=0x0, reason=MDC_REASON_DEFAULT,
func=0x59ac10 <__func__.20274> "mdcache_new_entry", line=691)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:174
#10 0x0000000000538c50 in mdcache_new_entry (export=0x1b4eed0,
sub_handle=0x7fa7383cdf40, attrs_in=0x7fa7932f00a0,
attrs_out=0x7fa7932f0760, new_directory=false, entry=0x7fa7932f0018,
state=0x0, reason=MDC_REASON_DEFAULT) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:690
#11 0x000000000052d156 in mdcache_alloc_and_check_handle
(export=0x1b4eed0, sub_handle=0x7fa7383cdf40, new_obj=0x7fa7932f01b0,
new_directory=false, attrs_in=0x7fa7932f00a0,
attrs_out=0x7fa7932f0760, tag=0x5999a4 "lookup ",
parent=0x7fa878288cb0, name=0x7fa738229270 "cer_7_0.5",
invalidate=0x7fa7932f009f, state=0x0)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:100
#12 0x000000000053aea1 in mdc_lookup_uncached
(mdc_parent=0x7fa878288cb0, name=0x7fa738229270 "cer_7_0.5",
new_entry=0x7fa7932f02c8, attrs_out=0x7fa7932f0760)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:1391
#13 0x000000000053ab27 in mdc_lookup (mdc_parent=0x7fa878288cb0,
name=0x7fa738229270 "cer_7_0.5", uncached=true,
new_entry=0x7fa7932f02c8, attrs_out=0x7fa7932f0760)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:1325
#14 0x000000000052d5bb in mdcache_lookup (parent=0x7fa878288ce8,
name=0x7fa738229270 "cer_7_0.5", handle=0x7fa7932f0878,
attrs_out=0x7fa7932f0760)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:181
#15 0x0000000000431a60 in fsal_lookup (parent=0x7fa878288ce8,
name=0x7fa738229270 "cer_7_0.5", obj=0x7fa7932f0878,
attrs_out=0x7fa7932f0760)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/fsal_helper.c:683
#16 0x000000000048e83e in nfs3_lookup (arg=0x7fa7381bdc88,
req=0x7fa7381bd580, res=0x7fa73839dab0)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/Protocols/NFS/nfs3_lookup.c:104
#17 0x000000000045703b in nfs_rpc_process_request
(reqdata=0x7fa7381bd580) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1329
#18 0x000000000045781f in nfs_rpc_process_request_slowio
(reqdata=0x7fa7381bd580, slowio_check_cb=0x553b4a <nfs3_timeout_proc>)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1542
#19 0x0000000000457955 in nfs_rpc_valid_NFS (req=0x7fa7381bd580) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1586
#20 0x00007faa5c901435 in svc_vc_decode (req=0x7fa7381bd580) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_vc.c:829
#21 0x000000000044a497 in nfs_rpc_decode_request (xprt=0x7fa8840027d0,
xdrs=0x7fa7384befa0)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1345
#22 0x00007faa5c901346 in svc_vc_recv (xprt=0x7fa8840027d0) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_vc.c:802
#23 0x00007faa5c8fda92 in svc_rqst_xprt_task (wpe=0x7fa8840029e8) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_rqst.c:769
#24 0x00007faa5c8fdeea in svc_rqst_epoll_events (sr_rec=0x1b62c60,
n_events=1) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_rqst.c:941
#25 0x00007faa5c8fe17f in svc_rqst_epoll_loop (sr_rec=0x1b62c60) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_rqst.c:1014
#26 0x00007faa5c8fe232 in svc_rqst_run_task (wpe=0x1b62c60) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_rqst.c:1050
#27 0x00007faa5c906dbb in work_pool_thread (arg=0x7fa7540008c0) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/work_pool.c:181
#28 0x00007faa5ba26dc5 in start_thread () from /lib64/libpthread.so.0
#29 0x00007faa5b33321d in clone () from /lib64/libc.so.6
(gdb) frame 3
#3 0x0000000000528657 in _mdcache_lru_unref_chunk
(chunk=0x7fa7e03e31d0, func=0x59abd0 <__func__.20247>
"mdcache_clean_dirent_chunks", line=579)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:2066
2066in
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c
(gdb) p qlane->mtx
$42 = {__data = {__lock = 2, __count = 0, *__owner = 225*, __nusers = 1,
__kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next =
0x0}},
__size = "\002\000\000\000\000\000\000\000\341\000\000\000\001",
'\000' <repeats 26 times>, __align = 2}
=======================================================================================================================================================================
(gdb) info threads
84 Thread 0x7fa9a68e8700 (LWP 228) 0x00007faa5ba2a03e in
pthread_rwlock_wrlock () from /lib64/libpthread.so.0
83 Thread 0x7fa9a69e9700 (LWP 227) 0x00007faa5ba29e24 in
pthread_rwlock_rdlock () from /lib64/libpthread.so.0
*82 Thread 0x7fa9a6aea700 (LWP 225)* 0x00007faa5ba2cf4d in
__lll_lock_wait () from /lib64/libpthread.so.0
81 Thread 0x7fa9a6beb700 (LWP 226) 0x00007faa5ba2a03e in
pthread_rwlock_wrlock () from /lib64/libpthread.so.0
80 Thread 0x7fa9a6cec700 (LWP 224) 0x00007faa5ba2a03e in
pthread_rwlock_wrlock () from /lib64/libpthread.so.0
79 Thread 0x7fa9a6ded700 (LWP 223) 0x00007faa5ba29e24 in
pthread_rwlock_rdlock () from /lib64/libpthread.so.0
78 Thread 0x7fa9a6eee700 (LWP 222) 0x00007faa5ba2a03e in
pthread_rwlock_wrlock () from /lib64/libpthread.so.0
77 Thread 0x7fa9a6fef700 (LWP 221) 0x00007faa5ba29e24 in
pthread_rwlock_rdlock () from /lib64/libpthread.so.0
76 Thread 0x7fa9a70f0700 (LWP 220) 0x00007faa5ba2a03e in
pthread_rwlock_wrlock () from /lib64/libpthread.so.0
75 Thread 0x7fa9a71f1700 (LWP 219) 0x00007faa5ba2a03e in
pthread_rwlock_wrlock () from /lib64/libpthread.so.0
74 Thread 0x7fa9a72f2700 (LWP 218) 0x00007faa5ba29e24 in
pthread_rwlock_rdlock () from /lib64/libpthread.so.0
73 Thread 0x7fa9a73f3700 (LWP 217) 0x00007faa5ba2a03e in
pthread_rwlock_wrlock () from /lib64/libpthread.so.0
=======================================================================================================================================================================
(gdb) *thread 82*
[Switching to thread 82 (Thread 0x7fa9a6aea700 (LWP 225))]
#0 0x00007faa5ba2cf4d in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0 0x00007faa5ba2cf4d in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00007faa5ba2a307 in _L_lock_14 () from /lib64/libpthread.so.0
#2 0x00007faa5ba2a2b3 in pthread_rwlock_trywrlock () from
/lib64/libpthread.so.0
#3 0x0000000000524547 in *lru_reap_chunk_impl* (qid=LRU_ENTRY_L2,
parent=0x7fa7f8122810)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:809
#4 0x0000000000524a01 in mdcache_get_chunk (parent=0x7fa7f8122810,
prev_chunk=0x0, whence=0)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:886
#5 0x000000000053f0b6 in mdcache_populate_dir_chunk
(directory=0x7fa7f8122810, whence=0, dirent=0x7fa9a6ae7f80,
prev_chunk=0x0, eod_met=0x7fa9a6ae7f7f)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:2554
#6 0x0000000000540aff in mdcache_readdir_chunked
(directory=0x7fa7f8122810, whence=0, dir_state=0x7fa9a6ae8130,
cb=0x43215e <populate_dirent>, attrmask=122830, eod_met=0x7fa9a6ae884b)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:2991
#7 0x000000000052ee71 in mdcache_readdir (dir_hdl=0x7fa7f8122848,
whence=0x7fa9a6ae8110, dir_state=0x7fa9a6ae8130, cb=0x43215e
<populate_dirent>, attrmask=122830, eod_met=0x7fa9a6ae884b)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:559
#8 0x0000000000432a85 in fsal_readdir (directory=0x7fa7f8122848,
cookie=0, nbfound=0x7fa9a6ae884c, eod_met=0x7fa9a6ae884b,
attrmask=122830, cb=0x491b0f <nfs3_readdirplus_callback>,
opaque=0x7fa9a6ae8800) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/fsal_helper.c:1164
#9 0x0000000000491968 in nfs3_readdirplus (arg=0x7fa91470b9f8,
req=0x7fa91470b2f0, res=0x7fa91407b8f0)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/Protocols/NFS/nfs3_readdirplus.c:310
#10 0x000000000045703b in nfs_rpc_process_request
(reqdata=0x7fa91470b2f0) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1329
#11 0x000000000045781f in nfs_rpc_process_request_slowio
(reqdata=0x7fa91470b2f0, slowio_check_cb=0x553b4a <nfs3_timeout_proc>)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1542
#12 0x0000000000457955 in nfs_rpc_valid_NFS (req=0x7fa91470b2f0) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1586
#13 0x00007faa5c901435 in svc_vc_decode (req=0x7fa91470b2f0) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_vc.c:829
#14 0x000000000044a497 in nfs_rpc_decode_request (xprt=0x7fa8840027d0,
xdrs=0x7fa9142a1370)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1345
#15 0x00007faa5c901346 in svc_vc_recv (xprt=0x7fa8840027d0) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_vc.c:802
#16 0x00007faa5c8fda92 in svc_rqst_xprt_task (wpe=0x7fa8840029e8) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_rqst.c:769
#17 0x00007faa5c8fdeea in svc_rqst_epoll_events (sr_rec=0x1b62c60,
n_events=1) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_rqst.c:941
#18 0x00007faa5c8fe17f in svc_rqst_epoll_loop (sr_rec=0x1b62c60) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_rqst.c:1014
#19 0x00007faa5c8fe232 in svc_rqst_run_task (wpe=0x1b62c60) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/svc_rqst.c:1050
#20 0x00007faa5c906dbb in work_pool_thread (arg=0x7fa9380008c0) at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/libntirpc/src/work_pool.c:181
#21 0x00007faa5ba26dc5 in start_thread () from /lib64/libpthread.so.0
#22 0x00007faa5b33321d in clone () from /lib64/libc.so.6
(gdb) frame 3
#3 0x0000000000524547 in lru_reap_chunk_impl (qid=LRU_ENTRY_L2,
parent=0x7fa7f8122810)
at
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:809
809in
/export/jcloud-zbs/src/jd.com/zfs/FSAL_SkyFS/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c
(gdb) p entry->content_lock
$43 = {__data = {__lock = 2, __nr_readers = *32681*, __readers_wakeup =
0, __writer_wakeup = 0, __nr_readers_queued = 0, __nr_writers_queued =
0, __writer = -598294472, __shared = 32680,
__pad1 = 140366914447067, __pad2 = 0, __flags = 0},
__size = "\002\000\000\000\251\177", '\000' <repeats 18
times>,
"\070\300V\334\250\177\000\000\333\346\022\270\251\177", '\000'
<repeats
17 times>, __align = 140363826200578}
=======================================================================================================================================================================
--------------------------------
----- 原始邮件 -----
发件人:"QR" <zhbingyin(a)sina.com>
收件人:"Daniel Gryniewicz" <dgryniew(a)redhat.com>, "ganesha-devel"
<devel(a)lists.nfs-ganesha.org>,
主题:[NFS-Ganesha-Devel]回复:Re:_ganesha_hang_more_than_five_hours
日期:2020年02月26日 18点21分
Not yet. Because the docker did not enable core dump.
Will try to create a full backtrace for this, thanks.
--------------------------------
----- 原始邮件 -----
发件人:Daniel Gryniewicz <dgryniew(a)redhat.com>
收件人:devel(a)lists.nfs-ganesha.org
主题:[NFS-Ganesha-Devel] Re: ganesha hang more than five hours
日期:2020年02月25日 21点20分
No, definitely not a known issue. Do you have a full backtrace of one
(or several, if they're different) hung threads?
Daniel
On 2/24/20 9:28 PM, QR wrote:
> Hi Dang,
>
> Ganesha hangs more than five hours. It seems that 198 svc threads hang
> on nfs3_readdirplus.
> Is there a known issue about this? Thanks in advance.
>
> Ganesha server info
> ganesha version: V2.7.6
> FSAL : In house
> nfs client info
> nfs version : nfs v3
> client info : CentOS 7.4
>
> _______________________________________________
> Devel mailing list -- devel(a)lists.nfs-ganesha.org
> To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org
To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org
To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org
To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org