Thanks for the quick fix. This is looking good in testing so far. There is a memory growth
issue becoming visible as well which I will start a separate thread about.
-----Original Message-----
From: Daniel Gryniewicz [mailto:dang@redhat.com]
Sent: Tuesday, January 22, 2019 2:13 PM
To: devel(a)lists.nfs-ganesha.org
Subject: [NFS-Ganesha-Devel] Re: V2.7.1 Crash when cleaning mdcache entries
https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/441566
Daniel
On 1/22/19 12:50 PM, Daniel Gryniewicz wrote:
> I don't think either of those are relevant here. Those involve chunks
> being freed, and are not related to entries. The problem here is that
> the final unref cleanup somehow recursed back into the unref, and
> tried to cleanup the entry again. This caused it to attempt to take a
> lock that it already had, which causes the assert.
>
> I think there may be a tiny race window here. I'll look into it.
>
> Daniel
>
>
> On 1/22/19 12:16 PM, Kropelin, Adam wrote:
>> (gdb) bt
>>
>> #0 0x00007f1520efe277 in raise () from /lib64/libc.so.6
>>
>> #1 0x00007f1520eff968 in abort () from /lib64/libc.so.6
>>
>> #2 0x0000000000536388 in cih_remove_checked (entry=0x22a9d1a0) at
>> /src/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_hash.h:394
>>
>> #3 0x0000000000537248 in mdc_clean_entry (entry=0x22a9d1a0) at
>> /src/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:264
>>
>> #4 0x000000000052480d in mdcache_lru_clean (entry=0x22a9d1a0) at
>> /src/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:591
>>
>> #5 0x000000000052963c in _mdcache_lru_unref (entry=0x22a9d1a0,
>> flags=0, func=0x596b90 <__func__.23067> "cih_remove_checked",
>> line=406)
>>
>> at /src/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:1989
>>
>> #6 0x000000000053642d in cih_remove_checked (entry=0x22a9d1a0) at
>> /src/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_hash.h:406
>>
>> #7 0x0000000000537248 in mdc_clean_entry (entry=0x22a9d1a0) at
>> /src/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:264
>>
>> #8 0x000000000052480d in mdcache_lru_clean (entry=0x22a9d1a0) at
>> /src/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:591
>>
>> #9 0x000000000052963c in _mdcache_lru_unref (entry=0x22a9d1a0,
>> flags=0, func=0x5941df <__func__.22972> "mdcache_put",
line=196)
>>
>> at /src/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:1989
>>
>> #10 0x000000000052cb54 in mdcache_put (entry=0x22a9d1a0) at
>> /src/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.h:196
>>
>> #11 0x0000000000530bca in mdcache_put_ref (obj_hdl=0x22a9d1d8) at
>> /src/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1280
>>
>> #12 0x00000000004327e5 in populate_dirent (name=0x43f30278
>> "random.267", obj=0x22a9d1d8, attrs=0x22a9d268,
>> dir_state=0x7f15176e9af0, cookie=3786442)
>>
>> at /src/src/FSAL/fsal_helper.c:1064
>>
>> #13 0x0000000000541a25 in mdcache_readdir_chunked
>> (directory=0xcf7c7c0, whence=3786313, dir_state=0x7f15176e9af0,
>> cb=0x43217c <populate_dirent>,
>>
>> attrmask=0, eod_met=0x7f15176e9feb) at
>> /src/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:3146
>>
>> #14 0x000000000052e8c3 in mdcache_readdir (dir_hdl=0xcf7c7f8,
>> whence=0x7f15176e9ad0, dir_state=0x7f15176e9af0, cb=0x43217c
>> <populate_dirent>, attrmask=0,
>>
>> eod_met=0x7f15176e9feb) at
>> /src/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:559
>>
>> #15 0x0000000000432a76 in fsal_readdir (directory=0xcf7c7f8,
>> cookie=3786313, nbfound=0x7f15176e9fec, eod_met=0x7f15176e9feb,
>> attrmask=0,
>>
>> cb=0x4912a2 <nfs3_readdir_callback>, opaque=0x7f15176e9fa0) at
>> /src/src/FSAL/fsal_helper.c:1158
>>
>> #16 0x000000000049108a in nfs3_readdir (arg=0x996f788, req=0x996f080,
>> res=0x3f3c6a30) at /src/src/Protocols/NFS/nfs3_readdir.c:289
>>
>> #17 0x00000000004574d1 in nfs_rpc_process_request (reqdata=0x996f080)
>> at /src/src/MainNFSD/nfs_worker_thread.c:1329
>>
>> #18 0x0000000000457c90 in nfs_rpc_valid_NFS (req=0x996f080) at
>> /src/src/MainNFSD/nfs_worker_thread.c:1549
>>
>> #19 0x00007f15226b1e75 in svc_vc_decode (req=0x996f080) at
>> /src/src/libntirpc/src/svc_vc.c:825
>>
>> #20 0x000000000044a688 in nfs_rpc_decode_request (xprt=0x4f4e77f0,
>> xdrs=0xd57ef60) at /src/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1341
>>
>> #21 0x00007f15226b1d86 in svc_vc_recv (xprt=0x4f4e77f0) at
>> /src/src/libntirpc/src/svc_vc.c:798
>>
>> #22 0x00007f15226ae4d3 in svc_rqst_xprt_task (wpe=0x4f4e7a08) at
>> /src/src/libntirpc/src/svc_rqst.c:767
>>
>> #23 0x00007f15226ae94d in svc_rqst_epoll_events (sr_rec=0x101b440,
>> n_events=1) at /src/src/libntirpc/src/svc_rqst.c:939
>>
>> #24 0x00007f15226aebe2 in svc_rqst_epoll_loop (sr_rec=0x101b440) at
>> /src/src/libntirpc/src/svc_rqst.c:1012
>>
>> #25 0x00007f15226aec95 in svc_rqst_run_task (wpe=0x101b440) at
>> /src/src/libntirpc/src/svc_rqst.c:1048
>>
>> (gdb) print *entry
>>
>> $2 = {attr_lock = {__data = {__lock = 0, __nr_readers = 0,
>> __readers_wakeup = 0, __writer_wakeup = 0, __nr_readers_queued = 0,
>> __nr_writers_queued = 0,
>>
>> __writer = 0, __shared = 0, __pad1 = 0, __pad2 = 0, __flags =
>> 0}, __size = '\000' <repeats 55 times>, __align = 0}, obj_handle
=
>> {handles = {
>>
>> next = 0x12a5ac78, prev = 0x12de5418}, fs = 0x0, fsal =
>> 0x7d3340 <MDCACHE>, obj_ops = 0x7d3490 <MDCACHE+336>, obj_lock =
>> {__data = {__lock = 0,
>>
>> __nr_readers = 0, __readers_wakeup = 0, __writer_wakeup = 0,
>> __nr_readers_queued = 0, __nr_writers_queued = 0, __writer = 0,
>> __shared = 0,
>>
>> __pad1 = 0, __pad2 = 0, __flags = 0}, __size = '\000'
>> <repeats 55 times>, __align = 0}, type = REGULAR_FILE, fsid = {major
>> = 0, minor = 0},
>>
>> fileid = 2454576, state_hdl = 0x22a9d428}, sub_handle = 0x0,
>> attrs = {request_mask = 1433550, valid_mask = 1433550, supported =
>> 1433582,
>>
>> type = REGULAR_FILE, filesize = 1024, fsid = {major = 0, minor =
>> 0}, acl = 0x0, fileid = 2454576, mode = 438, numlinks = 1, owner =
>> 65534,
>>
>> group = 65534, rawdev = {major = 0, minor = 0}, atime = {tv_sec
>> = 1547875446, tv_nsec = 112000000}, creation = {tv_sec = 0, tv_nsec =
>> 0}, ctime = {
>>
>> tv_sec = 1547875446, tv_nsec = 112000000}, mtime = {tv_sec =
>> 1547875446, tv_nsec = 112000000}, chgtime = {tv_sec = 1547875446,
>> tv_nsec = 112000000},
>>
>> spaceused = 1024, change = 1547875446112, generation = 0,
>> expire_time_attr = 60, fs_locations = 0x0}, fh_hk = {node_k = {left =
>> 0x0, right = 0x0,
>>
>> parent = 267421682}, key = {hk = 10676371249170983334, fsal =
>> 0x7f15186d45e0 <FOO>, kv = {addr = 0x4761f300, len = 10}}, inavl =
>> false},
>>
>> mde_flags = 1, attr_time = 1547998073, acl_time = 0,
>> fs_locations_time = 0, lru = {q = {next = 0x0, prev = 0x0}, qid =
>> LRU_ENTRY_L1, refcnt = 0,
>>
>> flags = 0, lane = 10, cf = 0}, export_list = {next = 0x22a9d3d8,
>> prev = 0x22a9d3d8}, first_export_id = -1, content_lock = {__data =
>> {__lock = 0,
>>
>> __nr_readers = 0, __readers_wakeup = 0, __writer_wakeup = 0,
>> __nr_readers_queued = 0, __nr_writers_queued = 0, __writer = 0,
>> __shared = 0,
>>
>> __pad1 = 0, __pad2 = 0, __flags = 0}, __size = '\000'
<repeats
>> 55 times>, __align = 0}, fsobj = {hdl = {state_lock = {__data =
>> {__lock = 0,
>>
>> __nr_readers = 0, __readers_wakeup = 0, __writer_wakeup = 0,
>> __nr_readers_queued = 0, __nr_writers_queued = 0, __writer = 0,
>> __shared = 0,
>>
>> __pad1 = 0, __pad2 = 0, __flags = 0}, __size = '\000'
>> <repeats 55 times>, __align = 0}, no_cleanup = false, {file = {obj =
>> 0x22a9d1d8,
>>
>> list_of_states = {next = 0x22a9d470, prev = 0x22a9d470},
>> layoutrecall_list = {next = 0x22a9d480, prev = 0x22a9d480}, lock_list
>> = {
>>
>> next = 0x22a9d490, prev = 0x22a9d490}, nlm_share_list =
>> {next = 0x22a9d4a0, prev = 0x22a9d4a0}, write_delegated = false,
>> fdeleg_stats = {
>>
>> fds_curr_delegations = 0, fds_deleg_type = OPEN_DELEGATE_NONE,
>> fds_delegation_count = 0, fds_recall_count = 0, fds_avg_hold = 0,
>>
>> fds_last_delegation = 0, fds_last_recall = 0, fds_num_opens = 0,
>> fds_first_open = 0}, anon_ops = 0}, dir = {junction_export =
>> 0x22a9d1d8,
>>
>> export_roots = {next = 0x22a9d470, prev = 0x22a9d470},
>> exp_root_refcount = 581555328}}}, fsdir = {chunks = {next = 0x0, prev
>> = 0x0}, detached = {
>>
>> next = 0x0, prev = 0x0}, spin = 0, detached_count = 0, dhdl
>> = {state_lock = {__data = {__lock = 0, __nr_readers = 0,
>> __readers_wakeup = 0,
>>
>> __writer_wakeup = 0, __nr_readers_queued = 0, __nr_writers_queued =
>> 0, __writer = 581554648, __shared = 0, __pad1 = 581555312,
>>
>> __pad2 = 581555312, __flags = 581555328},
>>
>> __size = '\000' <repeats 24 times>,
>>
"\330ѩ\"\000\000\000\000p\324\251\"\000\000\000\000p\324\251\"\000\00
>> 0\000\000\200\324\251\"\000\000\000",
>>
>>
>> __align = 0}, no_cleanup = 128, {file = {obj = 0x22a9d490,
>> list_of_states = {next = 0x22a9d490, prev = 0x22a9d4a0},
>> layoutrecall_list = {
>>
>> next = 0x22a9d4a0, prev = 0x0}, lock_list = {next =
>> 0x0, prev = 0x0}, nlm_share_list = {next = 0x0, prev = 0x0},
>> write_delegated = false,
>>
>> fdeleg_stats = {fds_curr_delegations = 0, fds_deleg_type =
>> OPEN_DELEGATE_NONE, fds_delegation_count = 0, fds_recall_count = 0,
>>
>> fds_avg_hold = 0, fds_last_delegation = 0, fds_last_recall = 0,
>> fds_num_opens = 0, fds_first_open = 0}, anon_ops = 0}, dir = {
>>
>> junction_export = 0x22a9d490, export_roots = {next = 0x22a9d490, prev
>> = 0x22a9d4a0}, exp_root_refcount = 581555360}}}, parent = {addr =
>> 0x0,
>>
>> len = 0}, first_ck = 0, avl = {t = {root = 0x0, cmp_fn =
>> 0x0, height = 0, first = 0x0, last = 0x0, size = 0}, ck = {root =
>> 0x0, cmp_fn = 0x0,
>>
>> height = 0, first = 0x0, last = 0x0, size = 0}, sorted =
>> {root = 0x0, cmp_fn = 0x0, height = 0, first = 0x0, last = 0x0, size
>> = 0},
>>
>> collisions = 0}}}}
>>
>
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org To unsubscribe send an email
to devel-leave(a)lists.nfs-ganesha.org