Daniel,
I saw the following core again with V2.7.4 and the following patches applied including the last MDCACHE one.
Took about 5 days of hard testing to hit the core. There still seems an edge case that does an extra chunk unref.
You mentioned last time that the chunk cannot be freed by another thread (in between the content lock release and re-acquire) because we have a ref count. But could the code
path that removes the “sentinel” ref result in this chunk being re-used by the LRU since it still has the ref count from this path. ( Code that removes the sentinel reference: mdcache_populate_dir_chunk() the section with the comment “Put sentinal ref”.
) Or maybe there is a completely different edge case that does an extra unref.
Minor: The commit heading for 2.7.5 says 2.7.4
https://github.com/nfs-ganesha/nfs-ganesha/commit/6407d3d133c9346ae586cecf1ccac9c6bb3d36f8
Thanks,
Vandana
Patches applied:
https://github.com/nfs-ganesha/nfs-ganesha/commit/c7e7d24877085dbb0424542d9d8339c722771716
https://github.com/nfs-ganesha/nfs-ganesha/commit/136df4f262c3f9bc29df456eac50921321826da3
https://github.com/nfs-ganesha/nfs-ganesha/commit/c98ad1e238f5db9db5ab8db8cea1c5cf1f123d76
https://github.com/nfs-ganesha/nfs-ganesha/commit/f0d5b8d4f6dcce4597459c3ebaf36d1e96e4645b
https://github.com/nfs-ganesha/nfs-ganesha/commit/e11cd5b9f8d6ffbb46061e7cd2b5a3e247bf09af
https://github.com/nfs-ganesha/nfs-ganesha/commit/5439f63b8a1e00f6ff56a4def58f960f03676c7b
https://github.com/nfs-ganesha/nfs-ganesha/commit/e306c01685b74970ada2259f2bf393cdc1233af6
Here are the code snippets around the lines of code in the core, after mdcache_find_keyed_reason gets an error.
if (!has_write) {
/* We will have to re-find this dirent after we
* re-acquire the lock.
*/
look_ck = dirent->ck;
PTHREAD_RWLOCK_unlock(&directory->content_lock);
PTHREAD_RWLOCK_wrlock(&directory->content_lock);
has_write = true;
/* Dropping the content_lock may have
* invalidated some or all of the dirents and/or
* chunks in this directory. We need to start
* over from this point. look_ck is now correct
* if the dirent is still cached, and we haven't
* changed next_ck, so it's still correct for
* reloading the chunk.
*/
first_pass = true;
>>>>>>>>>> Line 3133 mdcache_lru_unref_chunk(chunk);
chunk = NULL;
/* Now we need to look for this dirent again.
* We haven't updated next_ck for this dirent
* yet, so it is the right whence to use for a
* repopulation readdir if the chunk is
* discarded.
*/
goto again;
(gdb) print lru_state
$1 = {entries_hiwat = 500000, entries_used = 518407, chunks_hiwat = 100000, chunks_used = 99952,
fds_system_imposed = 400000, fds_hard_limit = 396000, fds_hiwat = 360000, fds_lowat = 200000,
futility = 0, per_lane_work = 50, biggest_window = 160000, prev_fd_count = 244632,
prev_time = 1562643400, fd_state = 2}
(gdb) bt
#0
0x00007fdf573e5c40 in pthread_mutex_lock () from /lib64/libpthread.so.0
#1
0x000000000052af4a in _mdcache_lru_unref_chunk (chunk=0x56035bc0,
func=0x598a00 <__func__.23678> "mdcache_readdir_chunked", line=3133)
at /src/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:2058
#2
0x0000000000542d84 in mdcache_readdir_chunked (directory=0x1a244a20, whence=35907657,
dir_state=0x7fdf4ba7baf0, cb=0x4323ed <populate_dirent>, attrmask=0, eod_met=0x7fdf4ba7bfeb)
at /src/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:3133
#3
0x000000000053054c in mdcache_readdir (dir_hdl=0x1a244a58, whence=0x7fdf4ba7bad0,
dir_state=0x7fdf4ba7baf0, cb=0x4323ed <populate_dirent>, attrmask=0, eod_met=0x7fdf4ba7bfeb)
at /src/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:559
#4
0x0000000000432d14 in fsal_readdir (directory=0x1a244a58, cookie=35907657,
nbfound=0x7fdf4ba7bfec, eod_met=0x7fdf4ba7bfeb, attrmask=0, cb=0x491d35 <nfs3_readdir_callback>,
opaque=0x7fdf4ba7bfa0) at /src/src/FSAL/fsal_helper.c:1164
#5
0x0000000000491b1d in nfs3_readdir (arg=0x3a08e098, req=0x3a08d990, res=0x20466d70)
at /src/src/Protocols/NFS/nfs3_readdir.c:289
#6
0x0000000000457c16 in nfs_rpc_process_request (reqdata=0x3a08d990)
at /src/src/MainNFSD/nfs_worker_thread.c:1328
#7
0x00000000004583d5 in nfs_rpc_valid_NFS (req=0x3a08d990)
at /src/src/MainNFSD/nfs_worker_thread.c:1548
#8
0x00007fdf583c0034 in svc_vc_decode (req=0x3a08d990) at /src/src/libntirpc/src/svc_vc.c:829
#9
0x000000000044adc5 in nfs_rpc_decode_request (xprt=0x3dd4adf0, xdrs=0x3cfd0940)
at /src/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1345
#10 0x00007fdf583bff45 in svc_vc_recv (xprt=0x3dd4adf0) at /src/src/libntirpc/src/svc_vc.c:802
---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) select-frame 2
(gdb) info locals
status = {major = ERR_FSAL_NOENT, minor = 0}
cb_result = DIR_CONTINUE
entry = 0x0
attrs = {request_mask = 0, valid_mask = 1433550, supported = 1433582, type = REGULAR_FILE,
filesize = 1059147, fsid = {major = 0, minor = 0}, acl = 0x0, fileid = 35907648, mode = 493,
numlinks = 1, owner = 4294967294, group = 4294967294, rawdev = {major = 0, minor = 0}, atime = {
tv_sec = 1562643413407000, tv_nsec = 0}, creation = {tv_sec = 0, tv_nsec = 0}, ctime = {
tv_sec = 1562643413407000, tv_nsec = 0}, mtime = {tv_sec = 1562624796, tv_nsec = 478000000},
chgtime = {tv_sec = 1562643413407000, tv_nsec = 0}, spaceused = 1059147,
change = 1562643413407000000, generation = 0, expire_time_attr = 60, fs_locations = 0x0,
sec_label = {slai_lfs = {lfs_lfs = 0, lfs_pi = 0}, slai_data = {slai_data_len = 0,
slai_data_val = 0x0}}}
dirent = 0x3a584590
has_write = true
set_first_ck = false
next_ck = 35907651
look_ck = 35907664
chunk = 0x56035bc0
first_pass = true
eod = false
reload_chunk = false
__func__ = "mdcache_readdir_chunked"
__PRETTY_FUNCTION__ = "mdcache_readdir_chunked"
(gdb)
status = {major = ERR_FSAL_NOENT, minor = 0}
cb_result = DIR_CONTINUE
entry = 0x0
attrs = {request_mask = 0, valid_mask = 1433550, supported = 1433582, type = REGULAR_FILE,
filesize = 1059147, fsid = {major = 0, minor = 0}, acl = 0x0, fileid = 35907648, mode = 493,
numlinks = 1, owner = 4294967294, group = 4294967294, rawdev = {major = 0, minor = 0}, atime = {
tv_sec = 1562643413407000, tv_nsec = 0}, creation = {tv_sec = 0, tv_nsec = 0}, ctime = {
tv_sec = 1562643413407000, tv_nsec = 0}, mtime = {tv_sec = 1562624796, tv_nsec = 478000000},
chgtime = {tv_sec = 1562643413407000, tv_nsec = 0}, spaceused = 1059147,
change = 1562643413407000000, generation = 0, expire_time_attr = 60, fs_locations = 0x0,
sec_label = {slai_lfs = {lfs_lfs = 0, lfs_pi = 0}, slai_data = {slai_data_len = 0,
slai_data_val = 0x0}}}
dirent = 0x3a584590
has_write = true
set_first_ck = false
next_ck = 35907651
look_ck = 35907664
chunk = 0x56035bc0
first_pass = true
eod = false
reload_chunk = false
__func__ = "mdcache_readdir_chunked"
__PRETTY_FUNCTION__ = "mdcache_readdir_chunked"
(gdb) print *dirent
$2 = {chunk_list = {next = 0x7fdf585dd380 <xdr_ioq_ops>, prev = 0x34247680}, chunk = 0x0,
node_name = {left = 0x0, right = 0x0, parent = 457566472}, node_ck = {left = 0x3b255758,
right = 0x1b45e8a0, parent = 457566368}, node_sorted = {left = 0x1b45e908, right = 0x1b45e908,
parent = 25769803776}, ck = 1, eod = false, namehash = 0, ckey = {hk = 0, fsal = 0x0, kv = {
addr = 0x0, len = 0}}, flags = 0, entry = 0x3dd4b088,
name = 0x10000000000 <Address 0x10000000000 out of bounds>, name_buffer = 0x3a584640 ""}
(gdb) print *chunk
$3 = {chunks = {next = 0x24b1c303, prev = 0xa386010002000000}, dirents = {next = 0x300000003000000,
prev = 0x1c00000001000000}, parent = 0x7000000aaaaaaaa, chunk_lru = {q = {
next = 0x6e776f6e6b6e75, prev = 0xfefffffffeffffff}, qid = LRU_ENTRY_NONE, refcnt = 0,
flags = 0, lane = 268435456, cf = 16777283}, reload_ck = 1099511627778,
next_ck = 7236828793636126720, num_entries = 858680687}