should have been run under 4.1 and thus if we really are both caching
something in slot cache and drc somehow, they would have tripped also.
From the backtrace line numbers, it is clear that nfs_dupreq_rele()
has DUPREQ_NOCACHE. So DRC is NOT caching but only freeing.
Regards, Malahal.
On Fri, Oct 5, 2018 at 1:54 AM Frank Filz <ffilzlnx(a)mindspring.com> wrote:
nfs_dupreq_v4_cacheable is supposed to exclude all 4.1 requests.
I wonder if something changed. There are other requests that free things
in their Free functions that should have been run under 4.1 and thus if we
really are both caching something in slot cache and drc somehow, they would
have tripped also.
Here’s all the places that gsh_free something in res:
Protocols/NFS/nfs4_Compound.c nfs4_Compound 962
gsh_free(res->res_compound4.resarray.resarray_val);
Protocols/NFS/nfs4_Compound.c nfs4_Compound_Free 1128
gsh_free(res->res_compound4.resarray.resarray_val);
Protocols/NFS/nfs4_Compound.c nfs4_Compound_Free 1131
gsh_free(res->res_compound4.tag.utf8string_val);
Protocols/NFS/nfs4_op_exchange_id.c nfs4_op_exchange_id_Free 441
gsh_free(resok->eir_server_scope.eir_server_scope_val);
Protocols/NFS/nfs4_op_exchange_id.c nfs4_op_exchange_id_Free 442
gsh_free(resok->eir_server_owner.so_major_id.so_major_id_val);
Protocols/NFS/nfs4_op_exchange_id.c nfs4_op_exchange_id_Free 443
gsh_free(resok->eir_server_impl_id.eir_server_impl_id_val);
Protocols/NFS/nfs4_op_getdeviceinfo.c nfs4_op_getdeviceinfo_Free 205
gsh_free(resok->gdir_device_addr.da_addr_body.da_addr_body_val);
Protocols/NFS/nfs4_op_getdevicelist.c nfs4_op_getdevicelist_Free 196
gsh_free(resok->gdlr_deviceid_list.gdlr_deviceid_list_val);
Protocols/NFS/nfs4_op_getfh.c nfs4_op_getfh_Free 138
gsh_free(resp->GETFH4res_u.resok4.object.nfs_fh4_val);
Protocols/NFS/nfs4_op_read.c nfs4_op_read_Free 615
gsh_free(resp->READ4res_u.resok4.data.data_val);
Protocols/NFS/nfs4_op_readlink.c nfs4_op_readlink_Free 125
gsh_free(resp->READLINK4res_u.resok4.link.utf8string_val);
Protocols/NFS/nfs4_op_secinfo.c nfs4_op_secinfo_Free 337
gsh_free(resp->SECINFO4res_u.resok4.SECINFO4resok_val);
Protocols/NFS/nfs4_op_secinfo_no_name.c nfs4_op_secinfo_no_name_Free 209
gsh_free(resp->SECINFO4res_u.resok4.SECINFO4resok_val);
Protocols/NFS/nfs4_op_setclientid.c nfs4_op_setclientid_Free 384
gsh_free(resp->SETCLIENTID4res_u.client_using.r_addr);
Protocols/NFS/nfs4_op_test_stateid.c nfs4_op_test_stateid_Free 121
gsh_free(res->tsr_status_codes.tsr_status_codes_val);
Protocols/NFS/nfs4_op_xattr.c nfs4_op_getxattr_Free 142
gsh_free(res_GETXATTR4->GETXATTR4res_u.resok4.gr_value.utf8string_val);
Protocols/NFS/nfs4_op_xattr.c nfs4_op_listxattr_Free 316
gsh_free(res_LISTXATTR4->LISTXATTR4res_u.resok4.lr_names.entries);
So I’m struggling to see what is unique about test_stateid other than that
it didn’t check the return code, which can only be NFS4_OK or
NFS4ERR_INVAL, and only invalid if TEST_STATEID was issued with
minorversion = 0.
Frank
*From:* Malahal Naineni [mailto:malahal@gmail.com]
*Sent:* Thursday, October 4, 2018 12:49 PM
*To:* ffilzlnx(a)mindspring.com
*Cc:* patrice.lucas(a)cea.fr; devel(a)lists.nfs-ganesha.org
*Subject:* Re: [NFS-Ganesha-Devel] Re: double-free bug
nfs4_Compound_Free() looks at *res_cached* to free the stuff or not. The
code in nfs4_op_sequence() sets it to False and then
calls nfs4_Compound_Free(). I don't see any lock that prevents Thread10
(nfs_dupreq_rele path) running at the same time. I am new to this code, so
I might be wrong in my analysis though!
One option is to bypass DRC for NFS4.1 and above.
Regards, Malahal.
On Fri, Oct 5, 2018 at 12:01 AM Frank Filz <ffilzlnx(a)mindspring.com>
wrote:
Yea, pretty much 4.1 are not cacheable (4.1 uses the slot cache and so has
no need of the dupreq cache).
With my patch, test_stateid isn’t doing anything different than any of the
other 4.1 ops that actually free memory in their Free routines.
Maybe they all can lead to double free? In which case somewhere along the
line we are doing something wrong with the dup req cache for 4.1///
Frank
*From:* Malahal Naineni [mailto:malahal@gmail.com]
*Sent:* Thursday, October 4, 2018 11:16 AM
*To:* ffilzlnx(a)mindspring.com
*Cc:* patrice.lucas(a)cea.fr; devel(a)lists.nfs-ganesha.org
*Subject:* [NFS-Ganesha-Devel] Re: double-free bug
Thread10 thinks that op_test stateid_is not cachable, so it actually frees
the response and other goodies allocated. But thread7 finds in the slot
cache and tries to free leading to a double free. The code path has to be
for minor version 1 or 2 (not zero) based on line numbers. I don't know
much about 4.1 slot cache.
Regards, Malahal.
On Thu, Oct 4, 2018 at 8:52 PM Frank Filz <ffilzlnx(a)mindspring.com> wrote:
The only thing I can think of is thata TEST_STATEID was issued with minor
version = 0 which is the only way it can fail.
I’m going to submit a fix that checks for return status before freeing.
A couple Free routines NULL out the values they free, but almost all check
for NFS4_OK. There are a couple others that also don’t check. I’ll fix
those too.
Frank
*From:* patrice.lucas(a)cea.fr [mailto:patrice.lucas@cea.fr]
*Sent:* Thursday, October 4, 2018 6:43 AM
*To:* devel(a)lists.nfs-ganesha.org
*Subject:* [NFS-Ganesha-Devel] double-free bug
Hello everyone,
Frequent memory crashs have been occurring for few weeks in the
nfs-ganesha CEA FSAL-PROXY continuous integration test. I finally make time
for looking at these problems today by running the nfs-ganesha server under
Address Sanitizer.
I got the following stack wih a double-free error. Could anyone explain
this error ? Someone who well understand the dup-req cache ? Or someone who
already works with the code of the nfs4_op_test_stateid operation ?
The nfs4_op_test_stateid was introduce this summer by gerrit patch 418826
<
https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/418826> from
fatih-acar <
https://review.gerrithub.io/q/owner:fatih%2540gandi.net>,
07/22/2018.
The dup-req cache stack seems to be involved in this error.
Regards,
Patrice
==7037==ERROR: AddressSanitizer: attempting double-free on 0x60200001ced0
in thread T7:
#0 0x480c09 in __interceptor_free (/usr/bin/ganesha.nfsd+0x480c09)
#1 0x897125 in gsh_free /opt/nfs-ganesha/src/include/abstract_mem.h:299
#2 0x896f88 in nfs4_op_test_stateid_Free
/opt/nfs-ganesha/src/Protocols/NFS/nfs4_op_test_stateid.c:121
#3 0x703702 in nfs4_Compound_FreeOne
/opt/nfs-ganesha/src/Protocols/NFS/nfs4_Compound.c:1081
#4 0x7042c4 in nfs4_Compound_Free
/opt/nfs-ganesha/src/Protocols/NFS/nfs4_Compound.c:1119
#5 0x865c4a in nfs4_op_sequence
/opt/nfs-ganesha/src/Protocols/NFS/nfs4_op_sequence.c:185
#6 0x6fd80f in nfs4_Compound
/opt/nfs-ganesha/src/Protocols/NFS/nfs4_Compound.c:903
#7 0x67167c in nfs_rpc_process_request
/opt/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1329
#8 0x663040 in nfs_rpc_valid_NFS
/opt/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1539
#9 0x7ffff7bb94a1 in svc_vc_decode
/opt/nfs-ganesha/src/libntirpc/src/svc_vc.c:824
#10 0x6542ce in nfs_rpc_decode_request
/opt/nfs-ganesha/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1341
#11 0x7ffff7bb934c in svc_vc_recv
/opt/nfs-ganesha/src/libntirpc/src/svc_vc.c:797
#12 0x7ffff7bb47be in svc_rqst_xprt_task
/opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:767
#13 0x7ffff7bb51af in svc_rqst_epoll_events
/opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:939
#14 0x7ffff7bb4e94 in svc_rqst_epoll_loop
/opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:1012:8
#15 0x7ffff7bb38bf in svc_rqst_run_task
/opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:1048:14
#16 0x7ffff7bc077c in work_pool_thread
/opt/nfs-ganesha/src/libntirpc/src/work_pool.c:181
#17 0x7ffff6367e24 in start_thread (/lib64/libpthread.so.0+0x7e24)
#18 0x7ffff575c34c in __clone (/lib64/libc.so.6+0xf834c)
0x60200001ced0 is located 0 bytes inside of 4-byte region
[0x60200001ced0,0x60200001ced4)
freed by thread T10 here:
#0 0x480c09 in __interceptor_free (/usr/bin/ganesha.nfsd+0x480c09)
#1 0x897125 in gsh_free /opt/nfs-ganesha/src/include/abstract_mem.h:299
#2 0x896f88 in nfs4_op_test_stateid_Free
/opt/nfs-ganesha/src/Protocols/NFS/nfs4_op_test_stateid.c:121
#3 0x703702 in nfs4_Compound_FreeOne
/opt/nfs-ganesha/src/Protocols/NFS/nfs4_Compound.c:1081
#4 0x7042c4 in nfs4_Compound_Free
/opt/nfs-ganesha/src/Protocols/NFS/nfs4_Compound.c:1119
#5 0xcec2a4 in nfs_dupreq_rele
/opt/nfs-ganesha/src/RPCAL/nfs_dupreq.c:1315
#6 0x673196 in nfs_rpc_process_request
/opt/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1442
#7 0x663040 in nfs_rpc_valid_NFS
/opt/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1539
#8 0x7ffff7bb94a1 in svc_vc_decode
/opt/nfs-ganesha/src/libntirpc/src/svc_vc.c:824
#9 0x6542ce in nfs_rpc_decode_request
/opt/nfs-ganesha/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1341
#10 0x7ffff7bb934c in svc_vc_recv
/opt/nfs-ganesha/src/libntirpc/src/svc_vc.c:797
#11 0x7ffff7bb47be in svc_rqst_xprt_task
/opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:767
#12 0x7ffff7bb51af in svc_rqst_epoll_events
/opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:939
#13 0x7ffff7bb4e94 in svc_rqst_epoll_loop
/opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:1012:8
#14 0x7ffff7bb38bf in svc_rqst_run_task
/opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:1048:14
#15 0x7ffff7bc077c in work_pool_thread
/opt/nfs-ganesha/src/libntirpc/src/work_pool.c:181
#16 0x7ffff6367e24 in start_thread (/lib64/libpthread.so.0+0x7e24)
previously allocated by thread T10 here:
#0 0x480e59 in calloc (/usr/bin/ganesha.nfsd+0x480e59)
#1 0x89689a in gsh_calloc__
/opt/nfs-ganesha/src/include/abstract_mem.h:145
#2 0x895c4e in nfs4_op_test_stateid
/opt/nfs-ganesha/src/Protocols/NFS/nfs4_op_test_stateid.c:88:3
#3 0x6fd80f in nfs4_Compound
/opt/nfs-ganesha/src/Protocols/NFS/nfs4_Compound.c:903
#4 0x67167c in nfs_rpc_process_request
/opt/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1329
#5 0x663040 in nfs_rpc_valid_NFS
/opt/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1539
#6 0x7ffff7bb94a1 in svc_vc_decode
/opt/nfs-ganesha/src/libntirpc/src/svc_vc.c:824
#7 0x6542ce in nfs_rpc_decode_request
/opt/nfs-ganesha/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1341
#8 0x7ffff7bb934c in svc_vc_recv
/opt/nfs-ganesha/src/libntirpc/src/svc_vc.c:797
#9 0x7ffff7bb47be in svc_rqst_xprt_task
/opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:767
#10 0x7ffff7bb51af in svc_rqst_epoll_events
/opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:939
#11 0x7ffff7bb4e94 in svc_rqst_epoll_loop
/opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:1012:8
#12 0x7ffff7bb38bf in svc_rqst_run_task
/opt/nfs-ganesha/src/libntirpc/src/svc_rqst.c:1048:14
#13 0x7ffff7bc077c in work_pool_thread
/opt/nfs-ganesha/src/libntirpc/src/work_pool.c:181
#14 0x7ffff6367e24 in start_thread (/lib64/libpthread.so.0+0x7e24)
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org
To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org