This is not one I've seen before, and a quick look at the code looks
strange. The only assert in that bit is asserting the parent is a
directory, but the parent directory is not something that was passed in
by Ganesha, but rather something that was looked up internally in
libcephfs. This is beyond my expertise, at this point. Maybe some ceph
logs would help?
Daniel
On 7/15/19 10:54 AM, David C wrote:
This list has been deprecated. Please subscribe to the new devel list
at
lists.nfs-ganesha.org.
Hi All
I'm running 2.7.3 using the CEPH FSAL to export CephFS (Luminous), it
ran well for a few days and crashed. I have a coredump, could someone
assist me in debugging this please?
(gdb) bt
#0 0x00007f04dcab6207 in raise () from /lib64/libc.so.6
#1 0x00007f04dcab78f8 in abort () from /lib64/libc.so.6
#2 0x00007f04d2a9d6c5 in ceph::__ceph_assert_fail(char const*, char
const*, int, char const*) () from /usr/lib64/ceph/libceph-common.so.0
#3 0x00007f04d2a9d844 in ceph::__ceph_assert_fail(ceph::assert_data
const&) () from /usr/lib64/ceph/libceph-common.so.0
#4 0x00007f04cc807f04 in Client::_lookup_name(Inode*, Inode*, UserPerm
const&) () from /lib64/libcephfs.so.2
#5 0x00007f04cc81c41f in Client::ll_lookup_inode(inodeno_t, UserPerm
const&, Inode**) () from /lib64/libcephfs.so.2
#6 0x00007f04ccadbf0e in create_handle (export_pub=0x1baff10,
desc=<optimized out>, pub_handle=0x7f0470fd4718,
attrs_out=0x7f0470fd4740) at
/usr/src/debug/nfs-ganesha-2.7.3/FSAL/FSAL_CEPH/export.c:256
#7 0x0000000000523895 in mdcache_locate_host (fh_desc=0x7f0470fd4920,
export=export@entry=0x1bafbf0, entry=entry@entry=0x7f0470fd48b8,
attrs_out=attrs_out@entry=0x0)
at
/usr/src/debug/nfs-ganesha-2.7.3/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:1011
#8 0x000000000051d278 in mdcache_create_handle (exp_hdl=0x1bafbf0,
fh_desc=<optimized out>, handle=0x7f0470fd4900, attrs_out=0x0) at
/usr/src/debug/nfs-ganesha-2.7.3/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1578
#9 0x000000000046d404 in nfs4_mds_putfh
(data=data@entry=0x7f0470fd4ea0) at
/usr/src/debug/nfs-ganesha-2.7.3/Protocols/NFS/nfs4_op_putfh.c:211
#10 0x000000000046d8e8 in nfs4_op_putfh (op=0x7f03effaf1d0,
data=0x7f0470fd4ea0, resp=0x7f03ec1de1f0) at
/usr/src/debug/nfs-ganesha-2.7.3/Protocols/NFS/nfs4_op_putfh.c:281
#11 0x000000000045d120 in nfs4_Compound (arg=<optimized out>,
req=<optimized out>, res=0x7f03ec1de9d0) at
/usr/src/debug/nfs-ganesha-2.7.3/Protocols/NFS/nfs4_Compound.c:942
#12 0x00000000004512cd in nfs_rpc_process_request
(reqdata=0x7f03ee5ed4b0) at
/usr/src/debug/nfs-ganesha-2.7.3/MainNFSD/nfs_worker_thread.c:1328
#13 0x0000000000450766 in nfs_rpc_decode_request (xprt=0x7f02180c2320,
xdrs=0x7f03ec568ab0) at
/usr/src/debug/nfs-ganesha-2.7.3/MainNFSD/nfs_rpc_dispatcher_thread.c:1345
#14 0x00007f04df45d07d in svc_rqst_xprt_task (wpe=0x7f02180c2538) at
/usr/src/debug/nfs-ganesha-2.7.3/libntirpc/src/svc_rqst.c:769
#15 0x00007f04df45d59a in svc_rqst_epoll_events (n_events=<optimized
out>, sr_rec=0x4bb53e0) at
/usr/src/debug/nfs-ganesha-2.7.3/libntirpc/src/svc_rqst.c:941
#16 svc_rqst_epoll_loop (sr_rec=<optimized out>) at
/usr/src/debug/nfs-ganesha-2.7.3/libntirpc/src/svc_rqst.c:1014
#17 svc_rqst_run_task (wpe=0x4bb53e0) at
/usr/src/debug/nfs-ganesha-2.7.3/libntirpc/src/svc_rqst.c:1050
#18 0x00007f04df465123 in work_pool_thread (arg=0x7f044c0008c0) at
/usr/src/debug/nfs-ganesha-2.7.3/libntirpc/src/work_pool.c:181
#19 0x00007f04dda05dd5 in start_thread () from /lib64/libpthread.so.0
#20 0x00007f04dcb7dead in clone () from /lib64/libc.so.6
Package versions:
nfs-ganesha-2.7.3-0.1.el7.x86_64
nfs-ganesha-ceph-2.7.3-0.1.el7.x86_64
libcephfs2-14.2.1-0.el7.x86_64
librados2-14.2.1-0.el7.x86_64
I notice in my Ceph log I have a bunch of slow requests around the time
it went down, I'm not sure if it's a symptom of Ganesha segfaulting or
if it was a contributing factor.
Thanks,
David
_______________________________________________
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel(a)lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel