Ganesha Version: 2.5.1

Platform: Linux x86_64

 

I am seeing following issue while doing re-exporting the ganesha export (remove_export -> add_export)

The crash happens during add_export operation and there are some applications using the export while this is happening. We also implemented mechanism to stall the IOs when this operation is happening but it did not help.

 

Program terminated with signal 11, Segmentation fault.

#0  0x0000000000530d27 in mdcache_lru_cleanup_push (entry=0x7f3078007f50)

    at /usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:935

935                     LRU_DQ_SAFE(lru, q);

Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-12.el7.x86_64 dbus-libs-1.10.24-7.el7.x86_64 elfutils-libelf-0.160-1.el7.x86_64 elfutils-libs-0.160-1.el7.x86_64 gssproxy-0.3.0-10.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-19.el7.x86_64 libacl-2.2.51-14.el7.x86_64 libattr-2.4.46-13.el7.x86_64 libblkid-2.23.2-21.el7.x86_64 libcap-2.22-8.el7.x86_64 libcom_err-1.42.9-12.el7_5.x86_64 libgcrypt-1.5.3-12.el7.x86_64 libgpg-error-1.12-3.el7.x86_64 libnfsidmap-0.25-11.el7.x86_64 libselinux-2.5-12.el7.x86_64 libuuid-2.23.2-21.el7.x86_64 lz4-1.7.5-2.el7.x86_64 pcre-8.32-17.el7.x86_64 systemd-libs-219-57.el7.x86_64 xz-libs-5.1.2-9alpha.el7.x86_64 zlib-1.2.7-13.el7.x86_64

(gdb) bt

#0  0x0000000000530d27 in mdcache_lru_cleanup_push (entry=0x7f3078007f50)

    at /usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:935

#1  0x000000000054a0fc in _mdcache_kill_entry (entry=0x7f3078007f50,

    file=0x5a0970 "/root/rpmbuild/BUILD/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c", line=218,

    function=0x5a2120 <__func__.23400> "mdcache_alloc_handle")

    at /usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:3310

#2  0x00000000005419de in mdcache_alloc_handle (export=0x7f30bc05d070, sub_handle=0x7f3078007c20, fs=0x0)

    at /usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:218

#3  0x00000000005430ea in mdcache_new_entry (export=0x7f30bc05d070, sub_handle=0x7f3078007c20, attrs_in=0x7f31287c6670, attrs_out=0x0, new_directory=false,

    entry=0x7f31287c67e8, state=0x0) at /usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:590

#4  0x0000000000544093 in mdcache_locate_host (fh_desc=0x7f31287c6c60, export=0x7f30bc05d070, entry=0x7f31287c67e8, attrs_out=0x0)

    at /usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:998

#5  0x000000000053d1b3 in mdcache_create_handle (exp_hdl=0x7f30bc05d070, fh_desc=0x7f31287c6c60, handle=0x7f31287c6c58, attrs_out=0x0)

    at /usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1902

#6  0x000000000047729d in nfs4_mds_putfh (data=0x7f31287c6d60) at /usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/Protocols/NFS/nfs4_op_putfh.c:211

#7  0x0000000000477486 in nfs4_op_putfh (op=0x7f30a00424c0, data=0x7f31287c6d60, resp=0x7f3078000d30)

    at /usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/Protocols/NFS/nfs4_op_putfh.c:281

#8  0x000000000045f670 in nfs4_Compound (arg=0x7f30a00010f0, req=0x7f30a00008e8, res=0x7f3078001fa0)

    at /usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/Protocols/NFS/nfs4_Compound.c:743

#9  0x000000000044c20d in nfs_rpc_execute (reqdata=0x7f30a00008c0) at /usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1289

#10 0x000000000044ca17 in worker_run (ctx=0x1d88bf0) at /usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1561

#11 0x0000000000508a7a in fridgethr_start_routine (arg=0x1d88bf0) at /usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/support/fridgethr.c:550

#12 0x00007f31797f2df5 in start_thread (arg=0x7f31287c8700) at pthread_create.c:308

#13 0x00007f3178eb31ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

 

(gdb) p q

$1 = (struct lru_q *) 0x0

(gdb) f 3

#3  0x00000000005430ea in mdcache_new_entry (export=0x7f30bc05d070, sub_handle=0x7f3078007c20, attrs_in=0x7f31287c6670, attrs_out=0x0, new_directory=false,

    entry=0x7f31287c67e8, state=0x0) at /usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:590

590             nentry = mdcache_alloc_handle(export, sub_handle, sub_handle->fs);

(gdb) p export->flags

$2 = 1 '\001'

(gdb) p entry->lru

$6 = {q = {next = 0x0, prev = 0x0}, qid = LRU_ENTRY_NONE, refcnt = 1, flags = 0, lane = 2, cf = 0}

 

mdc_check_mapping () in following snippet returns error because MDC_UNEXPORT flag in export->flags is set.

After an export is completely removed and ganesha_mgr remove_export command returned successfully, why do we see the flag set?

The comment also says “The current export is in process to be unexported

 

        /* Map the export before we put this entry into the LRU, but after it's

         * well enough set up to be able to be unrefed by unexport should there

         * be a race.

         */

        status = mdc_check_mapping(result);

 

        if (unlikely(FSAL_IS_ERROR(status))) {

                /* The current export is in process to be unexported, don't

                 * create new mdcache entries.

                 */

                LogDebug(COMPONENT_CACHE_INODE,

                         "Trying to allocate a new entry %p for export id %"

                         PRIi16" that is in the process of being unexported",

                         result, op_ctx->ctx_export->export_id);

                mdcache_put(result);

                mdcache_kill_entry(result);

                return NULL;

        }

 

The crash occurs because lru_queue_of() returns q = NULL due to entry->lru.qid = LRU_NO_LANE and LRU_DQ_SAFE refers to q.

 

Is there any deferred work during unexport in mdcache FSAL?

If we add delay between remove_export and add_export, I did not see the problem. But that does not seem to be elegant solution.

 

Please help me understand if there are any limitations from mdcache with respect to back to back unexport and export operation.

 

Thanks,

Sandeep

 

 

 

 

***************************Legal Disclaimer***************************
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**********************************************************************