---------- Forwarded message ---------
From: Sachin Punadikar <punadikar.sachin@gmail.com>
Date: Thu, Dec 6, 2018 at 7:52 PM
Subject: Ganesha crash in lock_avail
To: nfs-ganesha-devel <nfs-ganesha-devel@lists.sourceforge.net>


Hello,
Customer reported below crash:
(gdb) where
#0  0x00007fa70c161fcb in raise () from /lib64/libpthread.so.0
#1  0x0000000000454884 in crash_handler (signo=11, info=0x7fa5a1ff9f30, ctx=0x7fa5a1ff9e00)
    at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/MainNFSD/nfs_init.c:225
#2  <signal handler called>
#3  0x0000000000000000 in ?? ()
#4  0x0000000000435084 in lock_avail (vec=0x18f07c8, file=0x7fa420157fd8, owner=0x7fa4f8189fc0,
    lock_param=0x7fa420157ff0)
    at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/FSAL_UP/fsal_up_top.c:179
#5  0x00000000005386eb in mdc_up_lock_avail (vec=0x18f07c8, file=0x7fa420157fd8, owner=0x7fa4f8189fc0,
    lock_param=0x7fa420157ff0)
    at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_up.c:380
#6  0x0000000000439c72 in queue_lock_avail (ctx=0x7fa40c039c40)
    at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/FSAL_UP/fsal_up_async.c:247
#7  0x000000000050a32c in fridgethr_start_routine (arg=0x7fa40c039c40)
    at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/support/fridgethr.c:550
#8  0x00007fa70c15adc5 in start_thread () from /lib64/libpthread.so.0
#9  0x00007fa70b81a1cd in clone () from /lib64/libc.so.6

It was found that op_ctx was not proper.
(gdb) frame 4
#4  0x0000000000435084 in lock_avail (vec=0x18f07c8, file=0x7fa420157fd8, owner=0x7fa4f8189fc0,
    lock_param=0x7fa420157ff0)
    at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/FSAL_UP/fsal_up_top.c:179
179        obj->obj_ops.put_ref(obj);
(gdb) p *obj
$2 = {handles = {next = 0x0, prev = 0x0}, fs = 0x193e240, fsal = 0x0, obj_ops = {get_ref = 0x0,
    put_ref = 0x0, release = 0x0, merge = 0x0, lookup = 0x0, readdir = 0x0,
    compute_readdir_cookie = 0x0, dirent_cmp = 0x0, create = 0x0, mkdir = 0x0, mknode = 0x0,
    symlink = 0x0, readlink = 0x0, test_access = 0x0, getattrs = 0x0, setattrs = 0x0, link = 0x0,
    fs_locations = 0x0, rename = 0x0, unlink = 0x0, open = 0x0, reopen = 0x0, status = 0x0,
    read = 0x0, read_plus = 0x0, write = 0x0, write_plus = 0x0, seek = 0x0, io_advise = 0x0,
    commit = 0x0, lock_op = 0x0, share_op = 0x0, close = 0x0, list_ext_attrs = 0x0,
    getextattr_id_by_name = 0x0, getextattr_value_by_name = 0x0, getextattr_value_by_id = 0x0,
    setextattr_value = 0x0, setextattr_value_by_id = 0x0, remove_extattr_by_id = 0x0,
    remove_extattr_by_name = 0x0, handle_is = 0x0, handle_to_wire = 0x0, handle_to_key = 0x0,
    handle_cmp = 0x0, layoutget = 0x0, layoutreturn = 0x0, layoutcommit = 0x0, getxattrs = 0x0,
    setxattrs = 0x0, removexattrs = 0x0, listxattrs = 0x0, open2 = 0x0, check_verifier = 0x0,
    status2 = 0x0, reopen2 = 0x0, read2 = 0x0, write2 = 0x0, seek2 = 0x0, io_advise2 = 0x0,
    commit2 = 0x0, lock_op2 = 0x0, setattr2 = 0x0, close2 = 0x0}, obj_lock = {__data = {
      __lock = 0, __nr_readers = 0, __readers_wakeup = 0, __writer_wakeup = 0,
      __nr_readers_queued = 0, __nr_writers_queued = 0, __writer = 0, __shared = 0, __pad1 = 0,
      __pad2 = 0, __flags = 0}, __size = '\000' <repeats 55 times>, __align = 0},
  type = REGULAR_FILE, fsid = {major = 11073324921844891658, minor = 1}, fileid = 229392385,
  state_hdl = 0x7fa51006aea0}

(gdb) frame 5
#5  0x00000000005386eb in mdc_up_lock_avail (vec=0x18f07c8, file=0x7fa420157fd8,
    owner=0x7fa4f8189fc0, lock_param=0x7fa420157ff0)
    at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_up.c:380
380        rc = myself->super_up_ops.lock_avail(vec, file, owner,
(gdb) p op_ctx
$3 = (struct req_op_context *) 0x7fa5a1ffa430
(gdb) p *op_ctx
$4 = {creds = 0x0, original_creds = {caller_uid = 0, caller_gid = 0, caller_glen = 0,
    caller_garray = 0x0}, caller_gdata = 0x0, caller_garray_copy = 0x0, managed_garray_copy = 0x0,
  cred_flags = 0, caller_addr = 0x0, clientid = 0x0, nfs_vers = 0, nfs_minorvers = 0,
  req_type = 0, client = 0x0, ctx_export = 0x18efc78, fsal_export = 0x18f0680, export_perms = 0x0,
  start_time = 0, queue_wait = 0, fsal_private = 0x0, fsal_module = 0x0, fsal_pnfs_ds = 0x0}
(gdb)

In the above it shows that op_ctx is not set properly. "fsal_module" is NULL.

To fix this issue I have posted a patch. 
https://review.gerrithub.io/#/c/436356/
--
with regards,
Sachin Punadikar


--
with regards,
Sachin Punadikar