Fwd: Ganesha crash in lock_avail
by Sachin Punadikar
---------- Forwarded message ---------
From: Sachin Punadikar <punadikar.sachin(a)gmail.com>
Date: Thu, Dec 6, 2018 at 7:52 PM
Subject: Ganesha crash in lock_avail
To: nfs-ganesha-devel <nfs-ganesha-devel(a)lists.sourceforge.net>
Hello,
Customer reported below crash:
(gdb) where
#0 0x00007fa70c161fcb in raise () from /lib64/libpthread.so.0
#1 0x0000000000454884 in crash_handler (signo=11, info=0x7fa5a1ff9f30,
ctx=0x7fa5a1ff9e00)
at
/usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/MainNFSD/nfs_init.c:225
#2 <signal handler called>
#3 0x0000000000000000 in ?? ()
#4 0x0000000000435084 in lock_avail (vec=0x18f07c8, file=0x7fa420157fd8,
owner=0x7fa4f8189fc0,
lock_param=0x7fa420157ff0)
at
/usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/FSAL_UP/fsal_up_top.c:179
#5 0x00000000005386eb in mdc_up_lock_avail (vec=0x18f07c8,
file=0x7fa420157fd8, owner=0x7fa4f8189fc0,
lock_param=0x7fa420157ff0)
at
/usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_up.c:380
#6 0x0000000000439c72 in queue_lock_avail (ctx=0x7fa40c039c40)
at
/usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/FSAL_UP/fsal_up_async.c:247
#7 0x000000000050a32c in fridgethr_start_routine (arg=0x7fa40c039c40)
at
/usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/support/fridgethr.c:550
#8 0x00007fa70c15adc5 in start_thread () from /lib64/libpthread.so.0
#9 0x00007fa70b81a1cd in clone () from /lib64/libc.so.6
It was found that op_ctx was not proper.
(gdb) frame 4
#4 0x0000000000435084 in lock_avail (vec=0x18f07c8, file=0x7fa420157fd8,
owner=0x7fa4f8189fc0,
lock_param=0x7fa420157ff0)
at
/usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/FSAL_UP/fsal_up_top.c:179
179 obj->obj_ops.put_ref(obj);
(gdb) p *obj
$2 = {handles = {next = 0x0, prev = 0x0}, fs = 0x193e240, fsal = 0x0,
obj_ops = {get_ref = 0x0,
put_ref = 0x0, release = 0x0, merge = 0x0, lookup = 0x0, readdir = 0x0,
compute_readdir_cookie = 0x0, dirent_cmp = 0x0, create = 0x0, mkdir =
0x0, mknode = 0x0,
symlink = 0x0, readlink = 0x0, test_access = 0x0, getattrs = 0x0,
setattrs = 0x0, link = 0x0,
fs_locations = 0x0, rename = 0x0, unlink = 0x0, open = 0x0, reopen =
0x0, status = 0x0,
read = 0x0, read_plus = 0x0, write = 0x0, write_plus = 0x0, seek = 0x0,
io_advise = 0x0,
commit = 0x0, lock_op = 0x0, share_op = 0x0, close = 0x0,
list_ext_attrs = 0x0,
getextattr_id_by_name = 0x0, getextattr_value_by_name = 0x0,
getextattr_value_by_id = 0x0,
setextattr_value = 0x0, setextattr_value_by_id = 0x0,
remove_extattr_by_id = 0x0,
remove_extattr_by_name = 0x0, handle_is = 0x0, handle_to_wire = 0x0,
handle_to_key = 0x0,
handle_cmp = 0x0, layoutget = 0x0, layoutreturn = 0x0, layoutcommit =
0x0, getxattrs = 0x0,
setxattrs = 0x0, removexattrs = 0x0, listxattrs = 0x0, open2 = 0x0,
check_verifier = 0x0,
status2 = 0x0, reopen2 = 0x0, read2 = 0x0, write2 = 0x0, seek2 = 0x0,
io_advise2 = 0x0,
commit2 = 0x0, lock_op2 = 0x0, setattr2 = 0x0, close2 = 0x0}, obj_lock
= {__data = {
__lock = 0, __nr_readers = 0, __readers_wakeup = 0, __writer_wakeup =
0,
__nr_readers_queued = 0, __nr_writers_queued = 0, __writer = 0,
__shared = 0, __pad1 = 0,
__pad2 = 0, __flags = 0}, __size = '\000' <repeats 55 times>, __align
= 0},
type = REGULAR_FILE, fsid = {major = 11073324921844891658, minor = 1},
fileid = 229392385,
state_hdl = 0x7fa51006aea0}
(gdb) frame 5
#5 0x00000000005386eb in mdc_up_lock_avail (vec=0x18f07c8,
file=0x7fa420157fd8,
owner=0x7fa4f8189fc0, lock_param=0x7fa420157ff0)
at
/usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_up.c:380
380 rc = myself->super_up_ops.lock_avail(vec, file, owner,
(gdb) p op_ctx
$3 = (struct req_op_context *) 0x7fa5a1ffa430
(gdb) p *op_ctx
$4 = {creds = 0x0, original_creds = {caller_uid = 0, caller_gid = 0,
caller_glen = 0,
caller_garray = 0x0}, caller_gdata = 0x0, caller_garray_copy = 0x0,
managed_garray_copy = 0x0,
cred_flags = 0, caller_addr = 0x0, clientid = 0x0, nfs_vers = 0,
nfs_minorvers = 0,
req_type = 0, client = 0x0, ctx_export = 0x18efc78, fsal_export =
0x18f0680, export_perms = 0x0,
start_time = 0, queue_wait = 0, fsal_private = 0x0, fsal_module = 0x0,
fsal_pnfs_ds = 0x0}
(gdb)
In the above it shows that op_ctx is not set properly. "fsal_module" is
NULL.
To fix this issue I have posted a patch.
https://review.gerrithub.io/#/c/436356/
--
with regards,
Sachin Punadikar
--
with regards,
Sachin Punadikar
6 years
Change in ffilz/nfs-ganesha[next]: Set op_ctx for lock_avail and lock_grant
by Sachin Punadikar (GerritHub)
Sachin Punadikar has uploaded this change for review. ( https://review.gerrithub.io/436356
Change subject: Set op_ctx for lock_avail and lock_grant
......................................................................
Set op_ctx for lock_avail and lock_grant
Ganesha crashes in functions lock_avail and lock_grant due to
non-availability of proper op_ctx. In parent functions
mdc_up_lock_grant and mdc_up_lock_avail the op_ctx is not set
properly leading to crash, "init_root_op_context" invoked to
set it properly.
Change-Id: Iaffa2df9f0ba8c9b6d535dddafe54c6ccc701afe
Signed-off-by: Sachin Punadikar <psachin(a)in.ibm.com>
---
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_up.c
1 file changed, 11 insertions(+), 12 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/56/436356/1
--
To view, visit https://review.gerrithub.io/436356
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iaffa2df9f0ba8c9b6d535dddafe54c6ccc701afe
Gerrit-Change-Number: 436356
Gerrit-PatchSet: 1
Gerrit-Owner: Sachin Punadikar <psachin(a)in.ibm.com>
6 years
Depleting fds security issue
by William Allen Simpson
This was reported on ntirpc at Github. After DanG's fix for
transport reference counting, Gaurav Gangalwar wrote that s/he
can deplete the available fds with:
for i in {1..1000}; do echo $i; mount -t nfs -o vers=3,tcp :/export /mnt; umount /mnt; done
There is a configurable parameter limiting the number of
concurrent fds. They are cleaned up after a fairly long
inactivity timeout. So an adversary (or runaway client) can
quickly deplete the fds, preventing valid clients from mounting.
The most obvious patch would be for umount to add the same
SVC_DESTROY line that DanG (and previously me) had added for the
TCP well known ports in dispatch.
But that would allow runaway clients to flood the server.
Preventing that seems to be the (undocumented) purpose of the
configurable parameter.
The parameter and cleanup code are relatively old, and added to
ntirpc circa 2013. Was it a bad idea?
Which security issue is more important?
6 years
Community Conference Calls in 2019
by Frank Filz
Our kids will be starting at a new school in January with an 8:30 AM drop
off instead of a 9:00 AM drop off, which means I need to end the call by
8:00 AM Pacific Time.
One option in recognition that lately we rarely go beyond 30 minutes on the
call anyway is to cut the call at 30 minutes instead of 60 minutes.
Another option is to start 30 minutes earlier (7:00 AM Pacific Time). This
option conflicts with another meeting several of us are in, but that meeting
might be able to be rescheduled.
My thought is try the 30 minute meeting and if that just isn't enough, we
can look at changing the start time (or picking a different day).
Frank
6 years, 1 month
test message
by Frank Filz
Test message, maybe this one will get held..
Frank
6 years, 1 month
making liburcu and lttng coexist in a LGPL'ed program
by Jeff Layton
The nfs-ganesha project has used lttng for quite some time to handle
tracing. Recently though, we decided to start building liburcu in as a
mandatory component, with an eye toward using it in certain areas.
Before this change, the code linked in liburcu-bp directly, but now we
just use liburcu. Unfortunately, when we enable tracepoints in the build
now, we get errors like this at link time:
--------------------8<----------------
[ 96%] Linking C executable ganesha.nfsd
/usr/bin/ld: libMainServices.a(nfs_worker_thread.c.o): undefined reference to symbol 'rcu_gp_bp'
/usr/bin/ld: //usr/lib64/liburcu-bp.so.6: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
make[2]: *** [MainNFSD/CMakeFiles/ganesha.nfsd.dir/build.make:308: MainNFSD/ganesha.nfsd] Error 1
make[1]: *** [CMakeFiles/Makefile2:2740: MainNFSD/CMakeFiles/ganesha.nfsd.dir/all] Error 2
make: *** [Makefile:152: all] Error 2
--------------------8<----------------
nfs-ganesha defines _LGPL_SOURCE, and that makes lttng use the
redefinitions in tracepoint-rcu.h. If I disable _LGPL_SOURCE, it all
builds as expected.
I found a similar bug here:
https://bugs.lttng.org/issues/1156
Any thoughts on the right fix for this? We'd like to eat our cake and
have it too, so that we can have _LGPL_SOURCE defined, lttng enabled,
and the urcu flavor be determined at runtime.
Many thanks,
--
Jeff Layton <jlayton(a)redhat.com>
6 years, 1 month
Ganesha crashe in dec_nlm_state_ref.
by Suhrud Patankar
Hello All,
Ganesha: V2.5.0.4
Windows client: Windows 10 Pro
Testing NFSv3 access from windows client. Sometimes Ganesha crashes in
dec_nlm_state_ref.
My understanding of the issue is -
state is maintained in multiple lists but a single ref is taken for all entries.
state mutex is not held while adding/removing it from different lists.
this can cause incorrect ref on the state object.
Attached patch appears to fix this. However I still need to test more.
Any comments on the issue and patch will be highly appreciated.
Thanks & Regards,
Suhrud
6 years, 1 month