Thanks for your response Soumya.
The client used was Ubuntu16.04.2 VM.
This is not seen consistently but we are hitting this randomly in some failure scenarios
for NFSv4.0 alone.
The scenarios were :
1. SetClientId_Confirm op fails with some gss error and when client retries
SetClientid_Confirm op it tries deleting the backchannel and hits this.
2. Second one was randomly on nfs_client_id_expire path.
One thing I wanted clarification was, if the fix for that panic was this or there was
more to the fix?
>
https://github.com/nfs-ganesha/ntirpc/pull/155/commits/ca74cde10ef02a322b...
> svc_xprt_lookup - Add extra ref on create
>
> An xprt has a ref for the hash table (that's released by SVC_DESTROY());
> but when it's first created, only 1 ref was taken, so there wasn't a ref
> for the caller.
>
> Add an extra ref for the caller when the xprt is first created.
>
> Signed-off-by: Daniel Gryniewicz <dang(a)redhat.com>
>next (#155) v3.2
> …
>v1.8.0
> @dang
> dang committed on 19 Oct 2018
> commit ca74cde10ef02a322b8944a6c8639b1318fa34dc
Regards,
Deepthi
On 15/05/20, 1:41 PM, "Soumya Koduri" <skoduri(a)redhat.com> wrote:
Hi Deepthi,
On 5/15/20 8:16 AM, Deepthi Shivaramu wrote:
> Soumya,
> I see there was discussion in github about the exact same segfault and you were
debugging this issue :
>
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub....
>
> There were multiple fixes discussed in there but ultimately I see this fix was
checked in :
>
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub....
>
> But the strange part is I have that fix already in my source and still hitting
this same segfault.
> Also one correction from my previous mail, actually we are using libntirpc1.7.0
with ganesha2.7.2.
>
> @Soumya, do you know any other fix which was related to this problem?
yes. This issue was fixed a while back and we hadn't encountered it
again. Probably Dan may have some insights on it.
Is this consistently hit? What is the client used?
Thanks,
Soumya
>
> Regards,
> Deepthi
>
> On 14/05/20, 5:09 PM, "Deepthi Shivaramu" <des(a)vmware.com>
wrote:
>
> I see this segfault is in nfs_rpc_destroy_chan() and not specific to
setclientid_confirm.
> We are not seeing it with NFSv4.1 but seeing it frequently with NFSv4.0
tests.
>
> I saw one more core today with bt:
>
> (gdb) bt
> #0 0x00007ff7b1dde71a in svc_release_it (xprt=0x7ff780001740, flags=0,
tag=0x7ff7b1e05fd0 "clnt_vc_destroy", line=462)
> at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/svc.h:433
> #1 0x00007ff7b1ddf4fb in clnt_vc_destroy (clnt=0x7ff780001620) at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/src/clnt_vc.c:462
> #2 0x000000000043b4e1 in clnt_release_it (clnt=0x7ff780001620, flags=0,
tag=0x55e550 <__func__.21824> "_nfs_rpc_destroy_chan", line=628)
> at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/clnt.h:319
> #3 0x000000000043b577 in clnt_destroy_it (clnt=0x7ff780001620,
tag=0x55e550 <__func__.21824> "_nfs_rpc_destroy_chan", line=628)
> at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/clnt.h:341
> #4 0x000000000043eb97 in _nfs_rpc_destroy_chan (chan=0x7ff7940023a8) at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/MainNFSD/nfs_rpc_callback.c:628
> #5 0x000000000043f800 in nfs_rpc_destroy_chan (chan=0x7ff7940023a8) at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/MainNFSD/nfs_rpc_callback.c:864
> #6 0x00000000004bde35 in nfs_client_id_expire (clientid=0x7ff794002300,
make_stale=false)
> at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/SAL/nfs4_clientid.c:1099
> #7 0x00000000004442bf in reap_hash_table (ht_reap=0xf35f40) at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/MainNFSD/nfs_reaper_thread.c:109
> #8 0x0000000000444a62 in reaper_run (ctx=0xf66ca0) at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/MainNFSD/nfs_reaper_thread.c:232
> #9 0x00000000004fdc38 in fridgethr_start_routine (arg=0xf66ca0) at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/support/fridgethr.c:550
> #10 0x00007ff7b09aa3d4 in start_thread (arg=0x7ff791ffb700) at
pthread_create.c:334
> #11 0x00007ff7b02c9ebd in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
> (gdb) f 0
> #0 0x00007ff7b1dde71a in svc_release_it (xprt=0x7ff780001740, flags=0,
tag=0x7ff7b1e05fd0 "clnt_vc_destroy", line=462)
> at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/svc.h:433
> 433 in
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/svc.h
> (gdb) p xprt
> $1 = (SVCXPRT *) 0x7ff780001740
> (gdb) p *$
> $2 = {xp_ops = 0x0, xp_dispatch = {process_cb = 0x0, rendezvous_cb = 0x0},
xp_parent = 0x7ff770004730, xp_tp = 0x6d00000001 <error: Cannot access memory at
address 0x6d00000001>,
> xp_netid = 0x7ff79c00a160 "", xp_p1 = 0x7ff770004750, xp_p2 =
0x0, xp_p3 = 0x0, xp_u1 = 0x3, xp_u2 = 0x0, xp_local = {nb = {maxlen = 0, len = 0, buf =
0x7ff7940018a0}, ss = {
> ss_family = 0, __ss_align = 0, __ss_padding = '\000'
<repeats 111 times>}}, xp_remote = {nb = {maxlen = 4280583506, len = 0, buf = 0x0},
ss = {ss_family = 34467,
> __ss_align = 1,
> __ss_padding =
"_:P\346ju\200\223\001\000\000\000\001\000\000\000`,\000\200\367\177\000\000\341\376\266^",
'\000' <repeats 12 times>,
"\061\000\000\000\000\000\000\000\000\061\000\200\367\177\000\000\220]\000\200\367\177\000\000c3-edbe-2fea12000\000\000\000\000\000\000\000\064\001",
'\000' <repeats 21 times>}}, xp_lock = {__data = {__lock = -1946148624,
> __count = 32759, __owner = 0, __nusers = 37, __kind = -1946148624,
__spins = 32759, __list = {__prev = 0x7ff77c001530, __next = 0x0}},
> __size = "\360
\000\214\367\177\000\000\000\000\000\000%\000\000\000\360
\000\214\367\177\000\000\060\025\000|\367\177\000\000\000\000\000\000\000\000\000",
> __align = 140701182468336}, xp_fd = 0, xp_ifindex = 0, xp_si_type = 3,
xp_type = 0, xp_refcnt = -1, xp_flags = 64}
> (gdb) p xprt->xp_ops
> $3 = (struct xp_ops *) 0x0
> (gdb)
>
>
> Regards,
> Deepthi
>
> On 14/05/20, 12:17 PM, "Deepthi Shivaramu"
<des(a)vmware.com> wrote:
>
> Daniel,
> I am seeing this segfault in the libntirpc1.8.0 with ganesha2.8.2 in
setclientid_confirm code path.
> Can you please check and let me know if you have seen this issue
before and if the fix is already available in latest versions?
>
>
> (gdb) bt
> #0 0x0000000000000000 in ?? ()
> #1 0x00007fd66badf72e in svc_release_it (xprt=0x7fd658002e90,
flags=0,
> tag=0x7fd66bb06fd0 "clnt_vc_destroy", line=462)
> at
>
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/svc.h:433
> #2 0x00007fd66bae04fb in clnt_vc_destroy (clnt=0x7fd658002ba0)
> at
>
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/src/clnt_vc.c:462
> #3 0x000000000043b4e1 in clnt_release_it (clnt=0x7fd658002ba0,
flags=0,
> tag=0x55e550 <__func__.21824> "_nfs_rpc_destroy_chan",
line=628)
> at
>
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/clnt.h:319
> #4 0x000000000043b577 in clnt_destroy_it (clnt=0x7fd658002ba0,
> tag=0x55e550 <__func__.21824> "_nfs_rpc_destroy_chan",
line=628)
> at
>
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/clnt.h:341
> #5 0x000000000043eb97 in _nfs_rpc_destroy_chan (chan=0x7fd64c002648)
> at
>
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/MainNFSD/nfs_rpc_callback.c:628
> #6 0x000000000043f800 in nfs_rpc_destroy_chan (chan=0x7fd64c002648)
> at
>
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/MainNFSD/nfs_rpc_callback.c:864
> #7 0x000000000048011c in nfs4_op_setclientid_confirm
(op=0x7fd62c001d90,
> ---Type <return> to continue, or q <return> to quit---
> data=0x7fd6607dff70, resp=0x7fd62c002070)
> at
>
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/Protocols/NFS/nfs4_op_setclientid_confirm.c:382
> #8 0x000000000045b4b1 in nfs4_Compound (arg=0x7fd62c0011a8,
> req=0x7fd62c000aa0, res=0x7fd62c001f60)
> at
> ....
> .......
> #20 0x00007fd669fcaebd in clone ()
> at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
> (gdb) f 1
> #1 0x00007fd66badf72e in svc_release_it (xprt=0x7fd658002e90,
flags=0,
> tag=0x7fd66bb06fd0 "clnt_vc_destroy", line=462)
> at
>
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/svc.h:433
> 433
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/svc.h:
> No such file or directory.
> (gdb) p clnt
> No symbol "clnt" in current context.
> (gdb) p xprt
> $10 = (SVCXPRT *) 0x7fd658002e90
> (gdb) p *$
> $11 = {xp_ops = 0x7fd658000e20, xp_dispatch = {process_cb =
0x7fd658000078,
> rendezvous_cb = 0x7fd658000078}, xp_parent = 0x0, xp_tp = 0x0,
> xp_netid = 0x0, xp_p1 = 0x0, xp_p2 = 0x0, xp_p3 = 0x0, xp_u1 = 0x0,
> xp_u2 = 0x0, xp_local = {nb = {maxlen = 483619223, len = 1, buf =
0x2},
> ss = {ss_family = 0, __ss_align = 0,
> __ss_padding =
>
"\313)\260k\326\177\000\000\020\320\236b\326\177\000\000\006\000\000\000\034\000\000\000\004\004\005\377\377\377\377\377\000\000\000\000\020\373\364\310\333c\335\363\245\332\362b\324.M\332",
> '\000' <repeats 59 times>}}, xp_remote = {nb = {maxlen =
0, len = 0, buf =
> 0x0}, ss = {ss_family = 0,
> __ss_align = 0, __ss_padding = '\000' <repeats 111
times>}}, xp_lock =
> {
> __data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind =
0,
> __spins = 0, __list = {__prev = 0x0, __next = 0x0}},
> __size = '\000' <repeats 39 times>, __align = 0}, xp_fd
= 0,
> xp_ifindex = 0, xp_si_type = 0, xp_type = 0, xp_refcnt = -1, xp_flags
= 64}
> (gdb) f 6
> #6 0x000000000043f800 in nfs_rpc_destroy_chan (chan=0x7fd64c002648)
> at
>
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/MainNFSD/nfs_rpc_callback.c:864
> 864
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/MainNFSD/nfs_rpc_callback.c:
> No such file or directory.
> (gdb) p chan
> $12 = (rpc_call_channel_t *) 0x7fd64c002648
> (gdb) p *$
> $13 = {type = RPC_CHAN_V40, mtx = {__data = {__lock = 1, __count = 0,
> __owner = 163, __nusers = 1, __kind = 0, __spins = 0, __list = {
> __prev = 0x0, __next = 0x0}},
> __size =
"\001\000\000\000\000\000\000\000\243\000\000\000\001",
> '\000' <repeats 26 times>, __align = 1}, states = 0,
source = {clientid =
> 0x7fd64c0025a0,
> session = 0x7fd64c0025a0}, last_called = 0, clnt = 0x7fd658002ba0,
> auth = 0x0, gss_sec = {mech = 0x0, qop = 0, svc =
RPCSEC_GSS_SVC_INTEGRITY,
> cred = 0x0, req_flags = 0}}
> (gdb) p chan->client
> There is no member named client.
> (gdb) p chan->clnt
> $14 = (CLIENT *) 0x7fd658002ba0
> (gdb) p *$
> $15 = {cl_ops = 0x7fd66bd192e0, cl_netid = 0x0, cl_tp = 0x0, cl_u1 =
0x0,
> cl_u2 = 0x0, cl_lock = {__data = {__lock = 0, __count = 0, __owner =
0,
> __nusers = 0, __kind = 3, __spins = 0, __list = {__prev = 0x0,
> __next = 0x0}},
> __size = '\000' <repeats 16 times>, "\003",
'\000'
> <repeats 22 times>,
> __align = 0}, cl_error = {ru = {RE_errno = 0, RE_why = AUTH_OK,
RE_vers = {
> low = 0, high = 0}, RE_lb = {s1 = 0, s2 = 0}},
> re_status = RPC_SUCCESS}, cl_refcnt = 0, cl_flags = 96}
> (gdb)
>
> On 06/05/20, 10:00 PM, "Daniel Gryniewicz"
<dang(a)redhat.com> wrote:
>
> I'm happy to announce the latest stable versions of NTIRPC and
Ganesha
> in the 2.8 series. These are NTIRPC 1.8.1 and Ganesha 2.8.4.
There are
> >40 bug fixes in these releases.
>
> Daniel
> _______________________________________________
> Devel mailing list -- devel(a)lists.nfs-ganesha.org
> To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
>
>
> _______________________________________________
> Devel mailing list -- devel(a)lists.nfs-ganesha.org
> To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
>
>
>
>
> _______________________________________________
> Devel mailing list -- devel(a)lists.nfs-ganesha.org
> To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
>