Neither of those changes should cause leaks, that I 'm aware of.
However, 1.7.4 has several fixes for leaks in it, so leaks may just be
present in 1.7.2.
Daniel
On 6/4/20 2:33 PM, Deepthi Shivaramu wrote:
Daniel,
I took the below 2 changes to libntirpc on top of our existing libntirpc1.7.0 and
ganesha2.7.2 as per the suggestion in this thread.
But now we are seeing memory leaks in our system testing environment when large number of
shares were being unmounted.
Ganesha seems to be consuming more than 5G memory as per top output.
We are yet to analyse the leak but do we know if these changes in clnt_vc.c in libntirpc
can cause leaks in connection close path?
commit 2d13724606d6391c2cc485d2dbd0555cc6c1bcae
VC - RELEASE after DESTROY
Many error paths call DESTROY, which will unlink and drop the ref. This
means that the final RELEASE will free, causing the DESTROY to
use-after-free. Instead, make sure we DESTROY first.
Signed-off-by: Daniel Gryniewicz <dang(a)redhat.com>
commit c1b95f7519cb3ecbeccdeb69f9d5f534c58383d0
Don't attempt to destroy XPRT if CLNT create was unsuccessful
Currently in clnt_vc_destroy() we call SVC_DESTROY for a XPRT,
but if CLNT (client handle) creation failed then the related
'cx->cx_rec' won't be valid and this will lead to a crash.
Fixed this by calling SVC_DESTROY only when 'cx->cx_rec' is valid.
Signed-off-by: Madhu Thorat <madhu.punjabi(a)in.ibm.com>
Regards,
Deepthi
On 17/05/20, 5:53 PM, "Deepthi Shivaramu" <des(a)vmware.com> wrote:
Thanks Daniel, I will try this.
Regards,
Deepthi
On 15/05/20, 6:55 PM, "Daniel Gryniewicz" <dang(a)redhat.com> wrote:
Try this one:
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub....
Daniel
On 5/15/20 6:55 AM, Deepthi Shivaramu wrote:
> Thanks for your response Soumya.
>
> The client used was Ubuntu16.04.2 VM.
> This is not seen consistently but we are hitting this randomly in some
failure scenarios for NFSv4.0 alone.
>
> The scenarios were :
> 1. SetClientId_Confirm op fails with some gss error and when client retries
SetClientid_Confirm op it tries deleting the backchannel and hits this.
> 2. Second one was randomly on nfs_client_id_expire path.
>
> One thing I wanted clarification was, if the fix for that panic was this or
there was more to the fix?
>
> >
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub....
>> svc_xprt_lookup - Add extra ref on create
>>
>> An xprt has a ref for the hash table (that's released by
SVC_DESTROY());
>> but when it's first created, only 1 ref was taken, so there
wasn't a ref
>> for the caller.
>>
>> Add an extra ref for the caller when the xprt is first created.
>>
>> Signed-off-by: Daniel Gryniewicz <dang(a)redhat.com>
> >next (#155) v3.2
>> …
> >v1.8.0
>> @dang
>> dang committed on 19 Oct 2018
>> commit ca74cde10ef02a322b8944a6c8639b1318fa34dc
>
> Regards,
> Deepthi
>
> On 15/05/20, 1:41 PM, "Soumya Koduri" <skoduri(a)redhat.com>
wrote:
>
> Hi Deepthi,
>
>
> On 5/15/20 8:16 AM, Deepthi Shivaramu wrote:
> > Soumya,
> > I see there was discussion in github about the exact same
segfault and you were debugging this issue :
> >
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub....
> >
> > There were multiple fixes discussed in there but ultimately I see
this fix was checked in :
> >
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub....
> >
> > But the strange part is I have that fix already in my source and
still hitting this same segfault.
> > Also one correction from my previous mail, actually we are using
libntirpc1.7.0 with ganesha2.7.2.
> >
> > @Soumya, do you know any other fix which was related to this
problem?
>
> yes. This issue was fixed a while back and we hadn't encountered
it
> again. Probably Dan may have some insights on it.
>
> Is this consistently hit? What is the client used?
>
> Thanks,
> Soumya
>
> >
> > Regards,
> > Deepthi
> >
> > On 14/05/20, 5:09 PM, "Deepthi Shivaramu"
<des(a)vmware.com> wrote:
> >
> > I see this segfault is in nfs_rpc_destroy_chan() and not
specific to setclientid_confirm.
> > We are not seeing it with NFSv4.1 but seeing it frequently
with NFSv4.0 tests.
> >
> > I saw one more core today with bt:
> >
> > (gdb) bt
> > #0 0x00007ff7b1dde71a in svc_release_it
(xprt=0x7ff780001740, flags=0, tag=0x7ff7b1e05fd0 "clnt_vc_destroy", line=462)
> > at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/svc.h:433
> > #1 0x00007ff7b1ddf4fb in clnt_vc_destroy
(clnt=0x7ff780001620) at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/src/clnt_vc.c:462
> > #2 0x000000000043b4e1 in clnt_release_it
(clnt=0x7ff780001620, flags=0, tag=0x55e550 <__func__.21824>
"_nfs_rpc_destroy_chan", line=628)
> > at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/clnt.h:319
> > #3 0x000000000043b577 in clnt_destroy_it
(clnt=0x7ff780001620, tag=0x55e550 <__func__.21824>
"_nfs_rpc_destroy_chan", line=628)
> > at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/clnt.h:341
> > #4 0x000000000043eb97 in _nfs_rpc_destroy_chan
(chan=0x7ff7940023a8) at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/MainNFSD/nfs_rpc_callback.c:628
> > #5 0x000000000043f800 in nfs_rpc_destroy_chan
(chan=0x7ff7940023a8) at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/MainNFSD/nfs_rpc_callback.c:864
> > #6 0x00000000004bde35 in nfs_client_id_expire
(clientid=0x7ff794002300, make_stale=false)
> > at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/SAL/nfs4_clientid.c:1099
> > #7 0x00000000004442bf in reap_hash_table (ht_reap=0xf35f40)
at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/MainNFSD/nfs_reaper_thread.c:109
> > #8 0x0000000000444a62 in reaper_run (ctx=0xf66ca0) at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/MainNFSD/nfs_reaper_thread.c:232
> > #9 0x00000000004fdc38 in fridgethr_start_routine
(arg=0xf66ca0) at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/support/fridgethr.c:550
> > #10 0x00007ff7b09aa3d4 in start_thread (arg=0x7ff791ffb700)
at pthread_create.c:334
> > #11 0x00007ff7b02c9ebd in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
> > (gdb) f 0
> > #0 0x00007ff7b1dde71a in svc_release_it
(xprt=0x7ff780001740, flags=0, tag=0x7ff7b1e05fd0 "clnt_vc_destroy", line=462)
> > at
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/svc.h:433
> > 433 in
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/svc.h
> > (gdb) p xprt
> > $1 = (SVCXPRT *) 0x7ff780001740
> > (gdb) p *$
> > $2 = {xp_ops = 0x0, xp_dispatch = {process_cb = 0x0,
rendezvous_cb = 0x0}, xp_parent = 0x7ff770004730, xp_tp = 0x6d00000001 <error: Cannot
access memory at address 0x6d00000001>,
> > xp_netid = 0x7ff79c00a160 "", xp_p1 =
0x7ff770004750, xp_p2 = 0x0, xp_p3 = 0x0, xp_u1 = 0x3, xp_u2 = 0x0, xp_local = {nb =
{maxlen = 0, len = 0, buf = 0x7ff7940018a0}, ss = {
> > ss_family = 0, __ss_align = 0, __ss_padding =
'\000' <repeats 111 times>}}, xp_remote = {nb = {maxlen = 4280583506, len =
0, buf = 0x0}, ss = {ss_family = 34467,
> > __ss_align = 1,
> > __ss_padding =
"_:P\346ju\200\223\001\000\000\000\001\000\000\000`,\000\200\367\177\000\000\341\376\266^",
'\000' <repeats 12 times>,
"\061\000\000\000\000\000\000\000\000\061\000\200\367\177\000\000\220]\000\200\367\177\000\000c3-edbe-2fea12000\000\000\000\000\000\000\000\064\001",
'\000' <repeats 21 times>}}, xp_lock = {__data = {__lock = -1946148624,
> > __count = 32759, __owner = 0, __nusers = 37, __kind =
-1946148624, __spins = 32759, __list = {__prev = 0x7ff77c001530, __next = 0x0}},
> > __size = "\360
\000\214\367\177\000\000\000\000\000\000%\000\000\000\360
\000\214\367\177\000\000\060\025\000|\367\177\000\000\000\000\000\000\000\000\000",
> > __align = 140701182468336}, xp_fd = 0, xp_ifindex = 0,
xp_si_type = 3, xp_type = 0, xp_refcnt = -1, xp_flags = 64}
> > (gdb) p xprt->xp_ops
> > $3 = (struct xp_ops *) 0x0
> > (gdb)
> >
> >
> > Regards,
> > Deepthi
> >
> > On 14/05/20, 12:17 PM, "Deepthi Shivaramu"
<des(a)vmware.com> wrote:
> >
> > Daniel,
> > I am seeing this segfault in the libntirpc1.8.0 with
ganesha2.8.2 in setclientid_confirm code path.
> > Can you please check and let me know if you have seen
this issue before and if the fix is already available in latest versions?
> >
> >
> > (gdb) bt
> > #0 0x0000000000000000 in ?? ()
> > #1 0x00007fd66badf72e in svc_release_it
(xprt=0x7fd658002e90, flags=0,
> > tag=0x7fd66bb06fd0 "clnt_vc_destroy",
line=462)
> > at
> >
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/svc.h:433
> > #2 0x00007fd66bae04fb in clnt_vc_destroy
(clnt=0x7fd658002ba0)
> > at
> >
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/src/clnt_vc.c:462
> > #3 0x000000000043b4e1 in clnt_release_it
(clnt=0x7fd658002ba0, flags=0,
> > tag=0x55e550 <__func__.21824>
"_nfs_rpc_destroy_chan", line=628)
> > at
> >
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/clnt.h:319
> > #4 0x000000000043b577 in clnt_destroy_it
(clnt=0x7fd658002ba0,
> > tag=0x55e550 <__func__.21824>
"_nfs_rpc_destroy_chan", line=628)
> > at
> >
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/clnt.h:341
> > #5 0x000000000043eb97 in _nfs_rpc_destroy_chan
(chan=0x7fd64c002648)
> > at
> >
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/MainNFSD/nfs_rpc_callback.c:628
> > #6 0x000000000043f800 in nfs_rpc_destroy_chan
(chan=0x7fd64c002648)
> > at
> >
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/MainNFSD/nfs_rpc_callback.c:864
> > #7 0x000000000048011c in nfs4_op_setclientid_confirm
(op=0x7fd62c001d90,
> > ---Type <return> to continue, or q <return>
to quit---
> > data=0x7fd6607dff70, resp=0x7fd62c002070)
> > at
> >
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/Protocols/NFS/nfs4_op_setclientid_confirm.c:382
> > #8 0x000000000045b4b1 in nfs4_Compound
(arg=0x7fd62c0011a8,
> > req=0x7fd62c000aa0, res=0x7fd62c001f60)
> > at
> > ....
> > .......
> > #20 0x00007fd669fcaebd in clone ()
> > at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
> > (gdb) f 1
> > #1 0x00007fd66badf72e in svc_release_it
(xprt=0x7fd658002e90, flags=0,
> > tag=0x7fd66bb06fd0 "clnt_vc_destroy",
line=462)
> > at
> >
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/svc.h:433
> > 433
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/libntirpc/ntirpc/rpc/svc.h:
> > No such file or directory.
> > (gdb) p clnt
> > No symbol "clnt" in current context.
> > (gdb) p xprt
> > $10 = (SVCXPRT *) 0x7fd658002e90
> > (gdb) p *$
> > $11 = {xp_ops = 0x7fd658000e20, xp_dispatch =
{process_cb = 0x7fd658000078,
> > rendezvous_cb = 0x7fd658000078}, xp_parent = 0x0, xp_tp
= 0x0,
> > xp_netid = 0x0, xp_p1 = 0x0, xp_p2 = 0x0, xp_p3 = 0x0,
xp_u1 = 0x0,
> > xp_u2 = 0x0, xp_local = {nb = {maxlen = 483619223, len =
1, buf = 0x2},
> > ss = {ss_family = 0, __ss_align = 0,
> > __ss_padding =
> >
"\313)\260k\326\177\000\000\020\320\236b\326\177\000\000\006\000\000\000\034\000\000\000\004\004\005\377\377\377\377\377\000\000\000\000\020\373\364\310\333c\335\363\245\332\362b\324.M\332",
> > '\000' <repeats 59 times>}}, xp_remote =
{nb = {maxlen = 0, len = 0, buf =
> > 0x0}, ss = {ss_family = 0,
> > __ss_align = 0, __ss_padding = '\000'
<repeats 111 times>}}, xp_lock =
> > {
> > __data = {__lock = 0, __count = 0, __owner = 0, __nusers
= 0, __kind = 0,
> > __spins = 0, __list = {__prev = 0x0, __next = 0x0}},
> > __size = '\000' <repeats 39 times>,
__align = 0}, xp_fd = 0,
> > xp_ifindex = 0, xp_si_type = 0, xp_type = 0, xp_refcnt =
-1, xp_flags = 64}
> > (gdb) f 6
> > #6 0x000000000043f800 in nfs_rpc_destroy_chan
(chan=0x7fd64c002648)
> > at
> >
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/MainNFSD/nfs_rpc_callback.c:864
> > 864
/build/mts/release/bora-16138726/cayman_nfs-ganesha/nfs-ganesha/src/src/MainNFSD/nfs_rpc_callback.c:
> > No such file or directory.
> > (gdb) p chan
> > $12 = (rpc_call_channel_t *) 0x7fd64c002648
> > (gdb) p *$
> > $13 = {type = RPC_CHAN_V40, mtx = {__data = {__lock = 1,
__count = 0,
> > __owner = 163, __nusers = 1, __kind = 0, __spins = 0,
__list = {
> > __prev = 0x0, __next = 0x0}},
> > __size =
"\001\000\000\000\000\000\000\000\243\000\000\000\001",
> > '\000' <repeats 26 times>, __align = 1},
states = 0, source = {clientid =
> > 0x7fd64c0025a0,
> > session = 0x7fd64c0025a0}, last_called = 0, clnt =
0x7fd658002ba0,
> > auth = 0x0, gss_sec = {mech = 0x0, qop = 0, svc =
RPCSEC_GSS_SVC_INTEGRITY,
> > cred = 0x0, req_flags = 0}}
> > (gdb) p chan->client
> > There is no member named client.
> > (gdb) p chan->clnt
> > $14 = (CLIENT *) 0x7fd658002ba0
> > (gdb) p *$
> > $15 = {cl_ops = 0x7fd66bd192e0, cl_netid = 0x0, cl_tp =
0x0, cl_u1 = 0x0,
> > cl_u2 = 0x0, cl_lock = {__data = {__lock = 0, __count =
0, __owner = 0,
> > __nusers = 0, __kind = 3, __spins = 0, __list = {__prev
= 0x0,
> > __next = 0x0}},
> > __size = '\000' <repeats 16 times>,
"\003", '\000'
> > <repeats 22 times>,
> > __align = 0}, cl_error = {ru = {RE_errno = 0, RE_why =
AUTH_OK, RE_vers = {
> > low = 0, high = 0}, RE_lb = {s1 = 0, s2 = 0}},
> > re_status = RPC_SUCCESS}, cl_refcnt = 0, cl_flags = 96}
> > (gdb)
> >
> > On 06/05/20, 10:00 PM, "Daniel Gryniewicz"
<dang(a)redhat.com> wrote:
> >
> > I'm happy to announce the latest stable versions
of NTIRPC and Ganesha
> > in the 2.8 series. These are NTIRPC 1.8.1 and
Ganesha 2.8.4. There are
> > >40 bug fixes in these releases.
> >
> > Daniel
> > _______________________________________________
> > Devel mailing list -- devel(a)lists.nfs-ganesha.org
> > To unsubscribe send an email to
devel-leave(a)lists.nfs-ganesha.org
> >
> >
> > _______________________________________________
> > Devel mailing list -- devel(a)lists.nfs-ganesha.org
> > To unsubscribe send an email to
devel-leave(a)lists.nfs-ganesha.org
> >
> >
> >
> >
> > _______________________________________________
> > Devel mailing list -- devel(a)lists.nfs-ganesha.org
> > To unsubscribe send an email to
devel-leave(a)lists.nfs-ganesha.org
> >
>
>
>
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org
To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org