Yeah, PRs on github.
Daniel
On 1/30/20 2:14 PM, Sriram Patil wrote:
Daniel, thank you for the quick response. I will fix this and verify
in our environment. Will post a review request soon.
Do we use pull requests for ntirpc fixes?
Thanks,
Sriram
On 1/30/20, 11:08 AM, "Daniel Gryniewicz" <dang(a)redhat.com> wrote:
Definitely not a known issue, and it certainly sounds plausible.
Daniel
On 1/30/20 1:45 PM, Sriram Patil wrote:
> Hi,
>
> While setting up some Kerberos shares with ganesha I came across an issue where
the client receives RPCSEC_GSS_CREDPROBLEM. The source of this error is is form following
code in function _svcauth_gss which is used when authenticating RPCSEC_GSS request,
>
> 442 /* Context lookup. */
> 443 if ((gc->gc_proc == RPCSEC_GSS_DATA)
> 444 || (gc->gc_proc == RPCSEC_GSS_DESTROY)) {
> 445
> 446 /* Per RFC 2203 5.3.3.3, if a valid security context
> 447 * cannot be found to authorize a request, the
> 448 * implementation returns RPCSEC_GSS_CREDPROBLEM.
> 449 * N.B., we are explicitly allowed to discard
contexts
> 450 * for any reason (e.g., to save space). */
> 451 gd = authgss_ctx_hash_get(gc);
> 452 if (!gd) {
> 453 rc = RPCSEC_GSS_CREDPROBLEM;
> 454 goto cred_free;
> 455 }
> 456 if (gc->gc_svc != gd->sec.svc)
> 457 gd->sec.svc = gc->gc_svc;
> 458 }
>
> When the GSS proc is GSS_DATA or GSS_DESTROY, ntirpc expects the context to be
in the cache. If not it returns RPCSEC_GSS_CREDPROBLEM.
>
> For the above condition to be true, this context should have been inserted in
the cache as part of GSS_INIT request. In the same function, the svc_rpc_gss_data object
is inserted in the cache but there is a difference in the dispatch code. When the request
is RPCSEC_GSS_INIT, ntirpc sends the reply back to the client before caching the context.
>
> 533 *no_dispatch = true;
> 534
> 535 req->rq_msg.RPCM_ack.ar_results.where = &gr;
> 536 req->rq_msg.RPCM_ack.ar_results.proc =
> 537 (xdrproc_t)
xdr_rpc_gss_init_res;
> 538 call_stat = svc_sendreply(req);
<-- Sending reply to client here
> 539
> 540 /* XXX */
> 541 gss_release_buffer(&min_stat, &gr.gr_token);
> 542 gss_release_buffer(&min_stat,
&gd->checksum);
> 543 mem_free(gr.gr_ctx.value, 0);
> 544
> 545 if (call_stat >= XPRT_DIED) {
> 546 rc = AUTH_FAILED;
> 547 goto gd_free;
> 548 }
> 549
> 550 if (gr.gr_major == GSS_S_COMPLETE) {
> ……
> ……
> 583 (void)authgss_ctx_hash_set(gd);
<-- Inserting the svc_rpc_gss_data object into the cache
> 584 }
>
> In my setup, I observed that, the client can send another request, let's
say EXCHANGE_ID with RPCSEC_GSS_DATA and end up returning the CREDPROBLEM error if that
work thread gets scheduled first and tries to authenticate the request before the context
is inserted in the cache. I believe we should be sending the reply after inserting the
context in the cache.
>
> Please find some logs below which confirm the race. Svc_10 is the
RPCSEC_GSS_INIT request and svc_4 is RPCSEC_GSS_DATA request. We can observe that
rbtree_x_cached_lookup happens before rbtree_x_cached_insert for hash key f8003f00,
>
> 2020-01-30T16:52:42Z : ganesha.nfsd-2248[none] [svc_10] 0 :rpc :TIRPC
:xdr_reply_encode:109 SUCCESS
> 2020-01-30T16:52:42Z : ganesha.nfsd-2248[none] [svc_10] 0 :rpc :TIRPC
:xdr_rpc_gss_encode() success (0x7f89fc002360:16)
> 2020-01-30T16:52:42Z : ganesha.nfsd-2248[none] [svc_10] 0 :rpc :TIRPC
:xdr_rpc_gss_encode() success (0x7f89fc008bb0:155)
> 2020-01-30T16:52:42Z : ganesha.nfsd-2248[none] [svc_10] 0 :rpc :TIRPC
:xdr_rpc_gss_init_res() encode success (ctx 0x7f89fc002360:16, maj 0, min 0, win 32, token
0x7f89fc008bb0:155)
> 2020-01-30T16:52:42Z : ganesha.nfsd-2248[none] [svc_10] 0 :rpc :TIRPC
:svc_ref_it() 0x7f8a04000a90 fd 32 xp_refcnt 5 af 10 port 35536 @svc_ioq_write_now:265
> 2020-01-30T16:52:42Z : ganesha.nfsd-2248[none] [svc_10] 0 :rpc :TIRPC
:svc_release_it() 0x7f8a04000a90 fd 32 xp_refcnt 4 af 10 port 35536 @svc_ioq_write:233
> 2020-01-30T16:52:42Z : ganesha.nfsd-2248[none] [svc_10] 0 :rpc :TIRPC
:xdr_ioq_destroy() xioq 0x7f89fc002800
> ....
> ....
> 2020-01-30T16:52:42Z : ganesha.nfsd-2248[none] [svc_4] 0 :rpc :TIRPC
:xdr_rpc_gss_cred() decode success (v 1, proc 0, seq 1, svc 2, ctx 0x7f8a0c001780:16)
> 2020-01-30T16:52:42Z : ganesha.nfsd-2248[none] [svc_4] 0 :rpc :TIRPC
:rbtree_x_cached_lookup: t 0x7f8a0c0021c0 nk 0x7f8a20153e80 nv (nil)( hk f8003f00
slot/offset 56)
> 2020-01-30T16:52:42Z : ganesha.nfsd-2248[none] [svc_4] 0 :rpc :TIRPC
:xdr_rpc_gss_cred() decode success (v 1, proc 0, seq 1, svc 2, ctx 0x7f8a0c001780:16)
> 2020-01-30T16:52:42Z : ganesha.nfsd-2248[none] [svc_4] 728
:nfs_rpc_process_request :DISP :Could not authenticate request... rejecting with
AUTH_STAT=RPCSEC_GSS_CREDPROBLEM
> ....
> ....
> 2020-01-30T16:52:42Z : ganesha.nfsd-2248[none] [svc_10] 0 :rpc :TIRPC
:rbtree_x_cached_insert: cix 9 ct 0x7f8a0c0021c0 t 0x7f8a0c0021c0 inserting 0x7f89fc001890
(cache hk f8003f00 slot/offset 56) flags 3
> 2020-01-30T16:52:42Z : ganesha.nfsd-2248[none] [svc_10] 0 :rpc :TIRPC
:xdr_rpc_gss_cred() decode success (v 1, proc 1, seq 0, svc 1, ctx (nil):0)
> 2020-01-30T16:52:42Z : ganesha.nfsd-2248[none] [svc_10] 740
:nfs_rpc_process_request :DISP :RPCSEC_GSS no_dispatch=1 gc->gc_proc=(1)
RPCSEC_GSS_INIT
>
>
> Is this a known issue or being worked on already?
>
> Thanks,
> Sriram
>