Interesting discussion. If you’re able to join us in IRC, we hang out on Freenode #ganesha

 

It might make sense to break up nfs_Init_svc so whatever is needed by the PROXY FSALs can be set up before loading exports but prevents the server from opening up to connections before the exports are processed.

 

Frank

 

From: Jo Seaton [mailto:jo@petagene.com]
Sent: Friday, September 18, 2020 9:56 AM
To: Solomon Boulos <boulos@google.com>
Cc: Matt Benjamin <mbenjami@redhat.com>; Ganesha-devel <devel@lists.nfs-ganesha.org>
Subject: [NFS-Ganesha-Devel] Re: Seeking assistance with Ganesha PROXY FSAL / libntirpc issues

 

Aha! Yes, I've been reading the V4 code so I totally missed this, but this is exactly the same error! It went away as soon as I managed to delay the clnt stuff till after nfs_Init_svc.

 

The actual method I use is pretty horrible - I make the FSAL export init fail, and then later (inside nfs_rpc_cb_init_ccache, yuck!) trigger re-parsing the relevant bit of config file, so reloading the export after all the init I need has happened. Obviously at the very least I'd like to contain this nastiness to the FSAL. Any prettier ideas very welcome because just urgh.

 

Anyway, I'm happy to share code + try stuff out for you, but that will have to wait a week!

 

Thanks for getting back to me about this, it's nice to not be alone in this code :)

 

Jo

 

On Fri, Sep 18, 2020 at 5:29 PM Solomon Boulos <boulos@google.com> wrote:

Absolutely. I didn't want to write it :).

 

 

> I could not figure it out after multiple days of sadness. I had hoped to just use rpc_call!


> AFAICT, there is no way to use clnt_call or rpc_call from within a Ganesha FSAL (and luckily, I looked at the V4 proxy and said "Oh, they just rolled their own, too").

> I *think* it's because of something the main NFSD does in setting up the listening code (via svc_create or similar). I tried wading through the libntirpc code to understand if there's also something particularly different about that.

 

I'm surprised your rpc->clnt = clnt_vc_ncreatef works around it. Can you try the same for NLM / PORTMAP / whatever? I can't find the simple example I went back and forth with folks on that code, but it was basically "No, I can't even issue a MNT_NULL with clnt_call / rpc_call". My comment on proxyv3_call though describes the error:

 

/*
 * NOTE(boulos): proxyv3_call is basically rpc_call redone by hand, because
 * ganesha's NFSD hijacks the RPC setup to the point where we can't issue our
 * own NFS-related rpcs as a simple client via clnt_ncreate (internally,
 * svc_exprt_lookup explodes saying "fd %d max_connections 0 exceeded").
 */

 

Are you saying you did something else either for the FSAL loading / similar to wait until later such that clnt_ncreate will succeed? (so that max_connections can be >0). Either way, I intentionally left the proxy_v3 code ready though to rip out and look just like clnt_call / rpc_call, so that it can all be replaced if we can fix this :).

 

 

On Fri, Sep 18, 2020 at 9:14 AM Jo Seaton <jo@petagene.com> wrote:

Hi both, thanks for the responses,

 

Solomon - I had a hell of a time initially with the client functions that turned out to be because some libntirpc initialisation hadn't been done yet. I have an unpleasant hack to delay PROXY exports until that's happened that solves the issue for me - maybe you had the same?

 

Concerning use of client functions, here's my initialisation (rpc_sock has already been opened):

 

rpc->clnt = clnt_vc_ncreatef(rpc->rpc_sock, &raddr, proxyv4_exp->info.srv_prognum, (rpcvers_t) 4, proxyv4_exp->info.srv_sendsize, proxyv4_exp->info.srv_recvsize, 0);
rpc->auth = authgss_ncreate_default(rpc->clnt, "nfs@nfs-storage.blah.net", &rpcsec_gss_data);

 

and here's the RPC calls:

   struct clnt_req *cc;
   enum clnt_stat stat;
   cc = gsh_malloc(sizeof(*cc));
    clnt_req_fill(cc, clnt, au, CB_COMPOUND,
                          (xdrproc_t) xdr_COMPOUND4args, (caddr_t)args,
                          (xdrproc_t) xdr_COMPOUND4res, (caddr_t)res);
  
   stat = clnt_req_setup(cc, tout);
   if (stat == RPC_SUCCESS) {
           cc->cc_refreshes = 1;                                                                                                                                                                                                                                                                                                                           
          stat = CLNT_CALL_WAIT(cc);
   }
   clnt_req_release(cc);

 

This appears to roughly work - I have some issues but I'm not done yet.

 

My auth_gss.c hack is just this:

(in authgss_verify:)

maj_stat =
  gss_verify_mic(&min_stat, gd->ctx, &signbuf, &checksum, &qop_state);

if (maj_stat != GSS_S_COMPLETE || qop_state != gd->sec.qop) {
  gss_log_status("gss_verify_mic", maj_stat, min_stat);
  if (maj_stat == GSS_S_CONTEXT_EXPIRED) {
      gd->established = false;
      authgss_destroy_context(auth);
  }
  /* return (false); */ // here

 

(in authgss_refresh:)

maj_stat =                                                                            
        gss_verify_mic(&min_stat, gd->ctx, &bufin, &bufout,                                
               &qop_state);                                                                                                                 
maj_stat = GSS_S_COMPLETE; // and here

 

I'm happy to post my entire code when I get back, but I'm away for the next week (sorry!).

 

Solomon - does this mean you'd be happy with an approach like the above if it works? I'm not opposed to using the existing PROXY approach, but writing an equivalent of the libntirpc GSS code just for PROXY sounds like a good way to make bugs.

 

Many thanks,

Jo

 

On Thu, Sep 17, 2020 at 8:11 PM Matt Benjamin <mbenjami@redhat.com> wrote:

Hi,

It's really not the case (at least, unless we recently broke
something) that libntirpc cannot act as clients.  There might be some
specific issue with GSS, though.

Matt

On Thu, Sep 17, 2020 at 1:47 PM Solomon Boulos via Devel
<devel@lists.nfs-ganesha.org> wrote:
>
> Yeah, sadly libntirpc (and ntirpc implementations generally) actually aren't able to establish connections as clients. I haven't looked at the GSS auth code, but will do so. I don't think there's any reason it would "require" CLIENT (it's just more bytes that need to be on the wire).
>
> But my experience with the CLIENT functions from the V3 proxy was that I couldn't even get it to establish connections. (see FSAL_PROXY_V3/rpc.c). Would you mind pasting your in-progress code? (if you got it to work, that would be *fantastic*)
>
> On Thu, Sep 17, 2020 at 10:21 AM Jo Seaton <jo@petagene.com> wrote:
>>
>> Hi all,
>>
>> I've been working on re-adding GSS/Kerberos authentication support to the PROXY (V4) FSAL, with a mind to eventually also adding Kerberos delegation support for completeness. I've been having some issues I was hoping to get some feedback on.
>>
>> The first issue I've been having is to do with the GSS code in libntirpc.
>> I'm calling authgss_ncreate_default with a valid CLIENT * and service name, and hopefully a reasonable struct rpc_gss_sec. This fails, because the call to gss_verify_mic inside authgss_refresh fails. This appears to be because authgss_verify fills a relevant buffer (gd->gc_wire_verf) with <empty>, which originally comes from cc_verf in the clnt code (via the start of authgss_validate). Specifically I'm looking at clnt_req's cc_verf which gets used for AUTH_VALIDATE in clnt_generic.c, and always seems to have the same "_null_auth" value - which seems surprising to me! If anyone can give me some insight into what exactly cc_verf is supposed to contain that might help me fix it. Working around it by ignoring the result of gss_verify_mic does seem to work OK.
>>
>> The second issue is to do with the structure of the PROXY FSAL.
>> It appears that it largely handles requests "manually", calling the relevant xdr_* functions, and reading/writing to sockets itself. The GSS auth code on the other hand, seems to require use of CLIENT *, which in my understanding means handing responsibility for the socket to that CLIENT *. These two approaches appear incompatible to me. I've made some reasonable progress rewriting PROXY with clnt_req_* functions, similar to nfs_rpc_callback.c, but if anyone has any feedback on a) why the original approach (ffilz suggests those functions didn't used to be threadsafe?) b) the most sensible thing to do now, it would be very much appreciated.
>>
>> Anyway any feedback is very welcome, I'm very new to both Ganesha and GSS/Kerberos.
>>
>> Many thanks,
>>
>> Jo
>>
>> _______________________________________________
>> Devel mailing list -- devel@lists.nfs-ganesha.org
>> To unsubscribe send an email to devel-leave@lists.nfs-ganesha.org
>
> _______________________________________________
> Devel mailing list -- devel@lists.nfs-ganesha.org
> To unsubscribe send an email to devel-leave@lists.nfs-ganesha.org



--

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309