Cool! I enjoy the way you hacked through to make progress :). Let’s find a
nice way to stand up the ntirpc bits, before the FSALs initialize so that
it doesn’t have to be like this.
On Wed, Sep 30, 2020 at 07:59 Jo Seaton <jo(a)petagene.com> wrote:
Hi All,
Apologies for the delay, I am back.
Please find (very experimental!) code here:
https://github.com/jseaton/nfs-ganesha/commit/5ad302b815acc4b99bd111753c9...
https://github.com/jseaton/ntirpc/commit/c7a3c5b87435ddca4a62220e13b535bd...
note: copies and such will hang sporadically for unclear reasons - I
suspect I need to remove (or add...) some locking, but I might be wrong.
Also assumes a krb5-supporting remote.
@Solomon - sorry, I can't confirm for other protocols, but feel free to
have a go!
@Frank - I'd be interested in that if we can work out what the issue is.
I'm only fairly sure nfs_Init_svc is the problem, I don't understand the
internals of that yet.
I'm on #ganesha (as fiiiii) UK hours if anyone wants to chat about this.
Many thanks,
Jo
On Fri, Sep 18, 2020 at 6:10 PM Frank Filz <ffilzlnx(a)mindspring.com>
wrote:
> Interesting discussion. If you’re able to join us in IRC, we hang out on
> Freenode #ganesha
>
>
>
> It might make sense to break up nfs_Init_svc so whatever is needed by the
> PROXY FSALs can be set up before loading exports but prevents the server
> from opening up to connections before the exports are processed.
>
>
>
> Frank
>
>
>
> *From:* Jo Seaton [mailto:jo@petagene.com]
> *Sent:* Friday, September 18, 2020 9:56 AM
> *To:* Solomon Boulos <boulos(a)google.com>
> *Cc:* Matt Benjamin <mbenjami(a)redhat.com>; Ganesha-devel <
> devel(a)lists.nfs-ganesha.org>
> *Subject:* [NFS-Ganesha-Devel] Re: Seeking assistance with Ganesha PROXY
> FSAL / libntirpc issues
>
>
>
> Aha! Yes, I've been reading the V4 code so I totally missed this, but
> this is exactly the same error! It went away as soon as I managed to delay
> the clnt stuff till after nfs_Init_svc.
>
>
>
> The actual method I use is pretty horrible - I make the FSAL export init
> fail, and then later (inside nfs_rpc_cb_init_ccache, yuck!) trigger
> re-parsing the relevant bit of config file, so reloading the export after
> all the init I need has happened. Obviously at the very least I'd like to
> contain this nastiness to the FSAL. Any prettier ideas very welcome because
> just urgh.
>
>
>
> Anyway, I'm happy to share code + try stuff out for you, but that will
> have to wait a week!
>
>
>
> Thanks for getting back to me about this, it's nice to not be alone in
> this code :)
>
>
>
> Jo
>
>
>
> On Fri, Sep 18, 2020 at 5:29 PM Solomon Boulos <boulos(a)google.com> wrote:
>
> Absolutely. I didn't want to write it :).
>
>
>
> From the review on
>
https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/487439
>
>
>
> > I could not figure it out after multiple days of sadness. I had hoped
> to just use rpc_call!
>
>
> > AFAICT, there is no way to use clnt_call or rpc_call from within a
> Ganesha FSAL (and luckily, I looked at the V4 proxy and said "Oh, they just
> rolled their own, too").
>
> > I *think* it's because of something the main NFSD does in setting up
> the listening code (via svc_create or similar). I tried wading through the
> libntirpc code to understand if there's also something particularly
> different about that.
>
>
>
> I'm surprised your rpc->clnt = clnt_vc_ncreatef works around it. Can you
> try the same for NLM / PORTMAP / whatever? I can't find the simple example
> I went back and forth with folks on that code, but it was basically "No, I
> can't even issue a MNT_NULL with clnt_call / rpc_call". My comment on
> proxyv3_call though describes the error:
>
>
>
> /*
> * NOTE(boulos): proxyv3_call is basically rpc_call redone by hand,
> because
> * ganesha's NFSD hijacks the RPC setup to the point where we can't issue
> our
> * own NFS-related rpcs as a simple client via clnt_ncreate (internally,
> * svc_exprt_lookup explodes saying "fd %d max_connections 0 exceeded").
> */
>
>
>
> Are you saying you did something else either for the FSAL loading /
> similar to wait until later such that clnt_ncreate will succeed? (so that
> max_connections can be >0). Either way, I intentionally left the proxy_v3
> code ready though to rip out and look just like clnt_call / rpc_call, so
> that it can all be replaced if we can fix this :).
>
>
>
>
>
> On Fri, Sep 18, 2020 at 9:14 AM Jo Seaton <jo(a)petagene.com> wrote:
>
> Hi both, thanks for the responses,
>
>
>
> Solomon - I had a hell of a time initially with the client functions that
> turned out to be because some libntirpc initialisation hadn't been done
> yet. I have an unpleasant hack to delay PROXY exports until that's happened
> that solves the issue for me - maybe you had the same?
>
>
>
> Concerning use of client functions, here's my initialisation (rpc_sock
> has already been opened):
>
>
>
> rpc->clnt = clnt_vc_ncreatef(rpc->rpc_sock, &raddr,
> proxyv4_exp->info.srv_prognum, (rpcvers_t) 4,
> proxyv4_exp->info.srv_sendsize, proxyv4_exp->info.srv_recvsize, 0);
> rpc->auth = authgss_ncreate_default(rpc->clnt,
"nfs(a)nfs-storage.blah.net",
> &rpcsec_gss_data);
>
>
>
> and here's the RPC calls:
>
> struct clnt_req *cc;
> enum clnt_stat stat;
> cc = gsh_malloc(sizeof(*cc));
> clnt_req_fill(cc, clnt, au, CB_COMPOUND,
> (xdrproc_t) xdr_COMPOUND4args, (caddr_t)args,
> (xdrproc_t) xdr_COMPOUND4res, (caddr_t)res);
>
> stat = clnt_req_setup(cc, tout);
> if (stat == RPC_SUCCESS) {
> cc->cc_refreshes = 1;
>
>
>
>
> stat = CLNT_CALL_WAIT(cc);
> }
> clnt_req_release(cc);
>
>
>
> This appears to roughly work - I have some issues but I'm not done yet.
>
>
>
> My auth_gss.c hack is just this:
>
> (in authgss_verify:)
>
> maj_stat =
> gss_verify_mic(&min_stat, gd->ctx, &signbuf, &checksum,
&qop_state);
>
> if (maj_stat != GSS_S_COMPLETE || qop_state != gd->sec.qop) {
> gss_log_status("gss_verify_mic", maj_stat, min_stat);
> if (maj_stat == GSS_S_CONTEXT_EXPIRED) {
> gd->established = false;
> authgss_destroy_context(auth);
> }
> /* return (false); */ // here
>
>
>
> (in authgss_refresh:)
>
> maj_stat =
>
> gss_verify_mic(&min_stat, gd->ctx, &bufin, &bufout,
>
> &qop_state);
>
> maj_stat = GSS_S_COMPLETE; // and here
>
>
>
> I'm happy to post my entire code when I get back, but I'm away for the
> next week (sorry!).
>
>
>
> Solomon - does this mean you'd be happy with an approach like the above
> if it works? I'm not opposed to using the existing PROXY approach, but
> writing an equivalent of the libntirpc GSS code just for PROXY sounds like
> a good way to make bugs.
>
>
>
> Many thanks,
>
> Jo
>
>
>
> On Thu, Sep 17, 2020 at 8:11 PM Matt Benjamin <mbenjami(a)redhat.com>
> wrote:
>
> Hi,
>
> It's really not the case (at least, unless we recently broke
> something) that libntirpc cannot act as clients. There might be some
> specific issue with GSS, though.
>
> Matt
>
> On Thu, Sep 17, 2020 at 1:47 PM Solomon Boulos via Devel
> <devel(a)lists.nfs-ganesha.org> wrote:
> >
> > Yeah, sadly libntirpc (and ntirpc implementations generally) actually
> aren't able to establish connections as clients. I haven't looked at the
> GSS auth code, but will do so. I don't think there's any reason it would
> "require" CLIENT (it's just more bytes that need to be on the wire).
> >
> > But my experience with the CLIENT functions from the V3 proxy was that
> I couldn't even get it to establish connections. (see FSAL_PROXY_V3/rpc.c).
> Would you mind pasting your in-progress code? (if you got it to work, that
> would be *fantastic*)
> >
> > On Thu, Sep 17, 2020 at 10:21 AM Jo Seaton <jo(a)petagene.com> wrote:
> >>
> >> Hi all,
> >>
> >> I've been working on re-adding GSS/Kerberos authentication support to
> the PROXY (V4) FSAL, with a mind to eventually also adding Kerberos
> delegation support for completeness. I've been having some issues I was
> hoping to get some feedback on.
> >>
> >> The first issue I've been having is to do with the GSS code in
> libntirpc.
> >> I'm calling authgss_ncreate_default with a valid CLIENT * and service
> name, and hopefully a reasonable struct rpc_gss_sec. This fails, because
> the call to gss_verify_mic inside authgss_refresh fails. This appears to be
> because authgss_verify fills a relevant buffer (gd->gc_wire_verf) with
> <empty>, which originally comes from cc_verf in the clnt code (via the
> start of authgss_validate). Specifically I'm looking at clnt_req's cc_verf
> which gets used for AUTH_VALIDATE in clnt_generic.c, and always seems to
> have the same "_null_auth" value - which seems surprising to me! If anyone
> can give me some insight into what exactly cc_verf is supposed to contain
> that might help me fix it. Working around it by ignoring the result of
> gss_verify_mic does seem to work OK.
> >>
> >> The second issue is to do with the structure of the PROXY FSAL.
> >> It appears that it largely handles requests "manually", calling
the
> relevant xdr_* functions, and reading/writing to sockets itself. The GSS
> auth code on the other hand, seems to require use of CLIENT *, which in my
> understanding means handing responsibility for the socket to that CLIENT *.
> These two approaches appear incompatible to me. I've made some reasonable
> progress rewriting PROXY with clnt_req_* functions, similar to
> nfs_rpc_callback.c, but if anyone has any feedback on a) why the original
> approach (ffilz suggests those functions didn't used to be threadsafe?) b)
> the most sensible thing to do now, it would be very much appreciated.
> >>
> >> Anyway any feedback is very welcome, I'm very new to both Ganesha and
> GSS/Kerberos.
> >>
> >> Many thanks,
> >>
> >> Jo
> >>
> >> _______________________________________________
> >> Devel mailing list -- devel(a)lists.nfs-ganesha.org
> >> To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
> >
> > _______________________________________________
> > Devel mailing list -- devel(a)lists.nfs-ganesha.org
> > To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
>
>
>
> --
>
> Matt Benjamin
> Red Hat, Inc.
> 315 West Huron Street, Suite 140A
>
<
https://www.google.com/maps/search/315+West+Huron+Street,+Suite+140A+Ann+...
> Ann Arbor, Michigan 48103
>
<
https://www.google.com/maps/search/315+West+Huron+Street,+Suite+140A+Ann+...
>
>
http://www.redhat.com/en/technologies/storage
>
> tel. 734-821-5101 <(734)%20821-5101>
> fax. 734-769-8938 <(734)%20769-8938>
> cel. 734-216-5309 <(734)%20216-5309>
>
>