I had meant to add lots of comments to the FSAL API header about how all this handle-to-wire / wire-to-host / handle-to-key stuff works… (particularly the memory lifetimes and ownership) because I struggled with it when implementing the v3 proxy.

The giant comment in fsal_api.h is not bad, but the real action is in nfs_filehandle_mgmt.c (there are other callers of wire_to_host, including the mdcache just passing it through to the underlying FSAL, and an NFSv4 digest variant, but HandleMap_GetFH is being called on the "get me the v3 handle" path):

 https://github.com/nfs-ganesha/nfs-ganesha/blob/2ce0989828a1d3fc1f5dc6164ae1620243ed3320/src/support/nfs_filehandle_mgmt.c#L88

which shows that the buffer passed into wire_to_host is NFS3_FHSIZE long (which makes sense, the function is trying to compute an fh3 as output). The comment there incorrectly says "no larger than NFS4_FHSIZE" but it should say NFS3_FHSIZE. (How'd you get NFS4_FHSIZE+2?)

We should still feel bad about that FIXME in FhandleToCache and the reuse of the buffer as input and output (the size of the buffer doesn't have much to do with the actual fs len).







On Sun, Jul 4, 2021 at 07:30 Nick Couchman <nick.e.couchman@gmail.com> wrote:
On Sun, Jul 4, 2021 at 10:04 AM Nick Couchman <nick.e.couchman@gmail.com> wrote:
On Sat, Jul 3, 2021 at 10:08 PM Nick Couchman <nick.e.couchman@gmail.com> wrote:
On Tue, Jun 29, 2021 at 11:52 PM Solomon Boulos <boulos@google.com> wrote:
Alright, looking through some git blame -w, I think this broke and then was restored in 2912fad98e. Jim had a diff in 2014 (823625c5da) to make the code compile, but crucially the buffer code no longer had a “max len” just a length.

I think this broke the “clever” reuse of the input buffdesc as the argument to GetFH (which has been there since 2012 from 878f45493e).

The input when mapped is a 16-byte thingy (sizeof nfs23_map_handle_t), and GetFH needs to expand the 16-byte mapped handle into an FH4. I assume the code when written had a FH4_SIZE as it’s max length, rather than the fh_desc buffer appearing to be trimmed to 16-bytes (that is, I think there was a length separate from max length at the time or something).


Digging into this a little more, I think the following commit actually originally broke things:

3016c4cee45635afa9c25b21c2f5b014d80d1b6d

In that commit, the "netbuf" data type used as an argument for HandleMap_GetFH is replaced with the "gsh_buffdesc" data type. As far as I can tell, "gsh_buffdesc" never had a "maxlen" member, so the following code in that function would not have compiled after that:

==
...
if(rc == HASHTABLE_SUCCESS)
    {
      handle_pool_entry_t *h = (handle_pool_entry_t *)buffval.pdata;
      if(h->fh_len < p_out_fsal_handle->maxlen)
        {
          p_out_fsal_handle->len = h->fh_len;
          memcpy(p_out_fsal_handle->buf, h->fh_data,  h->fh_len);
          memcpy(p_out_fsal_handle->addr, h->fh_data,  h->fh_len);
          rc = HANDLEMAP_SUCCESS;
        }
...
==

After that change, there are various iterations of renaming the "p_out_fsal_handle" parameter to other things, up until the commit that you mentioned that fixed the compile issues (823625c5da2373606dd4f75dce46a92be143507c), which resulted in the code having "p_out_fsal_handle" renamed to the value actually being passed in, and the "maxlen" member replaced with "len".
 

And, just another note on this - in the commit prior to 3016c4cee4, I believe the netbuf that gets passed around is allocated in this block of code:

==
fsal_status_t
pxy_create_handle(struct fsal_export *exp_hdl,
                  struct fsal_handle_desc *hdl_desc,
                  struct fsal_obj_handle **handle)
{
        fsal_status_t st;
        nfs_fh4 fh4;
        fsal_attrib_list_t attr;
        struct pxy_obj_handle *ph;
#ifdef _HANDLE_MAPPING
        char fh_data[NFS4_FHSIZE+2];
#endif

        if(!exp_hdl || !hdl_desc || !handle || (hdl_desc->len > NFS4_FHSIZE))
                ReturnCode(ERR_FSAL_INVAL, 0);

        fh4.nfs_fh4_val = hdl_desc->start;
        fh4.nfs_fh4_len = hdl_desc->len;

#ifdef _HANDLE_MAPPING
        if(hdl_desc->len == sizeof(nfs23_map_handle_t)) {
                nfs23_map_handle_t *h23 = (nfs23_map_handle_t*)hdl_desc->start;
                if(h23->type == PXY_HANDLE_MAPPED) {
                        struct netbuf fh = {
                                .maxlen = sizeof(fh_data),
                                .buf = fh_data
                        };
                        struct pxy_handle_blob *blob;
==

From this I gather that the size of maxlen for that netbuf is the size of the data going into the buffer; however, that member is allocated as NFS4_FHSIZE + 2.

So, maybe a constant, something like:

#define PROXYV4_HANDLE_MAXLEN NFS4_FHSIZE+2

The following patch seems to fix things:

==
diff -Naur nfs-ganesha-4-dev.66.a/src/FSAL/FSAL_PROXY_V4/handle_mapping/handle_mapping.c nfs-ganesha-4-dev.66.b/src/FSAL/FSAL_PROXY_V4/handle_mapping/handle_mapping.c
--- nfs-ganesha-4-dev.66.a/src/FSAL/FSAL_PROXY_V4/handle_mapping/handle_mapping.c 2021-07-02 18:12:40.000000000 -0400
+++ nfs-ganesha-4-dev.66.b/src/FSAL/FSAL_PROXY_V4/handle_mapping/handle_mapping.c 2021-07-04 08:25:04.398912776 -0400
@@ -344,7 +344,7 @@
  if (rc == HASHTABLE_SUCCESS) {
  handle_pool_entry_t *h = (handle_pool_entry_t *) buffval.addr;
 
- if (h->fh_len < fsal_handle->len) {
+ if (h->fh_len < PROXYV4_HANDLE_MAXLEN) {
  fsal_handle->len = h->fh_len;
  memcpy(fsal_handle->addr, h->fh_data, h->fh_len);
  rc = HANDLEMAP_SUCCESS;
diff -Naur nfs-ganesha-4-dev.66.a/src/FSAL/FSAL_PROXY_V4/handle_mapping/handle_mapping.h nfs-ganesha-4-dev.66.b/src/FSAL/FSAL_PROXY_V4/handle_mapping/handle_mapping.h
--- nfs-ganesha-4-dev.66.a/src/FSAL/FSAL_PROXY_V4/handle_mapping/handle_mapping.h 2021-07-02 18:12:40.000000000 -0400
+++ nfs-ganesha-4-dev.66.b/src/FSAL/FSAL_PROXY_V4/handle_mapping/handle_mapping.h 2021-07-04 08:24:44.496077148 -0400
@@ -81,6 +81,9 @@
 #define HANDLEMAP_HASHTABLE_ERROR 7
 #define HANDLEMAP_EXISTS         8
 
+/* Maximum file handle length */
+#define PROXYV4_HANDLE_MAXLEN NFS4_FHSIZE+2
+
 int HandleMap_Init(const handle_map_param_t *p_param);
 
 int HandleMap_GetFH(const nfs23_map_handle_t *, struct gsh_buffdesc *);
==
 
No extensive testing done, yet, but I'm happy to submit a pull request, if this makes sense?

Or maybe there's a better way?

Or maybe there's still a better way??
 
-Nick