On Sat, Jun 26, 2021 at 10:14 PM Nick Couchman <nick.e.couchman@gmail.com> wrote:
On Sat, Jun 26, 2021 at 2:04 PM Nick Couchman <nick.e.couchman@gmail.com> wrote:
On Sat, Jun 26, 2021 at 12:34 PM Todd Pfaff <pfaff@rhpcs.mcmaster.ca> wrote:
Nick,

I've been using the Proxy FSAL for the past two years and have had much
better results since moving to the V4.dev branch about a year ago,
particularly after Solomon Boulos' improvements to the Proxy FSAL.  I
don't claim to be using it exactly as you've described but I believe it
may be worth your while to try the V4.dev branch.


Todd,
Thanks for the hints - I'll give it a shot - either see if packages are available or just compile it myself.


No joy after switching to the V4 Dev branch - behavior seems identical. There's definitely something in the V3 -> V4 proxy that just doesn't work, as mounting the proxied NFS share as NFSv4 from a client works fine, but NFSv3 results in the "Remote I/O Error" behavior.

Here are the things I've tried so far:
* Both versions 3.5 (Current Stable) and V4 dev (latest release).
* Various combinations of Privileged Port enabled/disabled on both the Export and the Proxy FSAL.
* mount_path_pseudo both enabled and disabled to see if the path translation was causing issues for the V3 client.
* Both enabling and disabling Handle Mapping.

I'm open to any other suggestions that folks have for me, and am happy to provide any further debug output.

Okay, I actually made a little bit of progress in figuring out what's going on, here, with the help of tcpdump. There are two issues:

* Looks like the Ganesha NFSv3 server may not support the "readdirplus" command? Output from "tcpdump" includes the following:
22:24:49.212153 IP 10.1.2.3.netrcs > 10.1.2.4.nfs: Flags [P.], seq 648:832, ack 441, win 337, options [nop,nop,TS val 861164029 ecr 2104677449], length 184: NFS request xid 3221395829 180 readdirplus fh Unknown/430003863737072F240000010000000000000001000000000000000380E6C8CC 2048 bytes @ 0
22:24:49.261230 IP 10.1.2.4.nfs > 10.1.2.3.netrcs: Flags [P.], seq 441:561, ack 832, win 428, options [nop,nop,TS val 2104677499 ecr 861164029], length 120: NFS reply xid 3221395829 reply ok 116 readdirplus ERROR: Unspecified error on server

I can get around this issue by adding the "nordirplus" NFS mount option. I then start to get some part of the directory listing, but with lots of errors.

* After getting past the readdirplus issue, I still get errors, and this appears to be due to file handle issues:

22:48:54.686280 IP 10.1.2.4.nfs > 10.1.2.3.phonebook: Flags [P.], seq 4565:4601, ack 5925, win 520, options [nop,nop,TS val 2106122935 ecr 862609458], length 36: NFS reply xid 1689362215 reply ok 32 lookup ERROR: Illegal NFS file handle

I figured out that the file handle issues are because none of the versions of Ganesha I've been using are actually compiled with the file handle mapping enabled. This includes the version in the CentOS Ganesha SIG (3.5, with the older "Proxy" FSAL) and the version I was building from source via the RPM SPEC file, which was not enabling the support for file handle mapping in Proxy V4.

* All that said, when I finally do get a version of V4 Dev built that enables Handle Mapping, it just core dumps:

Jun 26 23:54:41 nfs-proxy.example.com systemd-coredump[350511]: Process 350496 (ganesha.nfsd) of user 0 dumped core.
                                                                   
                                                                    Stack trace of thread 350496:
                                                                    #0  0x00007f1f7caa6704 digest_alloc (libfsalproxy_v4.so)
                                                                    #1  0x00007f1f7caa70a2 handle_mapping_hash_add (libfsalproxy_v4.so)
                                                                    #2  0x00007f1f7caa75b2 HandleMap_SetFH (libfsalproxy_v4.so)
                                                                    #3  0x00007f1f7caa51df proxyv4_alloc_handle (libfsalproxy_v4.so)
                                                                    #4  0x00007f1f7caa0883 proxyv4_make_object (libfsalproxy_v4.so)
                                                                    #5  0x00007f1f7caa0f01 proxyv4_lookup_impl (libfsalproxy_v4.so)
                                                                    #6  0x00007f1f7caa5465 proxyv4_lookup_path (libfsalproxy_v4.so)
                                                                    #7  0x00007f1f8968334d mdcache_lookup_path (libganesha_nfsd.so.4)
                                                                    #8  0x00007f1f895fe1d7 init_export_root (libganesha_nfsd.so.4)
                                                                    #9  0x00007f1f895fd8e6 init_export_cb (libganesha_nfsd.so.4)
                                                                    #10 0x00007f1f896164c3 foreach_gsh_export (libganesha_nfsd.so.4)
                                                                    #11 0x00007f1f895fd93c exports_pkginit (libganesha_nfsd.so.4)
                                                                    #12 0x00007f1f89582c62 nfs_Init (libganesha_nfsd.so.4)
                                                                    #13 0x00007f1f89583b2d nfs_start (libganesha_nfsd.so.4)
                                                                    #14 0x00000000004029b9 main (ganesha.nfsd)
                                                                    #15 0x00007f1f86e4a493 __libc_start_main (libc.so.6)
                                                                    #16 0x00000000004018fe _start (ganesha.nfsd)

-Nick