On 8/12/19 10:43 PM, Daniel Gryniewicz wrote:
It's okay, we all have meetings.
The "compute_readdir_cookie failed" isn't an error; it just indicates
that your FSAL (Gluster in this case) doesn't support cookie
computation. Only FSAL_RGW does that right now, to it's not an issue.
It just means that new dirents become detached, rather than being
inserted into the correct chunk.
A clean run with a clean log would be really useful, if you can get it.
Also, if you could do a packet dump on the run as well, that would also
be useful, but not strictly necessary.
+1. yes. tcpdump of both nfs and gluster traffic (taken on the node
where nfs-ganesha server is running) may help . Also can you please
check for any errors in gluster brick logs and ganesha-gfapi.log as well.
One difference between nfs-ganesha and Gluster-NFS is that gluster-NFS
does all the operations using root user, where as
nfs-ganesha/fsal-gluster switches to nfs client user creds before
sending requests to Gluster.
Also I noticed in the logs that there are some calls to fetch ACLs even
though you seem to have disabled ACLs using global option. If there is
no need of ACLs, could you disable it in EXPORT block {} instead and
give a try.
Thanks,
Soumya
>
> Daniel
>
> On 8/12/19 1:05 PM, Erik Jacobson wrote:
>> I could not find 'erikj' in my 250M log file (I could send if if that
>> would help, it should compress down well).
>>
>> I'm fully game to stop ganesha, clear the logs, and start ganesha for a
>> run that would hopefully have a smaller log file but capture everything.
>> I'm game to do anything that helps! I can also do a test without overlay
>> if that helps.
>>
>> Going along your idea, I did see some snips above where the failure
>> happened, like this:
>>
>> (Sorry for being tardy; does not represent lack of interest, was caught
>> in meetings).
>>
>> So happy for the help thnak you -
>>
>>
>> [root@leader1 ganesha]# grep home ganesha.log | grep -v oscar
>> 12/08/2019 08:52:54 : epoch 5d516e01 : leader1 :
>> ganesha.nfsd-29407[svc_22] nfs3_lookup :NFS3 :DEBUG :REQUEST
>> PROCESSING: Calling nfs_Lookup handle: File Handle V3: Len=40
>> 4300000a20c81d9f835a974ac8a16529d95910db736659947ae4a3413992521809ab9045b2000000
>> name: home
>> 12/08/2019 08:52:54 : epoch 5d516e01 : leader1 :
>> ganesha.nfsd-29407[svc_22] mdc_lookup :NFS READDIR :F_DBG :NFS
>> READDIR: FULLDEBUG: Lookup home
>> 12/08/2019 08:52:54 : epoch 5d516e01 : leader1 :
>> ganesha.nfsd-29407[svc_22] mdc_try_get_cached :NFS READDIR :F_DBG :NFS
>> READDIR: FULLDEBUG: Look in cache home, trust content yes
>> 12/08/2019 08:52:54 : epoch 5d516e01 : leader1 :
>> ganesha.nfsd-29407[svc_22] mdcache_avl_lookup :NFS READDIR :F_DBG :NFS
>> READDIR: FULLDEBUG: Lookup home
>> 12/08/2019 08:52:54 : epoch 5d516e01 : leader1 :
>> ganesha.nfsd-29407[svc_22] mdcache_avl_lookup :NFS READDIR :F_DBG :NFS
>> READDIR: FULLDEBUG: entry not found home
>> 12/08/2019 08:52:54 : epoch 5d516e01 : leader1 :
>> ganesha.nfsd-29407[svc_22] mdc_try_get_cached :NFS READDIR :F_DBG :NFS
>> READDIR: FULLDEBUG: mdcache_avl_lookup home failed trust negative no
>> 12/08/2019 08:52:54 : epoch 5d516e01 : leader1 :
>> ganesha.nfsd-29407[svc_22] mdc_lookup :NFS READDIR :F_DBG :NFS
>> READDIR: FULLDEBUG: Try again home
>> 12/08/2019 08:52:54 : epoch 5d516e01 : leader1 :
>> ganesha.nfsd-29407[svc_22] mdc_try_get_cached :NFS READDIR :F_DBG :NFS
>> READDIR: FULLDEBUG: Look in cache home, trust content yes
>> 12/08/2019 08:52:54 : epoch 5d516e01 : leader1 :
>> ganesha.nfsd-29407[svc_22] mdcache_avl_lookup :NFS READDIR :F_DBG :NFS
>> READDIR: FULLDEBUG: Lookup home
>> 12/08/2019 08:52:54 : epoch 5d516e01 : leader1 :
>> ganesha.nfsd-29407[svc_22] mdcache_avl_lookup :NFS READDIR :F_DBG :NFS
>> READDIR: FULLDEBUG: entry not found home
>> 12/08/2019 08:52:54 : epoch 5d516e01 : leader1 :
>> ganesha.nfsd-29407[svc_22] mdc_try_get_cached :NFS READDIR :F_DBG :NFS
>> READDIR: FULLDEBUG: mdcache_avl_lookup home failed trust negative no
>> 12/08/2019 08:52:54 : epoch 5d516e01 : leader1 :
>> ganesha.nfsd-29407[svc_22] mdc_lookup :NFS READDIR :DEBUG :NFS
>> READDIR: DEBUG: Cache Miss detected for home
>> 12/08/2019 08:52:54 : epoch 5d516e01 : leader1 :
>> ganesha.nfsd-29407[svc_22] mdcache_alloc_and_check_handle :INODE
>> :F_DBG :lookup Created entry 0x7fd220037460 FSAL GLUSTER for home
>> 12/08/2019 08:52:54 : epoch 5d516e01 : leader1 :
>> ganesha.nfsd-29407[svc_22] mdcache_dirent_add :INODE :F_DBG :Add dir
>> entry home
>> 12/08/2019 08:52:54 : epoch 5d516e01 : leader1 :
>> ganesha.nfsd-29407[svc_22] mdcache_avl_insert :NFS READDIR :F_DBG :NFS
>> READDIR: FULLDEBUG: Insert dir entry 0x7fd220033f90 home
>> 12/08/2019 08:52:54 : epoch 5d516e01 : leader1 :
>> ganesha.nfsd-29407[svc_22] mdcache_avl_insert :NFS READDIR :F_DBG :NFS
>> READDIR: FULLDEBUG: Inserted dirent home with ckey hk=63dd25586f8d2412
>> fsal=0x7fd3a722f5e0
>> key=0xc81d9f835a974ac8a16529d95910db738e97d75725094c3db72d3497b5ac887f
>> 12/08/2019 08:52:54 : epoch 5d516e01 : leader1 :
>> ganesha.nfsd-29407[svc_22] place_new_dirent :NFS READDIR :F_DBG :NFS
>> READDIR: FULLDEBUG: Could not add home to chunk for directory
>> 0x7fd2340056d0, compute_readdir_cookie failed
>>
>>
>> On Mon, Aug 12, 2019 at 11:39:28AM -0400, Daniel Gryniewicz wrote:
>>> On 8/12/19 10:00 AM, Erik Jacobson wrote:
>>>>> Okay, could you get a FULL_DEBUG log of the issue happening and
>>>>> post it?
>>>>
>>>> I am happy to get you snips from the various points if that's
>>>> interesting - like when the NFS mount happens, when the overlay
>>>> happens.
>>>>
>>>> What I pasted below is the output in FULL_DBEUG for all components when
>>>> I issued the failing su command. First the output of the command, then
>>>> the log snip that came out at the time the command below failed. I ran
>>>> this from our "miniroot" environment, chroot'd in to the
target root.
>>>> That takes out the complexity of systemd and a full bootup.
'erikj' is
>>>> an account I created in the image so it exists in the nfs underdir.
>>>> Would you like output from the same command working when
'overlay' is
>>>> not in the picture? I'm so grateful for your help. I'll try to
learn
>>>> about this output but it's overwhelming to me at first glance. :)
>>>>
>>>>
>>>> su: failsh-4.2# su - erikj
>>>> Last login: Mon Aug 12 08:53:07 CDT 2019
>>>> su: warning: cannot change directory to /home/erikj: Operation not
>>>> supported
>>>> su: failed to execute /bin/bash: Operation not supported
>>>>
>>>
>>> So, everything in that log snippet succeeded, except a lookup on /root,
>>> which returned ENOENT. In particular, there's no attempt to access
>>> /home or
>>> /home/erikj in that snippet. Can you look for the first occurrence
>>> of erikj
>>> in the log, and send a snippet around that? You're looking for a
>>> line like
>>> this, but with "erikj" instead of "mail":
>>>
>>> 12/08/2019 08:54:45 : epoch 5d516e01 : leader1 :
>>> ganesha.nfsd-29407[svc_14]
>>> nfs3_lookup :NFS3 :DEBUG :REQUEST PROCESSING: Calling nfs_Lookup handle:
>>> File Handle V3: Len=40
>>>
4300000a20c81d9f835a974ac8a16529d95910db73b6306d8b535a493c89ec7727316bc227000000
>>>
>>> name: mail
>>>
>>> Daniel
>>>
>>
> _______________________________________________
> Devel mailing list -- devel(a)lists.nfs-ganesha.org
> To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org