nfs-ganesha in a cluster with IP failover
by alxg@weka.io
Hi,
We'e setting up ganesha using FSAL_VFS, and NFSv3 only. We have a cluster of them running with floating IPs being failed over between them when a host dies.
The question we have is regarding writes which are UNSTABLE - meaning the client writes out from its cache for example with UNSTABLE, and then sends a COMMIT. In the event of a ganesha host crashing after receiving the writeouts, but before the COMMIT - is it assured that when the floating IP will failover to another host the client will resend the unstable writes again as well?
I guess this has several parts to it:
* Client side - assuming it's a linux kernel nfs client and it uses the pagecache - does it mark the pages clean right after successfully sending their writes as unstable, or does it mark them as clean only after it successfully sends a COMMIT on them as well? Do you happen to know?
* Server side - Does nfs-ganesha have some data cache? Or the writes being unstable just means they can land in the pagecache of the server side, and stay there until the COMMIT does a sync? in that case do we need to set `NFS_Commit = true`?
What are the guidelines on such a setup to ensure data doesn't get lost and doesn't corrupt?
Thank you,
Alex
4 years, 8 months
Unable to delete last ACE entry of an ACL on FSAL_VFS with ENABLE_VFS_DEBUG_ACL as ON
by subhash.arya@seagate.com
Hello all,
We are trying to use ACL functionality for FSAL_VFS and seeing the following issue while deleting the ACE entries.
1. Get/List file ACL
-bash-4.2$ nfs4_getfacl abc
# file: abc
A::dev1:rxtTncy
A::sub:rxtTncy
2. Delete specified ACE entry from the ACL and check
-bash-4.2$ nfs4_setfacl -x A::dev1:RXT abc
-bash-4.2$ nfs4_getfacl abc
# file: abc
A::sub:rxtTncy
3. Delete the last ACE
bash-4.2$ nfs4_setfacl -x A::sub:RXT abc
4. Check if the last ACE is deleted and it's not.
-bash-4.2$ nfs4_getfacl abc
# file: abc
A::sub:rxtTncy
Inspite of deleting the last ACE, the nfs4_getfacl still shows the deleted ACE. Could anyone help and let us know if we are missing something here?
Thanks,
Subhash
4 years, 8 months
Re: consistently reproduceable problem with kvm guest booting root filesystem via nfs-ganesha 2.8 proxy fsal
by Todd Pfaff
Daniel,
Grasping at straws, I've tried both UDP and TCP mounts, and NFS v3 and v4,
and some other tweaks as I peruse the various ganesha*config man pages and
wonder to myself. You know, banging on everything, hoping I get lucky,
before I roll up my sleeves and do it the right but difficult way with
packet captures. :)
I am using TCP for NFS by default everywhere and have been for years.
The change to UDP was just for that one test and when it didn't change
anything for the better I have since reverted to TCP.
I'll repeat the test with TCP and NFS v3 again and see if it generates the
same logged error.
Todd
On Thu, 12 Mar 2020, Daniel Gryniewicz wrote:
> 11 is EAGAIN, which means the socket would have blocked. This is
> strange, as 2.8 does blocking writes, so we should never get EAGAIN from
> a write. That said, why are you using UDP mounts? TCP mounts are
> generally considered to be better in every way (and are the default), so
> you might want to try that as a workaround.
>
>
> Daniel
>
> On 3/11/20 11:17 PM, Todd Pfaff wrote:
>> Daniel,
>>
>> Thanks for the guidance about logging. I haven't done packet captures
>> yet but this is what I'm seeing logged around the time of the error
>> condition:
>>
>> 11/03/2020 21:51:34 : epoch 5e6994c3 : nfs-server-hostname :
>> ganesha.nfsd-168948[svc_18] rpc :TIRPC :F_DBG :svc_dg_reply:
>> 0x7f35640008c0 fd 13 sendmsg failed (will set dead)
>>
>> 11/03/2020 21:51:34 : epoch 5e6994c3 : nfs-server-hostname :
>> ganesha.nfsd-168948[svc_18] complete_request :DISP :DEBUG :NFS
>> DISPATCHER: FAILURE: Error while calling svc_sendreply on a new
>> request.
>> rpcxid=3984287200 socket=13 function:nfs3_read
>> client:::ffff:10.1.2.3 program:100003 nfs version:3 proc:6 errno: 11
>>
>>
>> and from the client's perspective nothing can be done with the nfs
>> mount until the nfs-ganesha server is restarted.
>>
>> Do you have any suggestions for a recommended tcpdump command line to
>> capture all useful traffic?
>>
>> Thanks,
>> Todd
>>
>
>
4 years, 9 months
Re: consistently reproduceable problem with kvm guest booting root filesystem via nfs-ganesha 2.8 proxy fsal
by Todd Pfaff
Daniel,
Thanks for the guidance about logging. I haven't done packet captures yet
but this is what I'm seeing logged around the time of the error condition:
11/03/2020 21:51:34 : epoch 5e6994c3 : nfs-server-hostname :
ganesha.nfsd-168948[svc_18] rpc :TIRPC :F_DBG :svc_dg_reply:
0x7f35640008c0 fd 13 sendmsg failed (will set dead)
11/03/2020 21:51:34 : epoch 5e6994c3 : nfs-server-hostname :
ganesha.nfsd-168948[svc_18] complete_request :DISP :DEBUG :NFS
DISPATCHER: FAILURE: Error while calling svc_sendreply on a new request.
rpcxid=3984287200 socket=13 function:nfs3_read
client:::ffff:10.1.2.3 program:100003 nfs version:3 proc:6 errno: 11
and from the client's perspective nothing can be done with the nfs mount
until the nfs-ganesha server is restarted.
Do you have any suggestions for a recommended tcpdump command line to
capture all useful traffic?
Thanks,
Todd
4 years, 9 months
Re: consistently reproduceable problem with kvm guest booting root filesystem via nfs-ganesha 2.8 proxy fsal
by Todd Pfaff
On Tue, 10 Mar 2020, Daniel Gryniewicz wrote:
> Maybe turn on logging on Ganesha and look for anything interesting?
I've tried that by changing "-N NIV_CRIT" to "-N NIV_DEBUG" in
/etc/sysconfig/ganesha - is there a better way? - but this did not reveal
anything interesting to my eyes. I'm quite willing to accept that my eyes
may have missed something.
Something that I neglected to mention previously is that sometimes
(maybe always, not sure yet) when the error condition occurs in the kvm
guest, the kvm host's nfs-ganesha proxy mount has gone into a faulty state
and requires an unmount and remount to recover.
> Also, packet dumps would be useful.
I'll consider this. I may need to first need to set up a proper test
environment, unless I can figure out (or someone can tell me :) how best
to focus on the useful packets and not be overwhelmed by extraneous data.
> I haven't personally stored KVM
> images on NFS, let alone via Proxy, so I don't have any personal
> experience in this.
All of our many kvm images are on NFS and have been for years so I do have
plenty of personal experience with this, but having them on nfs via
ganesha is a new experiment for us. We do have many terabytes of data
filesystems served via nfs-ganesha proxy fsal though, with generally good
results.
>
> What's the OS/version of the original NFS server that Ganesha's proxying?
Also CentOS 7. Everything in our mix is CO6 or CO7.
Thanks,
Todd
>
> Daniel
>
> On 3/9/20 5:46 PM, Todd Pfaff wrote:
>> I have a reproduceable problem when trying to boot a kvm guest whose
>> root filesystem is on an nfs path that is accessed from the kvm virt
>> host via nfs-ganesha-2.8 and the proxy fsal.
>>
>> All hosts involved: kvm host, kvm guest, and nfs-ganesha server, run
>> CentOS 7.
>>
>> The guest begins to boot but consistently fails just after the switch
>> root with ext4 errors like this:
>>
>> [ OK ] Started Plymouth switch root service.
>> Starting Switch Root...
>> [ 15.154967] systemd-journald[145]: Received SIGTERM from PID 1
>> (systemd).
>> [ 15.168447] EXT4-fs error (device vda1): __ext4_get_inode_loc:4247:
>> inode #131073: block 524320: comm systemd: unable to read itable block
>> [ 15.269495] EXT4-fs warning (device vda1):
>> __ext4_read_dirblock:676: error reading directory block (ino 524296,
>> block 0)
>> [ 15.278651] systemd[1]: Failed to execute /sbin/init, giving up:
>> Input/output error
>>
>>
>> I'm using nfs-ganesha from the CentOS 7 Storage SIG.
>>
>> I've tested with both the latest stable 2.8 version from repo
>> [centos-nfs-ganesha28]:
>>
>> nfs-ganesha-2.8.3-3.el7.x86_64
>>
>> and the latest test version from [centos-nfs-ganesha28-test]:
>>
>> nfs-ganesha-2.8.3-4.el7.x86_64
>>
>> with similar results. I've also tried with nfs v3 and nfs v4.1 mounts
>> from the kvm host.
>>
>>
>> The kvm guest disk is defined as:
>>
>> <disk type='file' device='disk'>
>> <driver name='qemu' type='qcow2' cache='none'/>
>> <source file='/mnt/ganesha/kvm/guest.qcow2'/>
>> <target dev='vdc' bus='virtio'/>
>> <boot order='1'/>
>> <address type='pci' domain='0x0000' bus='0x00' slot='0x08'
>> function='0x0'/>
>> </disk>
>>
>>
>> The kvm guest works fine if the kvm host accesses this qcow2 image via
>> a direct nfs mount instead of via nfs-ganesha proxy.
>>
>> Thoughts?
>>
>> Thanks,
>> Todd
>> _______________________________________________
>> Support mailing list -- support(a)lists.nfs-ganesha.org
>> To unsubscribe send an email to support-leave(a)lists.nfs-ganesha.org
> _______________________________________________
> Support mailing list -- support(a)lists.nfs-ganesha.org
> To unsubscribe send an email to support-leave(a)lists.nfs-ganesha.org
>
4 years, 9 months
consistently reproduceable problem with kvm guest booting root filesystem via nfs-ganesha 2.8 proxy fsal
by Todd Pfaff
I have a reproduceable problem when trying to boot a kvm guest whose root
filesystem is on an nfs path that is accessed from the kvm virt host via
nfs-ganesha-2.8 and the proxy fsal.
All hosts involved: kvm host, kvm guest, and nfs-ganesha server, run
CentOS 7.
The guest begins to boot but consistently fails just after the switch root
with ext4 errors like this:
[ OK ] Started Plymouth switch root service.
Starting Switch Root...
[ 15.154967] systemd-journald[145]: Received SIGTERM from PID 1
(systemd).
[ 15.168447] EXT4-fs error (device vda1): __ext4_get_inode_loc:4247:
inode #131073: block 524320: comm systemd: unable to read itable block
[ 15.269495] EXT4-fs warning (device vda1): __ext4_read_dirblock:676:
error reading directory block (ino 524296, block 0)
[ 15.278651] systemd[1]: Failed to execute /sbin/init, giving up:
Input/output error
I'm using nfs-ganesha from the CentOS 7 Storage SIG.
I've tested with both the latest stable 2.8 version from repo
[centos-nfs-ganesha28]:
nfs-ganesha-2.8.3-3.el7.x86_64
and the latest test version from [centos-nfs-ganesha28-test]:
nfs-ganesha-2.8.3-4.el7.x86_64
with similar results. I've also tried with nfs v3 and nfs v4.1 mounts
from the kvm host.
The kvm guest disk is defined as:
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' cache='none'/>
<source file='/mnt/ganesha/kvm/guest.qcow2'/>
<target dev='vdc' bus='virtio'/>
<boot order='1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x08'
function='0x0'/>
</disk>
The kvm guest works fine if the kvm host accesses this qcow2 image via a
direct nfs mount instead of via nfs-ganesha proxy.
Thoughts?
Thanks,
Todd
4 years, 9 months
ganesha_stats client_all_ops cannot find IP address
by Wyllys Ingersoll
Im running ganesha 3.2 and I've enabled "all" stats using ganesha_stats but whenever I try to display the stats, it always fails and says that the IP address is not found. Is this a known issue or am I using it incorrectly? I've made sure to mount from a v4 client and generate activity on the client.
```
# ganesha_stats client_all_ops ::ffff:10.15.15.2
GANESHA RESPONSE STATUS: Client IP address not found
```
4 years, 9 months
Re: Mounting NFS Ganesha shares on Win 10 and Mac OS
by Frank Filz
Tom,
I was trying to find the instructions I had used a while ago to try using the Windows NFS client, but from what I am finding in searching today is that you may need to set up some additional registry keys in:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\ClientForNFS\CurrentVersion\Default
Add New DWORD (32-bit) Value for AnonymousUid with the uid you want the client to use
And New DWORD (32-bit) Value for AnonymousGid with the gid you want the client to use
Otherwise Windows uses -2,-2
This taken from this web page:
https://graspingtech.com/mount-nfs-share-windows-10/
I think the reason I didn't need to do that is I did the Windows stuff in an exported directory that had 0777 permissions.
If you ARE using AD, you can set the uid and gid in AD.
Frank
> -----Original Message-----
> From: TomK [mailto:tomkcpr@mdevsys.com]
> Sent: Wednesday, March 4, 2020 3:43 PM
> To: support(a)lists.nfs-ganesha.org; Frank Filz <ffilzlnx(a)mindspring.com>; Kaleb
> S. KEITHLEY <kkeithle(a)redhat.com>
> Subject: Mounting NFS Ganesha shares on Win 10 and Mac OS
>
> Hey All,
>
> Trying to find the best way to mount NFS Ganesha presented NFS paths on both
> Win 10 and Mac OS. I'm partially successful.
>
> Win 10:
>
> This laptop is not on AD. But I'm trying to mount the NFS share and access it via
> the AD user tom(a)mds.xyz. I can mount it this way:
>
> C:\Users\tom>mount -o nolock \\192.168.0.125\n M:
> M: is now successfully connected to \\192.168.0.125\n
>
> The command completed successfully.
>
> C:\Users\tom>
>
>
> However when I try to access /n/mds.xyz/tom, home folder of an AD user I of
> course get Permission denied. How do I specify an AD user from a client not on
> AD?
>
>
> Mac OS X (~2013) macOS High Sierra, 10.13.6:
>
> When trying to mount using:
>
> mount -t nfs -o resvport,rw 192.168.0.125:/n nfs
>
> I get a:
>
> mount_nfs can't mount /n from 192.168.0.125 onto /private/tmp/nfs:
> Permission denied
>
> I tested this to see where it's coming from and see that's it's sever generated. I
> turned off the server nfs ganesha and got a different error. Checking the logs on
> nfs-ganesha, I can see the client did reach the nfs-ganesha sever correcty. Just
> got permission denied.
>
>
> [ Ideal Situation ]
>
> How do I mount the NFS share ideally using my AD user from this Win 10 laptop
> and Mac OS X laptop, either of which is not on AD? I've a couple dozen
> machines and all work correcty with the home folders on this NFS share. NFS
> home folders between these kerberized machines wok well using the same NFS
> Ganesha.
>
> --
> Thx,
> TK.
>
>
>
>
> [root@nfs03 ganesha]# cat /etc/ganesha/ganesha.conf
> ###################################################
> #
> # EXPORT
> #
> # To function, all that is required is an EXPORT
> #
> # Define the absolute minimal export
> #
> ###################################################
>
>
> # logging directives--be careful
> LOG {
> # Default_Log_Level is unknown token??
> # Default_Log_Level = NIV_FULL_DEBUG;
> Components {
> # ALL = FULL_DEBUG;
> MEMLEAKS = FATAL;
> FSAL = DEBUG;
> NFSPROTO = FATAL;
> NFS_V4 = FULL_DEBUG;
> EXPORT = DEBUG;
> FILEHANDLE = FATAL;
> DISPATCH = DEBUG;
> CACHE_INODE = FULL_DEBUG;
> CACHE_INODE_LRU = FATAL;
> HASHTABLE = FATAL;
> HASHTABLE_CACHE = FATAL;
> DUPREQ = FATAL;
> INIT = DEBUG;
> MAIN = FATAL;
> IDMAPPER = FULL_DEBUG;
> NFS_READDIR = FULL_DEBUG;
> NFS_V4_LOCK = FULL_DEBUG;
> CONFIG = FULL_DEBUG;
> CLIENTID = FULL_DEBUG;
> SESSIONS = FATAL;
> PNFS = FATAL;
> RW_LOCK = FATAL;
> NLM = FATAL;
> RPC = FULL_DEBUG;
> NFS_CB = FATAL;
> THREAD = FATAL;
> NFS_V4_ACL = FULL_DEBUG;
> STATE = FULL_DEBUG;
> # 9P = FATAL;
> # 9P_DISPATCH = FATAL;
> FSAL_UP = FATAL;
> DBUS = FATAL;
> }
>
> Facility {
> name = FILE;
> destination = "/var/log/ganesha/ganesha-rgw.log";
> enable = active;
> }
> }
>
> NFSv4 {
> Lease_Lifetime = 20 ;
> IdmapConf = "/etc/idmapd.conf" ;
> DomainName = "nix.mds.xyz" ;
> }
>
>
> NFS_KRB5 {
> PrincipalName = "nfs/nfs03.nix.mds.xyz(a)NIX.MDS.XYZ" ;
> KeytabPath = /etc/krb5.keytab ;
> Active_krb5 = YES ;
> }
>
> NFS_Core_Param {
> Bind_addr = 192.168.0.125;
> NFS_Port = 2049;
> MNT_Port = 20048;
> NLM_Port = 38468;
> Rquota_Port = 4501;
> }
>
> %include "/etc/ganesha/export.conf"
> # %include "/etc/ganesha/export-home.conf"
> [root@nfs03 ganesha]# cat /etc/ganesha/export.conf
> EXPORT {
> Export_Id = 1 ; # Export ID
> unique to each export
> Path = "/n"; # Path of the
> volume to be exported. Eg: "/test_volume"
>
> FSAL {
> name = GLUSTER;
> hostname = "nfs03.nix.mds.xyz"; # IP of one
> of the nodes in the trusted pool
> volume = "gv01"; # Volume
> name. Eg: "test_volume"
> }
>
> Access_type = RW; # Access permissions
> Squash = No_root_squash; # To
> enable/disable root squashing
> Disable_ACL = FALSE; # To
> enable/disable ACL
> Pseudo = "/n"; # NFSv4 pseudo
> path for this export. Eg: "/test_volume_pseudo"
> Protocols = "3", "4"; # "3", "4" NFS
> protocols supported
> Transports = "UDP", "TCP" ; # "UDP", "TCP"
> Transport protocols supported
> SecType = "sys","krb5","krb5i","krb5p"; #
> "sys","krb5","krb5i","krb5p"; # Security flavors supported
> }
> [root@nfs03 ganesha]#
>
>
>
> Attached logs of a Mac OS X session attempt to mount the NFS share. (above)
>
4 years, 9 months