April 2021 - Support - Nfs Ganesha List Archives

ESXI 6.7 client creating Thick Eager zeroed vmdk files using ceph fsal

by Robert Toole

Hi, I have a 3 node Ceph octopus 15.2.7 cluster running on fully up to date Centos 7 with nfs-ganesha 3.5. After following the Ceph install guide https://docs.ceph.com/en/octopus/cephadm/install/#deploying-nfs-ganesha I am able to create a NFS 4.1 Datastore in vmware using the ip address of all three nodes. Everything appears to work OK.. The issue however is that for some reason esxi is creating thick provisioned eager zeroed disks instead of thin provisioned disks on this datastore, whether I am migrating, cloning, or creating new vms. Even running vmkfstools -i disk.vmdk -d thin thin_disk.vmdk still results in a thick eager zeroed vmdk file. This should not be possible on an NFS datastore, because vmware requires a VAAI NAS plugin to accomplish thick provisioning over NFS before it can thick provision disks. Linux clients to the same datastore can create thin qcow2 images, and when looking at the images created by esxi from the linux hosts you can see that the vmdks are indeed thick: ls -lsh total 81G 512 -rw-r--r--. 1 root root 230 Mar 25 15:17 test_vm-2221e939.hlog 40G -rw-------. 1 root root 40G Mar 25 15:17 test_vm-flat.vmdk 40G -rw-------. 1 root root 40G Mar 25 15:56 test_vm_thin-flat.vmdk 512 -rw-------. 1 root root 501 Mar 25 15:57 test_vm_thin.vmdk 512 -rw-------. 1 root root 473 Mar 25 15:17 test_vm.vmdk 0 -rw-r--r--. 1 root root 0 Jan 6 1970 test_vm.vmsd 2.0K -rwxr-xr-x. 1 root root 2.0K Mar 25 15:17 test_vm.vmx but the qcow2 files from the linux hosts are thin as one would expect: qemu-img create -f qcow2 big_disk_2.img 500G ls -lsh total 401K 200K -rw-r--r--. 1 root root 200K Mar 25 15:47 big_disk_2.img 200K -rw-r--r--. 1 root root 200K Mar 25 15:44 big_disk.img 512 drwxr-xr-x. 2 root root 81G Mar 25 15:57 test_vm These ls -lsh results are the same from esx, linux nfs clients and from cephfs kernel client. What is happening here? Are there undocumented VAAI features in nfs-ganesha with the cephfs fsal ? If so, how do I turn them off ? I want thin provisioned disks. ceph nfs export ls dev-nfs-cluster --detailed [ { "export_id": 1, "path": "/Development-Datastore", "cluster_id": "dev-nfs-cluster", "pseudo": "/Development-Datastore", "access_type": "RW", "squash": "no_root_squash", "security_label": true, "protocols": [ 4 ], "transports": [ "TCP" ], "fsal": { "name": "CEPH", "user_id": "dev-nfs-cluster1", "fs_name": "dev_cephfs_vol", "sec_label_xattr": "" }, "clients": [] } ] rpm -qa | grep ganesha nfs-ganesha-ceph-3.5-1.el7.x86_64 nfs-ganesha-rados-grace-3.5-1.el7.x86_64 nfs-ganesha-rados-urls-3.5-1.el7.x86_64 nfs-ganesha-3.5-1.el7.x86_64 centos-release-nfs-ganesha30-1.0-2.el7.centos.noarch rpm -qa | grep ceph python3-cephfs-15.2.7-0.el7.x86_64 nfs-ganesha-ceph-3.5-1.el7.x86_64 python3-ceph-argparse-15.2.7-0.el7.x86_64 python3-ceph-common-15.2.7-0.el7.x86_64 cephadm-15.2.7-0.el7.x86_64 libcephfs2-15.2.7-0.el7.x86_64 ceph-common-15.2.7-0.el7.x86_64 ceph -v ceph version 15.2.7 (<ceph_uuid>) octopus (stable) The ceph cluster is healthy using bluestore on raw 3.84TB sata 7200 rpm disks. -- Robert Toole rtoole(a)tooleweb.ca 403 368 5680

11 months, 1 week

4
5
0 / 0

[NFS GANESHA] Recommended Values for eviction timeout

by lokendrarathour＠gmail.com

Hi, We have setup NFS Ganesha as Active/Active and have mounted the cephfs using the VIP from the VM. On top of it we are testing IP failover/and GANESHA SWITCH-OVER. To manage this fail over in minimum time(for I/O to resume) we have found three variable in the ceph system as : 1.session_timeout (min allowed value is 30) - default -60 2.session_autoclose(min allowed value is 30) - default -300 3.mds_cap_revoke_eviction_timeout mds_cap_revoke_eviction_timeout is disabled by default and work at priority if enabled or configured as something above 0. we have tested by setting up mds_cap_revoke_eviction_timeout as 1 and have achieved the I/O resume duration as 7-9 seconds, but we are unsure of 1 seconds and its impact on production. because 1 seconds or few seconds can also be in class of network latency or maybe something other than NFS Server fails. Query: we need to know certain recommended values for the same variables as marked above. Note: we are not using this any container kind of setup to bring the down NFS server immediately, we are just switching from one node to another as we detect failure on one of the node. -Best Regards, Lokendra

4 years, 7 months

3
2
0 / 0

NFS GANESHA Code Compilation on CENTOS 8

by lokendrarathour＠gmail.com

Hi Team, we are facing issue in compiling NFS GANESHA CODE on CENTOS 8: Wiki link for reference: https://github.com/nfs-ganesha/nfs-ganesha/wiki/Compiling we are facing packages installation issues as mentioned below: 1.libgssglue-devel 2.nfs-utils-lib-devel 3. doxygen We understood that these packages are in support with Centos 7 and are not available in Centos 8. So do we have any document/wiki to get the same NFS Ganesha compiled on CENTOS-8. Best Regards, Lokendra

4 years, 8 months

2
3
1 / 0

NFS Ganesha Active/Passive HA Failover Issue

by lokendrarathour＠gmail.com

Hello, In our environment of Ceph Cluster(version 15.2.7) we are trying to use NFS HA Mode.Facing certain issues in the same as below: "Active/Passive HA NFS Cluster" When we are using Active/Passive HA Config for NFS Server using Corosync/Pacemekar: 1. configuration is done and we are able to perform fail-over, but when an active node is tested with power-off two scenarios are observed: 1.1 : I/O operations gets stuck until the node is powered on although the handover from active to other standby node happens immediately once the node is powered-off. All the existing requests are stuck. 1.2 : from other client if we try to check for the heartbeat of mount-point, it is also stuck for the same duration. 1.3 from the new client creating a new mount to the same subvolume works fine. Issues/Concern: I/O operations should resume just after the Failover happens.We are not able to achive this state, Can anyone please help in supporting any known configuration/solution/work-around that can be done done at NFS-Ganesha level to achieve healthy NFS HA Mode. Just a Note: mount points using Ceph's native FS driver works fine in the same shutdown/poweroff scenarios. Ceph version: 15.2.7 NFS Ganesha : 3.3 Ganesha Conf: - NFS Node 1: [ansible@cephnode2 ~]$ cat /etc/ganesha/ganesha.conf # Please do not change this file directly since it is managed by Ansible and will be overwritten NFS_Core_Param { Enable_NLM = false; Enable_RQUOTA = false; Protocols = 3,4; } EXPORT_DEFAULTS { Attr_Expiration_Time = 0; } CACHEINODE { Dir_Chunk = 0; NParts = 1; Cache_Size = 1; } RADOS_URLS { ceph_conf = '/etc/ceph/ceph.conf'; userid = "admin"; watch_url = "rados://nfs_ganesha/ganesha-export/conf-cephnode2"; } NFSv4 { RecoveryBackend = 'rados_cluster'; Lease_Lifetime = 10; Grace_Period = 20; } RADOS_KV { ceph_conf = '/etc/ceph/ceph.conf'; userid = "admin"; pool = "nfs_ganesha"; namespace = "ganesha-grace"; nodeid = "cephnode2"; } %url rados://nfs_ganesha/ganesha-export/conf-cephnode2 LOG { Facility { name = FILE; destination = "/var/log/ganesha/ganesha.log"; enable = active; } } EXPORT { Export_id=20235; Path = "/volumes/hns/conf/bb21b7c7-c663-40e9-ad11-a61441e6f77f"; Pseudo = /conf; Access_Type = RW; Protocols = 3,4; Transports = TCP; SecType = sys,krb5,krb5i,krb5p; Squash = No_Root_Squash; Attr_Expiration_Time = 0; FSAL { Name = CEPH; User_Id = "admin"; } } EXPORT { Export_id=20236; Path = "/volumes/hns/opr/138304ca-a70d-4962-9754-b572bce196b6"; Pseudo = /opr; Access_Type = RW; Protocols = 3,4; Transports = TCP; SecType = sys,krb5,krb5i,krb5p; Squash = No_Root_Squash; Attr_Expiration_Time = 0; FSAL { Name = CEPH; User_Id = "admin"; } } # NFS Node 2: [ansible@cephnode3 ~]$ cat /etc/ganesha/ganesha.conf # Please do not change this file directly since it is managed by Ansible and will be overwritten NFS_Core_Param { Enable_NLM = false; Enable_RQUOTA = false; Protocols = 3,4; } EXPORT_DEFAULTS { Attr_Expiration_Time = 0; } CACHEINODE { Dir_Chunk = 0; NParts = 1; Cache_Size = 1; } RADOS_URLS { ceph_conf = '/etc/ceph/ceph.conf'; userid = "admin"; watch_url = "rados://nfs_ganesha/ganesha-export/conf-cephnode3"; } NFSv4 { RecoveryBackend = 'rados_cluster'; Lease_Lifetime = 10; Grace_Period = 20; } RADOS_KV { ceph_conf = '/etc/ceph/ceph.conf'; userid = "admin"; pool = "nfs_ganesha"; namespace = "ganesha-grace"; nodeid = "cephnode3"; } %url rados://nfs_ganesha/ganesha-export/conf-cephnode3 LOG { Facility { name = FILE; destination = "/var/log/ganesha/ganesha.log"; enable = active; } } EXPORT { Export_id=20235; Path = "/volumes/hns/conf/bb21b7c7-c663-40e9-ad11-a61441e6f77f"; Pseudo = /conf; Access_Type = RW; Protocols = 3,4; Transports = TCP; SecType = sys,krb5,krb5i,krb5p; Squash = No_Root_Squash; Attr_Expiration_Time = 0; FSAL { Name = CEPH; User_Id = "admin"; } } EXPORT { Export_id=20236; Path = "/volumes/hns/opr/138304ca-a70d-4962-9754-b572bce196b6"; Pseudo = /opr; Access_Type = RW; Protocols = 3,4; Transports = TCP; SecType = sys,krb5,krb5i,krb5p; Squash = No_Root_Squash; Attr_Expiration_Time = 0; FSAL { Name = CEPH; User_Id = "admin"; } } ## Mount Commands at client side: sudo mount -t nfs -o nfsvers=4.1,proto=tcp 10.0.4.14:/conf /mnt/nfsconf where 10.0.4.14 is the floating IP.

4 years, 8 months

3
12
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

Support April 2021