ESXI 6.7 client creating Thick Eager zeroed vmdk files using ceph fsal
by Robert Toole
Hi,
I have a 3 node Ceph octopus 15.2.7 cluster running on fully up to date
Centos 7 with nfs-ganesha 3.5.
After following the Ceph install guide
https://docs.ceph.com/en/octopus/cephadm/install/#deploying-nfs-ganesha
I am able to create a NFS 4.1 Datastore in vmware using the ip address
of all three nodes. Everything appears to work OK..
The issue however is that for some reason esxi is creating thick
provisioned eager zeroed disks instead of thin provisioned disks on this
datastore, whether I am migrating, cloning, or creating new vms. Even
running vmkfstools -i disk.vmdk -d thin thin_disk.vmdk still results in
a thick eager zeroed vmdk file.
This should not be possible on an NFS datastore, because vmware requires
a VAAI NAS plugin to accomplish thick provisioning over NFS before it
can thick provision disks.
Linux clients to the same datastore can create thin qcow2 images, and
when looking at the images created by esxi from the linux hosts you can
see that the vmdks are indeed thick:
ls -lsh
total 81G
512 -rw-r--r--. 1 root root 230 Mar 25 15:17 test_vm-2221e939.hlog
40G -rw-------. 1 root root 40G Mar 25 15:17 test_vm-flat.vmdk
40G -rw-------. 1 root root 40G Mar 25 15:56 test_vm_thin-flat.vmdk
512 -rw-------. 1 root root 501 Mar 25 15:57 test_vm_thin.vmdk
512 -rw-------. 1 root root 473 Mar 25 15:17 test_vm.vmdk
0 -rw-r--r--. 1 root root 0 Jan 6 1970 test_vm.vmsd
2.0K -rwxr-xr-x. 1 root root 2.0K Mar 25 15:17 test_vm.vmx
but the qcow2 files from the linux hosts are thin as one would expect:
qemu-img create -f qcow2 big_disk_2.img 500G
ls -lsh
total 401K
200K -rw-r--r--. 1 root root 200K Mar 25 15:47 big_disk_2.img
200K -rw-r--r--. 1 root root 200K Mar 25 15:44 big_disk.img
512 drwxr-xr-x. 2 root root 81G Mar 25 15:57 test_vm
These ls -lsh results are the same from esx, linux nfs clients and from
cephfs kernel client.
What is happening here? Are there undocumented VAAI features in
nfs-ganesha with the cephfs fsal ? If so, how do I turn them off ? I
want thin provisioned disks.
ceph nfs export ls dev-nfs-cluster --detailed
[
{
"export_id": 1,
"path": "/Development-Datastore",
"cluster_id": "dev-nfs-cluster",
"pseudo": "/Development-Datastore",
"access_type": "RW",
"squash": "no_root_squash",
"security_label": true,
"protocols": [
4
],
"transports": [
"TCP"
],
"fsal": {
"name": "CEPH",
"user_id": "dev-nfs-cluster1",
"fs_name": "dev_cephfs_vol",
"sec_label_xattr": ""
},
"clients": []
}
]
rpm -qa | grep ganesha
nfs-ganesha-ceph-3.5-1.el7.x86_64
nfs-ganesha-rados-grace-3.5-1.el7.x86_64
nfs-ganesha-rados-urls-3.5-1.el7.x86_64
nfs-ganesha-3.5-1.el7.x86_64
centos-release-nfs-ganesha30-1.0-2.el7.centos.noarch
rpm -qa | grep ceph
python3-cephfs-15.2.7-0.el7.x86_64
nfs-ganesha-ceph-3.5-1.el7.x86_64
python3-ceph-argparse-15.2.7-0.el7.x86_64
python3-ceph-common-15.2.7-0.el7.x86_64
cephadm-15.2.7-0.el7.x86_64
libcephfs2-15.2.7-0.el7.x86_64
ceph-common-15.2.7-0.el7.x86_64
ceph -v
ceph version 15.2.7 (<ceph_uuid>) octopus (stable)
The ceph cluster is healthy using bluestore on raw 3.84TB sata 7200 rpm
disks.
--
Robert Toole
rtoole(a)tooleweb.ca
403 368 5680
4 weeks
Re: Proper selinux labelling for exported directory?
by Lars Kellogg-Stedman
On Thu, Jul 22, 2021 at 09:46:52AM -0400, Kaleb Keithley wrote:
> Have you installed the nfs-ganesha-selinux rpm?
I have...but I wouldn't expect that to allow ganesha to export
arbitrary directories (and it doesn't).
I realize I didn't mention it in the earlier message, but this question
is in the context of the VFS FSAL, if that matters.
--
Lars Kellogg-Stedman <lars(a)redhat.com> | larsks @ {irc,twitter,github}
http://blog.oddbit.com/ | N1LKS
3 years, 6 months
Proper selinux labelling for exported directory?
by Lars Kellogg-Stedman
Hello again,
With selinux in enforcing mode, what's the correct way to export a
directory with nfs-ganesha? There are few pages out there that suggest
the following sequence:
1. setenforce 0
2. start nfs-ganesha
3. run audit2allow to generate a new selinux module
4. install the module
But that seems overly broad, because you end up with rules that would
allow ganesha to export anything, like:
allow ganesha_t unlabeled_t:dir getattr;
Is there an existing label I should apply to the exported directory so that
ganesha can export it without errors?
--
Lars Kellogg-Stedman <lars(a)redhat.com> | larsks @ {irc,twitter,github}
http://blog.oddbit.com/ | N1LKS
3 years, 6 months
Possible to run NFSv4 server as non-root user?
by Tom McLaughlin
Hi,
I'm wondering if it's possible to run Ganesha as a normal user, to start up an NFSv4 server on a chosen port the user is allowed to bind and serve the user's files. I've made some progress, but I'm currently facing the error message "ganesha.nfsd-1003865[main] __Register_program :DISP :MAJ :Cannot register RQUOTA V1 on UDP". I thought NFSv4 didn't need to run on UDP, and I've tried to configure it to only run on TCP but no luck so far.
I'm also slightly confused about rpcbind dependency. Is it possible to disable/avoid using for a simple use case like this?
Here's my config so far:
NFS_KRB5 {
Active_krb5 = false;
}
NFS_CORE_PARAM {
Protocols = 4;
NFS_Port = 7777;
Rquota_Port = 7778;
}
EXPORT {
Export_Id = 2;
Path = /tmp/exported;
Pseudo = /tmp/exported;
Access_Type = RW;
Squash = No_Root_Squash;
Transports = "TCP";
Protocols = 4;
# SecType = none;
SecType = "sys";
FSAL {
Name = VFS;
}
}
3 years, 6 months
HA failover in less than five minutes?
by lars@redhat.com
I've been experimenting with an HA NFS configuration using pacemaker and nfs-ganesha. I've noticed that after a failover event, it takes about five minutes for clients to recover, and that seems to be independent of the settings of Lease_lifetime and Grace_period. Client recovery also doesn't seem to correspond to the ":NFS Server Now NOT IN GRACE " message in the ganesha.log. Is this normal behavior?
My pacemaker configuration looks like:
Full List of Resources:
* Resource Group: nfs:
* nfsd (systemd:nfs-ganesha): Started nfs2.storage
* nfs_vip (ocf::heartbeat:IPaddr2): Started nfs2.storage
And the ganesha configuration looks like:
NFS_CORE_PARAM
{
Enable_NLM = false;
Enable_RQUOTA = false;
Protocols = 4;
}
NFSv4
{
RecoveryBackend = rados_ng;
Minor_Versions = 1,2;
# From https://www.suse.com/support/kb/doc/?id=000019374
Lease_Lifetime = 10;
Grace_Period = 20;
}
MDCACHE {
# Size the dirent cache down as small as possible.
Dir_Chunk = 0;
}
EXPORT
{
Export_ID=100;
Protocols = 4;
Transports = TCP;
Path = /;
Pseudo = /data;
Access_Type = RW;
Attr_Expiration_Time = 0;
Squash = none;
FSAL {
Name = CEPH;
Filesystem = "tank";
User_Id = "nfs";
}
}
RADOS_KV
{
UserId = "nfsmeta";
pool = "cephfs.tank.meta";
namespace = "ganesha";
}
3 years, 6 months
Proxy FSAL - NFS4 -> 3
by Nick Couchman
I'm using Ganesha NFS with the Proxy FSAL. It seems to work fine as long as
I'm going between NFS clients and servers of the same version (NFSv4.1
client -> NFSv4.1 server). However, I have a situation where I'm trying to
proxy a NFSv4.1 service (Azure File Share NFS) for clients that do not
support NFSv4. When I do this, the "mount" command on the NFSv3 clients
works fine, but as soon as I try to do any data operations, I get I/O
errors:
# mount -t nfs -o vers=3 10.11.12.13:/azure-nfs-file /mnt/azure
# ls -l /mnt/azure
ls: reading directory '/mnt/azure': Remote I/O error
total 0
I've posted my (sanitized) configuration below. My questions are:
1) Is it possible to use Ganesha to proxy this way - allowing clients to
access a NFS server that differs in version, or is this not supported?
2) Am I missing something within my Ganesha config to enable this?
I can also provide the debug logging, if that's helpful.
Thanks - Nick
==ganesha.conf==
NFS_CORE_PARAM {
mount_path_pseudo = true;
Protocols = 3,4;
MNT_Port = 20048;
}
LOG {
Default_Log_Level = INFO;
Components {
FSAL = FULL_DEBUG;
}
}
EXPORT_DEFAULTS {
Access_Type = RW;
}
## Azure NFS Share
EXPORT
{
Export_id = 902;
Path = "/stgaccount/azure-file";
Pseudo = "/azure-file";
Access_Type = RW;
Squash = no_root_squash;
Sectype = sys;
FSAL {
Name = proxy;
Srv_Addr = 10.1.2.3;
Enable_Handle_Mapping = TRUE;
HandleMap_DB_Dir = "/var/ganesha/handledb/902";
HandleMap_Tmp_Dir = "/run/ganesha/tmp/902";
HandleMap_DB_Count = 8;
}
}
3 years, 7 months