ESXI 6.7 client creating Thick Eager zeroed vmdk files using ceph fsal
by Robert Toole
Hi,
I have a 3 node Ceph octopus 15.2.7 cluster running on fully up to date
Centos 7 with nfs-ganesha 3.5.
After following the Ceph install guide
https://docs.ceph.com/en/octopus/cephadm/install/#deploying-nfs-ganesha
I am able to create a NFS 4.1 Datastore in vmware using the ip address
of all three nodes. Everything appears to work OK..
The issue however is that for some reason esxi is creating thick
provisioned eager zeroed disks instead of thin provisioned disks on this
datastore, whether I am migrating, cloning, or creating new vms. Even
running vmkfstools -i disk.vmdk -d thin thin_disk.vmdk still results in
a thick eager zeroed vmdk file.
This should not be possible on an NFS datastore, because vmware requires
a VAAI NAS plugin to accomplish thick provisioning over NFS before it
can thick provision disks.
Linux clients to the same datastore can create thin qcow2 images, and
when looking at the images created by esxi from the linux hosts you can
see that the vmdks are indeed thick:
ls -lsh
total 81G
512 -rw-r--r--. 1 root root 230 Mar 25 15:17 test_vm-2221e939.hlog
40G -rw-------. 1 root root 40G Mar 25 15:17 test_vm-flat.vmdk
40G -rw-------. 1 root root 40G Mar 25 15:56 test_vm_thin-flat.vmdk
512 -rw-------. 1 root root 501 Mar 25 15:57 test_vm_thin.vmdk
512 -rw-------. 1 root root 473 Mar 25 15:17 test_vm.vmdk
0 -rw-r--r--. 1 root root 0 Jan 6 1970 test_vm.vmsd
2.0K -rwxr-xr-x. 1 root root 2.0K Mar 25 15:17 test_vm.vmx
but the qcow2 files from the linux hosts are thin as one would expect:
qemu-img create -f qcow2 big_disk_2.img 500G
ls -lsh
total 401K
200K -rw-r--r--. 1 root root 200K Mar 25 15:47 big_disk_2.img
200K -rw-r--r--. 1 root root 200K Mar 25 15:44 big_disk.img
512 drwxr-xr-x. 2 root root 81G Mar 25 15:57 test_vm
These ls -lsh results are the same from esx, linux nfs clients and from
cephfs kernel client.
What is happening here? Are there undocumented VAAI features in
nfs-ganesha with the cephfs fsal ? If so, how do I turn them off ? I
want thin provisioned disks.
ceph nfs export ls dev-nfs-cluster --detailed
[
{
"export_id": 1,
"path": "/Development-Datastore",
"cluster_id": "dev-nfs-cluster",
"pseudo": "/Development-Datastore",
"access_type": "RW",
"squash": "no_root_squash",
"security_label": true,
"protocols": [
4
],
"transports": [
"TCP"
],
"fsal": {
"name": "CEPH",
"user_id": "dev-nfs-cluster1",
"fs_name": "dev_cephfs_vol",
"sec_label_xattr": ""
},
"clients": []
}
]
rpm -qa | grep ganesha
nfs-ganesha-ceph-3.5-1.el7.x86_64
nfs-ganesha-rados-grace-3.5-1.el7.x86_64
nfs-ganesha-rados-urls-3.5-1.el7.x86_64
nfs-ganesha-3.5-1.el7.x86_64
centos-release-nfs-ganesha30-1.0-2.el7.centos.noarch
rpm -qa | grep ceph
python3-cephfs-15.2.7-0.el7.x86_64
nfs-ganesha-ceph-3.5-1.el7.x86_64
python3-ceph-argparse-15.2.7-0.el7.x86_64
python3-ceph-common-15.2.7-0.el7.x86_64
cephadm-15.2.7-0.el7.x86_64
libcephfs2-15.2.7-0.el7.x86_64
ceph-common-15.2.7-0.el7.x86_64
ceph -v
ceph version 15.2.7 (<ceph_uuid>) octopus (stable)
The ceph cluster is healthy using bluestore on raw 3.84TB sata 7200 rpm
disks.
--
Robert Toole
rtoole(a)tooleweb.ca
403 368 5680
4 weeks
Announce Push of V5.6
by Frank Filz
Branch next
Tag:V5.6
Merge Highlights
* CEPH: Disable nonblocking-io - the cephfs filesystem just isn't ready for
it yet.
* CEPH: Add capability to share ceph client between exports
* MDCACHE: try_release should set up op_context
* Add documentation for quoted strings in config.
* CMAKE: Fix use of USE_CB_SIMULATOR and USE_DBUS
* PSEUDOFS: Add some debug
* GLUSTER: in glusterfs_copy_my_fd insert_fd_lru belongs in the is_dup path
* mdcache_lru_pkginit now properly initializes Cache_FDs from mdcache_param.
* Fixed a couple issues raised by UBSAN.
* Fixed a couple issues with json
* nfs_proto_tools: Change the dynamicinfo field to be concrete
* CRASH:Memory access violation in posix_acl_2_fsal_acl
* main: periodically poke malloc()/free() to release memory
* getquota to support 64bit quota using bsize for scaling
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
Contents:
65e868552 Frank S. Filz V5.6
bd07142ad Michael Diederich getquota to support 64bit quota using bsize for
scaling
d59e65c0a Kaleb S. KEITHLEY main: periodically poke malloc()/free() to
release memory
fae6499dc Arnab Tah CRASH:Memory access violation in posix_acl_2_fsal_acl
d5f4e5e96 Assaf Yaari nfs_proto_tools: Change the dynamicinfo field to be
concrete
bcd1ee81d Trupti Shete Incorrect values for ganesha_stats json option
aa91f96a6 Trupti Shete Failed to get output of option ganesha_stats json
f975b38e6 David Rieber UBSAN reports misaligned access to sockaddr_t.
6c8d8269d David Rieber Fix bit shifting issue reported by UBSAN.
3e1346c49 David Rieber mdcache_lru_pkginit now properly initializes
Cache_FDs from mdcache_param.
dcf0387ee Yevhenii Huzii GLUSTER: in glusterfs_copy_my_fd insert_fd_lru
belongs in the is_dup path
110c7094d Frank S. Filz CEPH: Add capability to share ceph client between
exports
c96246cb3 Frank S. Filz MDCACHE: try_release should set up op_context
bb43dbb2a Frank S. Filz PSEUDOFS: Add some debug
7134bb27a Frank S. Filz Add documentation for quoted strings in config.
4cbf7d603 Frank S. Filz CEPH: Disable ceph nonblocking-io for now - it
doesn't work well
2f39cf184 Frank S. Filz CMAKE: Fix use of USE_CB_SIMULATOR and USE_DBUS
1 year, 3 months