ESXI 6.7 client creating Thick Eager zeroed vmdk files using ceph fsal
by Robert Toole
Hi,
I have a 3 node Ceph octopus 15.2.7 cluster running on fully up to date
Centos 7 with nfs-ganesha 3.5.
After following the Ceph install guide
https://docs.ceph.com/en/octopus/cephadm/install/#deploying-nfs-ganesha
I am able to create a NFS 4.1 Datastore in vmware using the ip address
of all three nodes. Everything appears to work OK..
The issue however is that for some reason esxi is creating thick
provisioned eager zeroed disks instead of thin provisioned disks on this
datastore, whether I am migrating, cloning, or creating new vms. Even
running vmkfstools -i disk.vmdk -d thin thin_disk.vmdk still results in
a thick eager zeroed vmdk file.
This should not be possible on an NFS datastore, because vmware requires
a VAAI NAS plugin to accomplish thick provisioning over NFS before it
can thick provision disks.
Linux clients to the same datastore can create thin qcow2 images, and
when looking at the images created by esxi from the linux hosts you can
see that the vmdks are indeed thick:
ls -lsh
total 81G
512 -rw-r--r--. 1 root root 230 Mar 25 15:17 test_vm-2221e939.hlog
40G -rw-------. 1 root root 40G Mar 25 15:17 test_vm-flat.vmdk
40G -rw-------. 1 root root 40G Mar 25 15:56 test_vm_thin-flat.vmdk
512 -rw-------. 1 root root 501 Mar 25 15:57 test_vm_thin.vmdk
512 -rw-------. 1 root root 473 Mar 25 15:17 test_vm.vmdk
0 -rw-r--r--. 1 root root 0 Jan 6 1970 test_vm.vmsd
2.0K -rwxr-xr-x. 1 root root 2.0K Mar 25 15:17 test_vm.vmx
but the qcow2 files from the linux hosts are thin as one would expect:
qemu-img create -f qcow2 big_disk_2.img 500G
ls -lsh
total 401K
200K -rw-r--r--. 1 root root 200K Mar 25 15:47 big_disk_2.img
200K -rw-r--r--. 1 root root 200K Mar 25 15:44 big_disk.img
512 drwxr-xr-x. 2 root root 81G Mar 25 15:57 test_vm
These ls -lsh results are the same from esx, linux nfs clients and from
cephfs kernel client.
What is happening here? Are there undocumented VAAI features in
nfs-ganesha with the cephfs fsal ? If so, how do I turn them off ? I
want thin provisioned disks.
ceph nfs export ls dev-nfs-cluster --detailed
[
{
"export_id": 1,
"path": "/Development-Datastore",
"cluster_id": "dev-nfs-cluster",
"pseudo": "/Development-Datastore",
"access_type": "RW",
"squash": "no_root_squash",
"security_label": true,
"protocols": [
4
],
"transports": [
"TCP"
],
"fsal": {
"name": "CEPH",
"user_id": "dev-nfs-cluster1",
"fs_name": "dev_cephfs_vol",
"sec_label_xattr": ""
},
"clients": []
}
]
rpm -qa | grep ganesha
nfs-ganesha-ceph-3.5-1.el7.x86_64
nfs-ganesha-rados-grace-3.5-1.el7.x86_64
nfs-ganesha-rados-urls-3.5-1.el7.x86_64
nfs-ganesha-3.5-1.el7.x86_64
centos-release-nfs-ganesha30-1.0-2.el7.centos.noarch
rpm -qa | grep ceph
python3-cephfs-15.2.7-0.el7.x86_64
nfs-ganesha-ceph-3.5-1.el7.x86_64
python3-ceph-argparse-15.2.7-0.el7.x86_64
python3-ceph-common-15.2.7-0.el7.x86_64
cephadm-15.2.7-0.el7.x86_64
libcephfs2-15.2.7-0.el7.x86_64
ceph-common-15.2.7-0.el7.x86_64
ceph -v
ceph version 15.2.7 (<ceph_uuid>) octopus (stable)
The ceph cluster is healthy using bluestore on raw 3.84TB sata 7200 rpm
disks.
--
Robert Toole
rtoole(a)tooleweb.ca
403 368 5680
6 days, 21 hours
Announce Push of V5.3.2
by Frank Filz
Branch next
Tag:V5.3.2
Merge Highlights
* [GPFS] Change handle_size to 48 if GPFS returned handle with size 40
* Add LogEventLimited to trace in fsal_common_is_referral
* Handle granted upcalls when they are not in the blocked list
* Add an option to print op id as part of a log entry
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
Contents:
850c6fbb9 Frank S. Filz V5.3.2
a99efb78d Shahar Hochma Add an option to print op id as part of a log entry
137110141 Malahal Naineni Handle granted upcalls when they are not in the
blocked list
986bddc8e Madhu Thorat Add LogEventLimited to trace in
fsal_common_is_referral
d70dc1a42 Madhu Thorat [GPFS] Change handle_size to 48 if GPFS returned
handle with size 40
1 year, 6 months
statistics when several ganesha servers run on the same node
by francoise.boudier@atos.net
Hello,
I meet an issue when collecting the statistics using `ganesha_stats` when several ganesha servers are started on the same node.
Each of them use its own NFS port, path, export_id and pseudo.
`ganesha_stats` returns the statistics of the first started ganesha server but fails to collect the statistics of the second one with `Export id not found`.
The rpm in use is nfs-ganesha-utils-4.0-1.el8.x86_64.
Is it a know problem and is it fix in a newer version ?
Thanks for your help
See below an example:
- two ganesha servers are running on a single node.
# ps -ef | grep ganesha
root 2956841 1 0 09:04 ? 00:00:01 ganesha.nfsd -L /tmp/sbbtestgkhLzj8/ganesha/logfile -f /tmp/sbbtestgkhLzj8/ganesha/ganesha.conf -N NIV_EVENT -p /tmp/sbbtestgkhLzj8/ganesha/pidfile
root 2957068 1 0 09:04 ? 00:00:00 ganesha.nfsd -L /tmp/sbbtestZoXOzPM/ganesha/logfile -f /tmp/sbbtestZoXOzPM/ganesha/ganesha.conf -N NIV_EVENT -p /tmp/sbbtestZoXOzPM/ganesha/pidfile
# df | grep mypath
localhost:/mypath.3714 104806400 35679232 69127168 35% /tmp/sbbtestgkhLzj8/mount
localhost:/mypath.9019 104806400 35679232 69127168 35% /tmp/sbbtestZoXOzPM/mount
- Write using each of them
$ dd if=/dev/zero of=//tmp/sbbtestgkhLzj8/mount/fichier bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB, 10 MiB) copied, 0.0257701 s, 407 MB/s
$ dd if=/dev/zero of=//tmp/sbbtestZoXOzPM/mount/fichier bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB, 10 MiB) copied, 0.0258967 s, 405 MB/s
- Collect of stats for each of them
# ganesha_stats iov42 3714
EXPORT 3714:
requested transferred total errors latency
READv42: 0 0 0 0 0 0
WRITEv42: 10485760 10485760 10 0 6141586 0
# ganesha_stats iov42 9019
EXPORT 9019: Export id not found
# ganesha_stats export
Export Stats
Stats collected since: Thu Jun 22 09:04:20 2023814664066 nsecs
Duration: 4897.5012726784 seconds
Export id: 3714
Path: /tmp/sbbtestgkhLzj8/path
NFSv3 stats available: 0
NFSv4.0 stats available: 0
NFSv4.1 stats available: 0
NFSv4.2 stats available: 1
MNT stats available: 0
NLMv4 stats available: 0
RQUOTA stats available: 0
9P stats available: 0
Export id: 0
Path: /
NFSv3 stats available: 0
NFSv4.0 stats available: 0
NFSv4.1 stats available: 0
NFSv4.2 stats available: 1
MNT stats available: 0
NLMv4 stats available: 0
RQUOTA stats available: 0
9P stats available: 0
1 year, 6 months
Announce Push of V5.3.1
by Frank Filz
Branch next
Tag:V5.3.1
Merge Highlights
* uid2grp - fix placement of PTHREAD_MUTEX_destroy
* GLUSTER: fixup new fsal_fd handling
* CHECKPATCH - we stll prefer 80 column lines
* Fix uo scripts/runcp.sh - macro no longer used
* CHECKPATCH - fix up warnings and errors that have crept in
* Several fixes to multilock test tool plus new test cases
* FSAL_MDCACHE/mdcache_handle: Fixed rename atomicity
* init_fds_limit(): Log EVENT to print current set FD limit
* nfs_read_conf: Set the PNFS_MDS & PNFS_DS default to false like written
* Fix possible race when populating pseudo fs
* Fix crash @ free_client_record()
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
Contents:
c3dd75362 Frank S. Filz V5.3.1
232521210 Prabhu Murugesan Fix crash @ free_client_record()
c67f9a5e8 Shahar Hochma Fix possible race when populating pseudo fs
d440c2b41 Assaf Yaari nfs_read_conf: Set the PNFS_MDS & PNFS_DS default to
false like written in the docs and comments
b3c9bd840 Madhu Thorat init_fds_limit(): Log EVENT to print current set FD
limit
eb15ca16c Lior Suliman multilock: Fixed a bug in the sleep command
117b325ae Lior Suliman FSAL_MDCACHE/mdcache_handle: Fixed rename atomicity
5879883d6 Lior Suliman multilock: Added scenarios to sample_tests
6729c56e3 Lior Suliman multilock: Added SO_REUSEADDR to the multilock
console server
2292e0c75 Lior Suliman multilock: Allow the usage of the full UINT64 range
for locks
99d09f50b Frank S. Filz CHECKPATCH - fix up warnings and errors that have
crept in
4253855e6 Frank S. Filz Fix uo scripts/runcp.sh - macro no longer used
9ab836d97 Frank S. Filz CHECKPATCH - we stll prefer 80 column lines
7e445649b Frank S. Filz GLUSTER: fixup new fsal_fd handling
b5f8eaebe Frank S. Filz uid2grp - fix placement of PTHREAD_MUTEX_destroy
1 year, 6 months
Announce Push of V5.3
by Frank Filz
Branch next
Tag:V5.3
Merge Highlights
* Add PSEUDOFS config block
* nfs4_op_open: Allow open with CLAIM_PREVIOUS when FSAL supports grace
period
* GPFS: fix 40 byte sized old file handles
* Fixed GPFS create export issue during claim_posix_filesystem
* build: Fix Python detection
* build: Fix developer warning for LSB module
* scripts: Add podman container creation with Ganesha dependencies
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
Contents:
f4085b698 Frank S. Filz V5.3
a35124d75 Martin Schwenke scripts: Add podman container creation with
Ganesha dependencies
f40ada3f9 Martin Schwenke build: Fix developer warning for LSB module
f765f795f Martin Schwenke build: Fix Python detection
e21025367 Yogendra Charya Fixed GPFS create export issue during
claim_posix_filesystem
4c7f92346 Malahal Naineni GPFS: fix 40 byte sized old file handles
448c04b4b Shahar Hochma nfs4_op_open: Allow open with CLAIM_PREVIOUS when
FSAL supports grace period
0542ff429 Frank S. Filz Add PSEUDOFS config block
1 year, 7 months