Monitoring in Ganesha?
by Bjorn Leffler
Apart from the counters that you can access through dbus, is there any
other monitoring built into Ganesha?
I'm thinking of adding it with this higher level plan:
- Exporting metrics from Ganesha to Prometheus.
- Aggregate data in Prometheus.
- Display monitoring consoles and graphs with Grafana.
- Package up Prometheus, Grafana and the preconfigured rules/dashboards as
a Docker image.
- This makes it straightforward to deploy monitoring alongside a Gansha
binary.
My rough coding plan for the code is to:
- Add a USE_MONITORING directive to the CMakeLists.txt files.
- Add a build dependency to the Prometheus C client
<https://github.com/digitalocean/prometheus-client-c>.
- Create a src/monitoring directory for the new source files and templates.
- Increment counters and timers throughout the code.
- Use histograms to compute latency percentiles, heatmaps, etc.
Is this a good idea? Any objections or suggestions?
Thanks,
Bjorn
1 year, 10 months
Change in ...nfs-ganesha[next]: MDCACHE - Add MDCACHE {} config block
by Daniel Gryniewicz (GerritHub)
Daniel Gryniewicz has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/454929
Change subject: MDCACHE - Add MDCACHE {} config block
......................................................................
MDCACHE - Add MDCACHE {} config block
Add a config block name MDCACHE that is a copy of CACHEINODE. Both can
be configured, but MDCACHE will override CACHEINODE. This allows us to
deprecate CACHEINODE.
Change-Id: I49012723132ae6105b904a60d1a96bb2bf78d51b
Signed-off-by: Daniel Gryniewicz <dang(a)fprintf.net>
---
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_read_conf.c
M src/config_samples/ceph.conf
M src/config_samples/config.txt
M src/config_samples/ganesha.conf.example
M src/doc/man/ganesha-cache-config.rst
M src/doc/man/ganesha-config.rst
6 files changed, 31 insertions(+), 7 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/29/454929/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/454929
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: I49012723132ae6105b904a60d1a96bb2bf78d51b
Gerrit-Change-Number: 454929
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Gryniewicz <dang(a)redhat.com>
Gerrit-MessageType: newchange
2 years, 10 months
Issues seen with krb5i and krb5p mounts
by Trishali Nayar
Hi all,
There are a few clients eg- Ubuntu 18.04.3 (4.15.0-55-generic) and RH7.8
(3.10.0-1127.el7.x86_64) for which we have observed... simple command like
'dd' either hangs or returns EIO. This is happening only on krb5i and krb5p
mounts. It seems to happen for file sizes eg- 100MB and larger mostly. But
sometimes even a 30 MB file sees failures.
A client eg- RH7.6 (3.10.0-957.el7.x86_64) does not seem to hit this
issue...so might be with more recent kernels?
We fixed the issue with check-in
https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490802 The idea was to
let clients know that Ganesha denied the request VS just dropping the
request.
This fix did seem to help and hangs/errors stopped completely... but for
larger file sizes eg- 1000MB we started seeing "Permission Denied" errors.
This was different than the EIO errors seen earlier. Reason could be we are
now sending an "AUTH DENIED" error so clients translate it to this new
error.
We tried to add more logging into Ganesha and observe that these particular
clients seem to send a lot of requests together. When we process same, the
sequence no. is pretty much out or order and we drop the requests outside
the sequence window, as per the RFC 2203 Section 7.2.1. The sequence window
that we have is 32.
Testing these clients with kNFS does not hit the issue...The kNFS sequence
window seems to be larger and is 128.
So, tried to increase the sequence window as well to 128 for ganesha. That
does not seem to help fix the issue.
We also have below additional 'seqmask' check and many of the requests went
into that category as well and were dropped.
"libntirpc/src/svc_auth_gss.c":
} else if (offset >= gd->win || (gd->seqmask & (1 << offset))) {
*no_dispatch = true;
goto gd_free;
}
Also observed that now these clients sent many requests above the 128
window...which we would again drop.
Wondering what is the proper way to fix this and any idea on what these
clients are doing different.
Thanks and regards,
Trishali.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Trishali Nayar
IBM Systems
ETZ, Pune.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 years, 11 months
lseek gets bad offset from nfs client with ganesha/gluster which supports SEEK
by Kinglong Mee
The latest ganesha/gluster supports seek according to,
https://tools.ietf.org/html/draft-ietf-nfsv4-minorversion2-41#section-15.11
From the given sa_offset, find the next data_content4 of type sa_what
in the file. If the server can not find a corresponding sa_what,
then the status will still be NFS4_OK, but sr_eof would be TRUE. If
the server can find the sa_what, then the sr_offset is the start of
that content. If the sa_offset is beyond the end of the file, then
SEEK MUST return NFS4ERR_NXIO.
For a file's filemap as,
Part 1: HOLE 0x0000000000000000 ---> 0x0000000000600000
Part 2: DATA 0x0000000000600000 ---> 0x0000000000700000
Part 3: HOLE 0x0000000000700000 ---> 0x0000000001000000
SEEK(0x700000, SEEK_DATA) gets result (sr_eof:1, sr_offset:0x70000) from ganesha/gluster;
SEEK(0x700000, SEEK_HOLE) gets result (sr_eof:0, sr_offset:0x70000) from ganesha/gluster.
If an application depends the lseek result for data searching, it may enter infinite loop.
while (1) {
next_pos = lseek(fd, cur_pos, seek_type);
if (seek_type == SEEK_DATA) {
seek_type = SEEK_HOLE;
} else {
seek_type = SEEK_DATA;
}
if (next_pos == -1) {
return ;
cur_pos = next_pos;
}
The lseek syscall always gets 0x70000 from nfs client for those two cases,
but, if underlying filesystem is ext4/f2fs, or the nfs server is knfsd,
the lseek(0x700000, SEEK_DATA) gets ENXIO.
I wanna to know,
should I fix the ganesha/gluster as knfsd return ENXIO for the first case?
or should I fix the nfs client to return ENXIO for the first case?
thanks,
Kinglong Mee
3 years
Building on macOS
by matvore@comcast.net
Hi,
I have been trying to build NFS-Ganesha on macOS, and having a bit of trouble - there are quite a lot of compiler errors. But I also noticed someone has at least tried in the past, as I've seen some `#ifdef __APPLE__` lines here and there. Is macOS compatibility on anyone's radar? Did it once build in the past?
Thank you,
Matt
3 years
Re: [NFS-Ganesha-Support] Re: questions about stability of nfs-ganesha versions and PROXY FSAL
by Frank Filz
> Back in April I wrote:
>
> On Thu, 9 Apr 2020, Pfaff, Todd wrote:
>
> > I'm having stability problems with nfs-ganesha 2.8 and PROXY FSAL on
> > CentOS 7.
>
> and this has been ongoing for me since then. Today I finally stumbled onto
> something that may have solved at least some of the problems that I've been
> having with nfs-ganesha. I also posted the following as a follow up to the nfs-
> ganesha github issue: frequent failures with PROXY FSAL (#580).
>
> ...
>
> I've discovered something that may be relevant to this issue.
>
> - My nfs-ganesha server is serving several EXPORTs.
>
> - Two of these EXPORTs refer to identically named PATHs on two different
> back-end servers.
>
> - I've now changed these so that the back-end server paths are different.
>
> - I've tested this multiple times with multiple VMs and it appears that
> this change may have eliminated all traces of the problems that I've
> been describing here.
>
>
> Does this make sense? Is it known that nfs-ganesha can not have multiple
> exports with the same value for the Path?
I'm not that familiar with FSAL_PROXY, and I'm not sure how it handles multiple exports. That is actually an area that needs more testing and more resolution.
Do you use
NFS_CORE_PARAM { mount_path_pseudo = true; }
You may be running into issues if path is the same with NFS v3 clients.
If that isn't the issue, we do need to make the proxy support more robust because certainly we should allow multiple back end servers exporting the same path.
Frank
3 years
Announce Push of V4-dev.31
by Frank Filz
Branch next
Tag:V4-dev.31
Merge Highlights
* Fix for coverity issue:360547 (Resource leak)
* RPM: fix the mismatched dependency of nfs-ganesha-utils
* [GPFS] handle ACLs that exceed 4096 byte buffer
* Adding xattr support for FSAL_GLUSTER
Signed-off-by: Frank S. Filz <ffilzlnx(a)mindspring.com>
Contents:
d60cd4d Frank S. Filz V4-dev.31
141e10e Arjun Adding xattr support for FSAL_GLUSTER
a359f19 Malahal Naineni [GPFS] handle ACLs that exceed 4096 byte buffer
08288eb sepia-liu RPM: fix the mismatched dependency of nfs-ganesha-utils
c1bce85 Yogendra Charya Fix for coverity issue:360547 (Resource leak)
3 years
Change in ...nfs-ganesha[next]: [GPFS] handle ACLs that exceed 4096 byte buffer
by Malahal (GerritHub)
Malahal has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/501193 )
Change subject: [GPFS] handle ACLs that exceed 4096 byte buffer
......................................................................
[GPFS] handle ACLs that exceed 4096 byte buffer
commit f8a75a9eb started checking for the maximum number of ACEs. It
didn't check for buffer too small error case where the number of ACEs is
undefined leading to SERVERFAULT errors.
Changed the code to return early in the case of ENOSPC/buffer too small
from GPFS file system.
Change-Id: If72f77c97839cfd3abc9d4018f0ab229fc233c99
Signed-off-by: Malahal Naineni <malahal(a)us.ibm.com>
---
M src/FSAL/FSAL_GPFS/fsal_internal.c
1 file changed, 1 insertion(+), 1 deletion(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/93/501193/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/501193
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: If72f77c97839cfd3abc9d4018f0ab229fc233c99
Gerrit-Change-Number: 501193
Gerrit-PatchSet: 1
Gerrit-Owner: Malahal <malahal(a)gmail.com>
Gerrit-MessageType: newchange
3 years, 1 month
Change in ...nfs-ganesha[next]: FSAL_CEPH: fix cannot find entry by ino_release_cb
by sepia-liu (GerritHub)
sepia-liu has uploaded this change for review. ( https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/501180 )
Change subject: FSAL_CEPH: fix cannot find entry by ino_release_cb
......................................................................
FSAL_CEPH: fix cannot find entry by ino_release_cb
There is no export_id in the key, the correct hash value
cannot be calculated, causing cannot find the entry.
Change-Id: Iead1952457ab169b15260aab0c4329dc2644d6df
Signed-off-by: sepia-liu <liuwei_coder(a)163.com>
---
M src/FSAL/FSAL_CEPH/main.c
1 file changed, 5 insertions(+), 4 deletions(-)
git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha refs/changes/80/501180/1
--
To view, visit https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/501180
To unsubscribe, or for help writing mail filters, visit https://review.gerrithub.io/settings
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Change-Id: Iead1952457ab169b15260aab0c4329dc2644d6df
Gerrit-Change-Number: 501180
Gerrit-PatchSet: 1
Gerrit-Owner: sepia-liu <liuwei_coder(a)163.com>
Gerrit-MessageType: newchange
3 years, 1 month