Sounds plausible. There may be nits with specifics, but it seems likely
that it's okay. It'll be nice to have monitoring.
Daniel
On 10/12/21 1:33 AM, Bjorn Leffler wrote:
It's been a while, but I'm finally ready to contribute my
monitoring
changes to the main branch. Below are some screenshots to give you an
idea of what the main Ganesha dashboard looks like. It's straightforward
to add metrics for other FSALs. Initially, I tried to write all the code
in C, but that didn't work well, as the Digital Ocean Prometheus C
client <
https://github.com/digitalocean/prometheus-client-c> had a
serious performance issue
<
https://github.com/digitalocean/prometheus-client-c/issues/59>. The
higher the Ganesha load, the more overall performance decreased. So I
switched to using the recommended C++ client
<
https://github.com/jupp0r/prometheus-cpp> instead, which has worked
much better in my high performance tests. I've also written a wrapper
library around that C++ client, so that a single function call from
Ganesha automatically generates:
* Request rates.
* Network throughput rates.
* Latency percentiles.
* Request size percentiles.
* Response size percentiles.
What I'd like to do next is to:
1. Release the C++ wrapper as a standalone piece of software, under the
Apache 2 license. This is so that it can be integrated into other
applications.
2. Add these modifications to the main branch:
1. A header file into src/include.
2. C and C++ files into the new directory src/monitoring
3. A few function calls into C files in the src/MainNFSD directory.
4. Monitoring configuration files into src/config_samples.
5. Modify the CMakeLists.txt files, leaving the new monitoring
disabled by default.
Does that sound like a good plan? Any comments or suggestions?
Thanks,
Bjorn
On Thu, Jul 16, 2020 at 2:57 AM Malahal Naineni <malahal(a)gmail.com
<mailto:malahal@gmail.com>> wrote:
Including rpcinfo checks for various services would be good to have
as well.
On Tue, Jul 14, 2020 at 5:27 PM Daniel Gryniewicz <dang(a)redhat.com
<mailto:dang@redhat.com>> wrote:
This seems like a fine idea to me. All the counters I'm aware
of are
available via DBUS.
Daniel
On 7/14/20 2:12 AM, Bjorn Leffler via Devel wrote:
> Apart from the counters that you can access through dbus, is
there any
> other monitoring built into Ganesha?
>
> I'm thinking of adding it with this higher level plan:
> - Exporting metrics from Ganesha to Prometheus.
> - Aggregate data in Prometheus.
> - Display monitoring consoles and graphs with Grafana.
> - Package up Prometheus, Grafana and the preconfigured
rules/dashboards
> as a Docker image.
> - This makes it straightforward to deploy monitoring
alongside a Gansha
> binary.
>
> My rough coding plan for the code is to:
> - Add a USE_MONITORING directive to the CMakeLists.txt files.
> - Add a build dependency to the Prometheus C client
> <
https://github.com/digitalocean/prometheus-client-c
<
https://github.com/digitalocean/prometheus-client-c>>.
> - Create a src/monitoring directory for the new source files
and templates.
> - Increment counters and timers throughout the code.
> - Use histograms to compute latency percentiles, heatmaps, etc.
>
> Is this a good idea? Any objections or suggestions?
>
> Thanks,
> Bjorn
>
>
> _______________________________________________
> Devel mailing list -- devel(a)lists.nfs-ganesha.org
<mailto:devel@lists.nfs-ganesha.org>
> To unsubscribe send an email to
devel-leave(a)lists.nfs-ganesha.org
<mailto:devel-leave@lists.nfs-ganesha.org>
>
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org
<mailto:devel@lists.nfs-ganesha.org>
To unsubscribe send an email to
devel-leave(a)lists.nfs-ganesha.org
<mailto:devel-leave@lists.nfs-ganesha.org>
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org
<mailto:devel@lists.nfs-ganesha.org>
To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
<mailto:devel-leave@lists.nfs-ganesha.org>