It's been a while, but I'm finally ready to contribute my monitoring
changes to the main branch. Below are some screenshots to give you an idea
of what the main Ganesha dashboard looks like. It's straightforward to add
metrics for other FSALs. Initially, I tried to write all the code in C, but
that didn't work well, as the Digital Ocean Prometheus C client
<
https://github.com/digitalocean/prometheus-client-c> had a serious performance
issue <
https://github.com/digitalocean/prometheus-client-c/issues/59>. The
higher the Ganesha load, the more overall performance decreased. So I
switched to using the recommended C++ client
<
https://github.com/jupp0r/prometheus-cpp> instead, which has worked much
better in my high performance tests. I've also written a wrapper library
around that C++ client, so that a single function call from Ganesha
automatically generates:
- Request rates.
- Network throughput rates.
- Latency percentiles.
- Request size percentiles.
- Response size percentiles.
What I'd like to do next is to:
1. Release the C++ wrapper as a standalone piece of software, under the
Apache 2 license. This is so that it can be integrated into other
applications.
2. Add these modifications to the main branch:
1. A header file into src/include.
2. C and C++ files into the new directory src/monitoring
3. A few function calls into C files in the src/MainNFSD directory.
4. Monitoring configuration files into src/config_samples.
5. Modify the CMakeLists.txt files, leaving the new monitoring
disabled by default.
Does that sound like a good plan? Any comments or suggestions?
Thanks,
Bjorn
On Thu, Jul 16, 2020 at 2:57 AM Malahal Naineni <malahal(a)gmail.com> wrote:
Including rpcinfo checks for various services would be good to have
as
well.
On Tue, Jul 14, 2020 at 5:27 PM Daniel Gryniewicz <dang(a)redhat.com> wrote:
> This seems like a fine idea to me. All the counters I'm aware of are
> available via DBUS.
>
> Daniel
>
> On 7/14/20 2:12 AM, Bjorn Leffler via Devel wrote:
> > Apart from the counters that you can access through dbus, is there any
> > other monitoring built into Ganesha?
> >
> > I'm thinking of adding it with this higher level plan:
> > - Exporting metrics from Ganesha to Prometheus.
> > - Aggregate data in Prometheus.
> > - Display monitoring consoles and graphs with Grafana.
> > - Package up Prometheus, Grafana and the preconfigured rules/dashboards
> > as a Docker image.
> > - This makes it straightforward to deploy monitoring alongside a Gansha
> > binary.
> >
> > My rough coding plan for the code is to:
> > - Add a USE_MONITORING directive to the CMakeLists.txt files.
> > - Add a build dependency to the Prometheus C client
> > <
https://github.com/digitalocean/prometheus-client-c>.
> > - Create a src/monitoring directory for the new source files and
> templates.
> > - Increment counters and timers throughout the code.
> > - Use histograms to compute latency percentiles, heatmaps, etc.
> >
> > Is this a good idea? Any objections or suggestions?
> >
> > Thanks,
> > Bjorn
> >
> >
> > _______________________________________________
> > Devel mailing list -- devel(a)lists.nfs-ganesha.org
> > To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
> >
> _______________________________________________
> Devel mailing list -- devel(a)lists.nfs-ganesha.org
> To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org
>
_______________________________________________
Devel mailing list -- devel(a)lists.nfs-ganesha.org
To unsubscribe send an email to devel-leave(a)lists.nfs-ganesha.org