Just sent the main change to Ganesha. Please let me know what you think.

I'm a bit confused as to how the USE_THING and _USE_THING definitions work.
The code depends on the "metrics" library, which I'm in the process of releasing the source to. Just completed our internal code review process.

Thanks,
Bjorn

On Wed, Oct 13, 2021 at 8:21 AM Bjorn Leffler <leffler@google.com> wrote:
Great, I'll start sending the changes.

On Tue, Oct 12, 2021 at 10:51 PM Daniel Gryniewicz <dang@redhat.com> wrote:
Sounds plausible.  There may be nits with specifics, but it seems likely
that it's okay.  It'll be nice to have monitoring.

Daniel

On 10/12/21 1:33 AM, Bjorn Leffler wrote:
> It's been a while, but I'm finally ready to contribute my monitoring
> changes to the main branch. Below are some screenshots to give you an
> idea of what the main Ganesha dashboard looks like. It's straightforward
> to add metrics for other FSALs. Initially, I tried to write all the code
> in C, but that didn't work well, as the Digital Ocean Prometheus C
> client <https://github.com/digitalocean/prometheus-client-c> had a
> serious performance issue
> <https://github.com/digitalocean/prometheus-client-c/issues/59>. The
> higher the Ganesha load, the more overall performance decreased. So I
> switched to using the recommended C++ client
> <https://github.com/jupp0r/prometheus-cpp> instead, which has worked
> much better in my high performance tests. I've also written a wrapper
> library around that C++ client, so that a single function call from
> Ganesha automatically generates:
>
>   * Request rates.
>   * Network throughput rates.
>   * Latency percentiles.
>   * Request size percentiles.
>   * Response size percentiles.
>
> What I'd like to do next is to:
>
>  1. Release the C++ wrapper as a standalone piece of software, under the
>     Apache 2 license. This is so that it can be integrated into other
>     applications.
>  2. Add these modifications to the main branch:
>      1. A header file into src/include.
>      2. C and C++ files into the new directory src/monitoring
>      3. A few function calls into C files in the src/MainNFSD directory.
>      4. Monitoring configuration files into src/config_samples.
>      5. Modify the CMakeLists.txt files, leaving the new monitoring
>         disabled by default.
>
> Does that sound like a good plan? Any comments or suggestions?
>
> Thanks,
> Bjorn
>
>
> On Thu, Jul 16, 2020 at 2:57 AM Malahal Naineni <malahal@gmail.com
> <mailto:malahal@gmail.com>> wrote:
>
>     Including rpcinfo checks for various services would be good to have
>     as well.
>
>     On Tue, Jul 14, 2020 at 5:27 PM Daniel Gryniewicz <dang@redhat.com
>     <mailto:dang@redhat.com>> wrote:
>
>         This seems like a fine idea to me.  All the counters I'm aware
>         of are
>         available via DBUS.
>
>         Daniel
>
>         On 7/14/20 2:12 AM, Bjorn Leffler via Devel wrote:
>          > Apart from the counters that you can access through dbus, is
>         there any
>          > other monitoring built into Ganesha?
>          >
>          > I'm thinking of adding it with this higher level plan:
>          > - Exporting metrics from Ganesha to Prometheus.
>          > - Aggregate data in Prometheus.
>          > - Display monitoring consoles and graphs with Grafana.
>          > - Package up Prometheus, Grafana and the preconfigured
>         rules/dashboards
>          > as a Docker image.
>          > - This makes it straightforward to deploy monitoring
>         alongside a Gansha
>          > binary.
>          >
>          > My rough coding plan for the code is to:
>          > - Add a USE_MONITORING directive to the CMakeLists.txt files.
>          > - Add a build dependency to the Prometheus C client
>          > <https://github.com/digitalocean/prometheus-client-c
>         <https://github.com/digitalocean/prometheus-client-c>>.
>          > - Create a src/monitoring directory for the new source files
>         and templates.
>          > - Increment counters and timers throughout the code.
>          > - Use histograms to compute latency percentiles, heatmaps, etc.
>          >
>          > Is this a good idea? Any objections or suggestions?
>          >
>          > Thanks,
>          > Bjorn
>          >
>          >
>          > _______________________________________________
>          > Devel mailing list -- devel@lists.nfs-ganesha.org
>         <mailto:devel@lists.nfs-ganesha.org>
>          > To unsubscribe send an email to
>         devel-leave@lists.nfs-ganesha.org
>         <mailto:devel-leave@lists.nfs-ganesha.org>
>          >
>         _______________________________________________
>         Devel mailing list -- devel@lists.nfs-ganesha.org
>         <mailto:devel@lists.nfs-ganesha.org>
>         To unsubscribe send an email to
>         devel-leave@lists.nfs-ganesha.org
>         <mailto:devel-leave@lists.nfs-ganesha.org>
>
>     _______________________________________________
>     Devel mailing list -- devel@lists.nfs-ganesha.org
>     <mailto:devel@lists.nfs-ganesha.org>
>     To unsubscribe send an email to devel-leave@lists.nfs-ganesha.org
>     <mailto:devel-leave@lists.nfs-ganesha.org>
>