[naemon-dev] Ideas about future features

Matthias Eble psychotrahe at gmail.com
Sat Dec 28 02:46:50 CET 2013


>>1) have a feature to monitor per metric rather than per check_command.

> I've been thinking about plugins and plugin architecture a bit.
>
> The nagiosplugins project is talking about a new threshold format -
> https://www.nagios-plugins.org/doc/new-threshold-syntax.html - to achieve
> the same thing you want to solve in-core.

It's not the same thing. The problem described in 1) cannot be solved
in the plugins

> I think the nagiosplugins approach - basically, update all plugins to
> support a much more complex (though easier to understand) threshold format,
> because the old one was too complicated - is wrong.

I am a member of the plugin team - OK almost entirely inactive
for some years now.
I am also not sure about the new format. It's a lot of work to convert them
but this can happen from time to time.
What you get is a more consistent command line and monitoring.
Another benefit is that enhancements can be made without breaking compatibility
because check_something has two more metrics with version X.

What I was talking about is a new *output* format, that would be parseable
by the core, which would create sub-services at run time.

>
> But I'm also not sure how far into the core the I'd want to put it. What if
> we, instead of either change the core or the plugins, write a plugin wrapper
> that takes a threshold as described by nagiosplugins and a plugin command
> line? It would simply parse the perfdata from the plugin, the threshold from
> the CLI, throw away the plugin exit code, and send a new, "imploved" exit
> code and stdout to naemon?
>
> I feel this plugin wrapper approach would take the least amount of work to
> implement. Which problems would it leave unsolved?

I can't see how this would solve the 1) "acknowledge one problem,
acknowledge the entire service"
problem. If a plugin checks three metrics, acknowledging a problem of one metric
disables notifications for the other two also.

IMO it should be possible to have each metric handled as its own
"service" with separate logging,
actions and so on. But not manually maintained in the config file.

The new threshold syntax would go into that direction from the
specification what is to be monitored,
the output format would specify how the results are passed to the core
and the core would have to know, how to demultiplex the result to the
"per-metric-services".


>> What do you think? What's the focus of the dev-team?
>
>
> So far, it seems the focus is mostly on cleanup. There's just *so* *much*
> ancient *crap* lying around.

Got that

> As far as I'm going to go in terms of longer-term vision and
> the-way-to-go-iness, I'd like to modularize the crap out of the core.
> It would be neat to lift > out a bunch of nagios functionality into a
> bundle of preinstalled modules.
>...
> That would require
> some extra module functionality - modules would have to be able to add
> configuration statements to the config (global and per-object) for
> configuring flapping thresholds, and modules would have to be able to couple
> state (is_flapping, last 20 check results) with the object and have it
> persist between restarts. Now, what if this was the easiest, most concise,
> and easiest-to-find-out-how way to do it?
> ...
> I think a module should be able to do all these things

That'd be great. I just asked myself if more flexibility would also require some
per-module bits in the UI.


> - and if it could do
> that, and if flapping was a module, I would not ever again have to worry
> about flapping in the remaining core, nor would I wonder where all special
> cases for flapping are handled - heck, I could even see if the flapping
> feature has tests and how extensive they are, just from looking at
> github.com/naemon/flapping !

mmh. I think that doesn't necessarily get better by being more modular. All
the above could be achieved with a better structured codebase, couldn't it?

> tl;dr: naemon should allow contributors to write modules that are much more
> powerful than today's broker modules, to make it possible and easy to write
> a module to add seemingly built-in functionality, like metrics and
> exceptions - then, we could start to write such modules, go crazy, and see
> what comes out!

One of the questions I ask myself is, if naemon will be a monitoring
solution or a
pluggable monitoring platform or both.
"write modules, go crazy and see what comes out" would be a cool thing but
I think users should still have a reference where they can start whith
a "complete" solution.
Would this be the task of a "Naemon Distro" like OMD?


More information about the Naemon-dev mailing list