Nagios external commands (2)

Tobias Mucke tobias.mucke at googlemail.com
Wed Sep 26 08:17:31 CEST 2007


Hi Andreas, hi list,

thanks for this good thread and your time.

> > Working with the External Commands documentation at nagios.org I got
> > the impression that it is written manually and stored in some kind of
> > content management system.
>
> True. This must be so, or the docs might not match the reality in the
> code.

Hopefully, the docs do not match the reality in the code! There are
some unlovely mistakes in it. Until today I thought this are just
typos, because code and docs are maintained apart.
My approach would be to maintain the XML file with two ideas behind.

- it is an independent program understandable format describing the
supported commands
- it can be used to generate a more readable documentation

> > With a XML file describing the Nagios
> > External Command Interface you could generate the documentation e.g.
> > by XSLT.
> > Why should you do that? Because it would be possible to send patches
> > for the documentation in a program understandable diff format.
> >
>
> But it already is. The program creating such program understandable diffs
> is called (surprise, surprise) "diff". The program that understands those
> diffs is called "patch". ;-)

Surprise: I know these programs. ;-) And I would use them to send
patches for the documentation. But I am not convinced that this will
work fine. How to send patches for files generated by a content
management system?
Besides that this would only correct the documentation not the API. I
am convinced that documentation has to be generated out of the program
code, so documentation always matches the API and every change to the
API is reflected immediatly in the documentation. Other programs are
using the API. How can they handle changes without changing the
program code? With a format independent description of the API. These
ideas result in: API -> description -> documentation.

>
> Seriously though, like I said above, the docs need to be maintained along
> with the code, or the doc-to-code relation might get broken for a certain
> revision of either.

Agreed, this is the right approach. But there is no description of the
API yet, so I decided to start with it. In future the description
should be generated automatically out of the API. That's why I think
it is important to Nagios and this mailing list.

> > Actually we are going to work on a concept for an even more
> > distributed environment. Right now all distributed monitoring systems
> > are located at one datacenter location. But we want to roll out Nagios
> > to other datacenters at remote locations too. The idea is to give the
> > remote datacenter as much as autonomy as possible e.g. keep
> > notifications local at each location.
>
> Sounds like a good idea. I'd use a NEB to propagate the commands if I
> were you, or a separate program altogether.

What about NEBs, are they incorporated within the Nagios code base
after some discussions and code quality reviews, e.g. like kernel
modules?

> > Nethertheless we want just one central console to control the Nagios
> > infrastructure, for example to enable / disable notifications. So you
> > have to control Nagios by an interface you can send commands to from a
> > central system. These commands should be checked if they are correct
> > before they are passed into the pipe.
> >
>
> Why? They must still be checked for correctness inside Nagios. Adding
> another point of failure might mean you get bitrot in the xml-doc, so
> your commands (which nagios would understand perfectly) don't go through.
> Otoh, if you update the xml-doc prematurely, you would get "OK" from the
> XML validation thingie and then Nagios wouldn't grok it anyways.

Because of multiple reasons. First of all: all input should be checked
before passed to a program, so just writing a program which is able to
write anything to the pipe you want to would be bad.
Second: A tool should support its users and give some explanations
about its usage, in this case some informations about the API.
Third: Quality control and versioning. How is versioning of the API
today realized? Just by changing the documentation? What about
commands which are under development and should not be used in a
production environment? How to you test the API after changes or
before a new Nagios release?
Fourth: you need an instance of control e.g. access control. The
Nagios webinterface works with a simple authentication scheme:
contacts are allowed to do everything. This instance does not exist
anymore if you are at the command line.

> > I thought the pip will vanish into thin air in Nagios 3.0. So this is not a
> > bottleneck anymore.
> >
> The FIFO stays. It'll no longer be used for child processes to pass service
> results to the grand-parent, but it'll still be used for external commands.
>

Thanks for pointing this out.

> However, assuming your implementation doesn't touch the nagios core, it won't
> be a problem since you can then convert the XML-command into a format that
> nagios groks, after having stripped the XML overhead.

I don't understand why you think that XML is overhead. XML is just
used to describe the API. Which format would you prefer? No, it won't
touch the core at first. But I would like to see some rethinking
concerning this approach: API -> description -> documentation.
This would open a chance to improve the quality of the API and
documentation and could play an important part in future developments.

Tobias

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/




More information about the Developers mailing list