Nagios 2.0 Event Broker and DB Support

Ethan Galstad nagios at nagios.org
Fri Aug 1 07:02:03 CEST 2003


Sorry for the crosspost, but the nagios-devel list is usually pretty 
quiet when I request comments about new features I'm implementing.  
This one is bigger than most, so I wanted to reach more people.  This 
is a bit long, so bear with me...

I am almost complete with coding for 2.0.  Two big things remain: the 
event broker and DB support (which is currently broken).

My original intent was to develop the event broker as a separate 
application, tying it to Nagios with a unix domain socket.  Nagios 
would send the event broker information about the everything that was 
going on (service checks, downtime, flapping, log entries, etc.).  
The event broker would be able to load user-developed modules (object 
files) at runtime and pass various types of Nagios data to them for 
processing.  This is all fine and good.  I have a working prototype 
of the event broker that does just this and seems to work okay.  I 
got to thinking that it was rather stupid to develop a separate 
application for this when I could simply have Nagios load user-
developed modules itself.  Doing this would give the modules the 
benefit of having access to internal Nagios structures and functions 
(which is good and bad - see below).

Here's an overview of how it would work:

- Nagios would load user-specified modules (object files) at startup 
using the dlopen() function.

- Nagios would call the module's initialization function (the name of 
which would be standardized).

- The module's init function would register for various types of 
Nagios event data (service checks, host checks, log entries, event 
handlers, etc.) using callback functions.

- When Nagios encounters an event for which a module has registered a 
callback function, Nagios would call that module's function and pass 
it data relevant to the event.  The module is then free to do 
whatever it wants to that event data.  An example might be to log 
service checks, performance data and log entries to MySQL, etc.

- Before shutting down, Nagios calls the module's de-init function.  
This allows the module to clean up any resources it may be using.


Seems good in theory.  Heck, might even work okay.  However, there's 
a big problem I have with it.  If I implement things this way, the 
user-developed modules would have access to internal Nagios data 
structures and functions.  This is not necessarily bad, as ill-
behaved modules would not be adopted by too many people. :-)  
However, modules that might be compiled and working fine
for Nagios 2.0 might segfault under future versions if the internal 
data structures change.  Here's an example of what I mean:

User module registers for Nagios service check data using its 
mymod_handle_servicecheck() function, which has a prototype of:

	int mymod_handle_servicecheck(service *);

The service struct is an internal Nagios structure definition which 
changes between Nagios versions.  If the user module is compiled for 
use with Nagios 2.0 and it's definition of the service struct, it 
will have problems if it is not recompiled for future versions of 
Nagios.

Off the top of my head, I could overcome this by requiring that the 
user modules indicate (by calling a function) what version of Nagios 
they are compiled for.  If they report anything but the current 
version (or do not report at all), unload them so they can do no 
harm.

I'm afraid I'm a bit over my head on how to handle this one.  Some of 
you developers out there must have experience with this type of 
thing.  If so, how did you handle it?  What would you recommend?  
Comments, suggestions, flames?  Is there a better way to accomplish 
this?  Speak up now.

What does this have to do with DB support, you ask?  Well, if I 
implement the event broker as I have proposed I will be yanking 
native DB support out of Nagios completely.  You can then write a 
module to log to a DB if you want. :-)

PS: I had originally planned on exposing almost all of Nagios' data 
and events to the broker, but I may have to scale that down if I plan 
on getting 2.0 out this century.  Perhaps just support for:

	- Service and host checks
	- Event handlers
	- Log data

This would allow the development of modules to log check information, 
performance data, and log file data to a DB (or whatever).


Ethan Galstad,
Nagios Developer
---
Email: nagios at nagios.org
Website: http://www.nagios.org



-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01




More information about the Users mailing list