[Nagios-users] Nagios 2.0 Event Broker and DB Support

Mooney, Ryan ryan.mooney at pnl.gov
Fri Aug 1 19:22:27 CEST 2003


You could make the service * to be versioned like:
  typdef struct {
    uint8_t version=MYVERSION;
    blah blah blah
  } service;

Then check what the module returns like:
  if((uint8_t)service != MYVERSION) {
     moan_and_complain();
     kick_module();
  }

(Yeah yeah a union would probably be clearer, and there are 50 ways to do 
this, and yes they are all dandy :)

Its a little hacky, but not too bad.

The biggest concern I have with this model is that a poorly behaved module is 
much more likely to corrupt the core process by doing something evil 
(unintentionally I'm sure) than in your other model. 

I'm sort of working on a simular problem and decided to create an API that
the "clients" linked to handle the nasty parts for them (so each "client"
would be an independent entity/process unless someone wanted to create a
"super server client" that loaded modules).  This would have each "client" 
create a shared mem queue (or a sysv queue - haven't fully decided which is 
less evil for this app) and then hand a pointer (literally or not) for that 
off to the "server" and register the events they want dumped into it.  This 
would in theory allow me to have multiple pre-forked backend processes 
handling a single queue if I'm doing something with the data that blocks a 
lot without stacking up a lot of processes or having to fork for each event 
(which can be really bad if you have a lot of events).  

This is (in essense) very simular to your socket model with the main difference
being that I don't care what the clients do (queue's full? Tough, message dropped!
although to be polite we might increment a counter they can query).  They are
completely decoupled from any of the server code (except thier interaction via
the well defined API).  This puts a little more burden on the module writers, but
cleans up the servers interactions (I don't have to worry about timing them out,
or other bad things they might do).

> -----Original Message-----
> From: Ethan Galstad [mailto:nagios at nagios.org]
> Sent: Thursday, July 31, 2003 10:02 PM
> To: nagios-users at lists.sourceforge.net;
> nagios-devel at lists.sourceforge.net
> Subject: [Nagios-users] Nagios 2.0 Event Broker and DB Support
> 
> 
> Sorry for the crosspost, but the nagios-devel list is usually pretty 
> quiet when I request comments about new features I'm implementing.  
> This one is bigger than most, so I wanted to reach more people.  This 
> is a bit long, so bear with me...
> 
> I am almost complete with coding for 2.0.  Two big things remain: the 
> event broker and DB support (which is currently broken).
> 
> My original intent was to develop the event broker as a separate 
> application, tying it to Nagios with a unix domain socket.  Nagios 
> would send the event broker information about the everything that was 
> going on (service checks, downtime, flapping, log entries, etc.).  
> The event broker would be able to load user-developed modules (object 
> files) at runtime and pass various types of Nagios data to them for 
> processing.  This is all fine and good.  I have a working prototype 
> of the event broker that does just this and seems to work okay.  I 
> got to thinking that it was rather stupid to develop a separate 
> application for this when I could simply have Nagios load user-
> developed modules itself.  Doing this would give the modules the 
> benefit of having access to internal Nagios structures and functions 
> (which is good and bad - see below).
> 
> Here's an overview of how it would work:
> 
> - Nagios would load user-specified modules (object files) at startup 
> using the dlopen() function.
> 
> - Nagios would call the module's initialization function (the name of 
> which would be standardized).
> 
> - The module's init function would register for various types of 
> Nagios event data (service checks, host checks, log entries, event 
> handlers, etc.) using callback functions.
> 
> - When Nagios encounters an event for which a module has registered a 
> callback function, Nagios would call that module's function and pass 
> it data relevant to the event.  The module is then free to do 
> whatever it wants to that event data.  An example might be to log 
> service checks, performance data and log entries to MySQL, etc.
> 
> - Before shutting down, Nagios calls the module's de-init function.  
> This allows the module to clean up any resources it may be using.
> 
> 
> Seems good in theory.  Heck, might even work okay.  However, there's 
> a big problem I have with it.  If I implement things this way, the 
> user-developed modules would have access to internal Nagios data 
> structures and functions.  This is not necessarily bad, as ill-
> behaved modules would not be adopted by too many people. :-)  
> However, modules that might be compiled and working fine
> for Nagios 2.0 might segfault under future versions if the internal 
> data structures change.  Here's an example of what I mean:
> 
> User module registers for Nagios service check data using its 
> mymod_handle_servicecheck() function, which has a prototype of:
> 
> 	int mymod_handle_servicecheck(service *);
> 
> The service struct is an internal Nagios structure definition which 
> changes between Nagios versions.  If the user module is compiled for 
> use with Nagios 2.0 and it's definition of the service struct, it 
> will have problems if it is not recompiled for future versions of 
> Nagios.
> 
> Off the top of my head, I could overcome this by requiring that the 
> user modules indicate (by calling a function) what version of Nagios 
> they are compiled for.  If they report anything but the current 
> version (or do not report at all), unload them so they can do no 
> harm.
> 
> I'm afraid I'm a bit over my head on how to handle this one.  Some of 
> you developers out there must have experience with this type of 
> thing.  If so, how did you handle it?  What would you recommend?  
> Comments, suggestions, flames?  Is there a better way to accomplish 
> this?  Speak up now.
> 
> What does this have to do with DB support, you ask?  Well, if I 
> implement the event broker as I have proposed I will be yanking 
> native DB support out of Nagios completely.  You can then write a 
> module to log to a DB if you want. :-)
> 
> PS: I had originally planned on exposing almost all of Nagios' data 
> and events to the broker, but I may have to scale that down if I plan 
> on getting 2.0 out this century.  Perhaps just support for:
> 
> 	- Service and host checks
> 	- Event handlers
> 	- Log data
> 
> This would allow the development of modules to log check information, 
> performance data, and log file data to a DB (or whatever).
> 
> 
> Ethan Galstad,
> Nagios Developer
> ---
> Email: nagios at nagios.org
> Website: http://www.nagios.org
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email sponsored by: Free pre-built ASP.NET sites including
> Data Reports, E-commerce, Portals, and Forums are available now.
> Download today and enter to win an XBOX or Visual Studio .NET.
> http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet
> _072303_01/01
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS 
> when reporting any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
> 


-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01




More information about the Users mailing list