R: Nagios 2.0 Event Broker and DB Support

Creator creator at mindcreations.com
Tue Aug 5 02:08:54 CEST 2003


Hi Ethan,

I've read up all replies that have been posted until now after your request
for comments and I've found them interesting.
Although they help a lot while taking the final decision, I would like to
add a consideration to the "obsolescency" factor.

My contribute is an idea to let modules to became less obsolete as new
versions of nagios are released.

The approach of versioning I think it is the only straight way you have to
check for module compatibility... but is it really necessary to discard a
module if you, for example, add new data without changing the meaning of the
existing one?

Take this example:

service structure of v1.1 is N bytes long with M variables inside.
In version v1.2 you add a variable (M+1) of 4 bytes (N+4) to the structure
appending it after the last variable in the service structure. A daily
scenario for a project like nagios, isn't it? :) Well, referring to what the
others are saying, now you have to update the "structure" version, or the
"API" version or simply the nagios version to make the check for modules
compatibility.

You have two paths to follow to check the compatibility of an old module:

1) Deny access to the new nagios structures
2) Try to accomodate the "older" module to make it feel comfortable using
the new nagios version.

The point 2 is not always easy or fatigueless to implement but it can save a
lot of recompiling/upgrading issues to the nagios community (mainly for
production sites). The great impact on modules during an upgrade would be
reduced and it will give some time to system administrators to get the
updated modules (which cannot be available at that moment). The urgency to
have the last version installed may be very important (bug fixes, security
holes, list your own motive, etc).

The point 1 may be the only reply in case of major structure changes... but
this is another story :(

However, let's get back to the implementation details:

I suggest a compatibility mode API that remaps API calls to old ones and
casts new structures to old ones.


Example
-------

struct service **services; /* Global and already populated */

struct service_v11     /* old 1.1 structure */
{
 ... PREVIOUS VERSION VARS ...
};

struct service     /* new 1.2 structure */
{
 ... PREVIOUS VERSION VARS ...
 long created_by; /* NEW: Pointer to the contact who has added this service
to the host */
};

// API Function example to retrieve service number N

struct service *getService(unsigned long number)
{
 if(MODULE_VER == 11)
   return (struct service_v11 *) services[number];     /* Cast the new
structure to the old one */
 else      /* Here you can put more checks for older versions too, just
decide the backward compatibility level you want to achieve
   return services[number];     /* The module is aligned with our version,
return the actual structure */
}

I'm sure this is a too simplistic example to cover all cases but is only to
explain the bare concept.

In addition to this, if the module has to write back the structure to nagios
memory I suggest to use an "Update" callback function rather than letting
the user write the updated data directly to nagios memory possibly messing
up things.
You can do sanity checks and even remap old structures with default values
for variables not directly handled by older structures, using the above
technique (ie. putService(); ).

This problem reminds me the old problem to copy data to clients or giving
them direct access.
It is an implementation choice that, as always, has pros and cons. If you
provide an access function to data you have to return copied data to the
modules and put it back with another access function. This is slow but more
robust. If you provide direct access to data structures it will be fast, but
a minimal error can make nagios crash in a very bad way.
Of course the majority of modules will only read data, but who knows... :)

Any way the choice of the "registration" method it is up to you (ie. what
the module has to do to inform nagios about the version he handles): you
already have many suggestions for that on the list :)

A different story is calling a function who has changed his behaviour.
To handle this an idea would be to bind the callback functions to backward
ones during the module registration process.

Declare an array of pointers to generic callback functions and assign them
to current callback functions. Then, on module registration, overwrite each
pointer with the correct function regarding the current module version.
This way you can even only remap a single function if you have made a minor
change.

Furthermore if you intend to implement backward compatibility even if you
add or delete some arguments for callback functions, you can use the
variable arguments list mechanism of C in place of the classic monolithic
prototype declaration... but I admit this is a little tricky to design and
to use :)

I know, I'm a fanatic of void *void_func( void *, ... ) functions :]

Hope you find this of help or at least of inspiration.

Bye,

------------

Stefano Coletta
http://www.mindcreations.com 

> -----Messaggio originale-----
> Da: nagios-devel-admin at lists.sourceforge.net 
> [mailto:nagios-devel-admin at lists.sourceforge.net] Per conto 
> di Ethan Galstad
> Inviato: venerdì 1 agosto 2003 7.02
> A: nagios-users at lists.sourceforge.net; 
> nagios-devel at lists.sourceforge.net
> Oggetto: [Nagios-devel] Nagios 2.0 Event Broker and DB Support
> 
> 
> Sorry for the crosspost, but the nagios-devel list is usually pretty 
> quiet when I request comments about new features I'm implementing.  
> This one is bigger than most, so I wanted to reach more people.  This 
> is a bit long, so bear with me...
> 
> I am almost complete with coding for 2.0.  Two big things remain: the 
> event broker and DB support (which is currently broken).
> 
> My original intent was to develop the event broker as a separate 
> application, tying it to Nagios with a unix domain socket.  Nagios 
> would send the event broker information about the everything that was 
> going on (service checks, downtime, flapping, log entries, etc.).  
> The event broker would be able to load user-developed modules (object 
> files) at runtime and pass various types of Nagios data to them for 
> processing.  This is all fine and good.  I have a working prototype 
> of the event broker that does just this and seems to work okay.  I 
> got to thinking that it was rather stupid to develop a separate 
> application for this when I could simply have Nagios load 
> user- developed modules itself.  Doing this would give the 
> modules the 
> benefit of having access to internal Nagios structures and functions 
> (which is good and bad - see below).
> 
> Here's an overview of how it would work:
> 
> - Nagios would load user-specified modules (object files) at startup 
> using the dlopen() function.
> 
> - Nagios would call the module's initialization function (the name of 
> which would be standardized).
> 
> - The module's init function would register for various types of 
> Nagios event data (service checks, host checks, log entries, event 
> handlers, etc.) using callback functions.
> 
> - When Nagios encounters an event for which a module has registered a 
> callback function, Nagios would call that module's function and pass 
> it data relevant to the event.  The module is then free to do 
> whatever it wants to that event data.  An example might be to log 
> service checks, performance data and log entries to MySQL, etc.
> 
> - Before shutting down, Nagios calls the module's de-init function.  
> This allows the module to clean up any resources it may be using.
> 
> 
> Seems good in theory.  Heck, might even work okay.  However, there's 
> a big problem I have with it.  If I implement things this way, the 
> user-developed modules would have access to internal Nagios data 
> structures and functions.  This is not necessarily bad, as 
> ill- behaved modules would not be adopted by too many people. :-)  
> However, modules that might be compiled and working fine
> for Nagios 2.0 might segfault under future versions if the internal 
> data structures change.  Here's an example of what I mean:
> 
> User module registers for Nagios service check data using its 
> mymod_handle_servicecheck() function, which has a prototype of:
> 
> 	int mymod_handle_servicecheck(service *);
> 
> The service struct is an internal Nagios structure definition which 
> changes between Nagios versions.  If the user module is compiled for 
> use with Nagios 2.0 and it's definition of the service struct, it 
> will have problems if it is not recompiled for future versions of 
> Nagios.
> 
> Off the top of my head, I could overcome this by requiring that the 
> user modules indicate (by calling a function) what version of Nagios 
> they are compiled for.  If they report anything but the current 
> version (or do not report at all), unload them so they can do no 
> harm.
> 
> I'm afraid I'm a bit over my head on how to handle this one.  Some of 
> you developers out there must have experience with this type of 
> thing.  If so, how did you handle it?  What would you recommend?  
> Comments, suggestions, flames?  Is there a better way to accomplish 
> this?  Speak up now.
> 
> What does this have to do with DB support, you ask?  Well, if I 
> implement the event broker as I have proposed I will be yanking 
> native DB support out of Nagios completely.  You can then write a 
> module to log to a DB if you want. :-)
> 
> PS: I had originally planned on exposing almost all of Nagios' data 
> and events to the broker, but I may have to scale that down if I plan 
> on getting 2.0 out this century.  Perhaps just support for:
> 
> 	- Service and host checks
> 	- Event handlers
> 	- Log data
> 
> This would allow the development of modules to log check information, 
> performance data, and log file data to a DB (or whatever).
> 
> 
> Ethan Galstad,
> Nagios Developer
> ---
> Email: nagios at nagios.org
> Website: http://www.nagios.org
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email sponsored by: Free pre-built ASP.NET sites 
> including Data Reports, E-commerce, Portals, and Forums are 
> available now. Download today and enter to win an XBOX or 
> Visual Studio .NET. 
> http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet
_072303_01/01
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel



-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01




More information about the Developers mailing list