2.0 CVS patch for Aggressive Stale Check directive (and the option to disable it)

Mueller, Karl KMueller at netsuite.com
Sun May 16 20:17:32 CEST 2004


After running this patch for a while, there's a few issues I have to resolve with it, so don't add it to CVS yet, Ethan.  I'll send an update when those issues are resolved.  

Karl


-----Original Message-----
From: nagios-devel-admin at lists.sourceforge.net
[mailto:nagios-devel-admin at lists.sourceforge.net]On Behalf Of Mueller,
Karl
Sent: Thursday, May 13, 2004 2:53 PM
To: Nagios Developers
Subject: [Nagios-devel] 2.0 CVS patch for Aggressive Stale Check
directive (and the option to disable it)


The patch is located here:

http://www.xney.com/nagios/nagios-2.0-aggressive_stale_checks.patch


We run a distributed nagios environment with 3 slave servers reporting
to 1 master server.  It works great, except for two scenarios:

1) The slave server has a service go down with dependent services
attached to it.  The slave server (correctly) stops running service
checks for the dependent services.  The master server stops getting
passive check results.  The service on the master eventually goes stale,
forcing an active check.

2) The slave server has a service (or host) taken down administratively
(for example, a reboot).  The slave stops submitting service check
information to the master.  The master executes forced active checks
when the service information goes stale.

In both situations, there are three somewhat serious consequences:

a) Checks are performed when they don't need to be, consuming resources
and clogging up status information.  (This is especially true if you
have a service that many other services depend on, which we do)

b) Notifications are sent out when not needed due to service being
marked down or it having failed dependencies

c) Handlers are executed on services that shouldn't have them running
(due to dependency failure or administrative down situation)


(Note: All these issues potentially affect host checks as well)


My patch creates a new service and host configuration directive:
"aggressive_stale_checks", which by default is enabled.  When it is
turned OFF, Nagios will check a few things before executing a forced
service or host check, just like a normal host or service check.   If,
for example, an execution dependency fails with aggressive stale checks
turned off, then the check will not be performed.

This would result in stale information, but that's OK, if it's what is
intended.  The same thing, after all, happens with active checks.

To preserve compatibility with the previous way freshness and stale
checks worked, the aggressive_stale_checks option is, by default, on.  

Let me know if there are any questions, comments, or concerns.

The patch applies to my locally modified CVS source, which has the
service dependencies patch/fix I submitted a few weeks ago as well.  I
just tried applying it against the May 13th CVS snapshot and it applied
cleanly, though.

Karl





-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id%62&alloc_ida84&op=ick
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel


-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id%62&alloc_ida84&op=click




More information about the Developers mailing list