Can passive results cause active check rescheduling?

Jim Avery jim at jimavery.me.uk
Tue Jan 24 18:25:20 CET 2012


On 24 January 2012 15:07, Craig Stewart <Craig.Stewart at corp.xplornet.com> wrote:
> Good day!
>
> Here is a question.  Can the processing of a passive service check
> schedule a pending active service check?
>
> Here's a simple example from a distributed monitoring setup.
>
> Lets say we have a service called check_alive that simply pings a host.
>  It's defined on the probe and the central server.  On the probe, it's
> scheduled to check every 60 seconds, and submit the results to the
> central server.  On the central server, it's scheduled to check every
> 120 seconds.  Is there a way to cause the following behaviour to occur?
>
> - Probe checks and submits the results to central.  It reschedules the
> check for 60 seconds from now.
>
> - Central receives and processes the results (assume an okay state here)
> and reschedules it's active check that was going to run within short
> time frame to 120 seconds from now, without running the active check.
>
> - 60 seconds later the probe checks and submits (still in an okay state)
> it's check to the central server.
>
> - The central server receives and processes the check result, and
> reschedules (postpones) it's active service check for 120 seconds from
> now, without actually running it.
>
> - Repeat until a non-okay result comes in.
>
>
>
> This way if the probe goes silent, the central server will pick up
> monitoring nearly seamlessly. I know I can get the results I want using
> the check freshness option.
>
> What is happening now, if I enable active service checks on the central
> server is that the passive result comes in, is processed and the active
> service check is run at exactly the scheduled time.  There is no
> postponing the central server's check time.
>
> There are a couple of problems.
>
> 1) My central server also does active service checking for some devices
> that are not associated with a probe.  I can build these to have a
> separate template, but I'm already getting into template complexity.  I
> don't want to have to have two templates, one for local and one for
> remote monitoring of each device type.
>
> 2) I have an individual that doesn't like to see red on the Nagios
> interface.  They are sufficiently far up the food chain that all I can
> really do is say "Yes sir, yes sir, three bags full sir" and hope they
> forget about it.  All the 10000 plus services show up in the Nagios
> interface as disabled and have a red background.  Please assume any
> valid arguments on my part have been made.
>
> None of my reading has suggested that this behaviour is possible, but I
> thought I'd put it out there and ask.


My understanding of what Nagios does is it schedules the active check,
it goes in to the schedule queue with the specific time it's scheduled
to run and that is the time it will run regardless of what passive
checks come in in the mean time.

I don't understand why it's a problem to you to have the active checks
run in their regular times under the control of the Nagios scheduler
with passive checks coming in too at whatever times they come in.
Granted you will fairly often have the two checks run at almost the
same time, but does that cause you a problem?

You're right you could use freshness checking.  Is there a particular
reason why you chose not to?

------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list