Hierarchical host schedule queuing

Marc Powell marc at ena.com
Fri Mar 11 04:38:20 CET 2005



> -----Original Message-----
> From: nagios-users-admin at lists.sourceforge.net [mailto:nagios-users-
> admin at lists.sourceforge.net] On Behalf Of Shawn Iverson
> Sent: Thursday, March 10, 2005 6:39 PM
> To: nagios-users at lists.sourceforge.net
> Subject: [Nagios-users] Hierarchical host schedule queuing
> 
> Greetings!
> 
> While simulating a network failure to test my nagios setup, I noticed
> that nagios (using version 1.2) does not hierarchically proceed to
check
> upstream hosts following when it concludes that a host is down hard.
> 

[snip snip]
 
 
> What would perhaps would be more efficient in terms of outage
discovery
> would happen as follows.  Server A is discovered to be down, but
nagios
> witholds sending an alert for the moment.  It halts its normal
> scheduling queue and begins a temporary hierarchical scheduling queue,
> scheduling the hosts between nagios and the suspect server, starting
> with the closest one and ending with the farthest one and taking into
> account redundant links.  It then processes this queue, discovers that
> Router A is the actual problem sees that no other path exists to
Server
> A, sends an alert for Router A, and revises the alert for Server A as
> unreachable.  It then revises its normal queue to exclude the hosts
just
> checked hierarchically and proceeds normally.  Anything else found
> behind Router A is henceforth correctly marked as unreachable.  In
fact,
> everything can logically be determined to be unreachable behind Router
A
> after such a test and can then be updated instantly.

[snip snip]

> If a newer version of nagios already supports this, then great!  If
not,
> perhaps I can assist in creating the necessary code to make this extra
> logic possible.  This program is too good not to have such a feature.

It sounds to me that you're describing a feature that has been in
Nagios, and Netsaint prior, for years. If your idea is somehow
different, can you clarify?

http://nagios.sourceforge.net/docs/1_0/networkreachability.html

"Monitoring Remote Hosts 

Checking the status of remote hosts is a bit more complicated that for
local hosts. If Nagios cannot monitor services on a remote host, it
needs to determine whether the remote host is down or whether it is
unreachable. Luckily, the <parent_hosts> option allows Nagios to do
this. 

If a host check command for a remote host returns a non-OK state, Nagios
will "walk" the depency tree (as shown in the figure above) until it
reaches the top (or until a parent host check results in an OK state).
By doing this, Nagios is able to determine if a service problem is the
result of a down host, an down network link, or just a plain old service
failure.

DOWN vs. UNREACHABLE Notification Types 

I get lots of email from people asking why Nagios is sending
notifications out about hosts that are unreachable. The answer is
because you configured it to do that. If you want to disable UNREACHABLE
notifications for hosts, modify the notification_options argument of
your host definitions to not include the u (unreachable) option. More
information can be found in this FAQ."

--
Marc


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id396&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list