Notifications or host checks stopped working

Andreas Ericsson ae at op5.se
Tue Oct 18 20:30:15 CEST 2005


Andrew Laden wrote:
> I just recently upgraded to 2.0b4.

From?

> Notifications were working ok when I
> first upgraded. 
> 

Not from 1.x then, since the macros have changed between the versions.

> Our company is having a DR test. So we shut down the routers connecting one
> of our sites.
> 
> The GUI shows mostly correct. The two routers are listed in Network outages,
> And it seems that the hosts that are children of those routers are all being
> marked as unreachable instead of down.
> 
> But I am seeing some oddities. It looks like host checks are no longer being
> scheduled at all. I have host escalations in place, and there are no
> notifications going out on the two down routers. Current Notification Number
> isnt increasing. They are in a Down Hard state, but current attempt is stuck
> at a 1/5 count. 
> 

Are they behind the outage, or are they the ones causingt the outage?

> 
> So, questions
> Is there a way to tell if host checks are being run?

Yes. By the status data age on the host detail view.

> They aren't in the
> scheduled queue. I set one of the down routers to up using a passive check.
> And it looks like even when the service for it went down, the host check
> never ran. Though when I forced the check, it ran ok.
> 

This is weird. I expect you've double-checked check_period for the host 
definitions?

> I had a host that was in an unreachable state. I ran a service check for
> that host that suceeded. The host went into a down state. But again, no
> further host checks seem to have been run. And no notifications have been
> sent out.
> 
> Any ideas where I can look for problems?
> 

You could try re-compiling Nagios with debug-output enabled (./configure 
--help to know which debug-options to enable) and then run the same 
scenario while running nagios in the foreground. This will produce quite 
a bit of output, so you'll likely want to pipe it through tee for later 
perusal as well.

Please don't post the debug output to the list though. If you need help 
with viewing it you can put it on a web-page somewhere and then submit a 
link. Sourceforge is quite busy enough without hauling 5mb files to 6000 
subscribers.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231


-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list