Service checks not being executed

Darren Gamble Darren.Gamble at sjrb.ca
Tue Sep 17 19:50:15 CEST 2002


Good day,

> I was one of the other users that also experienced this behavior.  I
> still experience it on the 1.0b5 (on RH 7.3).   I now have 
> Nagios set to
> perform host checks for 5 minutes, although I experienced this issue 
> equally so when it was set to the default parameters.
> 
> How did you determine that it was hung processing just the host checks
> on the handful of devices?  In other words, what commands did 
> you run to
> capture that information?  Thanks.
> 
> Nolan

Well, I was just tail-ing the nagios.log when all of this was happening to
watch for soft alerts, as I was expecting dozens of services to flare up.  I
noted that no service failures were logged while it was doing its host
checking of one of the machines that was no longer reachable.  When Nagios
had finished its host checking (emphasis on host checking, not a service
checking) and marked the host as down, it would resume logging failed
services until it ran into another service of another host that was also
"down", which due to scheduling order, tended to be near each other.
Basically, most of my service checks didn't get run during all of this.  I
should note that only 7 hosts out of 203 were actually unreachable, but the
other hosts' services didn't get checked.

Looking at the times on the Service Detail page during all of this confirmed
that the service checks weren't being done.  Using the .cgi's to reschedule
the event had no affect.

I distinctly remember looking at the log when I first had this problem
months ago, and noting very little activity in it, so I am pretty sure this
is not the same cause (but perhaps they were related).

Sounds like a bug to me, but, I'm hoping to get some feedback before I post
to the devel list.

============================
Darren Gamble
Planner, Regional Services
Shaw Cablesystems GP
630 - 3rd Avenue SW
Calgary, Alberta, Canada
T2P 4L4
(403) 781-4948


-------------------------------------------------------
This SF.NET email is sponsored by: AMD - Your access to the experts
on Hammer Technology! Open Source & Linux Developers, register now
for the AMD Developer Symposium. Code: EX8664
http://www.developwithamd.com/developerlab




More information about the Users mailing list