Nagios occasionally does not send notifications when a service goes down

Andy Barker andrew.barker at nottingham.ac.uk
Tue Feb 22 10:42:19 CET 2005


My *guess* would be along the lines that your contacts (in contacts.cfg)
are only getting notifications sent between certain times usually
"workhours" (which is initially 9am-5pm weekdays). This is configured in
the timeperiods.cfg .

Either that or possibly the host in question is using the workhours time
period in it's definition.

Andy



On Mon, 2005-02-21 at 11:53 -0600, Toby Kraft wrote:
> 
> Hi all, 
> 
> I've been using Nagios 1.2 (and Netsaint before) with some clients for
> a while.  One installation (on Fedora Core 2) has an issue where a
> service will go down, but Nagios does not send any notification. 
> 
> The service check is a simple tcp port check, the host_alive_check is
> *default (ping), the host can be pinged.  This host has one and only
> one service.  It's a pretty vanilla install and everything works fine
> most of the time. 
> 
> This past weekend, a host went down.  No notifications were sent.
>  Monday morning the staff came in, saw the host was down and restarted
> it.  After they restarted the target host, Nagios then sent out a
> bunch of Host Down alerts followed by a Host Up alert.  Notifications
> for this server or host were NOT disabled (nagios.log archives show
> they were enabled on 2/9/05). 
> 
> Okay now you're saying - it's your mail server.  But Nagios did not
> log any notifications at the time of the problem! 
> 
> The Host Alert History shows: 
> Sun Feb 20 00:00:00 CST 2005 to Mon Feb 21 00:00:00 CST 2005  
> 
> [02-20-2005 18:08:43] SERVICE ALERT: ucisvr5.champlabs.com;Sandbox -
> DB;CRITICAL;HARD;1;Connection refused or timed out
> [02-20-2005 18:08:43] HOST ALERT: ucisvr5.champlabs.com;DOWN;
> HARD;3;/bin/ping -n -U -c 1 ucisvr5.champlabs.com
> [02-20-2005 18:08:40] HOST ALERT: ucisvr5.champlabs.com;DOWN;
> SOFT;2;/bin/ping -n -U -c 1 ucisvr5.champlabs.com
> [02-20-2005 18:08:37] HOST ALERT: ucisvr5.champlabs.com;DOWN;
> SOFT;1;/bin/ping -n -U -c 1 ucisvr5.champlabs.com 
> 
> The Host Notification History shows: 
> Sun Feb 20 00:00:00 CST 2005 to Mon Feb 21 00:00:00 CST 2005  
> No notifications have been recorded for this host in this archived log
> file  
> 
> The Service Alert History shows: 
> Sun Feb 20 00:00:00 CST 2005 to Mon Feb 21 00:00:00 CST 2005  
> [02-20-2005 18:08:43] SERVICE ALERT: ucisvr5.champlabs.com;Sandbox -
> DB;CRITICAL;HARD;1;Connection refused or timed out  
> 
> The Service Notification History shows: 
> Sun Feb 20 00:00:00 CST 2005 to Mon Feb 21 00:00:00 CST 2005  
> No notifications have been recorded for this service in this archived
> log file  
> 
> It seems that this occurs after Nagios has been up and running for a
> while.  The system and Nagsio have been up for 11 days which doesn't
> seem like a long time. 
> 
> Mainly just fishing for any ideas on what could cause this or how to
> troubleshoot the problem.  It would be nice if Nagios logged some info
> when it processes an event and then decides NOT to send a
> notification, like "Notification for event xxxx suppressed because
> yyyyy" or some such. 
> 
> Thanks for listening.  I'll check into any debug and/or logging
> options. 
> 
> Toby 
> 



This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list