Nagios Notifications not being Sent Out

Goutos, Kevin kgoutos at libertymgt.com
Fri Oct 16 16:31:27 CEST 2009


Thanks for the reply Marc.

I do have some good news.  I did receive a notification last night for a
flapping alert.  However, it still is not sending out alerts from being
down, returning to up state, etc...

[1255678430] SERVICE FLAPPING ALERT: AUSTIN-LAPTOP;CPU Load;STOPPED;
Service appears to have stopped flapping (4.0% change < 5.0% threshold)
[1255678430] SERVICE NOTIFICATION: libertyadmins;AUSTIN-LAPTOP;CPU
Load;FLAPPINGSTOP (CRITICAL);notify-service-by-email;No route to host

If I look at this portion of the log....It seems there should be
notifications sent out, but none were..

[1255547104] HOST ALERT: CORP-NSS;DOWN;SOFT;1;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255547104] SERVICE ALERT: CORP-NSS;CPU Load;CRITICAL;HARD;1;No route
to host
[1255547104] SERVICE ALERT: CORP-NSS;NSClient++
Version;CRITICAL;HARD;1;No route to host
[1255547174] HOST ALERT: CORP-NSS;DOWN;SOFT;2;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255547234] SERVICE ALERT: CORP-NSS;Ping Test;CRITICAL;HARD;1;CRITICAL
- Host Unreachable (IP ADDRESS)
[1255547244] HOST ALERT: CORP-NSS;DOWN;SOFT;3;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255547314] HOST ALERT: CORP-NSS;DOWN;SOFT;4;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255547384] HOST ALERT: CORP-NSS;DOWN;SOFT;5;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255547454] HOST ALERT: CORP-NSS;DOWN;SOFT;6;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255547474] SERVICE ALERT: CORP-NSS;Used Disk Space;CRITICAL;HARD;1;No
route to host
[1255547474] SERVICE ALERT: CORP-NSS;System Uptime;CRITICAL;HARD;1;No
route to host
[1255547524] HOST ALERT: CORP-NSS;DOWN;SOFT;7;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255547584] SERVICE ALERT: CORP-NSS;Memory Usage;CRITICAL;HARD;1;No
route to host
[1255547594] HOST ALERT: CORP-NSS;DOWN;SOFT;8;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255547664] HOST ALERT: CORP-NSS;DOWN;SOFT;9;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255547734] HOST ALERT: CORP-NSS;DOWN;HARD;10;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255556204] HOST ALERT: CORP-NSS;UP;HARD;1;PING OK - Packet loss = 0%,
RTA = 10.40 ms
[1255556234] SERVICE ALERT: CORP-NSS;Ping Test;OK;HARD;1;PING OK -
Packet loss = 0%, RTA = 7.97 ms

I did remove the w and c options and replaced them with 'd' where you
noted.  


"I am assuming that the members of the libertyadminsgroup all use the  
generic-contact template you provided with minimal modifications. It  
wouldn't hurt to provide the complete object for this and a sample  
problem service from both objects.cache and status.dat when the  
problem occurs."

That is correct, the libertadminsgroup right now is just my E-mail
address and I'm using the generic contact template.

Here is something I don't understand...what I pasted below is from
objects.cache.  Why does that show service and host notifications set as
only f,s? Below that I'll show what I have for contacts.cfg 

Objects.cache

define contact {
	contact_name	libertyadmins
	alias	Liberty Admins
	service_notification_period	24x7
	host_notification_period	24x7
	service_notification_options	f,s
	host_notification_options	f,s
	service_notification_commands	notify-service-by-email
	host_notification_commands	notify-host-by-email
	email	kgoutos at libertymgt.com
	host_notifications_enabled	1
	service_notifications_enabled	1
	can_submit_commands	1
	retain_status_information	1
	retain_nonstatus_information	1
	}


Contacts.cfg

define contact{
      contact_name                    	libertyadmins
; Short name of user
	use					     	generic-contact
; Inherit default values from generic-contact template (defined above)
      alias                           	Liberty Admins		; Full
name of user
	host_notifications_enabled	    	1
	service_notifications_enabled   	1
	host_notification_commands	    	notify-host-by-email
	service_notification_commands   	notify-service-by-email
	host_notification_period		24x7
	service_notification_period		24x7
	host_notification_options		d,u,r,f,s
	service_notification_options		d,u,r,f,s 
	email            	              	kgoutos at libertymgt.com
; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ******
       }




This is from status.dat

hoststatus {
	host_name=AUSTIN-LAPTOP
	modified_attributes=0
	check_command=check-host-alive
	check_period=24x7
	notification_period=24x7
	check_interval=1.000000
	retry_interval=1.000000
	event_handler=
	has_been_checked=1
	should_be_scheduled=1
	check_execution_time=4.027
	check_latency=0.301
	check_type=0
	current_state=0
	last_hard_state=0
	last_event_id=906
	current_event_id=933
	current_problem_id=0
	last_problem_id=339
	plugin_output=PING OK - Packet loss = 0%, RTA = 20.01 ms
	long_plugin_output=
	
performance_data=rta=20.006001ms;3000.000000;5000.000000;0.000000
pl=0%;80;100;0
	last_check=1255702500
	next_check=1255702570
	check_options=0
	current_attempt=1
	max_attempts=10
	state_type=1
	last_state_change=1255701660
	last_hard_state_change=1255701660
	last_time_up=1255702510
	last_time_down=1255701600
	last_time_unreachable=0
	last_notification=0
	next_notification=0
	no_more_notifications=0
	current_notification_number=0
	current_notification_id=89769
	notifications_enabled=1
	problem_has_been_acknowledged=0
	acknowledgement_type=0
	active_checks_enabled=1
	passive_checks_enabled=1
	event_handler_enabled=1
	flap_detection_enabled=1
	failure_prediction_enabled=1
	process_performance_data=1
	obsess_over_host=1
	last_update=1255702560
	is_flapping=0
	percent_state_change=4.54
	scheduled_downtime_depth=0
	}


This is from Nagios.log....I simply unplugged "AUSTIN-LAPTOP" (My test
machine)

[1255703030] SERVICE ALERT: AUSTIN-LAPTOP;System
Uptime;CRITICAL;HARD;1;No route to host
[1255703030] SERVICE ALERT: AUSTIN-LAPTOP;Memory
Usage;CRITICAL;HARD;1;No route to host
[1255703030] SERVICE ALERT: AUSTIN-LAPTOP;Ping
Test;CRITICAL;HARD;1;CRITICAL - Host Unreachable (192.168.95.232)
[1255703040] SERVICE ALERT: AUSTIN-LAPTOP;NSClient++
Version;CRITICAL;HARD;1;No route to host
[1255703040] SERVICE ALERT: AUSTIN-LAPTOP;Used Disk
Space;CRITICAL;HARD;1;No route to host

That's all I see in the log, nothing about notifications or anything.




- Verify that notifications are enabled program-wide in nagios.cfg.  

I was able to confirm everything was enabled.

- Verify that it hasn't been disabled via the GUI (Program Status)

Also confirmed.

- Verify that notifications for the specific service haven't been  
disabled via the GUI (click on them and look or look for them in  
status.dat)

Confirmed.

- See the 'Query regarding Nagios notification' thread from yesterday  
so we don't have to repeat further.

I reveiwed this thread, double checked everything he was having trouble
with.


Thank you very much for the help! Please let me know if you need any
other information!



-----Original Message-----
From: Marc Powell [mailto:marc at ena.com] 
Sent: Thursday, October 15, 2009 5:36 PM
To: Nagios-users at lists.sourceforge.net users
Subject: Re: [Nagios-users] Nagios Notifications not being Sent Out


On Oct 15, 2009, at 3:40 PM, Goutos, Kevin wrote:

> Hello all,

Hello.

>  That shows a test host I've been using, I don't see anything in  
> there about sending out a notification though..don't know if I  
> should be. ..
>
>
> Oct 15 16:11:50 nagios nagios: SERVICE ALERT: AUSTIN-LAPTOP;CPU  
> Load;OK;HARD;1;CPU Load 2% (5 min average)
> Oct 15 16:11:50 nagios nagios: SERVICE ALERT: AUSTIN-LAPTOP;Memory  
> Usage;OK;HARD;1;Memory usage: total:2440.61 Mb - used: 402.64 Mb  
> (16%) - free: 2037.97 Mb (84%)
> Oct 15 16:11:50 nagios nagios: SERVICE ALERT: AUSTIN-LAPTOP;NSClient+ 
> + Version;OK;HARD;1;NSClient++ 0.3.6.818 2009-06-14
> Oct 15 16:11:50 nagios nagios: SERVICE ALERT: AUSTIN-LAPTOP;System  
> Uptime;OK;HARD;1;System Uptime - 1 day(s) 18 hour(s) 57 minute(s)
> Oct 15 16:11:50 nagios nagios: SERVICE ALERT: AUSTIN-LAPTOP;Used  
> Disk Space;OK;HARD;1;c:\ - total: 37.26 Gb - used: 21.69 Gb (58%) -  
> free 15.57 Gb (42%)
> Oct 15 16:12:00 nagios nagios: SERVICE ALERT: AUSTIN-LAPTOP;Ping  
> Test;OK;HARD;1;PING OK - Packet loss = 0%, RTA = 3.62 ms
> Oct 15 16:12:00 nagios nagios: HOST ALERT: AUSTIN-LAPTOP;UP;HARD; 
> 1;PING OK - Packet loss = 0%, RTA = 0.55 ms

These are all OK states. Do you have any examples of non-OK states  
when you expect a notification to have been sent? Please also provide  
a few entries prior to the state change and all the way through from  
SOFT to HARD state.

> define contact{
>        name                            generic-contact         ; The  
> name of this contact template
>
>        host_notification_options       w,u,c,r,f,s             ;  
> send notifications for all host states, flapping events, and  
> scheduled downtime events

w and c are not valid for host_notification_options. You want 'd'  
instead.

> define host{
>        name                               generic-host

>        notification_options               w,u,c,r,f,s          ;  
> Only send notifications for specific host states

w and c are not valid for host notification_options. You want 'd'  
instead. Check all your host templates.

>        contact_groups                     libertyadminsgroup   ;  
> Notifications get sent to the admins by default
>        register                          0                    ; DONT  
> REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
>         }

I am assuming that the members of the libertyadminsgroup all use the  
generic-contact template you provided with minimal modifications. It  
wouldn't hurt to provide the complete object for this and a sample  
problem service from both objects.cache and status.dat when the  
problem occurs.

- Verify that notifications are enabled program-wide in nagios.cfg
- Verify that it hasn't been disabled via the GUI (Program Status)
- Verify that notifications for the specific service haven't been  
disabled via the GUI (click on them and look or look for them in  
status.dat)
- See the 'Query regarding Nagios notification' thread from yesterday  
so we don't have to repeat further.

--
Marc
------------------------------------------------------------------------
------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and
stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when
reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list