Nagios not sending host or service notifications

Tim Philips timp at dts.net.nz
Thu Jun 4 08:48:15 CEST 2009


Hi,

We have hit a weird situation with Nagios and I hope someone in the 
community can assist or provide some details of additional 
troubleshooting steps we can/should take.

The situation is Nagios doesn't seem to be sending out host or service 
notifications.  I have done the following debugging:

   1. Ensure there is no Nagios configuration errors with nagios -v
      nagios.cfg (no configuration errors)
   2. Stopped and started Nagios to ensure there are no configuration
      errors that -v isn't aware of (and no errors and Nagios restarts
      correctly)
   3. Stopped Nagios and deleted both objects.cache retention.dat and
      restarted to ensure there was nothing legacy in the configuration
   4. Ensure that both on a service, host and global basis that
      notifications are enabled (and they are)
   5. Reviewing the nagios configuration (both in the files and via
      "view config" from the Nagios UI)


When a host (or service) experiences a problem the normal process seems 
to be happening (from what I see in the UI and the logs):

[1244096420] SERVICE ALERT: localhost;root 
/dev/mapper/VolGroup00-LogVol00;CRITICAL;SOFT;1;DISK CRITICAL - free 
space: / 17736 MB (4% inode=99%):
[1244096480] SERVICE ALERT: localhost;root 
/dev/mapper/VolGroup00-LogVol00;CRITICAL;SOFT;2;DISK CRITICAL - free 
space: / 17739 MB (4% inode=99%):
[1244096540] SERVICE ALERT: localhost;root 
/dev/mapper/VolGroup00-LogVol00;CRITICAL;SOFT;3;DISK CRITICAL - free 
space: / 17739 MB (4% inode=99%):
[1244096600] SERVICE ALERT: localhost;root 
/dev/mapper/VolGroup00-LogVol00;CRITICAL;HARD;4;DISK CRITICAL - free 
space: / 17739 MB (4% inode=99%):
[1244096600] SERVICE NOTIFICATION: nagiosadmin;localhost;root 
/dev/mapper/VolGroup00-LogVol00;CRITICAL;notify-service-by-email;DISK 
CRITICAL - free space: / 17739 MB (4% inode=99%):

After enabling debugging (debug_level=32) I see the following in the 
debug log:

[1244096600.182820] [032.0] [pid=7951] ** Service Notification Attempt 
** Host: 'localhost', Service: 'root /dev/mapper/VolGroup00-LogVol00', 
Type: 0, Options: 0, Current State: 2, Last Notification: Thu Jan  1 
12:00:00 1970
[1244096600.182945] [032.0] [pid=7951] Notification viability test passed.
[1244096600.182986] [032.1] [pid=7951] Current notification number: 1 
(incremented)
[1244096600.183016] [032.1] [pid=7951] Service notification will NOT be 
escalated.
[1244096600.183042] [032.1] [pid=7951] Adding normal contacts for 
service to notification list.
[1244096600.198454] [032.0] [pid=7951] 1 contacts were notified.  Next 
possible notification time: Thu Jun  4 19:23:20 2009
[1244096600.198514] [032.0] [pid=7951] 1 contacts were notified.

I'm able to manually use the mail command to send external e-mail (have 
attempted this several times) and can confirm that I can manually 
execute the command_line for the notification option:

# 'notify-service-by-email' command definition
define command{
         command_name    notify-service-by-email
         command_line    /usr/bin/printf "%b" "***** Nagios 
*****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: 
$SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: 
$SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional 
Info:\n\n$SERVICEOUTPUT$" | /usr/bin/mailx -s "** $NOTIFICATIONTYPE$ 
Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" 
$CONTACTEMAIL$
         }

I have addjusted the command_line and changed it to something like (for 
example):

     command_line    echo "testing" | /usr/bin/mail -s "** TEST SUBJECT 
**" my at email.address

Nagios doesn't have any configuration errors, starts up but still 
doesn't seem to fire the command!!


All this said I'm not seeing anything in the mail server logs indicating 
that Nagios is even attempting to use notify-service-by-email and send 
e-mail.

Would really appreciate any suggestion anyone can assist with.


Cheers

-- 

Tim Philips


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20090604/8fce1cb3/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
OpenSolaris 2009.06 is a cutting edge operating system for enterprises 
looking to deploy the next generation of Solaris that includes the latest 
innovations from Sun and the OpenSource community. Download a copy and 
enjoy capabilities such as Networking, Storage and Virtualization. 
Go to: http://p.sf.net/sfu/opensolaris-get
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list