service notification when host is down

Samuel Bancal sam.bancal at gmail.com
Wed Feb 17 16:42:55 CET 2010


Nagios Core 3.2.0
nagios-plugins-1.4.14
Ubuntu server 8.04.3 LTS

Hi,

I'm encountering problems to configure the notifications in case a server is
no more responding to PING (ICMP).
I don't understand why Nagios is jumping over steps when it's doing
service-check "icmp".
Here is the config :

define host{
  use                    generic-server
  host_name              server1
  alias                  server1
  address                the.ip.the.ip
  hostgroups             prod-servers
  contact_groups         group1
  check_command          check-host-alive
  check_period           24x7
  check_interval         5
  retry_interval         1
  max_check_attempts     4
  notification_period    24x7
  notification_interval  60
  notification_options   d,u,r
}

define service{
  use                     generic-service
  host_name               server1
  service_description     ICMP
  check_command           check_icmp!100.0,20%!500.0,60%
  max_check_attempts      4
  normal_check_interval   5
  retry_check_interval    1
  notification_options    w,u,c,r
  notification_interval   60
  notification_period     24x7
}
[...]
define command{
  command_name    check-host-alive
  command_line    $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c
5000.0,100% -p 5
}
define command{
  command_name    check_icmp
  command_line    $USER1$/check_icmp -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p
5
}
[...]

Here is an example of history that I get :
[image: Service Critical][2010-02-16 11:33:13] SERVICE ALERT:
server1;ICMP;CRITICAL;SOFT;1;CRITICAL - the.ip.the.ip: rta nan, lost 100%
[image: Host Down][2010-02-16 11:33:43] HOST ALERT:
server1;DOWN;SOFT;1;(Host Check Timed Out)
[image: Service Critical][2010-02-16 11:34:13] SERVICE ALERT:
server1;ICMP;CRITICAL;HARD;1;CRITICAL - the.ip.the.ip: rta nan, lost 100%
[image: Host Down][2010-02-16 11:34:43] HOST ALERT:
server1;DOWN;SOFT;2;(Host Check Timed Out)
[image: Host Down][2010-02-16 11:35:23] HOST ALERT:
server1;DOWN;SOFT;3;(Host Check Timed Out)
[image: Host Down][2010-02-16 11:36:33] HOST ALERT:
server1;DOWN;HARD;4;(Host Check Timed Out)
[image: Host Up][2010-02-16 11:37:43] HOST ALERT: server1;UP;HARD;1;PING OK
- Packet loss = 0%, RTA = 0.67 ms
[image: Service Ok][2010-02-16 11:39:13] SERVICE ALERT:
server1;ICMP;OK;HARD;1;OK - the.ip.the.ip: rta 0.943ms, lost 0%

Or later :
[image: Host Down][2010-02-16 11:42:03] HOST ALERT: server1;DOWN;SOFT;1;(Host
Check Timed Out)
[image: Host Down][2010-02-16 11:43:13] HOST ALERT: server1;DOWN;SOFT;2;(Host
Check Timed Out)
[image: Service Critical][2010-02-16 11:44:13] SERVICE ALERT:
server1;ICMP;CRITICAL;HARD;1;CRITICAL
- the.ip.the.ip: rta nan, lost 100%
[image: Host Down][2010-02-16 11:44:43] HOST ALERT: server1;DOWN;SOFT;3;(Host
Check Timed Out)
[image: Host Up][2010-02-16 11:45:53] HOST ALERT: server1;UP;SOFT;4;PING OK
- Packet loss = 0%, RTA = 0.64 ms
[image: Service Ok][2010-02-16 11:49:13] SERVICE ALERT:
server1;ICMP;OK;HARD;1;OK
- the.ip.the.ip: rta 0.948ms, lost 0%

Someone any clue?

Regards,
Samuel Bancal
-- 
Samuel Bancal - CH
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20100217/efb85138/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list