service notification when host is down

Samuel Bancal sam.bancal at gmail.com
Thu Feb 18 10:47:52 CET 2010


Thanks for your answer,

In fact it is normal behavior to me also.
Thing that is not "normal behavior" to me is that between two checks, Nagios
jumps from "SOFT 1" to "HARD 1" without doing the steps "SOFT 1" > "SOFT 2"
> "SOFT 3" and finally "HARD 4".

Regards,
Samuel Bancal

2010/2/17 Morris, Patrick <patrick.morris at hp.com>

> Samuel Bancal wrote:
>
>> Nagios Core 3.2.0
>> nagios-plugins-1.4.14
>> Ubuntu server 8.04.3 LTS
>>
>> Hi,
>>
>> I'm encountering problems to configure the notifications in case a server
>> is no more responding to PING (ICMP).
>> I don't understand why Nagios is jumping over steps when it's doing
>> service-check "icmp".
>> Here is the config :
>>
>> define host{
>>  use                    generic-server
>>  host_name              server1
>>  alias                  server1
>>  address                the.ip.the.ip
>>  hostgroups             prod-servers
>>  contact_groups         group1
>>  check_command          check-host-alive
>>  check_period           24x7
>>  check_interval         5
>>  retry_interval         1
>>  max_check_attempts     4
>>  notification_period    24x7
>>  notification_interval  60
>>  notification_options   d,u,r
>> }
>>
>> define service{
>>  use                     generic-service
>>  host_name               server1
>>  service_description     ICMP
>>  check_command           check_icmp!100.0,20%!500.0,60%
>>  max_check_attempts      4
>>  normal_check_interval   5
>>  retry_check_interval    1
>>  notification_options    w,u,c,r
>>  notification_interval   60
>>  notification_period     24x7
>> }
>> [...]
>> define command{
>>  command_name    check-host-alive
>>  command_line    $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c
>> 5000.0,100% -p 5
>> }
>> define command{
>>  command_name    check_icmp
>>  command_line    $USER1$/check_icmp -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$
>> -p 5
>> }
>> [...]
>>
>> Here is an example of history that I get :
>> Service Critical[2010-02-16 11:33:13] SERVICE ALERT:
>> server1;ICMP;CRITICAL;SOFT;1;CRITICAL - the.ip.the.ip: rta nan, lost 100%
>> Host Down[2010-02-16 11:33:43] HOST ALERT: server1;DOWN;SOFT;1;(Host Check
>> Timed Out)
>> Service Critical[2010-02-16 11:34:13] SERVICE ALERT:
>> server1;ICMP;CRITICAL;HARD;1;CRITICAL - the.ip.the.ip: rta nan, lost 100%
>> Host Down[2010-02-16 11:34:43] HOST ALERT: server1;DOWN;SOFT;2;(Host Check
>> Timed Out)
>> Host Down[2010-02-16 11:35:23] HOST ALERT: server1;DOWN;SOFT;3;(Host Check
>> Timed Out)
>> Host Down[2010-02-16 11:36:33] HOST ALERT: server1;DOWN;HARD;4;(Host Check
>> Timed Out)
>> Host Up[2010-02-16 11:37:43] HOST ALERT: server1;UP;HARD;1;PING OK -
>> Packet loss = 0%, RTA = 0.67 ms
>> Service Ok[2010-02-16 11:39:13] SERVICE ALERT: server1;ICMP;OK;HARD;1;OK -
>> the.ip.the.ip: rta 0.943ms, lost 0%
>>
>> Or later :
>> Host Down[2010-02-16 11:42:03] HOST ALERT: server1;DOWN;SOFT;1;(Host Check
>> Timed Out)
>> Host Down[2010-02-16 11:43:13] HOST ALERT: server1;DOWN;SOFT;2;(Host Check
>> Timed Out)
>> Service Critical[2010-02-16 11:44:13] SERVICE ALERT:
>> server1;ICMP;CRITICAL;HARD;1;CRITICAL - the.ip.the.ip: rta nan, lost 100%
>> Host Down[2010-02-16 11:44:43] HOST ALERT: server1;DOWN;SOFT;3;(Host Check
>> Timed Out)
>> Host Up[2010-02-16 11:45:53] HOST ALERT: server1;UP;SOFT;4;PING OK -
>> Packet loss = 0%, RTA = 0.64 ms
>> Service Ok[2010-02-16 11:49:13] SERVICE ALERT: server1;ICMP;OK;HARD;1;OK -
>> the.ip.the.ip: rta 0.948ms, lost 0%
>>
>
> If you're asking why Nagios runs a host check when it sees the service fail
> a check, that's normal behavior.
>
> When a service check fails, the first thing Nagios will do is look to see
> if the service failed because the host is down.
>



-- 
Samuel Bancal - CH
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20100218/7f987e7b/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs 
proactively, and fine-tune applications for parallel performance. 
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list