parent/child setup not working

Andy Shellam (Mailing Lists) andy.shellam-lists at mailnetwork.co.uk
Sat Jan 6 00:38:11 CET 2007


If I understand it right, your host checks should not be scheduled - but 
your service checks are.
So, every time a service requires checking and Nagios finds the service 
is down, it checks the host to see if the host is down.  If it is, then 
it suppresses notifications for the service and instead goes into the 
host's notification handling.

However I'm not sure if this is the case for escalated service 
notifications.  You have a notification_interval set - try commenting 
this out (or setting to 0) and see if you then get the same thing happening.

Andy.


David Miller wrote:
> Andy Shellam (Mailing Lists) wrote:
>
> Arghh!  Sorry for the previous, content free reply.
>
> The service entry is;
>
> define service{
>        use                             generic-service         ; Name 
> of service template to use
>        hostgroup_name                  webservers
>        service_description             Check Simple Webservers
>        is_volatile                     0
>        check_period                    24x7
>        max_check_attempts              5
>        normal_check_interval           5
>        retry_check_interval            2
>        contact_groups                  ops
>        notification_interval           120
>        notification_period             24x7
>        notification_options            w,u,c,r
>        check_command                   check_http
>        }
>      
> But the point is, unless I'm missing something, that the service 
> should not be checked at all if the parent is down.
>
> Thanks!
>
> --- David
>
>> Hi David,
>>
>> I'm not clued up on parent/child relationships between hosts, however 
>> one thing I believe might be happening is that the example of the 
>> alert you've sent for the service - it might be a "reminder" 
>> notification that the service is still down.  (Perhaps as a result of 
>> escalation settings?)
>>
>> I think this is because it has a delay in the state variable - ie. 
>> "CRITICAL for xxxxx" as opposed to just "CRITICAL."
>>
>> What's the definition for that service?
>>
>> Andy.
>>
>>
>> David Miller wrote:
>>> Hi;
>>>
>>> I'm not sure what I'm doing wrong.
>>>
>>> Running nagios 2.5 on debian-stable.  I have the nagios server in 
>>> one data center monitoring 30ish servers in another data center.
>>>
>>> In the hosts.cfg file I have a gateway (firewall) defined:
>>>
>>> define host {
>>>         use                     generic-host    ; Name of host 
>>> template to use
>>>         host_name               pix
>>>         alias                   PIX
>>>         address                 x.y.z.2
>>>         check_command           check-host-alive
>>>         max_check_attempts      1
>>>         notification_interval   1
>>>         notification_period     24x7
>>>         notification_options    d,u,r
>>>         }
>>>
>>>
>>> I then use that as a parent to all the hosts I want to monitor in 
>>> the remote data center.  Those have host entries like this;
>>>
>>>
>>> define host {
>>>         use                     generic-host    ; Name of host 
>>> template to use
>>>         host_name               logweb1
>>>         alias                   Logweb1
>>>         address                 logweb1.foo.com
>>>         parents                 pix
>>>         max_check_attempts      1
>>>         active_checks_enabled   0
>>>         notification_interval   1
>>>         notification_period     24x7
>>>         notification_options    d,r
>>>         }
>>>
>>> As I read the documentation, when nagios detects that host "pix" is 
>>> down that it won't check or report on host logweb1.
>>>
>>> If the network connection is broken, however, by deleting the 
>>> default route, I get three messages that the pix is down that look 
>>> like this:
>>>
>>> Subject:** PROBLEM alert 1 - PIX host is DOWN **
>>>
>>> ***** Nagios  *****
>>>
>>> Notification Type: PROBLEM
>>> Host: PIX
>>> State: DOWN for 0d 0h 0m 0s
>>> Address: 66.151.232.2
>>> Info:
>>>
>>> CRITICAL - Network unreachable (x.y.z.2)
>>>
>>> Date/Time: Fri Jan 5 16:17:48 EST 2007
>>>
>>> ACK by: Comment:
>>>
>>> And a few minutes later I get notice on the child server:
>>>
>>> Subject: ** PROBLEM alert 1 - Logweb1/Check Simple Webservers is 
>>> CRITICAL **
>>>
>>> ***** Nagios  *****
>>>
>>> Notification Type: PROBLEM
>>>
>>> Service: Check Simple Webservers
>>> Host: Logweb1
>>> State: CRITICAL for 0d 0h 8m 6s
>>> Address: logweb1.foo.com
>>>
>>> Info:
>>>
>>> Network is unreachable
>>>
>>> Date/Time: Fri Jan 5 16:29:28 EST 2007
>>>
>>> ACK by: Comment:
>>>
>>> What am I doing wrong?
>>>
>>> Thanks in advance,
>>>
>>> --- David
>>>
>>>
>>>
>>> ------------------------------------------------------------------------- 
>>>
>>> Take Surveys. Earn Cash. Influence the Future of IT
>>> Join SourceForge.net's Techsay panel and you'll get the chance to 
>>> share your
>>> opinions on IT & business topics through brief surveys - and earn cash
>>> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV 
>>>
>>> _______________________________________________
>>> Nagios-users mailing list
>>> Nagios-users at lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>>> ::: Please include Nagios version, plugin version (-v) and OS when 
>>> reporting any issue. ::: Messages without supporting info will risk 
>>> being sent to /dev/null
>>>
>>>
>>>
>>>
>>>   
>>
>>
>
>
> !DSPAM:37,459ee03d137101012410913!
>
>


-- 
Andy Shellam
NetServe Support Team

the Mail Network
"an alternative in a standardised world"

p: +44 (0) 121 288 0832/0839
m: +44 (0) 7818 000834


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list