[Fwd: distributed monitoring host checking question]

Sean McAfee smcafee at collaborativefusion.com
Thu Jul 31 18:34:31 CEST 2008


Tom Ammon wrote:
> So, no thoughts on this question?
>
> -------- Original Message --------
> Subject: 	distributed monitoring host checking question
> Date: 	Wed, 30 Jul 2008 01:05:21 -0600
> From: 	Tom Ammon <tom.ammon at utah.edu>
> To: 	nagios-users at lists.sourceforge.net
>
>
>
> Hi,
>
> I am working on setting up a distributed monitoring system with Nagios 
> (actually Groundwork). I have 3 child servers and 1 parent server, using 
> NSCA to send passive check results from the children to the parent server.
>
> My question is about how Nagios (version 2.5) will behave when an on 
> demand host check needs to be run.
>
> So for example:
>
> Host A is configured with check_host_alive ( a simple ping ) as its host 
> check command on the parent server. It is also configured with Service 
> A, say an SNMP check. Active host checks are not disabled on the parent 
> server, but active service checks are.
>
> Host A, obviously, is also configured on the child server. When the 
> child server sends a passive check result up to the parent saying that 
> the SNMP check has failed, will the parent server then run the on-demand 
> host check command to verify that Host A is still up? If not, how do I 
> get that information up to the parent? Are passive host checks my only 
> option?
>
> So I guess the question is this: In a distributed monitoring setup, will 
> a parent server run an on-demand host check for a host that gets a 
> report (via a passive service check sent from a child server) of a 
> service being critical?
>
> Thanks,
>
> Tom

In an all Nagios 3 (I don't believe anything has changed in the relevant 
check logic from 2.x) setup with a master configured with 
execute_service_checks=1, execute_host_checks=0, the master does indeed 
run the follow-up host check after receiving a passive service result:

[1217521713.140143] [016.1] [pid=16669] Handling check result for 
service 'NTP Time Server Ping' on host 'time.example.com'...
[1217521713.140183] [016.0] [pid=16669] ** Handling check result for 
service 'NTP Time Server Ping' on host 'time.example.com'...
[1217521713.140243] [016.1] [pid=16669] HOST: time.example.com, SERVICE: 
NTP Time Server Ping, CHECK TYPE: Passive, OPTIONS: 0, SCHEDULED: No, 
RESCHEDULE: No, EXITED OK: Yes, RETURN CODE: 2, OUTPUT: Testing
[1217521713.140538] [016.0] [pid=16669] ** Running async check of host 
'time.example.com'...
[1217521713.140565] [001.0] [pid=16669] check_time_against_period()
[1217521713.140608] [016.0] [pid=16669] Checking host 'time.example.com'...
[1217521713.140649] [016.2] [pid=16669] Adjusting check attempt number 
for host 'time.example.com': current attempt=1/3, state=0, state type=1
[1217521713.141121] [2048.2] [pid=16669]   Uncleaned macro.  Running 
output (87): '/usr/local/libexec/nagios/check_ping -H time.example.com'
[1217521713.141135] [2048.2] [pid=16669]   Just finished macro.  Running 
output (87): '/usr/local/libexec/nagios/check_ping -H time.example.com'
[1217521713.141162] [2048.2] [pid=16669]   Not currently in macro.  
Running output (126): '/usr/local/libexec/nagios/check_ping -H 
time.example.com -t 5 -w 3000.0,80% -c 5000.0,100% -p 5'
[1217521713.141179] [2048.1] [pid=16669]   Done.  Final output: 
'/usr/local/libexec/nagios/check_ping -H time.example.com -t 5 -w 
3000.0,80% -c 5000.0,100% -p 5'

Sean McAfee
System Engineer

Collaborative Fusion, Inc.
 smcafee at collaborativefusion.com
 412-422-3463 x 4025

5849 Forbes Avenue
Pittsburgh, PA 15217

****************************************************************
IMPORTANT: This message contains confidential information
and is intended only for the individual named. If the reader of
this message is not an intended recipient (or the individual
responsible for the delivery of this message to an intended
recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited. Please
notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
sender therefore does not accept liability for any errors or
omissions in the contents of this message, which arise as a
result of e-mail transmission.
****************************************************************



-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list