Question about host checks

Simone Felici s.felici at alpikom.it
Thu Nov 27 09:36:10 CET 2008


Marc Powell ha scritto:
> On Nov 26, 2008, at 1:54 AM, Simone Felici wrote:
> 
>>
>> Please help, noone has an idea?
>> Ah, my Nagios version (Nagios 3.0.3).
>> Also why all my hosts are checked more or less every 4 seconds? :(
>> Thank's!
> 
> 
> The information provided so far indicates that hosts will only be  
> checked on demand, as you expect. That means, probably, either the  
> information is incorrect or they're being checked on demand. Since the  
> normal interval for actions in nagios is measured in minutes, I'd lean  
> toward the on-demand side of things.
> 
> Can you post the host definition from status.dat.
> Can you post relevant log entries for the host and any services on  
> that host near the host checks. You may need to increase your logging  
> options in nagios.cfg.
> Is the 4 second number in any way significant to your installation? Is  
> your time_interval less than 60?
> Debug mode is available to you to figure out what's going on. This is  
> almost certainly going to be your best source for resolution.
> 

Good morning.
With time_interval do you mean interval_lenght?
I've set it to "1". In this way I've set all checks in seconds, because I need for certain services a retry interval of 
30seconds.

Here additional infos:

########################################
#          NAGIOS STATUS FILE
#
# THIS FILE IS AUTOMATICALLY GENERATED
# BY NAGIOS.  DO NOT MODIFY THIS FILE!
########################################

<<cut>>

hoststatus {
         host_name=<<MY-WINDOWS-EXAMPLE-SERVER>>
         modified_attributes=3
         check_command=check-host-alive
         check_period=24hx7
         notification_period=24hx7
         check_interval=5.000000
         retry_interval=1.000000
         event_handler=
         has_been_checked=1
         should_be_scheduled=1
         check_execution_time=0.016
         check_latency=12.595
         check_type=0
         current_state=0
         last_hard_state=0
         last_event_id=116739
         current_event_id=116740
         current_problem_id=0
         last_problem_id=51304
         plugin_output=PING OK - Packet loss = 0%, RTA = 0.65 ms
         long_plugin_output=
         performance_data=
         last_check=1227773285
         next_check=1227773291
         check_options=0
         current_attempt=1
         max_attempts=2
         current_event_id=116740
         last_event_id=116739
         state_type=1
         last_state_change=1227714561
         last_hard_state_change=1227714561
         last_time_up=1227773286
         last_time_down=1227714537
         last_time_unreachable=1215613491
         last_notification=0
         next_notification=0
         no_more_notifications=0
         current_notification_number=0
         current_notification_id=46804
         notifications_enabled=1
         problem_has_been_acknowledged=0
         acknowledgement_type=0
         active_checks_enabled=1
         passive_checks_enabled=0
         event_handler_enabled=0
         flap_detection_enabled=0
         failure_prediction_enabled=1
         process_performance_data=0
         obsess_over_host=0
         last_update=1227773300
         is_flapping=0
         percent_state_change=0.00
         scheduled_downtime_depth=0
         }

<<cut>>

Then I've enabled debugging (24) and here the result pasting only where I've found the example host.
I've written "(..CUT..)" to skip MB of lines not important, referring to other services/hosts (having of course the same 
problem).


(..CUT..)
[1227774166.211254] [008.0] [pid=25062] ** Timed Event ** Type: 12, Run Time: Thu Nov 27 09:22:38 2008
[1227774166.211266] [008.0] [pid=25062] ** Host Check Event ==> Host: '<<MY-WINDOWS-EXAMPLE-SERVER>>', Options: 0, 
Latency: 8.211000 sec
[1227774166.211283] [016.0] [pid=25062] Attempting to run scheduled check of host '<<MY-WINDOWS-EXAMPLE-SERVER>>': check 
options=0, latency=8.211000
[1227774166.211296] [016.0] [pid=25062] ** Running async check of host '<<MY-WINDOWS-EXAMPLE-SERVER>>'...
[1227774166.211325] [016.0] [pid=25062] Checking host '<<MY-WINDOWS-EXAMPLE-SERVER>>'...
[1227774166.211434] [016.1] [pid=25062] Check result output will be written to '/tmp/checkNmYAPl' (fd=7)
[1227774166.229697] [008.1] [pid=25062] ** Event Check Loop
[1227774166.229755] [008.1] [pid=25062] Next High Priority Event Time: Thu Nov 27 09:22:47 2008
[1227774166.229771] [008.1] [pid=25062] Next Low Priority Event Time:  Thu Nov 27 09:22:38 2008
[1227774166.229781] [008.1] [pid=25062] Current/Max Service Checks: 0/80
[1227774166.229794] [008.1] [pid=25062] Running event...
[1227774166.229808] [008.0] [pid=25062] ** Timed Event ** Type: 12, Run Time: Thu Nov 27 09:22:38 2008
(..CUT..)
[1227774186.288210] [016.1] [pid=6021] Checking host '<<MY-WINDOWS-EXAMPLE-SERVER>>' for flapping...
(..CUT..)
[1227774186.433737] [016.1] [pid=6021] Checking service 'DISK-SPACE' on host '<<MY-WINDOWS-EXAMPLE-SERVER>>' for flapping...
[1227774186.433749] [016.1] [pid=6021] Service is not flapping (0.00% state change).
[1227774186.433894] [016.1] [pid=6021] Checking service 'IMAP' on host '<<MY-WINDOWS-EXAMPLE-SERVER>>' for flapping...
[1227774186.433905] [016.1] [pid=6021] Service is not flapping (0.00% state change).
[1227774186.434036] [016.1] [pid=6021] Checking service 'MEMORY' on host '<<MY-WINDOWS-EXAMPLE-SERVER>>' for flapping...
[1227774186.434047] [016.1] [pid=6021] Service is not flapping (0.00% state change).
[1227774186.434179] [016.1] [pid=6021] Checking service 'PING' on host '<<MY-WINDOWS-EXAMPLE-SERVER>>' for flapping...
[1227774186.434190] [016.1] [pid=6021] Service is not flapping (0.00% state change).
[1227774186.434323] [016.1] [pid=6021] Checking service 'POP3' on host '<<MY-WINDOWS-EXAMPLE-SERVER>>' for flapping...
[1227774186.434334] [016.1] [pid=6021] Service is not flapping (0.00% state change).
[1227774186.434464] [016.1] [pid=6021] Checking service 'SMTP' on host '<<MY-WINDOWS-EXAMPLE-SERVER>>' for flapping...
[1227774186.434475] [016.1] [pid=6021] Service is not flapping (0.00% state change).
(..CUT..)
[1227774190.773319] [016.1] [pid=6021] Handling check result for host '<<MY-WINDOWS-EXAMPLE-SERVER>>'...
[1227774190.773330] [016.1] [pid=6021] ** Handling async check result for host '<<MY-WINDOWS-EXAMPLE-SERVER>>'...
[1227774190.773345] [016.1] [pid=6021] HOST: <<MY-WINDOWS-EXAMPLE-SERVER>>, ATTEMPT=1/2, CHECK TYPE=ACTIVE, STATE 
TYPE=HARD, OLD STATE=0, NEW STATE=0
[1227774190.773372] [016.1] [pid=6021] Host was UP.
[1227774190.773382] [016.1] [pid=6021] Host is still UP.
[1227774190.773392] [016.1] [pid=6021] Pre-handle_host_state() Host: <<MY-WINDOWS-EXAMPLE-SERVER>>, Attempt=1/2, 
Type=HARD, Final State=0
[1227774190.773414] [016.1] [pid=6021] Post-handle_host_state() Host: <<MY-WINDOWS-EXAMPLE-SERVER>>, Attempt=1/2, 
Type=HARD, Final State=0
[1227774190.773430] [016.1] [pid=6021] Checking host '<<MY-WINDOWS-EXAMPLE-SERVER>>' for flapping...
[1227774190.773452] [016.1] [pid=6021] Rescheduling next check of host at Thu Nov 27 09:23:15 2008
[1227774190.773481] [016.0] [pid=6021] Scheduling a non-forced, active check of host '<<MY-WINDOWS-EXAMPLE-SERVER>>' @ 
Thu Nov 27 09:23:15 2008
[1227774190.773500] [016.1] [pid=6021] ** Async check result for host '<<MY-WINDOWS-EXAMPLE-SERVER>>' handled: new state=0
[1227774190.773524] [016.1] [pid=6021] Deleted check result file '/usr/local/nagios/var/spool/checkresults/ctdW9xA'
(..CUT..)
[1227774197.619431] [008.0] [pid=6021] ** Timed Event ** Type: 12, Run Time: Thu Nov 27 09:23:08 2008
[1227774197.619443] [008.0] [pid=6021] ** Host Check Event ==> Host: '<<MY-WINDOWS-EXAMPLE-SERVER>>', Options: 0, 
Latency: 9.619000 sec
[1227774197.619460] [016.0] [pid=6021] Attempting to run scheduled check of host '<<MY-WINDOWS-EXAMPLE-SERVER>>': check 
options=0, latency=9.619000
[1227774197.619473] [016.0] [pid=6021] ** Running async check of host '<<MY-WINDOWS-EXAMPLE-SERVER>>'...
[1227774197.619504] [016.0] [pid=6021] Checking host '<<MY-WINDOWS-EXAMPLE-SERVER>>'...
[1227774197.619601] [016.1] [pid=6021] Check result output will be written to '/tmp/checkICrgKX' (fd=7)
[1227774197.636440] [008.1] [pid=6021] ** Event Check Loop
[1227774197.636524] [008.1] [pid=6021] Next High Priority Event Time: Thu Nov 27 09:23:18 2008
[1227774197.636544] [008.1] [pid=6021] Next Low Priority Event Time:  Thu Nov 27 09:23:08 2008
[1227774197.636558] [008.1] [pid=6021] Current/Max Service Checks: 0/80
[1227774197.636573] [008.1] [pid=6021] Running event...
(..CUT..)
[1227774198.041822] [016.1] [pid=6021] Handling check result for host '<<MY-WINDOWS-EXAMPLE-SERVER>>'...
[1227774198.041833] [016.1] [pid=6021] ** Handling async check result for host '<<MY-WINDOWS-EXAMPLE-SERVER>>'...
[1227774198.041851] [016.1] [pid=6021] HOST: <<MY-WINDOWS-EXAMPLE-SERVER>>, ATTEMPT=1/2, CHECK TYPE=ACTIVE, STATE 
TYPE=HARD, OLD STATE=0, NEW STATE=0
[1227774198.041863] [016.1] [pid=6021] Host was UP.
[1227774198.041872] [016.1] [pid=6021] Host is still UP.
[1227774198.041881] [016.1] [pid=6021] Pre-handle_host_state() Host: <<MY-WINDOWS-EXAMPLE-SERVER>>, Attempt=1/2, 
Type=HARD, Final State=0
[1227774198.041893] [016.1] [pid=6021] Post-handle_host_state() Host: <<MY-WINDOWS-EXAMPLE-SERVER>>, Attempt=1/2, 
Type=HARD, Final State=0
[1227774198.041904] [016.1] [pid=6021] Checking host '<<MY-WINDOWS-EXAMPLE-SERVER>>' for flapping...
[1227774198.041933] [016.1] [pid=6021] Rescheduling next check of host at Thu Nov 27 09:23:23 2008
[1227774198.041955] [016.0] [pid=6021] Scheduling a non-forced, active check of host '<<MY-WINDOWS-EXAMPLE-SERVER>>' @ 
Thu Nov 27 09:23:23 2008
[1227774198.042001] [016.1] [pid=6021] ** Async check result for host '<<MY-WINDOWS-EXAMPLE-SERVER>>' handled: new state=0
[1227774198.042025] [016.1] [pid=6021] Deleted check result file '/usr/local/nagios/var/spool/checkresults/cb
(..CUT..)

It's enough?
Any helps?

Thank's!!!

Simon


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list