freshness check bug?

Bryan Loniewski brylon at jla.rutgers.edu
Wed May 11 19:17:49 CEST 2005


Regardless of what freshness_threshold I pick (as long as it's not too unrealistic), 
I just want clarification if a bug exists? (By the way, where do you see the default
freshness threshold is 300 sec?). Anyway, I increased the threshold just now to 180
seconds and the only thing in my nagios.log was:

[1115831032] Finished daemonizing... (New PID=16154)
[1115831272] Warning: The results of service 'PROCS-NAGIOS' on host 'csstest2' are stale
by 60 seconds (threshold=180 seconds).  I'm forcing an immediate check of the service.

So it did not even execute my eventhandler once? I'm getting very inconsistent results!

NRPE and check_by_ssh are not acceptable methods for distributed monitoring in our
environment.

Thanks for the comments... Justin

_________________________
Bryan Loniewski
Rutgers University
NBCS - Systems Programmer

On Wed, 11 May 2005, admin at jpk236.com wrote:

> Bryan,		A freshness_threshold of 60 seconds might be a little 
> unrealistic.  The default value for the threshold is 300 seconds (5 minutes).
> 	If you want almost real-time stats, which appears to be what you're 
> going for, perhaps you want to try NRPE or check_by_ssh as an alternative 
> method of doing distributed monitoring.
>
> - Justin Kulikowski
> 	[ http://www.jpk236.com ]
>
> Bryan Loniewski wrote:
>> While trying to setup failover in a distributed environment, I came across 
>> the following
>> problem (bug?) involving freshness checking.
>> 
>> Note: The host that this is setup on is NOT receiving any passive checks 
>> while I am
>> testing the freshness checking.. so the results are always stale forcing 
>> the freshness
>> check everytime.
>> 
>> Note2: Relevant config snippets are under my .sig
>> 
>> Trying to configure (passive) service freshness checking to execute an 
>> eventhandler
>> works correctly for 1 or 2 iterations.. BUT no more than that. It seems to 
>> stop checking
>> the freshness after at most 3 iterations and stops executing the 
>> eventhandler after at most 2 iterations. I've replicated this behavior 
>> (too) many times and the results are
>> inconsistent.
>> 
>> Below is the output of my nagios log:
>> 
>> <snip nagios.log>
>> [1115822708] Finished daemonizing... (New PID=15941)
>> [1115822828] Warning: The results of service 'PROCS-NAGIOS' on host 
>> 'csstest2' are stale
>> by 60 seconds (threshold=60 seconds).  I'm forcing an immediate check of 
>> the service.
>> [1115822838] SERVICE ALERT: csstest2;PROCS-NAGIOS;CRITICAL;SOFT;1;CRITICAL
>> [1115822838] SERVICE EVENT HANDLER: 
>> csstest2;PROCS-NAGIOS;CRITICAL;SOFT;1;slave-failover
>> [1115822948] Warning: The results of service 'PROCS-NAGIOS' on host 
>> 'csstest2' are stale
>> by 60 seconds (threshold=60 seconds).  I'm forcing an immediate check of 
>> the service.
>> 
>> Notice the freshness check ran ONLY 2 times when it should have run 5 (if 
>> you look at my
>> config options below) and the eventhandler ran ONLY 1 time, when it should 
>> have ran 3 times.
>> 
>> Can anyone verify (disprove) this behavior? Am I missing something?
>> 
>> _________________________
>> Bryan Loniewski
>> Rutgers University
>> NBCS - Systems Programmer
>> 
>> <snip nagios.cfg>
>> check_service_freshness=1
>> service_freshness_check_interval=60
>> <snip>
>> 
>> <snip objects.cfg>
>> define service{
>>          name                            generic-service
>>          parallelize_check               1
>>          obsess_over_service             1
>>          check_freshness                 0
>>          freshness_threshold             60
>>          notifications_enabled           1
>>          event_handler_enabled           1
>>          flap_detection_enabled          1
>>          failure_prediction_enabled      1
>>          process_perf_data               1
>>          retain_status_information       1
>>          retain_nonstatus_information    1
>>          is_volatile                     0
>>          max_check_attempts              5
>>          normal_check_interval           2
>>          retry_check_interval            1
>>          check_period                    24x7
>>          contact_groups                  super-admins
>>          notification_interval           3
>>          notification_period             24x7
>>          register                        0
>> }
>> define service{
>>          use                             generic-service
>>          name                            generic-passive-service
>>          active_checks_enabled           0
>>          passive_checks_enabled          1
>>          register                        0
>> }
>> define service{
>>          use                             generic-passive-service
>>          host_name                       csstest2
>>          service_description             PROCS-NAGIOS
>>          check_freshness                 1
>>          freshness_threshold             60
>>          check_command                   check_dummy!2
>>          event_handler                   slave-failover
>> }
>> define command{
>>         command_name    check_dummy
>>         command_line    $USER1$/check_dummy $ARG1$
>> }
>> define command{
>>         command_name    slave-failover
>>         command_line    $USER2$/failover $SERVICESTATE$ $SERVICESTATETYPE$
>> }
>> <snip>
>> 
>> 
>> -------------------------------------------------------
>> This SF.Net email is sponsored by Oracle Space Sweepstakes
>> Want to be the first software developer in space?
>> Enter now for the Oracle Space Sweepstakes!
>> http://ads.osdn.com/?ad_id=7393&alloc_id=16281&op=click
>> _______________________________________________
>> Nagios-devel mailing list
>> Nagios-devel at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nagios-devel
>


-------------------------------------------------------
This SF.Net email is sponsored by Oracle Space Sweepstakes
Want to be the first software developer in space?
Enter now for the Oracle Space Sweepstakes!
http://ads.osdn.com/?ad_id=7393&alloc_id=16281&op=click




More information about the Developers mailing list