Check becomes unplanned

Bernd Arnold bernd_a at gmx.de
Sat Sep 13 16:41:42 CEST 2008


Thanks for all the comments. Yesterday I did some further 
testing and debugging. And it was very interesting.

Well, forget the ntp story for a moment. The real problem is 
not the time shift, although it seems it is. 
The problem is the if-clause:

/* the service could not be rescheduled properly - set the next check time for next year, but don't actually reschedule it */
if(time_is_valid==FALSE && next_valid_time==preferred_time){

If a service should be checked and the current check time is outside the 
timeperiod, the service becomes unplanned / planned-in-one-year.

This could happen when the system time is changed backwards, either by ntp
or manually. But it could also happen when the timeperiod is changed, Nagios config 
is reloaded and the next check time value is outside this timeperiod.

Test 1:
Next service check at 08:24
Timeperiod changed to 08:43-18:00 (end time isn't important)
Reload Nagios config
At 08:24 the service is rescheduled to 08:43
This is okay.

Test 2:
Next service check at 08:39
Timeperiod changed to 08:40-18:00
Reload Nagios config
At 08:39 the service is set to next check in one year and should_be_scheduled=FALSE
This isn't okay.

So where is the difference?
The success of the reschedule depends on the time difference: (timeperiod_begin - next_check).

In the first test, the timeperiod begins in 19 minutes. 
This is more than five minutes. In the second test,
there is only one minute left to the begin of the
timeperiod.

Let's have a look into checks.c (about line 280) again:

if(current_time>=preferred_time)
  preferred_time=current_time+((svc->check_interval<=0)?300:(svc->check_interval*interval_length));

The preferred_time is set to 5 minutes in the future.
This is 08:29 (test 1) or 08:44 (test 2).
Next code line:

get_next_valid_time(preferred_time,&next_valid_time,svc->check_period_ptr);

The next_valid_time is set to the beginning of the timeperiod.
Test 1: this is 08:43, since preferred time is outside timeperiod.
Test 2: this is 08:44, since the preferred time (now + 5min) is accepted.

When the next if-check is true, the next check time is set to +1 year.

if(time_is_valid==FALSE && next_valid_time==preferred_time){

This is only true for test 2 - our preferred time is the next_valid_time 
set by the get_next_valid_time(...) function.

I've added a patch file. The if-statement was removed and the next check 
time is always set to the preferred time.

Now both tests work fine and is a) ntp time changes aware and b) timeperiod changes aware.

Test 1:
Next check at 11:05
Timeperiod changed beginning 11:07
Reload nagios config
At 11:05 the next check is set to 11:10 (current time + five minutes)

Test 2:
Next check at 11:10
Timeperiod changed beginning 11:37
Reload nagios config
At 11:10 the next check is set to 11:37

I don't know if my patch has a negative influence to other things.
Maybe someone could check this? Who knows the background of the if-clause?

The patch covers only service checks so far! The same change could be applied 
to host checks (line 2795 in checks.c) if my patch doesn't have a bad impact.

Comments are welcome.

Regards
Bernd

-- 
Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten 
Browser-Versionen downloaden: http://www.gmx.net/de/go/browser
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch_unplanned-check.diff
Type: application/octet-stream
Size: 1578 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20080913/bb69d59d/attachment.obj>
-------------- next part --------------
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
-------------- next part --------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel


More information about the Developers mailing list