Strange service scheduling

Ricardo Maraschini ricardo.maraschini at opservices.com.br
Thu Apr 2 14:33:56 CEST 2009


Hi,

----- "Hendrik Baecker" <andurin at process-zero.de> escreveu:
> Are you shure that the problem hits us, when the calculation is
> 'inside'
>  a timeperiod?

If I send a valid timestamp to get_next_valid_time it will return the same valid time.
Take a look to base/checks.c at line 280:

get_next_valid_time(preferred_time,&next_valid_time,svc->check_period_ptr);

Suppose preferred_time is a valid timestamp within svc->check_period_ptr.
What is supposed to return from this call?

I supposed the return is: next_valid_time = preferred_time.

Look, some lines below we found:

if (time_is_valid==FALSE && next_valid_time==preferred_time) {
     //schedule this check for next year
}

time_is_valid = FALSE can be achieved with a timeperiod change.
Lets try with an example:

service X, timeperiod 24x7, check_interval is 5 minutes;
Nagios is running and the next check for X is scheduled to 11:48.
So, somebody change the timeperiod of service X to 11:50-23:00.

When the next check runs, time_is_valid will become FALSE because 11:48 is OUT of new timeperiod.
Prefered time is calculed to 11:53 and we run get_next_valid_time passing prefered_time, that returns 11:53.

The code below is self explained:
if (time_is_valid==FALSE && next_valid_time==preferred_time) {
     //schedule this check for next year
}

I propose to use 

get_next_valid_time(current_time,&next_valid_time,svc->check_period_ptr);

not 

get_next_valid_time(preferred_time,&next_valid_time,svc->check_period_ptr);

I will do this patch against development release and send again to list.

-rm

------------------------------------------------------------------------------




More information about the Developers mailing list