Antwort: Re: Antwort: Re: Check becomes unplanned

Sascha.Runschke at gfkl.com Sascha.Runschke at gfkl.com
Fri Sep 12 09:56:32 CEST 2008


nagios-devel-bounces at lists.sourceforge.net schrieb am 11.09.2008 20:53:18:

> So although this is a bug, I wouldn't expect Nagios to work very well
> with time shifts and there's many other applications/daemons out there
> that could fail as well. You shouldn't have to constantly re-adjust the
> clock - if you do then there might me a bigger problem...

Of course it's a problem rooted somewhere deeper in the environment,
yet I don't think programs should suffer from it.

> Have you tried running an ntp daemon? It should try keeping the clock
> sync without causing time shifts (as Andreas explained, it slows down or
> speed up the clock instead). What OS is it on? under a VM? custom
> compiled kernel? There might be something broken or unsupported in your
> setup that causes this. If you don't fix it you shouldn't expect your
> system to work very well anyway.

Running ntpd won't help at all. The timeshifts happen because the
systemclock is running too fast. It doesn't really matter if you
adjust it back in little ticks or in one big bang once a day.
You can easily reproduce this with running an older 2.6 kernel
inside a VM using the clocksource=pit kernelparameter. The systemclock
will be running 1 jitter too fast, resulting in 1 sec backshift like
every 5 minutes - either from ntpd or from the vmware-tools timesync.

The only workaround I can think of is running a
"/etc/init.d/nagios stop
ntpdate
/etc/init.d/nagios start"
once a day - which is quite ugly though.

> One thing that might help is setting this to 1 in your nagios.cfg. This
> will make Nagios periodically check for services that aren't scheduled:
> 
> check_for_orphaned_services=1

Well, the problem is that the service IS scheduled - just 1 year ahead.

> Besides, to fix this I would make the sanity check reschedule the check
> for the end of the check period (possibly only if it's not scheduled
> already), as it wouldn't have to check for this condition for every
> service on every time shift.

Why for the end? You'll still lose like all checks then. It runs the check
once and then sleeps, because the check_period is not valid anymore. Then
the time shifts again - rinse and repeat. You'll end up with one check
each day for such services.

Regards
        Sascha

-- 
Sascha Runschke
IT-Infrastruktur

GFKL Financial Services AG
Limbecker Platz 1
45127 Essen

Telefon : +49 (201) 102-1879 Mobil : +49 (173) 5419665 Fax : +49 (201) 
102-1102105



GFKL Financial Services AG
Vorstand: Dr. Peter Jänsch (Vors.), Jürgen Baltes, Dr. Till Ergenzinger, Dr. Tom Haverkamp
Vorsitzender des Aufsichtsrats: Dr. Georg F. Thoma
Sitz: Limbecker Platz 1, 45127 Essen, Amtsgericht Essen, HRB 13522
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20080912/8ef9b222/attachment.html>
-------------- next part --------------
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
-------------- next part --------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel


More information about the Developers mailing list