Problems with service scheduling

Marcus Fleige mfleige at rhenus.de
Tue Aug 29 10:42:06 CEST 2006


Hi list,

i recently ran into problems with the service scheduling inside my 
nagios installation and i guess i need some help. I'm running Nagios 2.5 
with roundabout 300 Hosts ans 2800 services on a 4-CPU Xeon machine with 
2 GB RAM. Services are actively monitored.

Thing is: I changed the configuration last week to better fit our needs. 
That included a lot of renaming of Services, Contacts, Contactgroups and 
Escalations. After I finished, i restarted the nagios daemon yesterday 
morning at about 9am. Result: the process doesn't start monitoring. I 
looked into the scheduling queue, and it told me it will start the 
monitoring at 5pm in the evening.

Over the day, i tried to analyse the problem. I reviewed the config, 
although Nagios verificates it to be good, finding nothing. Restart of 
the process has no effect, the scheduling queue doesn't change.

I tried with the old config (praise svn!), and the Process starts as 
usual, generating a new scheduling queue and beginning the monitoring.

As the only file influencing the schedule queue is the main config file 
and altough I did not change that, i copied it again from the old to the 
new config. It didn't show any effect, at least it shows the error seems 
to live in the host/service/escalation area of my config.

When i restarted the Nagios daemon at 4:50pm, waiting till 5pm, i began 
to monitor as he was expected to. That worked till today morning, when 
at around 9am (again!) the scheduling queue showed up with the 5pm-thing 
again.

I recompiled nagios with DEBUG1-3 to get some more information. After 
validating te config, it shows the following:

[...]
	Completed service verification checks
         Completed host verification checks
         Completed hostgroup verification checks
         Completed servicegroup verification checks
         Completed contact verification checks
         Completed contact group verification checks
         Completed service escalation checks
         Completed service dependency checks
         Completed host escalation checks
         Completed host dependency checks
         Completed command checks
         Completed command checks
         Completed extended host info checks
         Completed extended service info checks
         Completed circular path checks
         Completed circular host and service dependency checks
         Completed global event handler command checks
         Completed obsessive compulsive processor command checks
$0: Cannot enter daemon mode with DEBUG option(s) enabled.  We'll run as 
a foreground process instead...
COMMAND FILE THREAD: 1077427120
         Preferred Time: 1156839911 --> Tue Aug 29 10:25:11 2006
         Next Valid Time: 1156863600 --> Tue Aug 29 17:00:00 2006
         Preferred Time: 1156839911 --> Tue Aug 29 10:25:11 2006
         Next Valid Time: 1156863600 --> Tue Aug 29 17:00:00 2006
[...]
	Host 'AP001' should not be scheduled
	Host 'AP002' should not be scheduled
	Host 'AP003' should not be scheduled
	Host 'AP004' should not be scheduled
	Host 'AP005' should not be scheduled
[...]
Total scheduled services: 2837
Service Interleave factor: 1
Total service interleave blocks: 2837
Service inter-check delay: 1.0
         Current Interleave Block: 0
                 Service 'Network: Ping' on host 'AP001'
                 CIB: 0, IBI: 1, TIB: 2837, SIF: 1
                 Mult factor: 2837
                Preferred Check Time: 1156842748 --> Tue Aug 29 11:12:28 
2006
         Preferred Time: 1156842748 --> Tue Aug 29 11:12:28 2006
         Next Valid Time: 1156863600 --> Tue Aug 29 17:00:00 2006
         Actual Check Time: 1156863600 --> Tue Aug 29 17:00:00 2006
[...]

As you can see, i also changed the service interleaving from smart to 
dumb with an interleave factor of 1 to cirumvent the scheduling logic. 
In vain, i guess.... :-(

Now, for my questions:
Has anyone seen such behaviour already?
Where is that "Next Valid Time" in the Debug-Output from?
How is it generated?
Is there any tool beside the daemon itself to validate the config files?

Thanks for reading all the way down here and please excuse any lingual 
errors.



Regards,

Marcus Fleige

--
EOF

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list