Nagios slow restarting/reloading possible NDOUtilsissue?

Marc Powell marc at ena.com
Tue Oct 9 16:47:27 CEST 2007



> -----Original Message-----
> From: nagios-users-bounces at lists.sourceforge.net [mailto:nagios-users-
> bounces at lists.sourceforge.net] On Behalf Of mark.potter at academy.com
> Sent: Tuesday, October 09, 2007 9:12 AM
> To: nagios-users at lists.sourceforge.net
> Subject: [Nagios-users] Nagios slow restarting/reloading possible
> NDOUtilsissue?
 
> When I reload nagios (/etc/init.d/nagios reload) it takes upwards of a
> full minute to bring everything back up for viewing. In the meantime a
> message is displayed concerning the status of the nagios daemon:
> 
> Error: Could not read host and service status information!
> 
> This goes away in between 90 second and two minutes and everything
begins
> to function normally and the system goes on that way until the next
time I
> have to reload the configuration.

Nagios would appear to be doing other things like reading config files,
parsing associations, etc, before writing the cache file read by the
cgi's. How large is the file pointed to by object_cache_file when
everything is working? Does it exist during the non-working window? Do
you have retention enabled? It may help if you do not.

> 2. Always verify configuration options using the -v command-line
option
> before starting or restarting Nagios!
> 
> Done and nagios is actually running

What is that output?

> 3. Make sure you've compiled the main program and the CGIs to use the
same
> status data storage options (i.e. text file or database). If the main
> program is storing status data in a text file and the CGIs are trying
to
> read status data from a database, you'll have problems.
> 
> This is where I suspect the issue might be hiding. I am using NDOUtils
and

Actually, this is a holdover from the days of nagios-1 when database
support could be integrated directly into the nagios core if compiled
that way. It is very different than the current ndomod support. The main
program and the CGI's always use the same data storage options with 2.x+
(text file). In any event, your logs show your ndo module successfully
connecting to the database immediately after restart.

If you have lots of hosts/services (10's of thousands) or complex
relationships between them, then that delay is likely normal for 2.x. If
not, you really need to determine what nagios is doing during that
window. I would recompile nagios with debugging enabled (./configure
--enable-DEBUGALL) and run the daemon as a foreground process. If that
wasn't interesting, I'd resort to strace to see what the process was
doing at a system level.

--
Marc

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list