Continuing issues with retention file causing schedule/actions to be ignored.

Eli Stair estair at ilm.com
Wed Mar 8 21:12:28 CET 2006


I'm continuing to have problems when retention.dat file gets into a 
state where the nagios process stops functioning properly.  The problems 
I've had in the past were increasing numbers of hosts or entire 
hostgroups no longer executing their service checks, and now (today) 
that the event handler for one particular service stopped being executed 
(while all others continue to work).

In this and all previous cases, stopping nagios and moving the retention 
file out of the way resolves the issue.  Reloading or a hard stop/start 
of nagios doesn't have any effect.  There has never appeared to be 
anything "wrong" with the retention file.

The only issues with my installation are this issue, and the 
all-too-frequent "premature end of script headers" in all the CGI's, and 
"Warning: Size of service_message struct (528 bytes) is > 
POSIX-guaranteed atomic write size (512 bytes). " due to compiling 
x86_64.  That being said, I have enough issues that there dozens of 
daily "premature script header/Internal Server Error" wreaking havoc 
with production, and these instances of event failures that are 
extremely critical.  The script header problem came into being 
immediately upon upgrading from 2.0b6 to 2.0rc2+, and the 
scheduling/retention problem has been present to varying degrees in 
every 2.0b+ I've tried.

I am happy to find these are configuration/optimization issues on my end 
I can resolve, but my suspicion is they are bugs.  I will do anything I 
can to help provide a debug testbed for identifying and tracking them 
down.  Attached is my main nagios config (objects are not included), and 
I can provide any other data (object configs, logs, retention.dat, etc) 
privately if needed (security concerns).

Please let me know what I can do to help address this and find a resolution.

Regards,

/eli


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642




More information about the Developers mailing list