Reproducible nagios crash

Jason Ahrens Jason.Ahrens at telus.com
Wed Sep 11 22:19:24 CEST 2002


I'm having a problem with Nagios processing external commands when there's a
lot of them.

I currently have a very large backlog of external commands (from a
distributed monitoring environment). Currently, Nagios is incapable of
processing the amount of command traffic this generates.

When we attempt to dump the command log (56M) into the command pipe, Nagios
starts chewing CPU and memory. Eventually when CPU reaches near 50% (usually
hovers around 1-2% max) usage and memory has reached almost 20MB (usually
about 2MB or so) Nagios dies. The log message is always the same:

[1031774837] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;
[1031774837] Caught SIGSEGV, shutting down...

It seems to be a problem with processing the large number of check results
back logged.

Has anyone seen this before? This is on Solaris 8, E420, compiled with gcc
2.95 19990728.

Thanks

Jason

--
Jason Ahrens, System Analyst
TELUS Enterprise Solutions
http://www.telus.com




-------------------------------------------------------
In remembrance
www.osdn.com/911/




More information about the Users mailing list