Nagios crash, apparently in service_result_worker_thread

Andreas Ericsson ae at op5.se
Wed Jan 5 16:26:30 CET 2005


Ethan Galstad wrote:
> Andreas -
> 
> Did you have any luck(?) in having this happen again, so as to be 
> able to track it down? 
> 

It's been happening with irregular interval ever since. Always with the 
same backtrace, but not always on friday nights any more.

Here's how it's compiled;
CFLAGS="-pipe -march=i386 -mcpu=i686 -O2 -momit-leaf-frame-pointer 
-mpreferred-stack-boundary=3 -ggdb3 -g"
export CFLAGS
./configure --prefix=/opt/monitor --disable-statuswrl 
--with-nagios-user=monitor --with-nagios-group=httpd --disable-event-broker

Notable is that the customers to which this has happened are running 
lots of custom plugins and I'm not sure whether those kill themselves in 
a timely manner or not. A few other customers also do this, but they 
don't seem to be affected at all. All other software is identical on all 
systems.

I'll send a patch to allow coredumps in a clean way.

Cheers.

> 
> 
> On 20 Dec 2004 at 11:40, Andreas Ericsson wrote:
> 
> 
>>gdb nagios core
>>
>>(gdb) bt
>>#0 0x00000000 in ?? ()
>>#1 0x001c100b in __libc_malloc (bytes=512) at malloc.c 2695
>>#2 0x08060971 in service_result_worker_thread(arg=0x0) at utils.c:4666
>>#3 0x00162de2 in pthread_start_thread(arg=0xbf5ffe40) at manager.c:241
>>#4 0x0020f70a in thread_start () from /lib/libc.so.6
>>(gdb)
>>
>>What strikes me as weird is the fact that this crash happened after
>>Nagios had been running for 4 days (and always seems to happen at
>>friday nights between 9 PM and 11:30 PM in this particular network). I
>>would have expected service_result_worker_thread() to fail at
>>start-time, if at all.
>>
>>Mind though, I've made some modifications to allow it to dump core
>>(which should be either default, ./configure-able or a command
>>argument since debugging without it is not nearly as efficient, and
>>"ulimit c none" can be used to prevent it from doing so any way), but
>>only very minor such that shouldn't affect stability at all.
>>

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Lead Developer


-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt




More information about the Developers mailing list