Bug in ndoutils / ndo2db

Tilo Renz trenz at tagwork-one.de
Tue Dec 16 17:04:35 CET 2008


Hello List,

I believe, I found a bug in ndo2db.c in ndoutils.

Problem is: 
ndo2db forks a child for each accepted connection.
The Parent process calls waitpid _once_ for each received SIGCHLD.

But *nix does not guarantee the delievery of every SIGCHLD if multiple Signals of the same type occur.
--> increasing number of zombies.

Usually this no problem because the number of forks and waits is quite small and the probability of lost SIGCHLDs is even smaller (practically zero).
In our setup clients close the connection frequently and we have over 500 Nagios instances reporting to one central ndo2db, which raises the number of forks, waits and losses of SIGCHLDs to a significant level.

This patch repeats the call to waitpid until no finished children (zombies) are left.

Regards,
Joey5337 / Tilo Renz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20081216/82208cb4/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Lost_SIGCHLD.diff
Type: text/x-patch
Size: 719 bytes
Desc: Lost_SIGCHLD.diff
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20081216/82208cb4/attachment.bin>
-------------- next part --------------
------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
-------------- next part --------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel


More information about the Developers mailing list