[PATCH 1/5] base/events: Don't get stuck in busy-wait loop when poll returns EINVAL

robin.sonefors at op5.com robin.sonefors at op5.com
Mon Nov 5 14:58:29 CET 2012


From: Robin Sonefors <robin.sonefors at op5.com>

In a couple of instances, by completely breaking libnagios, I've managed
to create situations where there are no workers connected to nagios.
When that happens, nagios would get stuck in an infinite loop, where
polling for events always returns -1 (EINVAL) immediately, and nagios
would respond to that by immediately trying again, making my machine
warm and tired. Exiting the event loop when that happens seems more
reasonable - we can't run any more checks, and we can't notify anyone
about this, because we have no worker to do so for us.

It would be possible to set sigrestart to TRUE, to force nagios to
restart all its workers again, yet this could create the same crash loop
that I'm trying to fix, only somewhat larger and slower - if all sockets
died, something is seriously broken, so make it possible for an external
watchdog daemon to find out.

Signed-off-by: Robin Sonefors <robin.sonefors at op5.com>
---
 base/events.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/base/events.c b/base/events.c
index 9f059c2..cdaa188 100644
--- a/base/events.c
+++ b/base/events.c
@@ -1003,6 +1003,11 @@ int event_execution_loop(void) {
 		               poll_time_ms, iobroker_get_num_fds(nagios_iobs),
 		               squeue_size(nagios_squeue), nagios_iobs);
 		inputs = iobroker_poll(nagios_iobs, poll_time_ms);
+		if (inputs < 0) {
+			logit(NSLOG_RUNTIME_ERROR, TRUE, "Error polling for input, giving up");
+			break;
+		}
+
 		log_debug_info(DEBUGL_IPC, 2, "## %d descriptors had input\n", inputs);
 
 		/* 100 milliseconds allowance for firing off events early */
-- 
1.7.11.7


------------------------------------------------------------------------------
LogMeIn Central: Instant, anywhere, Remote PC access and management.
Stay in control, update software, and manage PCs from one command center
Diagnose problems and improve visibility into emerging IT issues
Automate, monitor and manage. Do more in less time with Central
http://p.sf.net/sfu/logmein12331_d2d




More information about the Developers mailing list