[PATCH] base/workers: Write only to initialized memory when we run out of slots

robin.sonefors at op5.com robin.sonefors at op5.com
Thu Nov 1 22:03:23 CET 2012


From: Robin Sonefors <robin.sonefors at op5.com>

get_job_id returns -1 when there are no free slots in the worker. We
didn't handle this case with anything other than a comment - the end
result is that the nagios core will assign the -1 slot of the worker,
causing memory errors.

This seems to be what sometimes created crashes when shutting
down/restarting the process. While I haven't been able to create a
completely reproducible test case, I have a fairly large, completely
unrelated test suite that used to cause a crash roughly 1/2 the time it
was executed, and this patched stopped that.

The FIXME to figure out somewhere else to put the check is still in its
place - I still don't do that - but not dumping core because we've got a
sizable workload seems reasonable. It's unlikely that another worker is
available to spread the workload for us, so the error returned about us
being too busy to do anything should normally already be quite
indirectly noticable anyway.

Signed-off-by: Robin Sonefors <robin.sonefors at op5.com>
---
 base/workers.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/base/workers.c b/base/workers.c
index 5338558..9f9aecd 100644
--- a/base/workers.c
+++ b/base/workers.c
@@ -830,8 +830,9 @@ static worker_process *get_worker(worker_job *job)
 
 	if (job->id < 0) {
 		/* XXX FIXME Fiddle with finding a new, less busy, worker here */
+		return NULL;
 	}
-	wp->jobs[job->id % wp->max_jobs] = job;
+	wp->jobs[job->id] = job;
 	job->wp = wp;
 	return wp;
 
-- 
1.7.11.7


------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct




More information about the Developers mailing list