Multiple Nagios proccesses running.

Chris Wilson chris at aidworld.org
Wed Jul 27 19:12:26 CEST 2005


Hi Andreas,

> What you think and don't think is, sadly, irrelevant. It's a fact that 
> Ethan doesn't actively track down bugs or prioritise bug-reports on the 
> 1.x-branch. If you're interested you could ofcourse backport the fixes 
> from the 2.x branch. I'm sure Ethan would welcome a patch.

Irrelevant to who? I will try and find time to maintain it myself if
nobody else wants to, and it doesn't annoy anyone too much (but it would
be a "fork" of Nagios 1.2). It will probably be a few years before I
trust 2.x enought to use it. I guess from the number of people reporting
this issue on the mailing list that I'm not the only one.

If Ethan will accept patches for 1.2, then great. I could even take some
responsibility for maintaining the official 1.x branch if that would
help.

> Nothing's wrong with it per se. To work around it I added the redhatish 
> concept of lockfiles that are created by the init-script. Several nagios 
> instances can still be spawned so long as you don't use the init-script, 
> but on platforms that have the "service" script it's not often useful to 
> do so anyways.

I think my patch makes nagios.lock work the way it should, so a separate
lockfile isn't necessary. But I would definitely welcome comments.

> > Nagios tries to do the
> > mutual exclusion, but fails for reasons that I don't understand yet.
> > 
> 
> I take it you haven't read the code. The mutex part simply isn't there 
> (it's fairly easy to follow, if you take it from main() and just read on 
> down to event_execution_loop() (or something).

How do you think I wrote a patch without reading the code? base/utils.c
daemon_init() doesn't use mutexes at all in 1.2. It uses fcntl(F_SETLK),
but that apparently doesn't work (at least there is no mutual exclusion
on Linux). I made it tougher by checking whether the process listed in
the PID file is still running, and aborting with an appropriate error if
it is.

How is the version in 2.x "more complete"? What can be more complete
than properly checking that the process specified by the lockfile is not
still running?

Cheers, Chris.
-- 
(aidworld) chris wilson | chief engineer (chris at aidworld.org)



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list