Multiple Nagios proccesses running.

Andreas Ericsson ae at op5.se
Thu Jul 28 20:43:43 CEST 2005


Chris Wilson wrote:
> Hi Andreas,
> 
>>>How do you think I wrote a patch without reading the code?
>>
>>Guess-work? I've seen a fair amount of it from you on other topics.
> 
> 
> Sorry, I wasn't aware of that. Please could you give me some examples? I
> would like to do better.
> 

The hashing issue. I just noticed however that that was Ken Dyke and not 
you, so I'd like to revoke that statement. Sorry.

>>I'll have to look at that patch. Can you send it again?
> 
> 
> Sure, it's attached.
> 

There are a couple of issues with it.

* C++ style comments aren't portable enough.
* It isn't in "Ethan style" (nothing but Ethan's code is, but patches 
just aren't accepted unless they are).
* I don't understand where you're reading the pid from the lockfile. All 
I see are two calls to getpid(2). If this is actually true, the patch 
will fix absolutely nothing.

More below.

> 
>>I don't have high hopes though, considering the fact that you proposed 
>>using the extremely non-portable setproctitle() to discern the master 
>>process from the slaves.
> 
> 
> Well, one can always #ifdef such things to platforms where they are
> known to work. I wasn't aware that it wasn't portable, but I must admit
> that I have only programmed C on Windows, Linux and OBSD, and only
> extensively on Linux.
> 
> 
>>How exactly do you check that the process listed isn't running? AFAIK 
>>there aren't any portable syscalls available for getting the process 
>>name from a pid, and the /proc filesystem works differently on different 
>>platforms.
> 
> 
> kill(pidno, 0).
> 

This isn't portable. On some systems it will return -1 and set errno to 
EINVAL. This is particularly true for systems that follow the pre-2001 
POSIX.1 standard. This behaviour is actually also POSIX-compatible. The 
result you're seeing on your system when testing this is an extension to 
the standard.

> 
>>man fcntl. If a process is already holding the lock one is trying to 
>>acquire it will return -1. It's the right way to do it. Checking the pid 
>>and trying to find matching process is the cumbersome, incorrect and 
>>non-portable way.
> 
> 
> I read the man page. It's all very well in theory, but it's just not
> working on Linux, and I don't know why yet.
> 

Then the bug is in either the Linux kernel, the GNU libc library or the 
compiler with the kernel being the most likely. What filesystem are you 
using?

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Lead Developer


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO September
19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list