NRPE way too fragile ?

Guillaume Rousse Guillaume.Rousse at inria.fr
Wed Oct 8 12:43:39 CEST 2008


Hello list.

I'm using nrpe quite heavily for testing lots of local service on all my 
machines. It work usually well, but seems a bit unreliable: too much 
often, nrpe itself fails to accept incoming connections, and test fails:
CHECK_NRPE: Socket timeout after 10 seconds.

stracing nrpe process shows it is probably waiting itself on another 
connection:
[root at denfert ~]# strace -p 22444
Process 22444 attached - interrupt to quit
select(6, [5], NULL, [5], {0, 170000})  = 0 (Timeout)
accept(5, 0, NULL)                      = -1 EAGAIN (Resource 
temporarily unavailable)

It usually recovers itself alone, but that's enough to cause much 
unwanted notifications, even if all monitored services have nrpe itself 
as dependency. I'm using ssl encryption, as usually advised, but I'm 
planning shifting to plain-text connection (everything occurs on a 
distinc VLAN, without user access).

Does everyone else has similar experience ?
-- 
Guillaume Rousse
Moyens Informatiques - INRIA Futurs
Tel: 01 69 35 69 62

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list