NRPE 2.0 + static nat = trouble

Frederic Vanden Poel IT at barclab.com
Thu Jun 17 11:51:52 CEST 2004


We are using nrpe 2.0 on all our Linux servers.

Towards the internal machines, nrpe runs fine and we never get bogus
alerts.

On the DMZ machines, nrpe daemons started from xinetd sometimes remain
in an ESTABLISHED state (netstat) and strace shows nrpe is waiting
during a read() syscall. The nrpe setup in xinetd is pretty standard :

service nrpe
{
        disable = no
        flags           = REUSE
        socket_type     = stream
        wait            = no
        user            = nagios
        server          = /usr/local/nagios/nrpe
        server_args     = -i -c /etc/nrpe.cfg
        only_from       = 127.0.0.1 aaa.bbb.ccc.ddd
        log_on_failure  += USERID
}

After a couple of hours there are so many nrpe processes in this state
that new connections from the nagios server result in the following
error :

[06-08-2004 16:14:49] SERVICE ALERT: svr.dmz.com;/var
disk;CRITICAL;HARD;4;CHECK_NRPE: Error - Could not complete SSL
handshake.

which is quite annoying as we are notified with bogus alerts for the
services checked through nrpe.

The nagios server has a one-to-one static NAT address aaa.bbb.ccc.ddd in
the DMZ range, which means that the other DMZ machines see the nagios
probes coming from within the DMZ.

We also tried to run nrpe as a standalone daemon without xinetd and the
problem remains. After a while, the nrpe daemon just gives up without a
syslog message. The nrpe problems occurs on all our Linux DMZ servers.

We have tried to change connection timeouts on the firewall and through
nagios but none of them seem to help. We have a less than ideal
workaround by killing all hanging nrpe processes from cron.

Any ideas to debug the issue are very welcome.

-- 
        Frederic Vanden Poel
        Network Engineer @ BARC N.V.

        Phone : +32 (9) 329 23 29
        Fax   : +32 (9) 329 23 30



-------------------------------------------------------
This SF.Net email is sponsored by The 2004 JavaOne(SM) Conference
Learn from the experts at JavaOne(SM), Sun's Worldwide Java Developer
Conference, June 28 - July 1 at the Moscone Center in San Francisco, CA
REGISTER AND SAVE! http://java.sun.com/javaone/sf Priority Code NWMGYKND
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list