How to debug nrpe not connecting?

david palm dvdplm at gmail.com
Wed May 16 18:35:03 CEST 2007


Hi all,
disclaimer: I'm new here and new to Nagios, so please bear with me. I've
searched and tried all lists, faq and whatnot to manage on my own, but to no
avail.

My goal (for now) is to make a dead simple nagios install:
Server A  runs nagios, does some basic  localhost checks
Server B runs nrpe daemon
Server A runs a check_load command on Server B thorough nrpe

I have done two separate installations from source on two different sets of
servers, running on different networks and SuSE 10.1/9.2 in one case, Debian
Unstable/Gentoo in the other. The error is identical on both sets of
servers, so I have concluded it is not a OS related problem.

Yes, you probably guessed it, it is the oh so common "Warning: Return code
of 127 for check of service 'CPU Load' on host 'gerald' was out of bounds.
Make sure the plugin you're trying to run actually exists.".

Ok, so here are some details ("helena" is Server A, where nagios runs;
"gerald" is Server B, where the nrpe daemon runs). All commands are run as
the "nagios" user.

    nagios at helena:/$ /opt/nagios/libexec/check_nrpe -H localhost

returns "NRPE v2.8.1" just as it should

    nagios at helena:/$ /opt/nagios/libexec/check_nrpe -H gerald -c check_load

returns a correct reply and I can see the two servers speaking by doing a
tail -f /var/log/syslog on Server B:

    May 16 18:15:42 gerald nrpe[20766]: Connection from 192.168.1.10 port
29830
    May 16 18:15:42 gerald nrpe[20766]: Host address is in allowed_hosts
    May 16 18:15:42 gerald nrpe[20766]: Handling the connection...
    May 16 18:15:42 gerald nrpe[20766]: Host is asking for command
'check_load' to be run...
    May 16 18:15:42 gerald nrpe[20766]: Running command:
/opt/nagios//libexec/check_load -w 15,10,5 -c 30,25,20
    May 16 18:15:42 gerald nrpe[20766]: Command completed with return code 0
and output: OK - load average: 0.01, 0.01, 0.00|load1=0.010;15.000;30.000;0;
load5=0.010;10.000;25.000;0; load15=0.000;5.000;20.000;0;
    May 16 18:15:42 gerald nrpe[20766]: Return Code: 0, Output: OK - load
average: 0.01, 0.01, 0.00|load1=0.010;15.000;30.000;0;
load5=0.010;10.000;25.000;0;
load15=0.000;5.000;20.000;0;
    May 16 18:15:42 gerald nrpe[20766]: Connection from 192.168.1.10 closed.

So, am I correct in assuming that nrpe is correctly running and functioning
on both servers? In this case the nrpe daemon is running as a stand-alone
daemon, but results are exactly the same when running under xinetd on the
SuSE servers (first set of servers).

When I launch nagios as a foreground process I see something interesting on
the console:
    root at helena:~/custom_compiles/nrpe-2.8.1# /opt/nagios/bin/nagios
/opt/nagios/etc/nagios.cfg

    Nagios 2.9
    Copyright (c) 1999-2007 Ethan Galstad (http://www.nagios.org)
    Last Modified: 04-10-2007
    License: GPL

    Nagios 2.9 starting... (PID=5406)
!!  sh: line 1: /opt/nagios/libexecHOSTADDRESS$: No such file or directory
    Warning: Return code of 127 for check of service 'CPU Load' on host
'gerald' was out of bounds. Make sure the plugin you're trying to run
actually exists.
!!  sh: line 1: /opt/nagios/libexecHOSTADDRESS$: No such file or directory
    Warning: Return code of 127 for check of service 'CPU Load' on host
'gerald' was out of bounds. Make sure the plugin you're trying to run
actually exists.

See those errors from sh? The command line seems to have been stripped of
the actual command and the $HOSTADDRESS$ macro lacks the leading "$"...

Now, who/what is doing this to the command? I've checked and double checked
my config files and believe they're correct. The relevant bits follow:

commands.cfg:
    define command{
              command_name  check_nrpe
              command_line  $USER1/check_nrpe -H $HOSTADDRESS$ -c $ARGS1$
    }

nrpe.cfg (on Server B, "gerald"):
    command[check_load]=/opt/nagios/libexec/check_load -w 15,10,5 -c
30,25,20

resource.cfg:
    $USER1$=/opt/nagios/libexec

gerald.cfg:
    define host{
            use                     linux-server
            host_name               gerald
            alias                   gerald
            address                 192.168.1.200
    }

    define service{
            use                     generic-service         ; Name of
service template to use
            host_name               gerald
            service_description     CPU Load

            check_period                    24x7            ; The service
can be checked at any time of the day
            max_check_attempts              600                ; Re-check
the service up to 4 times in order to determine its final (hard) state
            normal_check_interval           2                ; Check the
service every 5 minutes under normal conditions
            retry_check_interval            1                 ; Re-check the
service every minute until a hard state can be determined
            contact_groups                  admins        ; Notifications
get sent out to everyone in the 'admins' group
            notification_options            w,u,c,r            ; Send
notifications about warning, unknown, critical, and recovery events
            notification_interval           360                ; Re-notify
about service problems every hour
            notification_period             24x7                 ;
Notifications can be sent out at any time

            check_command           check_nrpe!check_load
            }

As you can see not much have been changed from the basic installation
instructions.

Ideas anyone? :-(

How can I configure nagios to provide more leads (debug info) than that
miserable sh error above? Is the debug_file (and related) options a
3.0-onlyoption? (
http://nagios.sourceforge.net/docs/3_0/configmain.html#debug_file)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20070516/6d8645f2/attachment.html>
-------------- next part --------------
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list