Help please - test works fine EXCEPT over NRPE???

Andrew Davis nccomp at gmail.com
Thu Apr 2 00:27:12 CEST 2009


I setup the "check_logs.pl" 
(http://www.nagiosexchange.org/cgi-bin/page.cgi?g=Detailed%2F1752.html;d=1) 
test and its config file on some local Linux servers tested via NRPE. 
All other NRPE tests work fine (including some custom ones). The 
check_logs.pl works fine locally, but fails over NRPE. I've enabled 
debugging in NRPE, but its not telling me much more...

Client local test:

    atum:/etc/init.d # /usr/local/nagios/libexec/check_logs.pl -c
    /etc/nagios/check_logs_linux.cfg
    faillog => OK; lastlog => OK; messages => OK; wtmp => OK;


Server test to client via NRPE:

    /usr/local/nagios/libexec/check_nrpe -H atum -c check_logs
    CHECK_NRPE: No output returned from daemon.


Local log (/var/log/messages) on client when test is run from server:

    Apr  1 18:05:52 atum nrpe[1412]: Added
    command[check_logs]=/usr/local/nagios/libexec/check_logs.pl -c
    /etc/nagios/check_logs_linux.cfg
    Apr  1 18:05:52 atum nrpe[1412]: INFO: SSL/TLS initialized. All
    network traffic will be encrypted.
    Apr  1 18:05:52 atum nrpe[1412]: Handling the connection...
    Apr  1 18:05:52 atum nrpe[1412]: Host is asking for command
    'check_logs' to be run...
    Apr  1 18:05:52 atum nrpe[1412]: Running command:
    /usr/local/nagios/libexec/check_logs.pl -c
    /etc/nagios/check_logs_linux.cfg
    Apr  1 18:05:52 atum nrpe[1412]: Command completed with return code
    0 and output: 
    Apr  1 18:05:52 atum nrpe[1412]: Return Code: 0, Output:


The response is immediate, so its not a timeout issue. Other NRPE tests 
work fine:

    /usr/local/nagios/libexec/check_nrpe -H atum -c check_load
    OK - load average: 0.00, 0.00, 0.00|load1=0.000;5.000;10.000;0;
    load5=0.000;5.000;10.000;0; load15=0.000;5.000;10.000;0;
    /usr/local/nagios/libexec/check_nrpe -H atum -c check_memory
    CHECK_MEMORY OK - 1702M free |
    free=1785552896b;210236620.8:;105118310.4:


And on the client:

    Apr  1 18:09:25 atum nrpe[1799]: INFO: SSL/TLS initialized. All
    network traffic will be encrypted.
    Apr  1 18:09:25 atum nrpe[1799]: Handling the connection...
    Apr  1 18:09:25 atum nrpe[1799]: Host is asking for command
    'check_load' to be run...
    Apr  1 18:09:25 atum nrpe[1799]: Running command:
    /usr/local/nagios/libexec/check_load -r -w 5.0 -c 10.0
    Apr  1 18:09:25 atum nrpe[1799]: Command completed with return code
    0 and output: OK - load average: 0.00, 0.00,
    0.00|load1=0.000;5.000;10.000;0; load5=0.000;5.000;10.000;0;
    load15=0.000;5.000;10.000;0; 
    Apr  1 18:09:25 atum nrpe[1799]: Return Code: 0, Output: OK - load
    average: 0.00, 0.00, 0.00|load1=0.000;5.000;10.000;0;
    load5=0.000;5.000;10.000;0; load15=0.000;5.000;10.000;0;
    Apr  1 18:09:26 atum nrpe[1802]: INFO: SSL/TLS initialized. All
    network traffic will be encrypted.
    Apr  1 18:09:26 atum nrpe[1802]: Handling the connection...
    Apr  1 18:09:26 atum nrpe[1802]: Host is asking for command
    'check_memory' to be run...
    Apr  1 18:09:26 atum nrpe[1802]: Running command:
    /usr/local/nagios/libexec/check_memory.pl -w 10% -c 5%
    Apr  1 18:09:26 atum nrpe[1802]: Command completed with return code
    0 and output: CHECK_MEMORY OK - 1703M free |
    free=1786134528b;210236620.8:;105118310.4:
    Apr  1 18:09:26 atum nrpe[1802]: Return Code: 0, Output:
    CHECK_MEMORY OK - 1703M free |
    free=1786134528b;210236620.8:;105118310.4:


Here's the local command in my /etc/nagios/nrpe.cfg:

    command[check_logs]=/usr/local/nagios/libexec/check_logs.pl -c
    /etc/nagios/check_logs_linux.cfg


And on the server (when done in services.cfg, though its failing with 
manual tests too):

    define service {
            hostgroup_name                  linux-servers
            service_description             LOGS
            check_command                   check_nrpe!check_logs
            max_check_attempts              3
            normal_check_interval           15
            retry_check_interval            5
            check_period                    24x7
            notification_interval           120
            notification_period             24x7
            notification_options            w, u, c, r, f, s
            contact_groups                  unixadmins
    }



Considering it fails with a manual test (command line), I doubt its my 
services.cfg entry. It runs fine when called locally, so I'm thinking it 
could be an issue on the client in the nrpe.cfg, but if so I can't find 
it...

I *do* see the obvious... namely, the other two tests that run over NRPE 
have something after "Output:" and the check_logs.pl does not. However, 
called at the command line it does... which is what stumps me.

What would cause the test to run fine locally, but return nothing when 
called via NRPE??? (BTW: I'm running 3.x with the latest set of plugins 
and NRPE).

One more thing: I know *someone* is going to ask why I don't just use 
the built-in check_log test. The answer is that check_logs.pl allows for 
multiple files and pattern matches and a "seek" file to speed things up.

-- 


  A. Davis
  Email:     nccomp at gmail.com

  "There is no limit to what a man can accomplish
   if he doesn't care who gets the credit." - Ronald Reagan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20090401/ba72fe2d/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list