Status oscillates w/check_by_ssh.

Scott Zak ZakS at so.ct.edu
Tue Dec 28 15:22:23 CET 2004


We actually have four separate nagios daemons running to provided 
different views to persons who monitor groups of hosts and services.  They 
have parallel configurations in some aspects, but they shouldn't overlap. 
Thanks for the tip.  I'll check it out.

Scott Zak
__________________




I had this happening when I had multiple nagios processes running.  You 
should check that out.

 

Scott Yem

Research Computing Services

Agilent Laboratories



From:nagios-users-admin at lists.sourceforge.net 
[mailto:nagios-users-admin at lists.sourceforge.net] On Behalf Of Scott Zak
Sent: Monday, December 27, 2004 1:27 PM
To: nagios-users at lists.sourceforge.net
Subject: [Nagios-users] Re: Status oscillates w/check_by_ssh.

 


After allowing the service checks to run for a while, I'm finding that the 
service status is oscillating between OK and UNKNOWN.  So it's working 
sometimes, and sometimes it's not.  Yikes. 

It doesn't seem likely that this has anything to do with timeouts or 
missed thresholds (I'm going to try increasing them anyway). 



__________________ 



It's doing what it is supposed to do -- returning zero. 

That is also the return when running check_by_ssh on the nagios box. 

Scott 


----- Forwarded by Scott Zak/IST/CSUSO on 12/27/04 01:18 PM ----- 


 


D Brian Hendrix <dhendrix2 at csc.com> 

12/27/04 12:57 PM 


        
        To:        "Scott Zak" <ZakS at sysoff.ctstateu.edu> 
        cc:         
        Subject:        Re: [Nagios-users] Status Unknown w/check_by_ssh. 
 Command line OK.









Scott, 

When you run the command on the Sun box, what is the error code returned? 

Use the following command: 
> echo $? 

You should get a zero (0) if successful, or a one (1) or higher if not 
successful. 

(Embedded image moved to file: pic06900.gif) 
Brian Hendrix 
Senior System Administrator - DCI 
IT/9KIDD, Baptist Hospital 
2000 Church Street, Nashville, TN    37062 
(615) 284-5297 work 
(615) 222-1704 fax 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
Whatever you do, do your work heartily, 
as for the Lord rather than for men 
- Colossians 3:23 
~~~ ><> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 


                                                                          
            "Scott Zak"                                                   
            <ZakS at sysoff.ctst                                             
            ateu.edu>                                                  To 
            Sent by:                  nagios-users at lists.sourceforge.net   
            nagios-users-admi                                          cc 
            n at lists.sourcefor                                             
            ge.net                                                Subject 
                                      [Nagios-users] Status Unknown       
                                      w/check_by_ssh.  Command line OK.   
            12/27/2004 11:52                                               
            AM                                                             
                                                                          
                                                                          
                                                                          
                                                                          





Hi , 

I'm setting up check_by_ssh to invoke a script via forced command on a 
remote server to check a service status.  The script returns the correct 
status on the server where it lives, and when I run check_by_ssh on the 
command-line, all is well.  Nagios runs the remote command (and displays 
the contents of stdout on the status detail page), but the status always 
comes back 'Unknown'. 

Command-line: 
[nagios at nagios1]$ libexec/check_by_ssh -H 149.152.10.183 -l remoteuser -i 
/path/to/rsa/identity_key  -C 'LDAP' 
LDAP daemon is running. 

When nagios runs the service check, the same 'LDAP daemon is running.' 
message appears in the status information, but the status is nevertheless 
marked as hard  'UNKNOWN'. 

Here's checkcommands.cfg: 

define command{ 
       command_name    check-cp-ldap 
       command_line    $USER1$/check_by_ssh -H $HOSTADDRESS$ -l remoteuser 
-i /path/to/rsa/identity_key -C 'LDAP' 
} 

This is the status log entry: 
[1104169159] 
SERVICE;soluminis1;LDAP;UNKNOWN;3/3;HARD;1104169025;1104169205;ACTIVE; 
1;1;1;1104157840;0;UNKNOWN;3706;335959;419;530;1104165074;2;1;0;1;1;0;0.00;0;1;1;1;LDAP 
daemon is running. 


Nagios is running on RH Linux, and the target host is Solaris 9.  Nagios' 
SSH is OpenSSH and the Solaris box is running Sun_SSH_1.0, protocol 
versions 1.5/2.0.  The remote script worked correctly when monitoring 
services on a Solaris 8 box which was running SSH Secure Shell 
(non-commercial license).  Check_by_ssh is from nagios-plugins 
1.4.0alpha1, 
version 1.18. 

What am I not seeing?  It's probably some bonehead maneuver on my part, 
but 
has anyone else run into (and found their way around) this? 

ScottZak 
ConnecticutStateUniversitySystem. 


Confidentiality Notice: This email message, including any attachments, 
contains 
information that is confidential and/or legally privileged. The 
information 
is intended 
only for the individual(s) named above. If you are not the intended 
recipient or the 
person responsible for delivering the email to the intended recipient, be 
advised that 
you have received this email in error and that any use, dissemination, 
distribution, 
forwarding, printing, or copying of this email is strictly prohibited. If 
you have received 
this email in error, please purge it immediately and notify the sender. 
  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20041228/8d37f040/attachment.html>


More information about the Users mailing list