"UNKNOWN" problem, with a return code of 0 in solaris 9.

Patrick Walentiny Patrick.Walentiny at stellent.com
Thu Dec 11 17:37:31 CET 2003


I looked through a lot of the mail list archives and didn't see this
problem listed by anyone else.  It's a bit long winded but I figure
being verbose is better than not being verbose enough.  So here goes.
            We are using nagios to monitor multiple UNIX systems,
including Solaris 2.6 8 & 9.  Our method for doing so is public keys
saved in the "authorized_keys" file of each nagios client's home
directory etc...  When our server runs the "check_by_ssh" against
solaris 9 clients, the nagios server claims it is getting a status of
"UNKNOWN".  I dug in to the documentation to find out that this is
figured out via return codes, IE 0,1,2,3 etc...  So I ran check_by_ssh
against the system in question to see for myself what the problem was.
I am only able to get return codes of 0 "OK".  I'll show the syntax I
used below.  I can get it to also give return codes of 1 and 2 if I
intentionally invoke a warning or critical condition, but when it is run
from the nagios process itself it shows up as "UNKNOWN", and
occasionally flaps to okay for brief periods of time, even though the
status output shows the disks are perfectly okay.
            Here is my output from my tests....
 
 
[...]
 
$ /usr/lib/nagios/plugins/check_by_ssh -H 12.40.185.175  -C
'/opt/nagios/libexec/check_disk -c 10% -w 20%'
$ echo $?
0
$
 
[...]
 
I even had this running in a continual loop to see if maybe 1 out of 10
would go in to an UNKNOWN state, but it doesn't.  I will paste the
portions of my config that should matter for this output.  I really
appreciate any help you guys can give me.
 
/*
 *  Command Definition
 ********************************/
# 'check_remote_disk' command definaition
define command {
        command_name    check_remote_disk
        command_line    $USER1$/check_by_ssh -H $HOSTADDRESS$ -C
'/opt/nagios/libexec/check_disk -c 10% -w 20%'
        }
 
/*
 *  Host Definition
 ********************************/
 
define host {
        use                     generic-host
        host_name               gondor
        alias                   Minneapolis Production Webserver
(gondor)
        address                 12.40.185.175
        check_command           check-host-alive
        max_check_attempts      10
        notification_interval   10
        notification_period     24x7
        notification_options    d,u,r
        parents                 mspfw1
        }
 
 
 
Thanks again for any help, if you need anymore output that this let me
know.
 
Patrick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20031211/0a0eef80/attachment.html>


More information about the Users mailing list