(No output!) Errors in Nagios 2.4

Andy Shellam andy.shellam-lists at mailnetwork.co.uk
Sun Aug 27 16:48:37 CEST 2006


Hugo,

I didn't think it relevant to post full details of hosts/services as 
sometimes the commands work and sometimes they don't.
It's not a problem with command syntax, specific host or service - it's 
a global thing.  If I run the command manually they work fine, as shown 
below.

Here's an example of a failing PostgreSQL service: "(Return code of 127 
is out of bounds - plugin may be missing)"

Run it manually:
> su -c '/usr/local/nagios/libexec/check_pgsql -H <HOST_IP> -P 5432 -d 
> <DB_NAME> -l <LOGIN_USER> -w 30 -c 60' - nagios
>  OK - database <DB_NAME> (0 
> sec.)|time=0.000000s;30.000000;60.000000;0.000000
If I force the service to re-poll for an active check then that error 
will clear and come up OK, but then another service will fail.  
Currently I've got 3 failures on services that are actually up and working.

Take another example - the SSH service on the Nagios machine - currently 
reading "CRITICAL - Server answer:".  The flapping state is  "Percent 
State Change:72.70%" which suggests the service is coming up and down 
extremely randomly, however the machine and SSH service is working fine.

The command for this is:

define command {
    command_name        Check_SSH
    command_line        /usr/local/nagios/libexec/check_ssh -H $ARG1$ -p 
3322
}

And the service definition:

define service {
    host_name            Perth,Sydney-1
    use                Service_Template
    service_description             Encrypted Remote Access - SSH
    check_command            Check_SSH!$HOSTADDRESS$
}

And the same for running a HTTP service which is reading "(No status!)" 
manually:
> [root at dns zones]# su -c '/usr/local/nagios/libexec/check_http -H 
> www.andyshellam.eu -N -p 80 -A "Nagios/2.4/dns.mailnetwork.co.uk" -f 
> follow -w 30 -c 60 -t 120' - nagios
> HTTP OK HTTP/1.1 200 OK - 1023 bytes in 0.006 seconds 
> |time=0.006282s;30.000000;60.000000;0.000000 size=1023B;;;0

Andy.

Hugo van der Kooij wrote:
> On Sun, 27 Aug 2006, Andy Shellam wrote:
>
>   
>> I've been using Nagios for around 5 months now with no problems.  I've
>> recently added a new server onto my network, which has added somewhere
>> in the region of another 3 hosts and 12 services onto Nagios.
>>
>> Since then I now keep getting random errors in the "Status Information"
>> for services only.
>>
>> For example I've got a HTTP monitor which monitors
>> http://photos.andyshellam.eu:80, and this has started saying "Name or
>> service not known" or "(No output!)" and labelled with either an OK or
>> CRITICAL state (when the site is actually OK.)
>>     
>
> I think you could improve the likelyhood of getting help by providing:
>  - host definition (+ template if needed)
>  - service definition (+ template if needed)
>  - checkcommand definition
>  - Results of check command as user nagios from the commandline
>
> Hugo.
>
>   


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list