Nagios false positive returns critical but serviceis really OK

Bryn Smith bryn at aas.duke.edu
Mon Feb 12 22:49:22 CET 2007



Marc Powell wrote:
>   
>> -----Original Message-----
>> From: nagios-users-bounces at lists.sourceforge.net [mailto:nagios-users-
>> bounces at lists.sourceforge.net] On Behalf Of Bryn Smith
>> Sent: Monday, February 12, 2007 1:02 PM
>> To: Morris, Patrick
>> Cc: Nagios Users Mailinglist
>> Subject: Re: [Nagios-users] Nagios false positive returns critical but
>> serviceis really OK
>>
>>
>> Morris, Patrick wrote:
>>     
>>>>>> Hi all,
>>>>>> I've got a setup of CentOS 4.4 and nagios-2.3.1, and I have one
>>>>>> server that started reporting critical errors over the weekend.
>>>>>>             
> I
>   
>>>>>> logged in and checked the server, and it was fine.  I restarted
>>>>>> nagios, and the problem state persisted,
>>>>>>             
>
> [chop]
>
>   
>> I did find it in /usr/local/nagios/var, which I guess I could have
>> epxected.
>> This is all it says: [1171170000] CURRENT SERVICE STATE:
>> aas4.aas.duke.edu;NonBackupLoad;CRITICAL;HARD;3
>> ;<A
>>
>>     
> href="https://bb.aas.duke.edu/nagios/mon_data/load/loadmsg152.3.56.13.tx
> t"
>   
>>> Cannot Obtain Load Info, please check on 152.3.56.13
>>>       
>
> [chop]
>
>   
>> returns an answer too.  It's like it just doesn't make it into the web
>> interface, even though it does for every other machine in that service
>> definition (and, of course, worked for this one until yesterday).
>>     
>
> But it is ;). The error message "Cannot Obtain Load Info, please check
> on 152.3.56.13" is not a nagios error message. It's almost certainly
> coming directly from your home-grown check scripts. It would appear to
> me that some condition is occurring during the check that your
> home-grown script can't deal with so it generates a general failure
> message for nagios to display. When you're testing the script during
> this failure time, are you doing so as the nagios user and exactly as
> nagios is calling it? When this was occurring, was the nagios Last Check
> time reasonable and in line with your normal_check_interval? That will
> help determine if the check was being performed and you were seeing a
> consistent error reported by your plugin.
>
>   
Ok, I at least found the real problem - it wasn't that the script 
worked, I had the wrong script.  My real problem is with snmpwalk, so 
I'm off to figure that out.
Thanks, all of your help let me track down the real problem.
-- 
Bryn Smith (Ms) A&SIST 660-2434 jabber IM: bryn at jabber.duke.edu

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list