Echo nagios status?

Paul L. Allen pla at softflare.com
Fri Apr 2 01:28:12 CEST 2004


Marc Powell writes: 

> On Wednesday, March 31, 2004 12:21 PM, Paul L. Allen shared with us:
>> As I recall if the check goes stale then Nagios will try to perform
>> an the active check that you defined.  This is useful if you want
>> distributed monitoring to take the load off the central monitor but
>> want the central machine to take over if one of your distributed
>> monitors goes down.    
> 
> Presuming freshness checking is enabled...

Moot point.  If it's not enabled you'll never know if your passive
checks have stopped coming in.  So if you're doing passive checks
without freshness checking you have a sub-optimal configuration. 

> Also from the documentation on freshness checking "It is important to
> note that an active service check which is being forced because the 
> service was detected as being "stale" gets executed even if active
> service checks are disabled on a program-wide or service-specific
> basis."

I'd forgotten about that one.  When I replied (as now) it was late at
night because I'd spent most of the day dealing with important issues.
I'm now working through my mailbox dealing with the issues of lesser
importance through a haze of red wine.  You do need to set up a dummy
active check in some circumstances, IST(vaguely)R. 

>> If the reason you have a machine submitting passive checks is because
>> an intervening firewall won't let you do active checks then you don't
>> want active checks enabled unless you want misleading error mesages.  
> 
> This seems to me to be irrelevant. He already recognizes that in his
> situation, active checks from Nagios aren't acceptable to him and the
> desired solution is the same either way.

The distinction is whether you see a critical error that tells you
the results are stale or a critical error that tells you that
such-and-such a check failed.  If the passive check is stale you get
told it is stale, which points you in one direction.  But if an active
check gets run and fails (because there's no way it could work) you
get led in another direction.  In an ideal world you remember every
detail of your configs and understand that if the active check gets
run it will necessarily fail.  In the real world you get woken at an
ungodly hour by an alert and start trying to figure things out with
excess blood in your caffeinestream.  Also you have a PHB screaming
at you that an active check has failed when really a passive check has
gone stale. 

> If you were to see 'misleading error messages' because of this then I 
> would say the installation wasn't thought through or implemented very 
> well.

You do have to understand that Nagios is designed to do active checks
if passive checks fail so that failover monitoring is possible and that
you need to take additional steps if active checks are not possible. 

However, I still think Nagios is wrong in reporting stale checks as
critical.  This is a transport issue and I believe the correct
interpretation is that the service state is unknown.  But that's an other
issue. 

-- 
Paul Allen
Softflare Support 




-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list