R: services check stale

Andreas Ericsson ae at op5.se
Fri Nov 26 13:31:20 CET 2004


Marco Borsani wrote:
> -}Marco Borsani wrote:
> -}> Hi,
> -}>
> -}> I have about 150 host and 400 services checked via nagios.
> -}>
> -}> Sometime It happens that many services "are stale by XXX
> -}seconds". In the
> -}> same time one check_host_alive stop to work properly and goes
> -}in a CRITICAL
> -}> state, but I can reach/ping it correctly !
> -}>
> -}
> -}Are you pinging it from the server nagios is running at? These sort of
> -}things often happen when the network load skyrockets for a short period
> -}of time, bringing a router or switch to its knees from which it takes a
> -}while to recuperate.
> 
> Yes, I ping that host in the same moment in which nagios do it and from the
> server nagios is running at. I don't understand how it can be possible
> receive a "Socket timeot" in Nagios check_ping and pinging correctly from
> the command line!
> 

ICMP messages don't get "Socket Timeout". Are you absolutely sure you're 
not using a TCP based host check? What's your host check_command, and 
what does that command object look like?

> 
> -}> I don't know if both situations are related together. May
> -}someone suggest me
> -}> which parameters I can modify to :
> -}> 1) reduce the "stale" situations
> -}
> -}Increase the freshness_threshold
> -}
> -}> 2) reach/ping the hosts always correctly
> -}>
> -}
> -}Make sure your network is intact and make ICMP a prioritized protocol on
> -}all your network equipment as well as your servers. If the problem is
> -}related to the "offending" server being temporarily overloaded it might
> -}not be fixable by any other means than new hardware.
> -}
> -}You can try raising the max_check_attempts value for the host object. It
> -}might help you avoid the problem if it's only temporary spikes.
> 
> Raising max_check_attempts value it can be done from a set of known
> hosts...but I have this problem with different hosts, often not the same
> one. I'd solve the problem more generally.
> 


-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Lead Developer


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list