Last ditch effort

Voon, Ton Ton.Voon at egg.com
Thu Mar 25 08:27:11 CET 2004


I sympathise. In our company, I was getting a lot of these timeouts. Doing a
continuous snoop on our Nagios server with a random server showed that the
Nagios server had made a check_nrpe request, but the target did not respond.
The Nagios server then sent a followup packet after 10 seconds when the
target responded almost immediately.

I don't know where this 10 second network retry setting is held (running on
Solaris 5.6), but I am blaming sporadic network problems, which
unfortunately is outside of my remit. I have set as many plugins to timeout
after 30 seconds instead for less failures.

Ton

-----Original Message-----
From: Aaron Levitt [mailto:alevitt at navis.com] 
Sent: Wednesday, March 24, 2004 4:22 PM
To: nagios-users at lists.sourceforge.net
Subject: [Nagios-users] Last ditch effort


Greets everyone-

I already posted this question already before, but never got to a
solution.  I wanted to post one last time in hopes I come up with a
solution before I am forced to go back to netsaint.

I have recently upgraded to nagios 1.1 from an older release of netsaint
after a long time of faithful service.  Since the upgrade, we have been
getting some random timeouts, with an average of 1 or 2 a day.  All the
information I really have to go on, is the output from nagios.  The mail
contains "Info: CRITICAL - Plugin timed out after 10 seconds".  The logs
have the same information, but nothing more helpful.  I'm not sure where
it's getting the 10 seconds from.  Initially I thought it was nrpe
timing out, but it seems to be random services and hosts as well.

So far, I have changed max_concurrent_checks and various timeout values
in nagios.cfg.  As well as changing max_check_attempts and
normal_check_interval to make sure there wasn't too much going on at the
same time (which really shouldn't matter since nagios is only monitoring
about 60 hosts).  I poked through the source code but couldn't find
anything with a 10 second timeout.

Currently nagios is running on it's own box, no other services are
running on it.  It's a 2.4.20 kernel on Redhat 9 and the hardware is a
PIII 800Mhz and it's got 384Mb of RAM.  Nothing very special, but that
should be enough I would think.

All of these false alarms are making nagios completely unreliable, but
it has so many new enhancements, I would really like to continue to use
it.  If anyone has any suggestions, please send them my way.  Any help
would be greatly appreciated.

Thanks in advance.

-=Aaron


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id70&alloc_id638&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


This private and confidential e-mail has been sent to you by Egg.
The Egg group of companies includes Egg Banking plc
(registered no. 2999842), Egg Financial Products Ltd (registered
no. 3319027) and Egg Investments Ltd (registered no. 3403963) which
is authorised and regulated by the Financial Services Authority. Egg
Investments Ltd. is entered in the FSA register under number 190518. 

Registered in England and Wales. Registered offices: 1 Waterhouse
Square, 138-142 Holborn, London EC1N 2NA.

If you are not the intended recipient of this e-mail and have received
it in error, please notify the sender by replying with 'received in
error' as the subject and then delete it from your mailbox.



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list