nsca / distributed monitoring result problem

Chris Goosen cgoosen at jhb.artec.co.za
Mon Jan 16 11:19:35 CET 2006


Hello all..

 

I am running my nagios central server on an HP 2.4ghz with 512mb ram.

 

At present, I am monitoring 65 hosts with approx. 400 services.

After a reboot, everything works perfectly, but the longer my server
runs, the more sluggish it gets and eventually the nsca processes
consume all the memory and the server stops responding. What also
happens it that I start getting hosts that are reported as down even
though they have the correct ping response.. the error says "PLUGIN
TIMED OUT after 10 seconds"

 

Here is an example of what I mean:

 

Host State Information

Host Status:

  DOWN    

Status Information:

CRITICAL - Plugin timed out after 10 seconds

Last Status Check:

01-16-2006 12:06:28

Status Data Age:

0d 0h 2m 57s

Last State Change:

01-16-2006 10:20:44

Current State Duration:

0d 1h 48m 41s

Last Host Notification:

01-16-2006 10:20:44

Current Notification Number:  

2  

Is This Host Flapping?

N/A

 

 

 

 

OK 01-16-2006 12:05:47 63d 19h 30m 59s 1/3 PING OK - Packet loss = 0%,
RTA = 0.42 ms 

 

I assume that these are related and that the lack of memory caused this
problem, would an upgrade to from nagios 1.2 to nagios 1.3 fix this? If
so, what is the best way to perform that upgrade?

 

my /etc/xinetd.d/nsca file :
# default: on 
# description: NSCA 
service nsca 
{ 
flags = REUSE 
socket_type = stream 
wait = no 
user = nagios 
group = nagios 
server = /usr/sbin/nsca 
server_args = -c /home/e-smith/nagios/nsca.cfg --inetd 
cps = 9000 30 
instances = UNLIMITED 
log_on_failure += USERID 
disable = no 
only_from = ip1, ip2, ip3, etc..
}

 

command_check_interval= -1

 

System info:

SME server 6.01 (2.4.20-18.7, i686)

Perl v5.6.1

Apache/1.3.27

Nagios 1.2

 

Any advice would be great... thanks.

 

Chris

 

 

 

 

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20060116/ced2ec25/attachment.html>


More information about the Users mailing list