Nagios 3 distributed monitoring and NSCA

Marc Powell marc at ena.com
Wed Sep 10 22:27:59 CEST 2008


On Sep 10, 2008, at 2:45 PM, Jonathan Call wrote:

> In Nagios 2.x Nagios the Obessive Compulsive Service Processor  
> (OCSP) is
> not very robust. Even with a few hundred service checks the OCSP stuff
> on the distributed servers bogs down and does not send anything out.
> This forced people like me to use tools like OCP_daemon.

I have to disagree with this as a general statement. I've used Nagios  
2.x (currenlty .9), sending/receiving thousands of passive results  
every 5 minutes successfully for years. My 'largest' data collector  
(not dedicated to nagios) has all checks completed, or in progress, in  
the 5 minute interval  --

Total Services:                       2198
Services Checked:                     2198
Services Scheduled:                   2198
Active Service Checks:                2198
Passive Service Checks:               0
Total Service State Change:           0.000 / 6.250 / 0.011 %
Active Service Latency:               38.626 / 68.765 / 59.834 sec
Active Service Execution Time:        0.064 / 60.015 / 0.679 sec
Active Service State Change:          0.000 / 6.250 / 0.011 %
Active Services Last 1/5/15/60 min:   377 / 1804 / 2198 / 2198
Passive Service State Change:         0.000 / 0.000 / 0.000 %
Passive Services Last 1/5/15/60 min:  0 / 0 / 0 / 0
Services Ok/Warn/Unk/Crit:            2189 / 0 / 0 / 9
Services Flapping:                    0
Services In Downtime:                 0

One of my central receivers (2.9) --

Total Services:                       6137
Services Checked:                     6136
Services Scheduled:                   26
Active Service Checks:                28
Passive Service Checks:               6109
Total Service State Change:           0.000 / 17.960 / 0.034 %
Active Service Latency:               0.000 / 4.686 / 0.346 sec
Active Service Execution Time:        0.000 / 2.529 / 0.444 sec
Active Service State Change:          0.000 / 11.970 / 0.428 %
Active Services Last 1/5/15/60 min:   3 / 3 / 26 / 26
Passive Service State Change:         0.000 / 17.960 / 0.033 %
Passive Services Last 1/5/15/60 min:  1104 / 5680 / 6107 / 6107
Services Ok/Warn/Unk/Crit:            6107 / 1 / 0 / 29
Services Flapping:                    0
Services In Downtime:                 0

One of my central receivers is still running nagios-1.3, with a  
database backend, and even it can keep up --

Passive Checks:
	
Time Frame	Checks Completed
<= 1 minute:	628 (10.3%)
<= 5 minutes:	5191 (85.0%)
<= 15 minutes:	6105 (100.0%)
<= 1 hour:	6105 (100.0%)
Since program start:  	6108 (100.0%)

> Has the OCSP infrastructure improved in Nagios 3? I need it to be  
> robust
> enough to handle ~2500 service checks.

I'm doing nearly that now with nagios-2.9.

--
Marc



-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list