Nagios 3 distributed monitoring and NSCA

Frederik Vanhee fvanhee at gmail.com
Wed Sep 24 20:19:48 CEST 2008


Marc Powell wrote:
> On Sep 10, 2008, at 2:45 PM, Jonathan Call wrote:
>
>   
>> In Nagios 2.x Nagios the Obessive Compulsive Service Processor  
>> (OCSP) is
>> not very robust. Even with a few hundred service checks the OCSP stuff
>> on the distributed servers bogs down and does not send anything out.
>> This forced people like me to use tools like OCP_daemon.
>>     
>
> I have to disagree with this as a general statement. I've used Nagios  
> 2.x (currenlty .9), sending/receiving thousands of passive results  
> every 5 minutes successfully for years. My 'largest' data collector  
> (not dedicated to nagios) has all checks completed, or in progress, in  
> the 5 minute interval  --
>
> Total Services:                       2198
> Services Checked:                     2198
> Services Scheduled:                   2198
> Active Service Checks:                2198
> Passive Service Checks:               0
> Total Service State Change:           0.000 / 6.250 / 0.011 %
> Active Service Latency:               38.626 / 68.765 / 59.834 sec
> Active Service Execution Time:        0.064 / 60.015 / 0.679 sec
> Active Service State Change:          0.000 / 6.250 / 0.011 %
> Active Services Last 1/5/15/60 min:   377 / 1804 / 2198 / 2198
> Passive Service State Change:         0.000 / 0.000 / 0.000 %
> Passive Services Last 1/5/15/60 min:  0 / 0 / 0 / 0
> Services Ok/Warn/Unk/Crit:            2189 / 0 / 0 / 9
> Services Flapping:                    0
> Services In Downtime:                 0
>
> One of my central receivers (2.9) --
>
> Total Services:                       6137
> Services Checked:                     6136
> Services Scheduled:                   26
> Active Service Checks:                28
> Passive Service Checks:               6109
> Total Service State Change:           0.000 / 17.960 / 0.034 %
> Active Service Latency:               0.000 / 4.686 / 0.346 sec
> Active Service Execution Time:        0.000 / 2.529 / 0.444 sec
> Active Service State Change:          0.000 / 11.970 / 0.428 %
> Active Services Last 1/5/15/60 min:   3 / 3 / 26 / 26
> Passive Service State Change:         0.000 / 17.960 / 0.033 %
> Passive Services Last 1/5/15/60 min:  1104 / 5680 / 6107 / 6107
> Services Ok/Warn/Unk/Crit:            6107 / 1 / 0 / 29
> Services Flapping:                    0
> Services In Downtime:                 0
>
> One of my central receivers is still running nagios-1.3, with a  
> database backend, and even it can keep up --
>
> Passive Checks:
> 	
> Time Frame	Checks Completed
> <= 1 minute:	628 (10.3%)
> <= 5 minutes:	5191 (85.0%)
> <= 15 minutes:	6105 (100.0%)
> <= 1 hour:	6105 (100.0%)
> Since program start:  	6108 (100.0%)
>
>   
>> Has the OCSP infrastructure improved in Nagios 3? I need it to be  
>> robust
>> enough to handle ~2500 service checks.
>>     
>
> I'm doing nearly that now with nagios-2.9.
>
> --
> Marc
>
>
>
> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
> Build the coolest Linux based applications with Moblin SDK & win great prizes
> Grand prize is a trip for two to an Open Source event anywhere in the world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
>   
I can agree with Mark, I use OCSP in a distributed setup with 8000 
passive services.
This worked fine on Nagios 1.x, 2.x and 3.0.3

Frederik

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list