Nagios Scalability Issues

Andrew Tjang andrew.tjang at ask.com
Tue Jun 5 02:40:27 CEST 2007


Hello everyone,

I think we are facing a scalability issue in nagios.

We are currently monitoring appx 11000 services (spread out over a
thousand or so hosts. (give or take)

Everything is done in passive checks, so no active checks are done. 

We have a cron job that runs to feed the nagios.cmd file with external
service checks. This is a nightmare, as nagios does not finish
processing all of the passive service checks before the next set of
service checks comes in. This leads to the forking of many nagios
instances that never finish.

In an attempt to fix this situation, we have broken up monitoring into
smaller chunks, each with it's own nagios daemon (all integrated into
one gui). We have managed to make sure the above mentioned fork bomb
effect does not occur. We have divided the logical partitions into
groups of 1000, 3000, and 4000. The multiple groups of 1000 run fine
with the cron job feeding service checks at a frequency of 1 batch per 5
minutes. However, with the 3K/4K instances, we must set the frequency to
greater than 15 minutes to avoid the fork bombing problem.

My questions are these:
1) is this scalability problem normal?
2) is there a way to fix this scalability problem?
3) is there anything we can do to increase the frequency of the checks?
	- one idea is to spread the actual service checks out to give
nagios time to process them (rather than 1K at a 	time, perhaps
give a few hundred, sleep a bit, and then give a few more, etc.

Thanks in advance for all your input.
-Andrew

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list