Distributed Monitoring woes and performance issues.

Jason Rojas jrojas at shopzilla.com
Tue Nov 8 17:15:17 CET 2005


Here is a good one for you guys.
I am currently monitoring roughly 4357 services on 700 hosts.
Now this is not all the hosts/services I need to be monitoring.
 From the output of nagios -s -c nagios.cfg
it tells me that one complete run checking all mentioned services/hosts 
will take roughly 885 seconds (14.7 minutes)
Thats bad.
Correct my math if I am wrong. For nagios to check HOSTX it can take up 
to 14 minutes., ok, thats kinda bad, lets say after nagios checks HOSTX, 
HOSTX decides to die, Well then it can take up to an additional 
14minutes to notice it is down.
Which in turn gives me a huge huge response time on downed machines.
This is bad in my case seeing as to how I am not even monitoring 
everything yet.
I have tested and tested my config and it seems that no matter what I do 
nagios is just not going to cut it.
I can either raise the check interval from every 5 minutes to every 15, 
but that still gives me latency issues.
When I realized this I decided that a distributed setup would be the way 
to go seeing as to how my company is deploying multiple co-locations, I 
do have my master server storing the data in a mysql DB, the problem 
with being distributed is that you cannot have the remote (scan only 
nodes) send data back via nsca because nagios is pulling data from the 
db. So some hosts dont show etc etc
So I went ahead and modified the distributed nodes to send directly to 
the db, not a good idea, there were so many inserts going on it rendered 
the database useless and the web interface took forever to load.

Does anyone have any ideas for a solution to this besides an enterprise 
grade monitoring system?


-Jason Rojas



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list