Adaptive Features in 2.0

Omeara, Randy randy.omeara at lmco.com
Thu May 1 18:15:33 CEST 2003


Thank you all, for your responses. Sorry for my delayed response.

>>>>
From: Ethan Galstad [mailto:nagios at nagios.org]
I'd be more inclined to improve retention support across restarts 
than add the ability to add/remove objects during runtime.  The 
overhead of doing so (consistency checking) doesn't make sense, 
especially when Nagios is designed to do that when it (re)starts.  
Retention in 2.0 has been improved (e.g. flap detection is now 
retained), so I'd be inclined to focus efforts there.  Restarting 
Nagios with a SIGHUP shouldn't take more than a few seconds and the 
only real thing that's lost is scheduling information (which is 
recalculated at startup).
<<<<

Improved retention would be a big plus also. I arrived at my initial request
as I thought about the application of Nagios where there might be many
thousands of objects, and users are able to manage their own  objects. I'm
thinking efficiency and scalability here...

(the following is an arbitrary size selection. Scale it up/down as you wish
in order to make it believable in your experience)

So, picture (say) 1,000 groups where each group has: 4 members (people), 5
hosts, and 5 services per host. We're talking 30,000 host+service objects
and (at least) 1,000 contacts, 1,000 contact groups, 1,000 hostgroups,
...etc. Let's just round it out to 35,000 objects, about 25,000 of which are
being actively monitored.

For each change, Nagios has to be run with the verification option to check
for errors, and once validated, Nagios must be hup'ed to restart with the
new change. In the best case, assume that the UI that is used to enter the
object changes will never allow an unacceptable change to make it to the
object validation execution phase of Nagios. For each change, Nagios goes
through its object check twice. That is, 70,000 objects are parsed and
checked.

Now, assume that objects are being defined and modified at a rate of
10/minute (might be high, but not extremely so). For a one minute period,
Nagios validates and reloads 700,000 objects.

I can't say how much hardware and processing power would be required to run
this type of site right now. I only know that, with the changes I proposed,
the resources required to do the same work would be reduced by a factor of
70,000 (700,000 operations versus 10).

Worthwhile?

Randy


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf




More information about the Developers mailing list