Nagios and a Real time performance issue (for a huge undertaken)

Marc Powell marc at ena.com
Mon Nov 22 16:58:44 CET 2004



> -----Original Message-----
> From: nagios-users-admin at lists.sourceforge.net [mailto:nagios-users-
> admin at lists.sourceforge.net] On Behalf Of sushilkumar
> Sent: Monday, November 22, 2004 6:10 AM
> To: nagios-users at lists.sourceforge.net
> Subject: [Nagios-users] Nagios and a Real time performance issue (for
a
> huge undertaken)
> 
> Hi nagios community,
> 
> I want start a new issue regarding with the nagios hosting in a huge
> network environment. If I use nagios to monitor hundreds of hosts &
> thousnads of services don't you fill it will consume a large network
> bandwidth & it will quickly flood all our network backbone with the
> packets
> flowing all the way freely & while at the same time with the
> acknowledgement & output packets sent by different host & services.

It's hardly an issue. Monitoring 2748 services on 1750 hosts every 5
minutes + NSCA traffic reporting those checks to 2 central servers uses
about 700 kbits/s. The actual check portion of that is probably only
about 200-250 kbits/s. This will of course vary depending on the types
of checks you are doing and their frequency. For example, a simple ping
check every 5 minutes uses less bandwidth than an http check of a web
page every two minutes. We use approximately 75% ping checks and the
remaining being NRPE, HTTP, SMTP, etc... If you're concerned about
bandwidth use, just make sure the checks you implement use the least
amount of bandwidth per check and/or check less often. IMHO, you should
be using whatever checks and frequency that give you the greatest level
of comfort that your network and devices are operating as they should be
and then make sure you have the bandwidth to accommodate that.

> 
> again How we can forget the refresh rate of 90 seconds (which itself
means
> all the stuff is going to be checked after each 90 seconds). also
there is
> one strange feature of nagios service check "when a particular service
is
> not in the o.k state at first check & it will get 3 chances".

The refresh rate doesn't have anything to do with the execution of
checks, only how often the results are displayed in the GUI. Check
intervals are controlled by max_check_attempts, normal_check_interval
and retry_check_interval and can be defined on a per-service basis. The
max_check_attempts is probably what you are looking for with regards to
your 'strange feature of nagios'.
 
> There is a concept of parallelized check but, i think it is not enough
> justifyable analytically.
> Anybody can statstically (analytically) justify the things?

Why should we? You can check these things yourself _and_ it will be more
applicable as only you know the types of checks and their frequency that
you want.

--
Marc



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list