Infrastructure help!

Mathew Walker lmw94002 at hotmail.com
Mon Jun 22 16:21:40 CEST 2009


Alot depends on the type of checks, the check frequency, and timeout values.

 

For example, we have ALOT of webinject (perl script) tests which can take 20-40 seconds to complete.  With a retry_interval of 1min, when we have some "issues", we can quickly have many checks running simultaneously (even hanging, waiting for the 60sec timeout).  This can quickly suck memory and may cause Nagios to crash.

 

In my situation, I have several small VPS systems that do external monitoring, testing and development that check to ensure Nagios is running on the our core monitoring box and page me to death if that check fails.  :-)


On a very reasonable spec'd server I'm sure you should be able to handle it.  


-- 
Mat W. - http://www.techadre.com


 
> From: t.h.amundsen at usit.uio.no
> To: harald.boehmecke at bertelsmann.de
> Date: Mon, 22 Jun 2009 13:44:41 +0200
> CC: nagios-users at lists.sourceforge.net
> Subject: Re: [Nagios-users] Infrastructure help!
> 
> Harald Böhmecke <harald.boehmecke at bertelsmann.de> writes:
> 
> > We are currently about to fully engage with Nagios.
> >
> > Our current VM which has aprox. 1200 services and 200 hosts will be
> > deleted and a new "distributed monitoring" will be setup.
> >
> > Thing is, we need to monitor aprox. 500 hosts with aprox. 6000
> > services.
> >
> > Almost all Servers are in a single location. So there should be no
> > need for Nagios "probes" distributed on big locations...
> >
> > Has anyone had an experience with Nagios with this ammount of
> > hosts/services to be monitored? I need a basic guidance regarding
> > hardware and distribution model to be used.
> 
> We currently have ~850 host and ~13000 service checks running on a
> single, standalone server. The hardware is nothing fancy, a plain HP
> ProLiant DL360 G5 with 8GB of RAM. You should be fine with a single
> server, but there are a few things to look out for:
> 
> * The 'enable_environment_macro' should be set to 0. Enabling this
> config option seriously increased the load on our server.
> 
> * PNP4Nagios and similar addons may significantly increase the load on
> your server. We collect perfdata and use PNP4Nagios for only a few
> select services.
> 
> * The Nagios embedded perl interpreter (ePN) was a major PITA, it
> leaked memory and we ended up compiling Nagios without ePN
> completely.
> 
> * You may have to adjust the command_check_interval and similar config
> settings.
> 
> * We mostly use NRPE plugins and only monitor UNIX/Linux hosts. You
> may have to take care not to use plugins that consume a lot of
> resources on your Nagios server.
> 
> These are the only tuning tips I can think of for now, but YMMV :)
> 
> We don't use distributed monitoring, but are looking into it. When a
> more streamlined clustered Nagios setup becomes available for production
> use (probably with the Merlin backend), we will definately be
> interested.
> 
> Cheers,
> -- 
> Trond Hasle Amundsen <t.h.amundsen at usit.uio.no>
> Center for Information Technology Services, University of Oslo
> 
> ------------------------------------------------------------------------------
> Are you an open source citizen? Join us for the Open Source Bridge conference!
> Portland, OR, June 17-19. Two days of sessions, one day of unconference: $250.
> Need another reason to go? 24-hour hacker lounge. Register today!
> http://ad.doubleclick.net/clk;215844324;13503038;v?http://opensourcebridge.org
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null

_________________________________________________________________
Lauren found her dream laptop. Find the PC that’s right for you.
http://www.microsoft.com/windows/choosepc/?ocid=ftp_val_wl_290
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20090622/ae338f54/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
Are you an open source citizen? Join us for the Open Source Bridge conference!
Portland, OR, June 17-19. Two days of sessions, one day of unconference: $250.
Need another reason to go? 24-hour hacker lounge. Register today!
http://ad.doubleclick.net/clk;215844324;13503038;v?http://opensourcebridge.org
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list