Designing distributed and failover architecture

shadih rahman shadhin71 at gmail.com
Fri Apr 16 21:52:30 CEST 2010


I have not implemented distributed nagios but I have failover setup.  Few
comments I want to make

1) If you are using ndoutils for backend make sure you have looked at faster
speed up option.  There is a patch for faster start up in opsview

2) if you using nsca to transfer acknowledgement and comments make sure you
do some research on that.  scalability is a problem

3) remember to understand failover architecture properly.  I have made few
mistakes when it comes to ndoutils.  your ndomod data processing option
should be different on the failover from master server.  Your master should
push configuration definitions into database but your failover should not
push theses configuration defitions in database.

4) Keep in mind scalability will be a problem.  data retention is really a
difficult problem for large setup.




On Sat, Apr 10, 2010 at 10:44 AM, Shanti Katta <shantikatta at gmail.com>wrote:

> Need suggestions on designing a distributed and failover Nagios monitoring
> infrastructure for ~1500 Linux RHEL hosts spread across 2 datacenters and
> DMZ networks.
> Reading through different archives, it appears DNX is the most preferred
> method for distribution/cluster setup and having a secondary Nagios server
> as a fail over option managed via Linux HA/DRBD.
> What are some of the cons in following setup:
>
> - Primary and secondary(failover) Nagios servers managed by Linux
> HA/DRBD/cron etc. Have MySQL replication between them.
> - Primary Nagios server performing active checks via N DNX worker nodes in
> both datacenters.
> - Primary Nagios server monitoring DMZ hosts via NRPE (Custom regular
> expression for services).
>
> Thanks
>
>
> ------------------------------------------------------------------------------
> Download Intel® Parallel Studio Eval
> Try the new software tools for yourself. Speed compiling, find bugs
> proactively, and fine-tune applications for parallel performance.
> See why Intel Parallel Studio got high marks during beta.
> http://p.sf.net/sfu/intel-sw-dev
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>



-- 
Cordially,
Shadhin Rahman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20100416/48ec962b/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list