[PATCH] common/macros.c:2185:grab_standard_servicegroup_macro() speed up & Service check execution problem report

Andreas Ericsson ae at op5.se
Tue Jan 4 13:51:13 CET 2011


On 01/04/2011 12:40 PM, Stephane LAPIE wrote:
> On 01/04/2011 07:54 PM, Andreas Ericsson wrote:
>> On 01/04/2011 11:25 AM, Stephane LAPIE wrote:
>>> On 01/04/2011 06:38 PM, Andreas Ericsson wrote:
>>>> http://www.op5.org/community/plugin-inventory/op5-projects/merlin
>>>> http://git.op5.org/git/?p=nagios/merlin.git;a=blob;f=HOWTO;hb=master
>>>> http://git.op5.org/git/?p=nagios/merlin.git;a=blob;f=README;hb=HEAD
>>>>
>>>> Make especially sure you read the first paragraph of the README.
>>>
>>> Oh, I see.
>>>
>>> I have been working with Nagios 3.2.0 since around Nov 2009 for our
>>> monitoring setup, so I went and implemented my own thing using ssh keys,
>>> a control shell script with specific commands, tuned configurations to
>>> limit duplication of files, and enhancements to NSCA client/server to
>>> make them workable, and such.
>>>
>>
>> Sounds like you made the detour more pleasant, whereas Merlin cuts a
>> new path. If your way works to satisfaction, I guess that should keep
>> on working for you.
> 
> To be fair, my way is still very "static" and sub-optimal (it requires a
> lot of care to keep a configuration the specific way I made it).
> 
> Also, it is not so much a cluster, as a "master node holding the GUI and
> doing no checks", and "slave nodes doing all the checking for their
> assigned servers".
> 
> This means there is no real redundancy should the master blow up (then
> its status information would be lost, unless I go through the trouble of
> rsync'ing it or holding it on NFS or something for a standby master,
> which introduces yet its own share of troubles).
> 
> (No redundancy of information on the slaves is not so important since
> they are only here to send the latest information to the master node.)
> 
> So, while I have some level of "distribution" and "static load
> balancing", this setup can't do automagic load balancing intuitively,
> and it can't hope to provide complete redundancy, as it is. :)
> 

Sounds like you'd want to set up Merlin with a bunch of peers then. That
way you get complete loadbalancing and redundancy, all in one go.

> 
>>> I never tried to touch the DB side of things with a vanilla Nagios base,
>>> because it wouldn't be proper to handle that as a side-hack, and would
>>> most likely kill any measure of performance.
>>>
>>
>> You needn't bother with the db parts of Merlin if you don't want to,
>> but its distributed/loadbalanced nature has some perks that you just
>> can't get without an eventbroker module (such as command forwarding
>> and automagic loadbalancing).
> 
> Actually, I'd really gladly welcome having a database (with its own
> solid redundancy system) to keep the monitoring data, if the Nagios GUI
> cgi scripts could use it, which is not the case as I understand it.
> 

They can't, but Ninja can.
http://www.op5.org/community/plugin-inventory/op5-projects/ninja

The two are built to cooperate nicely.

>> You're welcome. Let me know how it pans out. We've had some troubles
>> on *BSD systems in the past, but they should all be ironed out by now.
>> I have limited testing capabilities though, so feedback is most welcome.
> 
> I would be strongly inclined to say it is solid on OpenBSD, in the 3.2.0
> incarnation at least. The core process can keep on running for months on
> end without a hitch.
> 

I meant troubles with Merlin, not with Nagios. Hopefully its sorted out
now though. My single bsd system at least plays nicely enough with its
Linux playmates in the lab, but the troubles we had were bsd <-> bsd,
which is really odd.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl




More information about the Developers mailing list