multiple nagios monitoring that have to agree?

Dirk H. Schulz dirk.schulz at kinzesberg.de
Thu Mar 30 15:06:41 CEST 2006


Hi folks,

I would prefer a completely different approach:

Setup both Nagios servers independently and have them both send 
notifications to an email address that hands the mail over to a little 
script. All this script does is checking if the same notification has 
already been sent from the other host and if yes delete it.

The advantage of this approach is: You can add other functionality by 
and by to this script (e. g. filter out mass notifications from Nagios 
that can occur even if you use topologic dependencies). And you do not 
have to mess around with layers of service definitions. You can even use 
rsync to copy config alterations from one Nagios instance to the other.

Any ideas pro/against this approach?

Dirk

John P. Rouillard schrieb:

>In message <A7B0A9F02975A74A845FE85D0B95B8FA03468E18 at misex01.ena.com>,
>"Marc Powell" writes:
>  
>
>>>From: On Behalf Of John P. Rouillard
>>>Sent: Wednesday, March 29, 2006 4:45 PM
>>>In message <A7B0A9F02975A74A845FE85D0B95B8FA03468E0E at misex01.ena.com>,
>>>"Marc Powell" writes:
>>>      
>>>
>>>>>-----Original Message-----
>>>>>From: On Behalf Of Philip Hallstrom
>>>>>Sent: Wednesday, March 29, 2006 3:54 PM
>>>>>I'm wondering if two nagios instances can be set up to monitor the
>>>>>same hosts/services and have to agree with each other before
>>>>>sending a notification?
>>>>>          
>>>>>
>>[chop]
>>
>>    
>>
>>>>For an off-the-cuff suggestion, if you used multiple retries and didn't
>>>>specifically require that both servers see the state as HARD you could
>>>>embed that logic in your notification script.
>>>>
>>>>- NagiosA always sends notifications.
>>>>        
>>>>
>>>If you have a redunant setup, only one server A or B would have to
>>>send notifications for the service B.
>>>
>>>      
>>>
>>I presume that you're referring to this from your previous e-mail --
>>"On both nagios 1 and 2 create service B that does notify (and poll)
>>that uses check_cluster to require that both be in error condition to
>>generate an error notification."
>>    
>>
>
>Correct.
>
>  
>
>>How would you prevent duplicate notifications? Nagios 1 wouldn't know
>>that Nagios 2 had already sent a notification and vice-versa unless you
>>kept track of that externally.
>>    
>>
>
>The site where it was set up originally had the second server as a
>backup notifier. If it lost connectivity to the primary server it
>switched on notifications.
>
>Later a seperate SEC process on the second server monitored the
>primary's notifications and would release notifications queued up by
>the second nagios process (keyed by host, service, severity) if the
>notifications from the first and second didn't come through within 5
>minutes of each other. It worked and made sure that alert's weren't
>delayed more than 5 minutes, but frankly the original setup with the
>second server not notifying unless it lost heartbeat on the original
>server (or the original server detected it couldn't get pages out) had
>a lot fewer issues. Then again I didn't have to work there.
>
>  
>
>>>>- ServiceX on HostY reaches hard state.
>>>>- NagiosA initiates notification for ServiceX on HostY
>>>>- Notification script searches status.log on NagiosB or performs HTTP
>>>>screen scrape on NagiosB to determine state of ServiceX on HostY as
>>>>seen from there.
>>>>- If NagiosB shows CRITICAL, send notification
>>>>- If only one shows critical do nothing(?)
>>>>- repeat at regular intervals in case NagiosB was slow to pick up the
>>>>state (or use the vice-versa logic to also send notifications from
>>>>NagiosB)
>>>>        
>>>>
>>>Neat idea, however you would need to handle the case where nagios B
>>>isn't properly updating the service (and therfore isn't providing
>>>valid data).
>>>      
>>>
>>Looking at Last Update should cover that scenario.
>>    
>>
>
>True.
> 
>  
>
>>>>There are probably pitfalls but I think that's how I would approach it
>>>>at first.
>>>>        
>>>>
>>>Yeah. It's a bit dicey regardless of how you slice it.
>>>      
>>>
>>Agreed. Interesting problem though.
>>    
>>
>
>Yup then again so is automaticaly rewriting the nagios config files
>and correcting the parent links so they can be used on a redundant
>host.
>
>				-- rouilj
>John Rouillard
>===========================================================================
>My employers don't acknowledge my existence much less my opinions.
>
>
>-------------------------------------------------------
>This SF.Net email is sponsored by xPML, a groundbreaking scripting language
>that extends applications into web and mobile media. Attend the live webcast
>and join the prime developer group breaking into this new coding territory!
>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
>_______________________________________________
>Nagios-users mailing list
>Nagios-users at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/nagios-users
>::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
>::: Messages without supporting info will risk being sent to /dev/null
>  
>



-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list