High Availability

Robert Holman Robert.Holman at jeppesen.com
Mon May 13 19:20:51 CEST 2013


I would look into MK_Livestatus as a backend. This would allow you to "cluster" web frontends, and by simply replicating config files across nodes, you could have "cold" standby servers in case the backend(s) actually fail.

Regards,
Rob


-----Original Message-----
From: nagios-users-request at lists.sourceforge.net [mailto:nagios-users-request at lists.sourceforge.net]
Sent: Saturday, May 11, 2013 5:31 AM
To: nagios-users at lists.sourceforge.net
Subject: Nagios-users Digest, Vol 84, Issue 3

Send Nagios-users mailing list submissions to
        nagios-users at lists.sourceforge.net

To subscribe or unsubscribe via the World Wide Web, visit
        https://lists.sourceforge.net/lists/listinfo/nagios-users
or, via email, send a message with subject or body 'help' to
        nagios-users-request at lists.sourceforge.net

You can reach the person managing the list at
        nagios-users-owner at lists.sourceforge.net

When replying, please edit your Subject line so it is more specific than "Re: Contents of Nagios-users digest..."


Today's Topics:

   1. Re: Variables for determining time before first alert
      (Justin T Pryzby)
   2. High Availabilty with Nagios (Steve Shipway)
   3. Re: High Availabilty with Nagios
      (Supporto Tecnico - Crazy Network)
   4. Re: High Availabilty with Nagios (William Leibzon)
   5. Re: High Availabilty with Nagios (Edward St Pierre)
   6. Re: check_http with spaces problem (Claudio Kuenzler)
   7. Re: check_http with spaces problem (???????? ?????????)
   8. Re: check_http with spaces problem (Claudio Kuenzler)
   9. Re: High Availabilty with Nagios (Andrew Widdersheim)
  10. Re: High Availabilty with Nagios (frank)
  11. Re: High Availabilty with Nagios (Jim Winkle)
  12. Re: High Availabilty with Nagios (Andreas Ericsson)
  13. Re: High Availabilty with Nagios (Andreas Ericsson)
  14. Trying to figure out the PCRE expression for      Nagiosgraph Map
      (Percy Kwong)
  15. Re: Trying to figure out the PCRE expression for Nagiosgraph
      Map (Claudio Kuenzler)
  16. Re: servicegroup overview not restricted for htaccess users
      (Jonas Meurer)
  17. Re: Trying to figure out the PCRE expression for Nagiosgraph
      Map (Percy Kwong)


----------------------------------------------------------------------

Message: 1
Date: Tue, 7 May 2013 22:14:17 -0700
From: Justin T Pryzby <justinp at norchemlab.com>
Subject: Re: [Nagios-users] Variables for determining time before
        first alert
To: nagios-users at lists.sourceforge.net
Message-ID: <20130508051417.GA28622 at norchemlab.com>
Content-Type: text/plain; charset=us-ascii

On Wed, May 08, 2013 at 12:33:19AM -0400, Alex wrote:
> > http://nagios.sourceforge.net/docs/3_0/objectdefinitions.html
>
> Thanks for your help. I've actually read quite a bit of that, and I'm
> still confused. It wasn't clear that max_check_attempts is the number
> of attempts that are made for each iteration, before another alert is

http://nagios.sourceforge.net/docs/3_0/notifications.html

max_check_attempts is the number of FAILED attempts (each made "retry_interval" after the previous failing attempt) before a service moves from a "soft" failure state to a "hard" failure state.  Notifies are sent when max_check_attempts have been made, and the service is then in a "hard" state.  Notifies are also sent when a hard-failing services is rechecked (at "check_interval"), and at least notification_interval has passed since the last notify.

Justin



------------------------------

Message: 2
Date: Thu, 9 May 2013 09:19:17 +0000
From: Steve Shipway <s.shipway at auckland.ac.nz>
Subject: [Nagios-users] High Availabilty with Nagios
To: "nagios-users at lists.sourceforge.net"
        <nagios-users at lists.sourceforge.net>
Message-ID:
        <7294716191A1E142B80615ED2C633BCA6830F61E at uxcn10-tdc02.UoA.auckland.ac.nz>

Content-Type: text/plain; charset="iso-8859-1"

Does anyone have an HA setup for Nagios that works?

I'm thinking of creating a NEB module that will link two Nagios setups, and replicate over all status changes, config changes, downtime, comments, etc etc and then set the 'standby' Nagios to be checks/notifications disabled when in standby mode, and enabled when in active mode.  Then put the two behind a failover load balancer (F5, Foundry or apache reverse proxy).

However this would be too much work if someone else has already found an equivalent solution.

I've looked at Merlin but it doesn't seem to do what I'm after (and the documentation is practically nonexistant - much the same as the NEB API documentation, in fact).  Mod_gearman lets me have redundant checks and replicate *active* checks, but not commands, downtime or passive checks.

Does anyone out there have a workable way to get an active/standby or active/active Nagios setup?  Would be interested in hearing all ideas...

Steve


Steve Shipway
University of Auckland ITS
UNIX Systems Design Lead
s.shipway at auckland.ac.nz<mailto:s.shipway at auckland.ac.nz>
Ph: +64 9 373 7599 ext 86487

-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

Message: 3
Date: Thu, 09 May 2013 11:50:02 +0200
From: Supporto Tecnico - Crazy Network <support at crazynetwork.it>
Subject: Re: [Nagios-users] High Availabilty with Nagios
To: nagios-users at lists.sourceforge.net
Message-ID: <518B714A.1040800 at crazynetwork.it>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

I would be interested too, i'm actually using merlind for this right now, but i would like to dont have for example double notifications if a server goes down.. and i do want both nagios set for notify, since if one is down (for any reason) the other one should be able to check and notify and vice-versa....

Regards


Il 09/05/2013 11:19, Steve Shipway ha scritto:
> Does anyone have an HA setup for Nagios that works?
>
> I'm thinking of creating a NEB module that will link two Nagios
> setups, and replicate over all status changes, config changes,
> downtime, comments, etc etc and then set the 'standby' Nagios to be
> checks/notifications disabled when in standby mode, and enabled when
> in active mode.  Then put the two behind a failover load balancer (F5,
> Foundry or apache reverse proxy).
>
> However this would be too much work if someone else has already found
> an equivalent solution.
>
> I've looked at Merlin but it doesn't seem to do what I'm after (and
> the documentation is practically nonexistant - much the same as the
> NEB API documentation, in fact).  Mod_gearman lets me have redundant
> checks and replicate *active* checks, but not commands, downtime or
> passive checks.
>
> Does anyone out there have a workable way to get an active/standby or
> active/active Nagios setup?  Would be interested in hearing all ideas...
>
> Steve
>
>
> *Steve Shipway*
> University of Auckland ITS
> /UNIX Systems Design Lead/
> s.shipway at auckland.ac.nz <mailto:s.shipway at auckland.ac.nz>
> Ph: +64 9 373 7599 ext 86487
> //
>
>
> ----------------------------------------------------------------------
> -------- Learn Graph Databases - Download FREE O'Reilly Book "Graph
> Databases" is the definitive new guide to graph databases and their
> applications. This 200-page book is written by three acclaimed leaders
> in the field. The early access version is available now.
> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>
>
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


--
Andrea Iannucci
----------------------------

----------------------------
Crazy Network di Iannucci Andrea
Viale G.B. Lulli, 24
00050 Cerveteri - RM
(w) www.crazynetwork.it
(e) andrea.iannucci at crazynetwork.it
(t) +39 06 62279876
(f) +39 06 62298767
(m) +39 338 8552885

-------------------------------------------------------------------------------
Please consider our enviromental responsabilit? before printing this E-Mail. Thank you.
-------------------------------------------------------------------------------
Questo messaggio di posta elettronica contiene informazioni di carattere confidenziale rivolte esclusivamente al destinatario sopra indicato.
E' vietato l'uso, la diffusione, distribuzione o riproduzione da parte di ogni altra persona. Nel caso aveste ricevuto questo messaggio di posta elettronica per errore, siete pregati di segnalarlo immediatamente al mittente e distruggere quanto ricevuto (compresi i file allegati) senza farne copia.
Qualsivoglia utilizzo non autorizzato del contenuto di questo messaggio costituisce violazione dell'obbligo di non prendere cognizione della corrispondenza tra altri soggetti, salvo pi? grave illecito, ed espone il responsabile alle relative conseguenze.
--------------------------------------------------------------------------------
This e-mail is confidential and may also contain privileged information.
If you are not the intended recipient you are not authorised to read, print, save, process or disclose this message. If you have received this message by mistake, please inform the sender immediately and delete this e-mail, its attachments and any copies.

Any use, distribution, reproduction or disclosure by any person other than the intended recipient is strictly prohibited and the person responsible may incur penalties.
--------------------------------------------------------------------------------





------------------------------

Message: 4
Date: Thu, 9 May 2013 02:51:57 -0700
From: William Leibzon <william at leibzon.org>
Subject: Re: [Nagios-users] High Availabilty with Nagios
To: Nagios Users List <nagios-users at lists.sourceforge.net>
Message-ID:
        <CAFCy1BiXoic=Jcq+kh-jr_yBCWVEk2EPi6ZhZUTO00=7jBFZwA at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

On Thu, May 9, 2013 at 2:19 AM, Steve Shipway <s.shipway at auckland.ac.nz> wrote:
> Does anyone have an HA setup for Nagios that works?
>
> I'm thinking of creating a NEB module that will link two Nagios
> setups, and replicate over all status changes, config changes,
> downtime, comments, etc etc and then set the 'standby' Nagios to be
> checks/notifications disabled when in standby mode, and enabled when
> in active mode.  Then put the two behind a failover load balancer (F5, Foundry or apache reverse proxy).

I've thought several times of doing it but never actually get started although I have it all planned out kinda like you.

In the mean time my HA setup which I've done for several customers involves config synced using git or svn (script run by cron that checks if its something new and then restart nagios if config passes tests). Both servers doing checks but config is such that for one server all notifications are disabled except for cross-checking of the other nagios This is achieved by having common template from which all services are derived and this template is in a file specific to each server and so one has notifications disabled and the other enabled.
This is not a full HA in a way that if one server dies you have to execute a script that would enable the other servers for notifications (this can be done automatically too but I prefer people to do it).

> However this would be too much work if someone else has already found
> an equivalent solution.
>
> I've looked at Merlin but it doesn't seem to do what I'm after (and
> the documentation is practically nonexistant - much the same as the
> NEB API documentation, in fact).  Mod_gearman lets me have redundant
> checks and replicate *active* checks, but not commands, downtime or passive checks.
>
> Does anyone out there have a workable way to get an active/standby or
> active/active Nagios setup?  Would be interested in hearing all ideas...
>
> Steve
>
>
> Steve Shipway
> University of Auckland ITS
> UNIX Systems Design Lead
> s.shipway at auckland.ac.nz
> Ph: +64 9 373 7599 ext 86487
>
>
> ----------------------------------------------------------------------
> -------- Learn Graph Databases - Download FREE O'Reilly Book "Graph
> Databases" is the definitive new guide to graph databases and their
> applications. This 200-page book is written by three acclaimed leaders
> in the field. The early access version is available now.
> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null



------------------------------

Message: 5
Date: Thu, 9 May 2013 10:59:30 +0100
From: Edward St Pierre <edward.stpierre at gmail.com>
Subject: Re: [Nagios-users] High Availabilty with Nagios
To: Nagios Users List <nagios-users at lists.sourceforge.net>
Message-ID:
        <CAHryeXGwcWNougRAnS4A+Z27K5ephjRj3TaLfXtiR3Uaj==vVg at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi,

I have done this before using drbd for block based replication and clustering on Redhat, this also could be done with pacemaker/corrosync clusters also.

Ed


On 9 May 2013 10:51, William Leibzon <william at leibzon.org> wrote:

> On Thu, May 9, 2013 at 2:19 AM, Steve Shipway
> <s.shipway at auckland.ac.nz>
> wrote:
> > Does anyone have an HA setup for Nagios that works?
> >
> > I'm thinking of creating a NEB module that will link two Nagios
> > setups,
> and
> > replicate over all status changes, config changes, downtime,
> > comments,
> etc
> > etc and then set the 'standby' Nagios to be checks/notifications
> > disabled when in standby mode, and enabled when in active mode.
> > Then put the two behind a failover load balancer (F5, Foundry or apache reverse proxy).
>
> I've thought several times of doing it but never actually get started
> although I have it all planned out kinda like you.
>
> In the mean time my HA setup which I've done for several customers
> involves config synced using git or svn (script run by cron that
> checks if its something new and then restart nagios if config passes
> tests). Both servers doing checks but config is such that for one
> server all notifications are disabled except for cross-checking of the
> other nagios This is achieved by having common template from which all
> services are derived and this template is in a file specific to each
> server and so one has notifications disabled and the other enabled.
> This is not a full HA in a way that if one server dies you have to
> execute a script that would enable the other servers for notifications
> (this can be done automatically too but I prefer people to do it).
>
> > However this would be too much work if someone else has already
> > found an equivalent solution.
> >
> > I've looked at Merlin but it doesn't seem to do what I'm after (and
> > the documentation is practically nonexistant - much the same as the
> > NEB API documentation, in fact).  Mod_gearman lets me have redundant
> > checks and replicate *active* checks, but not commands, downtime or passive checks.
> >
> > Does anyone out there have a workable way to get an active/standby
> > or active/active Nagios setup?  Would be interested in hearing all ideas...
> >
> > Steve
> >
> >
> > Steve Shipway
> > University of Auckland ITS
> > UNIX Systems Design Lead
> > s.shipway at auckland.ac.nz
> > Ph: +64 9 373 7599 ext 86487
> >
> >
> >
> ----------------------------------------------------------------------
> --------
> > Learn Graph Databases - Download FREE O'Reilly Book "Graph
> > Databases" is the definitive new guide to graph databases and their
> > applications. This 200-page book is written by three acclaimed
> > leaders in the field. The early access version is available now.
> > Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
> > _______________________________________________
> > Nagios-users mailing list
> > Nagios-users at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nagios-users
> > ::: Please include Nagios version, plugin version (-v) and OS when
> reporting
> > any issue.
> > ::: Messages without supporting info will risk being sent to
> > /dev/null
>
>
> ----------------------------------------------------------------------
> -------- Learn Graph Databases - Download FREE O'Reilly Book "Graph
> Databases" is the definitive new guide to graph databases and their
> applications. This 200-page book is written by three acclaimed leaders
> in the field. The early access version is available now.
> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

Message: 6
Date: Thu, 9 May 2013 13:23:24 +0200
From: Claudio Kuenzler <ck at claudiokuenzler.com>
Subject: Re: [Nagios-users] check_http with spaces problem
To: Nagios Users List <nagios-users at lists.sourceforge.net>
Message-ID:
        <CAF-yqgj3Y=56Mag2FH4Lci4fT-j4wWczn4EspLMkptYpOq5bxQ at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

> Sun May 5 22:29:03 EEST 2013 /usr/lib64/nagios/plugins/check_http -H
> granma.gr -u http://granma.gr/index.html -R "Web " -w 10 -c 20 Name or
> service not known HTTP CRITICAL - Unable to open TCP socket
>

You have to break up the -u argument. -u expects the path, not the complete URI. So in this case:

/usr/lib64/nagios/plugins/check_http -H granma.gr -u /index.html -R "Web"
-w 10 -c 20
-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

Message: 7
Date: Thu, 9 May 2013 14:29:54 +0300
From: ???????? ????????? <dkokmadis at gmail.com>
Subject: Re: [Nagios-users] check_http with spaces problem
To: Nagios Users List <nagios-users at lists.sourceforge.net>
Message-ID:
        <CAFY9zEw92mX255_sA_5i+TLmxqxn0B=qQZ4-4udkwCsXvTPWBQ at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Thank you for the answer,

The problem doesn't seem to be at the url but at the -R option

If I use -R "Web" the response is ok but if i use -R "Web somethin" it returns error!


2013/5/9 Claudio Kuenzler <ck at claudiokuenzler.com>

>
> Sun May 5 22:29:03 EEST 2013 /usr/lib64/nagios/plugins/check_http -H
>> granma.gr -u http://granma.gr/index.html -R "Web " -w 10 -c 20 Name
>> or service not known HTTP CRITICAL - Unable to open TCP socket
>>
>
> You have to break up the -u argument. -u expects the path, not the
> complete URI. So in this case:
>
> /usr/lib64/nagios/plugins/check_http -H granma.gr -u /index.html -R "Web"
> -w 10 -c 20
>
>
>
> ----------------------------------------------------------------------
> -------- Learn Graph Databases - Download FREE O'Reilly Book "Graph
> Databases" is the definitive new guide to graph databases and their
> applications. This 200-page book is written by three acclaimed leaders
> in the field. The early access version is available now.
> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

Message: 8
Date: Thu, 9 May 2013 13:45:09 +0200
From: Claudio Kuenzler <ck at claudiokuenzler.com>
Subject: Re: [Nagios-users] check_http with spaces problem
To: Nagios Users List <nagios-users at lists.sourceforge.net>
Message-ID:
        <CAF-yqggd5xXGv6QohgbLS7aG8W5BFKYQ=ep3gz4=sN9iisvEkw at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

>
> If I use -R "Web" the response is ok but if i use -R "Web somethin" it
> returns error!
>

Because the pattern needs to exist in the source code.

./check_http -H granma.gr -u /index.html -R "Web somethin"
HTTP CRITICAL: HTTP/1.1 200 OK - pattern not found - 4342 bytes in 0.126 second response time |time=0.125853s;;;0.000000 size=4342B;;;0

./check_http -H granma.gr -u /index.html -R "Web Design"
HTTP OK: HTTP/1.1 200 OK - 4342 bytes in 0.125 second response time
|time=0.124846s;;;0.000000 size=4342B;;;0




>
>
> 2013/5/9 Claudio Kuenzler <ck at claudiokuenzler.com>
>
>>
>> Sun May 5 22:29:03 EEST 2013 /usr/lib64/nagios/plugins/check_http -H
>>> granma.gr -u http://granma.gr/index.html -R "Web " -w 10 -c 20 Name
>>> or service not known HTTP CRITICAL - Unable to open TCP socket
>>>
>>
>> You have to break up the -u argument. -u expects the path, not the
>> complete URI. So in this case:
>>
>> /usr/lib64/nagios/plugins/check_http -H granma.gr -u /index.html -R
>> "Web" -w 10 -c 20
>>
>>
>>
>> ---------------------------------------------------------------------
>> --------- Learn Graph Databases - Download FREE O'Reilly Book "Graph
>> Databases" is the definitive new guide to graph databases and their
>> applications. This 200-page book is written by three acclaimed
>> leaders in the field. The early access version is available now.
>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>>
>> _______________________________________________
>> Nagios-users mailing list
>> Nagios-users at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>> ::: Please include Nagios version, plugin version (-v) and OS when
>> reporting any issue.
>> ::: Messages without supporting info will risk being sent to
>> /dev/null
>>
>
>
>
> ----------------------------------------------------------------------
> -------- Learn Graph Databases - Download FREE O'Reilly Book "Graph
> Databases" is the definitive new guide to graph databases and their
> applications. This 200-page book is written by three acclaimed leaders
> in the field. The early access version is available now.
> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

Message: 9
Date: Thu, 9 May 2013 10:48:54 -0400
From: Andrew Widdersheim <awiddersheim at hotmail.com>
Subject: Re: [Nagios-users] High Availabilty with Nagios
To: Nagios Users List <nagios-users at lists.sourceforge.net>
Message-ID: <SNT143-W18B68E12BD581B3471CA95DDA40 at phx.gbl>
Content-Type: text/plain; charset="iso-8859-1"

I did a talk at last years conference that touches on HA Nagios setup which uses DRBD and pacemaker. There were also talks about mod_gearman and Merlin that might also be helpful. The slides (and maybe video?) are available on nagios.org. Here is a link to my slides:

http://www.slideshare.net/nagiosinc/andrew-widdersheim-nagiosisdownbosswantstosee-you


------------------------------

Message: 10
Date: Thu, 9 May 2013 11:33:53 -0500 (CDT)
From: frank <ratty at they.org>
Subject: Re: [Nagios-users] High Availabilty with Nagios
To: Nagios Users List <nagios-users at lists.sourceforge.net>
Message-ID: <alpine.LRH.2.03.1305091125050.5637 at they.org>
Content-Type: text/plain; charset="iso-8859-1"

While HA can be a great thing I've always been of the opinion that a monitoring setup needs to have as few moving parts as possible. The more complexity to the monitor, the more chance you'll be chasing monitoring issues rather than site issues. And everthing you add on top of the monitor also needs to be monitored. So somehow that F5 is going to need an out-of-band monitor because if it dies then your Nagios host may well not have a way to contact you about it unless you've dual homed it which brings up a whole other set of issues.

The closest I got to HA at my last gig was creating a CNAME for the active Nagios host so in a failover you point the CNAME to the new box and at least passive checks can still roll in (after DNS timeout of course, which I say is better than reconfiging every NSCA clent).

-f

On Thu, 9 May 2013, Steve Shipway wrote:

> Date: Thu, 9 May 2013 09:19:17 +0000
> From: Steve Shipway <s.shipway at auckland.ac.nz>
> Reply-To: Nagios Users List <nagios-users at lists.sourceforge.net>
> To: "nagios-users at lists.sourceforge.net" <nagios-users at lists.sourceforge.net>
> Subject: [Nagios-users] High Availabilty with Nagios
>
> Does anyone have an HA setup for Nagios that works?
>
> I'm thinking of creating a NEB module that will link two Nagios setups, and replicate over all
> status changes, config changes, downtime, comments, etc etc and then set the 'standby' Nagios to
> be checks/notifications disabled when in standby mode, and enabled when in active mode.? Then
> put the two behind a failover load balancer (F5, Foundry or apache reverse proxy).
>
> However this would be too much work if someone else has already found an equivalent solution.
>
> I've looked at Merlin but it doesn't seem to do what I'm after (and the documentation is
> practically nonexistant - much the same as the NEB API documentation, in fact).? Mod_gearman
> lets me have redundant checks and replicate *active* checks, but not commands, downtime or
> passive checks.
>
> Does anyone out there have a workable way to get an active/standby or active/active Nagios
> setup?? Would be interested in hearing all ideas...
>
> Steve
>
>
> Steve Shipway
> University of Auckland ITS
> UNIX Systems Design Lead
> s.shipway at auckland.ac.nz
> Ph: +64 9 373 7599 ext 86487
> ?
>
>

------------------------------

Message: 11
Date: Thu, 09 May 2013 13:33:50 -0500
From: Jim Winkle <jrwinkle at wisc.edu>
Subject: Re: [Nagios-users] High Availabilty with Nagios
To: Nagios Users List <nagios-users at lists.sourceforge.net>
Message-ID: <7750eefa25b76.518ba5be at wiscmail.wisc.edu>
Content-Type: text/plain; CHARSET=US-ASCII

On 05/09/13, Steve Shipway  wrote:

> Does anyone have an HA setup for Nagios that works?
>
> I'm thinking of creating a NEB module that will link two Nagios setups, and replicate over all status changes, config changes, downtime, comments, etc etc and then set the 'standby' Nagios to be checks/notifications disabled when in standby mode, and enabled when in active mode. Then put the two behind a failover load balancer (F5, Foundry or apache reverse proxy).

We use rsync (run out of cron every minute) and a floating VIP between two hosts. Nagios is running on only one host at a time. It's a trivial (manual) process to switch between hosts.

Files which are synced: all Nagios files except logs and transient results. Files synced include Nagios configs, binaries and CGIs, helper apps, plugins, local plugins and NRPE configs, docs, HTML files, status files, all files in ~nagios, and the crontab for user nagios.

-- Jim



------------------------------

Message: 12
Date: Fri, 10 May 2013 10:57:28 +0200
From: Andreas Ericsson <ae at op5.se>
Subject: Re: [Nagios-users] High Availabilty with Nagios
To: support at crazynetwork.it,    Nagios Users List
        <nagios-users at lists.sourceforge.net>
Message-ID: <518CB678.1080909 at op5.se>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

On 2013-05-09 11:50, Supporto Tecnico - Crazy Network wrote:
> I would be interested too, i'm actually using merlind for this right
> now, but i would like to dont have for example double notifications if a
> server goes down.. and i do want both nagios set for notify, since if
> one is down (for any reason) the other one should be able to check and
> notify and vice-versa....
>

Double notifications is a bug, unless you send passive checkresults to
both masters, in which case it's by design. Usually people want to solve
passive checks by arranging a single target ip or hostname to send to
and then add peered nodes at that tier as necessary, so as to not have
to send checkresults to multiple nodes from all the monitored machines.

--
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.



------------------------------

Message: 13
Date: Fri, 10 May 2013 10:58:12 +0200
From: Andreas Ericsson <ae at op5.se>
Subject: Re: [Nagios-users] High Availabilty with Nagios
To: Nagios Users List <nagios-users at lists.sourceforge.net>
Message-ID: <518CB6A4.4040701 at op5.se>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

On 2013-05-09 11:19, Steve Shipway wrote:
> Does anyone have an HA setup for Nagios that works?
>
> I'm thinking of creating a NEB module that will link two Nagios
> setups, and replicate over all status changes, config changes,
> downtime, comments, etc etc and then set the 'standby' Nagios to be
> checks/notifications disabled when in standby mode, and enabled when
> in active mode.  Then put the two behind a failover load balancer
> (F5, Foundry or apache reverse proxy).
>
> However this would be too much work if someone else has already found
> an equivalent solution.
>
> I've looked at Merlin but it doesn't seem to do what I'm after (and
> the documentation is practically nonexistant - much the same as the
> NEB API documentation, in fact).  Mod_gearman lets me have redundant
> checks and replicate *active* checks, but not commands, downtime or
>passive checks.


Merlin would do exactly that if you set one of the nodes as a poller
but having all hosts assigned to it. When the poller goes down, the
master will by default take over checks for it.

Merlin is actually pretty well documented, but as textfiles that you
have to read the oldschool way. If there's anything you find lacking
from the HOWTO document or the README, please let me know and I'll
amend it.

>
> Does anyone out there have a workable way to get an active/standby or
> active/active Nagios setup?  Would be interested in hearing all
> ideas...
>

Well, we have about 800 of them.

--
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.



------------------------------

Message: 14
Date: Fri, 10 May 2013 15:46:38 -0400
From: Percy Kwong <psk at psk.net>
Subject: [Nagios-users] Trying to figure out the PCRE expression for
        Nagiosgraph Map
To: nagios-users at lists.sourceforge.net
Message-ID: <518D4E9E.1030409 at psk.net>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

I'm writing a pcre rule for a nagios map file.

The output for one query would be:

PROCS OK: 11 processes with args 'apache'

What would the map rule look like that would do the following?

1. Begin with "PROCS OK:"
2. End with "args 'apache'"
3. Extract only the numeric value before the word processes?

Assuming it would be a nested regex within the regex.

So basically, the map regex would only return 11, but enforce the rules
above?

Just trying to understand the logic behind this.

Thanks.



------------------------------

Message: 15
Date: Fri, 10 May 2013 23:11:42 +0200
From: Claudio Kuenzler <ck at claudiokuenzler.com>
Subject: Re: [Nagios-users] Trying to figure out the PCRE expression
        for Nagiosgraph Map
To: psk at psk.net, Nagios Users List
        <nagios-users at lists.sourceforge.net>
Message-ID:
        <CAF-yqgiw6aJ_w_quRdzRk-6PTcaW+_JfzVFkbAU6sdCui9J0kA at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

> The output for one query would be:
>
> PROCS OK: 11 processes with args 'apache'
>

Well first of all you'd have to make sure that nagiosgraph also takes the
output in account.
It's always better to do that with perfdata...

You have the choice to also take the output as source to parse, although I
strongly recommend to use perfdata. That's what it is for.


>
> What would the map rule look like that would do the following?
>
> 1. Begin with "PROCS OK:"
> 2. End with "args 'apache'"
> 3. Extract only the numeric value before the word processes?


The regex would look something like this:

/output:PROCS.*:(\d+) processes.*/

assuming that you don't care about the args and the status (OK, WARNING,
CRITICAL) part.
Only the digit (11) would be taken out of the output in this case.
-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

Message: 16
Date: Sat, 11 May 2013 13:24:27 +0200
From: Jonas Meurer <jonas at freesources.org>
Subject: Re: [Nagios-users] servicegroup overview not restricted for
        htaccess users
To: nagios-users at lists.sourceforge.net
Message-ID: <518E2A6B.3090209 at freesources.org>
Content-Type: text/plain; charset=ISO-8859-1

Hello,

Am 06.05.2013 10:42, schrieb Jonas Meurer:
> I fear that I discovered a security issue in Nagios 3.4.4 status.cgi:

no comments on that?

> All htaccess users, even if not listed in any authorized_for_* config
> option, have full access to service group overview, summary and grid:
> /nagios/cgi-bin/status.cgi?servicegroup=all&style=overview
> /nagios/cgi-bin/status.cgi?servicegroup=all&style=summary
> /nagios/cgi-bin/status.cgi?servicegroup=all&style=grid
>
> I hope that this is not intended. Is this issue known?
>
> Kind regards,
>   jonas
>
>
> ------------------------------------------------------------------------------
> Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
> Get 100% visibility into your production application - at no cost.
> Code-level diagnostics for performance bottlenecks with <2% overhead
> Download for free and get started troubleshooting in minutes.
> http://p.sf.net/sfu/appdyn_d2d_ap1
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>




------------------------------

Message: 17
Date: Sat, 11 May 2013 07:30:36 -0400
From: Percy Kwong <psk at psk.net>
Subject: Re: [Nagios-users] Trying to figure out the PCRE expression
        for Nagiosgraph Map
To: Claudio Kuenzler <ck at claudiokuenzler.com>
Cc: Nagios Users List <nagios-users at lists.sourceforge.net>
Message-ID: <518E2BDC.2070200 at psk.net>
Content-Type: text/plain; charset="iso-8859-1"

OK.  So to make more sense of the whole thing, the only thing that is
taken into account is the actual numerical value?  In other words, it's
automatically parsed?  This is what I wasn't sure of.

Here is the entry in the mapfile I was using:



I guess the reason I'm having issues with this is the following snippet
from the nagiosgraph.log:

Fri May 10 12:57:51 2013 insert.pl warn output/perfdata not recognized:
hostname:mymachine
servicedesc:Apache Processes
output:PROCS OK: 11 processes with args apache
perfdata:

the problem is there is no perfdata and the rrd file isn't being
populated, (and obviously, no graph).  I'm attributing this to the fact
that the map file entry is wrong.  This is really where my problem
lies.  Am I looking in the wrong place?

Thanks.





On 5/10/2013 5:11 PM, Claudio Kuenzler wrote:
>
>     The output for one query would be:
>
>     PROCS OK: 11 processes with args 'apache'
>
>
> Well first of all you'd have to make sure that nagiosgraph also takes
> the output in account.
> It's always better to do that with perfdata...
>
> You have the choice to also take the output as source to parse,
> although I strongly recommend to use perfdata. That's what it is for.
>
>
>     What would the map rule look like that would do the following?
>
>     1. Begin with "PROCS OK:"
>     2. End with "args 'apache'"
>     3. Extract only the numeric value before the word processes?
>
>
> The regex would look something like this:
>
> /output:PROCS.*:(\d+) processes.*/
>
> assuming that you don't care about the args and the status (OK,
> WARNING, CRITICAL) part.
> Only the digit (11) would be taken out of the output in this case.

-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and
their applications. This 200-page book is written by three acclaimed
leaders in the field. The early access version is available now.
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may

------------------------------

_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users


End of Nagios-users Digest, Vol 84, Issue 3
*******************************************

------------------------------------------------------------------------------
AlienVault Unified Security Management (USM) platform delivers complete
security visibility with the essential security capabilities. Easily and
efficiently configure, manage, and operate all of your security controls
from a single console and one unified framework. Download a free trial.
http://p.sf.net/sfu/alienvault_d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list