Failover nagios server

Steve Shipway s.shipway at auckland.ac.nz
Tue Mar 7 22:03:55 CET 2006


> I have a standalone server with nagios running on it. I want 
> to tune up an another server, for distributed monitoring. But 
> main task is to provide for failover work of allover nagios 
> configuration - in case of failure of one of the servers 
> another server must provide data collection from all 
> monitoring servers. Is there any solutions for it?

We have such a setup here.

I have two Linux servers, each with 2 network cards and one Adaptec
serveraid card.  There is an external SCSI disk unit, connected to BOTH
server's SCSI cards, with a pair of disks configured in mirror.  The servers
are joined by a crossed ethernet cable on the second network card.  The
primary network card of each is on the network.

I have installed linux-HA on both servers, and have a service group
consisting of a virtual IP, the filesystem on the external disk, and the
nagios service.  This is set to failover between servers, with one server
being the primary home with autofailback.

(Actually, there is also a mysql database on the nagios server, plus the
BigBrother/Nagios gateway and a couple of other services, and the other
server normally runs our MRTG setup on a separate filesystem, but this is
just extra)

Since the Adaptec Serveraid natively supports this configuration, it works
very well with linux-HA and the failover goes nicely.  The only thing to add
is some cleanup code so that, if server1 dies, then server2 picks up the
filesystem and needs to delete any Nagios.cmd pipe that may have been left
lying around before starting.

This was surprisingly easy to set up, once the Raid config had been done.
Thereis a bitof problem with how to define 'down' -- if the network
interface 1 is down (so people cannot see Nagios) but interface 2 is up (so
the heartbeat still works) should it fail over?  I would say no.  You should
also send the heartbeat over the crossed ethernet cable, since otherwise a
switch going down would make the 2 servers fight over the services (but
thanks to the serveraid having internal locking, youll never get both
accessing the filesystem at once and corrupting data)

Steve




-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list