Redundant Switches/Routers/Network Interfaces and parent configuration question

Stefan Giesen Stefan.Giesen at firstgate.de
Tue Aug 3 10:33:14 CEST 2004


Hi all,

I've got a question/problem regarding redundant network connections. We
have the setup as displayed in the graphic below (sorry for the image,
but a picture says more than thousand words and it's easier to
understand what i mean).

As you can see, we've two redundant routers connected to the backbone
(or internet or whatever). Connected to each router is a master switch.
The frontend switches are each connected to both master switches. All
clients have two network interfaces, each connected to one corresponding
frontend switch. There are several virtual IPs (VIP) on each client
(RIP/Zebra is used for routing) and different services running on each
VIP. I added a logical client to the picture (which has no IP address,
since it's only a concept) to clarify the explanations below.

What we need is the following:
1) If anything fails, we want to get notifications for it, but
2) if anything blocking fails, we don't want to get notifications for
anything below, only for the blocking parts.
3) Blocking means that both of the structural elements are not
reachable/down (e.g. both routers, both master switches or both client
interfaces and so on).
4) we want ONE entry in the host/services list for every host (that's
why i added the "logical" host in the diagram), so the overview won't
get huge because of every service is listed twice (= for each network
interface of the client).

Actually I tried the following (i only list the relevant part of the
entries, all hosts checked via ping, except the logical host (which has
it's own check script to check if both interfaces are up), system is
Debian woody with Nagios version 1.2-0 from backports.org):

checkcommands.cfg:
define command{
 command_name    check_myping
 command_line    /usr/lib/nagios/plugins/check_ping -H $ARG1$ -w 100:25%
-c 250:100%
}
define command{
 command_name    check_vhost
 command_line    /usr/lib/nagios/plugins/check_http -H $ARG1$ -I $ARG1$
-w $ARG2$ -c $ARG3$ -e "HTTP/1.1 200 OK"
}


hosts.cfg:
define host{
 host_name               router1
}
define host{
 host_name               router2
}
define host{
 host_name               switch1
 parents                 router1
}
define host{
 host_name               switch2
 parents                 router2
}
define host{
 host_name               switch3
 parents                 switch1,switch2
}
define host{
 host_name               switch4
 parents                 switch1,switch2
}
define host{
 host_name               eth0.client1
 parents                 switch3
}
define host{
 host_name               eth1.client1
 parents                 switch4
}
define host{
 host_name               client1
 parents                 eth0.client1,eth0.client1
}


hostgroups.cfg:
define hostgroup{
 hostgroup_name  all-routers
 members         router1,router2
}
define hostgroup{
 hostgroup_name  all-switches
 members         switch1,switch2,switch3,switch4
}
define hostgroup{
 hostgroup_name  all-pclients
 members         eth0.client1,eth0.client1
}


services.cfg:
define service{
 use             ping-template
 hostgroup_name  all-routers
}
define service{
 use             ping-template
 hostgroup_name  all-switches
}
define service{
 use             ping-template
 hostgroup_name  all-pclients
}
define service{
 host_name           client1
 service_description Logical Host Check
 check_command       check_lhost!eth0.client1!eth1.client1
}
define service{
 host_name           client1
 service_description PING servive1 IP
 check_command       check_myping!service1-vip.client1
}
define service{
 host_name           client1
 service_description PING servive2 IP
 check_command       check_myping!service2-vip.client1
}
define service{
 host_name           client1
 service_description PING servive3 IP
 check_command       check_myping!service3-vip.client1
}
define service{
 host_name           client1
 service_description HTTP Service1 on client1
 check_command       check_vhost!service1-vip.client1!150!500
}
define service{
 host_name           client1
 service_description HTTP Service2 on client1
 check_command       check_vhost!service2-vip.client1!150!500
}
define service{
 host_name           client1
 service_description HTTP Service3 on client1
 check_command       check_vhost!service3-vip.client1!150!500
}


dependencies.cfg:
define servicedependency{
 host_name                     client1
 service_description           PING servive1 IP
 dependent_host_name           client
 dependent_service_description HTTP Service1 on client1
 execution_failure_criteria    u,w,c
 notification_failure_criteria u,w,c
}
define servicedependency{
 host_name                     client1
 service_description           PING servive2 IP
 dependent_host_name           client
 dependent_service_description HTTP Service2 on client1
 execution_failure_criteria    u,w,c
 notification_failure_criteria u,w,c
}
define servicedependency{
 host_name                     client1
 service_description           PING servive3 IP
 dependent_host_name           client
 dependent_service_description HTTP Service3 on client1
 execution_failure_criteria    u,w,c
 notification_failure_criteria u,w,c
}

We tried it and got the result that we won't get any notifications from
the services if just ONE switch/router/interface is going down. So it
looks like "parents" rules are like: "_all_ parents have to be up"
instead of "at least _one_ of the parents has to be up" (which makes
much more sense regarding "normal" network structures - how often do you
have a client which fails if ONE of his parents fails? And no, if you
have two switches in a row, the client would have just ONE parent,
because the second switch has the first one as parent, no problem
there).

Has anyone a similar network layout or knows the solution for this?

Thanks in advance,
Stefan
-- 
Stefan Giesen, Systemadministration Frankfurt
FIRSTGATE Internet AG, Im MediaPark 5, 50670 Koeln
Telefon: +49 (0) 2 21 / 45 45-745, Telefax: +49 (0) 2 21 / 45 45-710
Internet: www.firstgate.de         eMail: Stefan.Giesen at firstgate.de

-------------- next part --------------
A non-text attachment was scrubbed...
Name: nagios-konzept.png
Type: image/png
Size: 19724 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/users/attachments/20040803/420e5d17/attachment.png>


More information about the Users mailing list