Modeling a VPN network in nagios to suppress spurious notifications.

Andreas Ericsson ae at op5.se
Tue Oct 11 00:24:06 CEST 2005


John P. Rouillard wrote:
> Hello all:
> 
> I have reached bit of fork in the road in a nagios deployment. I have
> a physical network that has a VPN running on it across the Internet to
> multiple sites. A large number of the hosts/services I need to monitor
> are on the private network only.
> 
> The legend for the ASCII art network diagrams below.
> 
>     P(h1)     - Private interface for host 1 (172.16...)
>     U(h1)     - pUblic interface for host 1 (some public IP)
>     P(h1..h3) - Private interfaces for host 1 to host 3.
>     P(s1)     - Private traffic on switch 1
>     U(s1)     - public traffic on switch 1
>     UP(s1)    - public and private traffic on switch 1
>     U(I)      - Internet (public)
>     P(I)      - Internet (private VPN tunneled traffic)
>     P(v1)     - private address for VPN box
>     U(v1)     - public address for VPN box
>     U(r1)     - public router 1
>     U(nagios) - public interface for nagios 
>     P(nagios) - private interface for nagios 
> 
> Hopefully you can make some sense of the ASCII art. BTW does anybody
> have a tool to generate ASCII diagrams from some sort of file
> specification like dot/graphviz?
> 
> So the question is: what is the most efficient way to represent the
> dependencies (via parent or host dependencies) to prevent
> notifications on the VPN when the public network goes down?
> 

The whole concept of parents was designed for this very purpose, so you 
might as well put it to use for it.

> I was thinking of creating two separate networks. Each would have the
> parents defined as seen within its topology. So:
> 
>   private:
>  P(h1..h3) <- P(s1) <- P(v1) <- private_internet <- P(v2) <- P(s2) <- P(nagios)
>                                        v
>                                        |
>         P(h14..h16) <- P(s3) <- P(v3) -+
> 
> with switches and the private VPN addresses going to a "private
> internet" and creating all the parents as though the private network
> was a regular network.
> 
> Also create a similar one for the public traffic portion of the net.
> 
>   public:
>    U(h3..h5) <- U(s1) <- U(r1) <- U(internet) <- U(r2) <- U(s2) <- U(nagios)
>         U(v1) <---+                    v                    v
>                                        |                    +-> U(v2)
>                            U(v3) <- U(s3)
>       U(h17..h19) <- U(s4) <-+
> 
> Then at each VPN point create a host dependency of the private VPN
> "host" on the public host. The problem is that I don't think that host
> dependencies will work in the in same way as the parent' directive
> does when determining failure.
> 
> Another way to represent these network is to merge the two so that
> each private VPN host [P(vN)] has its public host [U(vN)] as its
> parent
> 
>   private+public:
> 
>    U(h3..h5) <- U(s1) <- public_internet <- U(s2) <- U(nagios)
>                   |            |              ^ U(v2)
>             +--<--+            |               	  ^P(v2) <- P(s2) <- P(nagios)
>           U(v1)                |                    ** Note this
>             |		       |
>           P(v1)		       |
>             |		       |
>           P(s1) 	       |
>             |		       |
>         P(h1..h3)	       |
>                                |
>                     U(v3) <- U(s3)
>                       |        ^----<  U(s4) <-- U(h17..h19)
>                     P(v3)
>                       |
>      P(h14..h16) <- P(s3)
> 
> 
> However, if I do this then an outage of P(v2) due to failure of the
> VPN software, makes it look to Nagios like there is still connectivity
> to the other VPN sites since the routes/parent dependencies for
> P(nagios) and U(nagios) are identical due to both having U(s2) (or
> public_internet) as a common child. However in reality you can't reach
> P anything downstream of P(v2) if P(v2) or P(s2) are down. That's why
> the parentage graph directions at ** note are pointing away from
> P(nagios) rather than away from U(nagios).
> 


Just follow the packets from the Nagios server to the target. Where they 
go into the VPN, you monitor both endpoints of the VPN and otherwise 
just do the same thing, like so;

nagios -> vpn-entry -> vpn-exit -> monitored-host

It's usually favourable to monitor the far side of the interface so you 
get a notification if the internal routing is acting up in the device 
(this generally holds for all devices routing traffic). Never mind the 
devices that need to be up in between the two vpn-endpoints unless you 
can do somethnig about it when they act up.

It gets tricky if you have several redundant routs or a circular 
redundancy thing set up, but in your case I don't see that particular 
problem.


> Also to make things a bit more complex, the switches usually have
> internal IP addresses but are partitioned so that they have both public
> and private traffic on them.
> 

You can add the separate addresses as separate hosts if you like. That 
might be what makes most sense if you're having a lot of trouble 
figuring out the route for something or if you'd get weird stuff 
happening (VLAN routing going twice through the same switch so it's its 
own parent or grandparent or some such, I've seen all sorts really).

> Well thanks for making it this far in the email.
> 
> Even if you don't have an answer, does what I am asking even make
> sense to anybody?

I'm not sure. I'm just winging it and hoping for the best. :)

In general though, just follow the packets from the Nagios server to the 
target. It gets tricky if you have several redundant routs or a circular 
redundancy thing set up, but in your case it doesn't seem to be the case.

> If you have any questions, it might help me to see
> things more clearly. If you have struggled with it and decided that
> nagios isn't up to the task that is good to know as well. The multiple
> hosts for public and private interfaces bothers me, but until nagios
> becomes "network aware" in its outage determination for multi-homed
> hosts, I think this is the only way to do it.
> 
> Also if anybody thinks I am nuts after reading this email, don't worry
> about I think I am nuts too but that's a good thing (TM) 8-) .
> 
> 				-- rouilj
> John Rouillard

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231


-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list