Lots of hosts, only a couple of services?

Jason Byrns jason-sourceforge at microlnk.net
Tue Aug 24 16:07:08 CEST 2004


Good morning, fellow Nagios users!

Right now, we have about 250 hosts.  But there are only 6 of them 
monitoring services like DNS, HTTP, SMTP, and POP3.  All the rest are 
being pinged only, since they are very simple network devices.  Switches 
and wireless access points, mainly.  These devices DO have telnet and 
generally web interfaces, though.

Here's part of the problem: If any device misses a single service check, 
a host check is immediately triggered.  But sometimes a device can miss 
a ping even though there is no problem, just a burst of network traffic. 
  Unfortunately, the service checks do not respect the 
max_check_attempts in this regard.  Instead, after any single missed 
service check, a host check is immediately triggered.  AND, if that host 
check also fails -- quite likely if the other ping just failed a few 
seconds ago -- then a notification is sent out immediately.  Again 
ignoring the max_check_attempts value.  I have already confirmed that 
this behaviors is by design!

Right now, I'm using a dummy check for host checks.  That took care of 
the problem where it was immediately triggering a notification, if any 
device missed one service check.  But the problem is that now our host 
status map doesn't show any of the problems, everything there is always 
green.

So here's my question: how can I improve our Nagios setup?

Here are my goals:
1) Prevent false positives with max_check_attempts (set to 5)
2) Get Nagios to respect max_check_attempts
3) Have the Status Map correctly show situation if any devices are down.

Could I...
1) Check telnet instead of just pinging these devices?  (And change the 
host checks back to the regular host_check_alive?)
2) Not check services at all, unless necessary, and only do host checks? 
  (Nagios throws lots of warnings if you do this, and I suppose I'd 
rather avoid that)
3) ...?  (Profit?)

I haven't written my own plugins yet, so I'm trying to figure out how 
hard it'd be to check telnet.  The devices are different enough that I 
doubt I can count on very similar responses from any telnet attempts...

Suggestions?  Advice?  Ideas?

Thanks very much for anything you can offer!

-- 
Jason Byrns
System Administrator, MicroLnk


-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list