Keeping the Nagios Configuration Sane

Trond Hasle Amundsen t.h.amundsen at usit.uio.no
Wed Mar 10 21:21:44 CET 2010


David Wallis <wallis at aps.anl.gov> writes:

> Matt Simmons wrote:
>> Hi All,
>>
>> I'm attending the 2010 Professional IT Community Conference
>> (http://www.picconf.org) being held in New Brunswick, NJ, and I'm
>> giving a talk about staying sane while working with the Nagios
>> configuration.
>>
>> The talk will be 45 minutes long, and will primarily be an outshoot
>> from this article that I wrote on my blog:
>> http://www.standalone-sysadmin.com/blog/2009/07/nagios-config/
>>
>> I could talk about that and some other things that I've been figuring
>> out, but I was wondering if anyone had any tricks or tips for dealing
>> with the Nagios config? Is there anything special that you do to keep
>> things straight?
>>
>> I'm going to be putting my slides and any additional material online
>> following the conference, so hopefully someone else can get some use
>> from it.
>>
>> By the way, if anyone on this list is in the north east of the US, you
>> should come visit the conference. Without training, it's only $275 for
>> 2 days. With a full day and a half of training, it's still only $400
>> for the whole shebang. Anyway, this isn't a sales email.
>>
>> I'm looking forward to any tips you would want to share. Thanks in advance!
>>
>> --Matt
>>   
>
> I manage the Nagios installation for 3 different domains at work, each 
> domain with several hundred servers and clients. I quickly reached the 
> "There's got to be a better way!" point when trying to maintain 
> configuration files that were getting pretty big. I was using all the 
> tricks listed in the Nagios docs, but it was still pretty crazy.
>
> The approach I took was to write a configuration generator program that 
> uses a meta-config file to generate the hosts.cfg, hostgroups.cfg and 
> services.cfg config files. The meta-config file allows one to set up 
> cascading configuration variables, and then has one line per monitored 
> host, that includes things like host groups, parents, etc, and then a 
> list of services to monitor.
>
> I also created the idea of "meta-services" that allow the program to 
> generate configuration data for any number of related services with a 
> single service name in the meta-config file. For instance, including the 
> service "weball" will cause the configuration generator to create 
> service entries for every plumbed interface on the web server, checks 
> for every virtual server (http and https), and checks for every SSL cert 
> that it finds. In one domain, a 400 line meta-config file generates a 
> 20,000 line services.cfg file.
>
> Rather than updating individual config files, I just update the 
> meta-config file and then regenerate all of the *.cfg files. I've been 
> using this for several years with very good results.

That's an interesting approach, and we do something similar. It goes
without saying that when the number of hosts grows to several hundred,
maintaining the Nagios config for hosts and hostgroups etc. the regular
way becomes an arduous task. This is especially true if your environment
is largely heterogenous.

We have a list of our servers maintained in a homegrown application
using a topic map as base. Large parts of the Nagios config are
generated from this. I think this is an important point. Usually, you
already have a list of your servers, and you can use this list as a base
for Nagios config as well. The format of the host list is not important,
but deciding that this is the starting point for Nagios hosts config
is. When a host is added/removed in the list, it is added/removed in
Nagios. This is very much like David's approach, i.e. a list of hosts in
a format that is easier to handle and maintain.

In addition, we have defined several "roles" that a server may have,
such as dell-hardware, hp-hardware, mail-mx-server, web-server,
dns-server etc. A simple perl script runs every day on each host and
determines its roles. This information is collected and kept
centrally. Parts of the Nagios config (hostgroups, servicegroups) are
generated based on these roles.

NRPE config is the same on all hosts. It is maintained centrally and
distributed to each host daily. Adding stuff in the sudoers file (needed
for some plugins) is done automatically based on the host's roles.

Another point: We generally don't use plugins that require us to
configure the plugin and tailor it for each individual host. For
example, for filesystem monitoring we have created a custom plugin that
monitors all partitions by default. It has a optional configuration file
locally on each host where we can set individual thresholds if needed.

Thinking like this should come easy to system administrators that are
used to dealing with large installations. It's all about automation :)

Cheers,
-- 
Trond H. Amundsen <t.h.amundsen at usit.uio.no>
Center for Information Technology Services, University of Oslo

------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list