Reporting and misc rave.

Steve Shipway s.shipway at auckland.ac.nz
Fri Feb 3 00:39:25 CET 2006


> I am writing with mainly a rant about Graphing and Reporting.

Oh yes, I have problems with these, also.  Please feel free to rant away at
length.

> 1 About graphing with Nagios
> Why would one bother when 
>   1.1 Cacti does such a good job
>   1.2 Nagios could check the Cacti RRDs with either 

We use MRTG/routers2 (on a different server) to do the graphing for a
similar reason.  MRTG can query the Nagios agents and extract the numbers
for graphing, and routers2 has a Nagios plugin to link back to Nagios.  The
only minor drawback is that you end up with two queries going to the host
(one from Nagios, one from MRTG) rather than just the one you'd get if using
the Nagios graphing extensions.  This also doesnt work for passive services.

> 2 About reporting
>   2.1 put the availability data in a DB table (prob with an 
> auto-incremented index)
>   2.2 use either
>     2.2.1 ad-hoc SQL queries, or
>     2.2.2 the reporting package of your choice (eg iReport)

Definitely.  Querying the Nagios .log files is a deadend that I've finally
given up on.  It is a simple matter to install mysql on the same server, set
up a log table, and nightly process your text logs into the database.  Note
that you need a unique index for many applications, and time/host/service is
not unique.  I found we can use time/host/service/state as a unique index
for alerts, and discard any duplicates - but this would not be appropriate
for people who (eg) send multiple syslog critical entries, for example.  We
only load the alerts, not notification and other log entries.

I can let people have a copy of my perl nagioslog->mysql alert data loader
if they want.

Once you've got it into mysql, then there is an ODBC driver for mysql you
can use under windows.  Reporting is massively faster this way!  Not only
that, but you can do much post-processing so that you add an 'in scheduled
downtime' flag to the alert records which is very useful (why doesn't Nagios
log the downtime state?)

My wishlist for Nagios?
0) Two levels of access - readonly and manage - on a per contact, per
service level.  Dont just give 'manage' access to everyone listed as a
contact for that service.
1) Add downtime (and parent-in-downtime) flags to the log entries, also
alert-disabled flags and all the other status flags.
2) Add optional database plugins for mysql, mssql, oracle.... instead of the
.log files
3) More features in the map functions
4) Something so you can see who will be alerted by a particular service at
any given time
5) A reporting tool for SLA reporting to give % time in unscheduled downtime
over a whole hostgroup.
6) cmd.cgi should have dropdown lists where possible, and make checks for
hostname/servicename validity.  I've already coded this in on our (1.2)
version.
7) Downtime schedules should have an optional flag for 'repeat' that will
re-scedule themselves for next day/week etc.

Thankyou for your time!  Maybe I should be posting to the nagios-devel list,
or finding some time to help code these myself...

Steve




-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list