Nagios-users digest, Vol 1 #2880 - 20 msgs

Stanley.Hopcroft at Dest.gov.au Stanley.Hopcroft at Dest.gov.au
Tue Nov 1 02:42:53 CET 2005


Dear Folks,

This Nagios site (government business/network node admin) is obliged to
report 
periodically on node availability.

The Nagios availability reporting is good but is not flexible enough to
satisfy our
requirements which include

1 documents suitable for management - graphics but _no_ metrics (ie no
hard SLA
processing or need to report on 'aggregate availability'). This
requirement is met
here by transforming the availability data into an Excel workbook and
adding
hyperlinks to the Nag avail report and the trend graph for each host.

2 automatable

3 arbitrary selection of fields; adding new fields

4 arbitrary ordering of availability records

To meet our requirements, I have developed a small and stupid class that
sucks in the Nagios
availability for a specified time period and then allows client code to
mangle it, sort it etc
by providing callbacks.

Here is an example of client code extracting records for a node matching
the regex ^Armi,
reporting on two time periods and sorting (ascending) by the max of
TOTAL_TIME_DOWN and
TOTAL_TIME_UNREACHABLE. The output shown here is debug format (the field
widths can
be expanded) but output as CSV or an Excel workbook is also provided -
that is the
means of meeting requirement 1.


[sh1517 at acisf011 examples]$ perl -Mblib  ex4a ^Armi

                          DEST_Foo_SLA_hours

HOST_NAME                 %_TOTAL_TIME_UP  TOTAL_TIME_DOWN
TIME_DOWN_HHMMSS TOTAL_TIME_UNREA TIME_UNREACH_HHM AVAIL_URL

Armidale_Optus_router_PE_ 95.656%          33170            9h 15m
0                                 http://nms/nagio

Armidale_DEST_router      95.673%          0
33040            9h 10m           http://nms/nagio



                          24x7

HOST_NAME                 %_TOTAL_TIME_UP  TOTAL_TIME_DOWN
TIME_DOWN_HHMMSS TOTAL_TIME_UNREA TIME_UNREACH_HHM AVAIL_URL

Armidale_DEST_router      99.882%          290              5m
2120             35m              http://nms/nagio

Armidale_Optus_router_PE_ 99.897%          2110             35m
0                                 http://nms/nagio

[sh1517 at acisf011 examples]$ 

Here is the code to produce this

use Nagios::Report ;

my $hostname_re = shift @ARGV ;
$hostname_re
  or die <<USAGE;
$0 <hostname | hostname_pattern>

Extracts Nagios Availability report data for host(s) matching the regex
argument.
eg $0 ^Alb
USAGE

my $x = Nagios::Report->new(q<dev_debug>)
  or die "Can't construct Nagios::Report object." ;

sub by_down_time { 
  my %f = @_ ;
  my $d = $f{TOTAL_TIME_DOWN} ;
  my $u = $f{TOTAL_TIME_UNREACHABLE} ;
                                                        # $a, $b are
package globals (in same package as sort call)
                                                        # ==> if the
callback refers to $a and $b in the default
                                                        # package
(main::), they will have null values in the
                                                        # package where
the sort takes place.
  package Nagios::Report ;
  my $x = $a->[$d] >= $a->[$u] ? $a->[$d] : $a->[$u] ;
  my $y = $b->[$d] >= $b->[$u] ? $b->[$d] : $b->[$u] ;
  $y <=> $x ;
}

$x->mkreport(
                [ qw(HOST_NAME PERCENT_TOTAL_TIME_UP TOTAL_TIME_DOWN
TIME_DOWN_HHMMSS TOTAL_TIME_UNREACHABLE TIME_UNREACH_HHMMSS AVAIL_URL)
],

                sub { my %F = @_; $F{HOST_NAME} =~ /$hostname_re/ },

                \&by_down_time,

                sub {   $_ = shift @_; my %F = @_;
                        my $d = $F{TOTAL_TIME_DOWN} ;
                        my $u = $F{TOTAL_TIME_UNREACHABLE} ;
                        push @$_, 
                                &t2hms($d),
                                &t2hms($u) ;
                        qw(TIME_DOWN_HHMMSS TIME_UNREACH_HHMMSS)
                }
) ;



$x->debug_dump ;

One constructs a report object and then calls mkreport() with callbacks
to 

- specifies the names and order of the fields in the report (no list
means
  get all the fields specified in the CSV version of the Nag
availability report)
- select availability data (sub { my %F = @_; $F{HOST_NAME} =~
/$hostname_re/ }
- sort the records (the fairly unattractive &by_down_time sort
subroutine)
- munges the data (by adding two new fields, the hhmmss formatted
downtimes).

Finally, one calls either excel_dump, csv_dump or debug_dump to get some
output.

The code is Perl, support best effort.

Is anyone interested in such ? Or, if there is a better mousetrap, I
would
be very grateful to hear about it.

Yours sincerely.





-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc.
Get Certified Today * Register for a JBoss Training Course
Free Certification Exam for All Training Attendees Through End of 2005
Visit http://www.jboss.com/services/certification for more information
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list