Nagios SLAs .... [SEC=UNCLASSIFIED]

Stanley.Hopcroft at Dest.gov.au Stanley.Hopcroft at Dest.gov.au
Fri Aug 17 07:36:28 CEST 2007


Dear Folks,

Is anyone interested in a toy like this (Nagios::SLA running here on the
CLI)


foo$ host_down_report -t thismonth | perl -MNagios::SLA -lane 'BEGIN {
$x=Nagios::SLA->new(undef, undef, "24x7", undef, "Nagios") } next unless
/\d{2}-\d{2}/; print $_, "\t", $x->sla_down("$F[1] $F[2]", "$F[3]
$F[4]")'
Broken_Hill_pe            02-08-2007 01:12:33             02-08-2007
01:18:33             6m 0s         360
Wollongong                02-08-2007 15:02:03             02-08-2007
15:08:33             6m 30s        360
Orange                    08-08-2007 13:21:05             08-08-2007
13:26:15             5m 10s        300
Sydney_pe                 13-08-2007 07:31:51             13-08-2007
09:00:11             1h 28m 20s    5340
Darwin-backup_pe          14-08-2007 16:32:01             14-08-2007
17:45:11             1h 13m 10s    4380
Hobart_Harrington_St      15-08-2007 12:28:40             15-08-2007
16:00:10             3h 31m 30s    12720


foo$ host_down_report -t thismonth | perl -MNagios::SLA -lane 'BEGIN {
$x=Nagios::SLA->new(undef, undef, undef, undef, "Nagios") } next unless
/\d{2}-\d{2}/; print $_, "\t", $x->sla_down("$F[1] $F[2]", "$F[3]
$F[4]")'
Broken_Hill_pe            02-08-2007 01:12:33             02-08-2007
01:18:33             6m 0s         0
Wollongong                02-08-2007 15:02:03             02-08-2007
15:08:33             6m 30s        360
Orange                    08-08-2007 13:21:05             08-08-2007
13:26:15             5m 10s        300
Sydney_pe                 13-08-2007 07:31:51             13-08-2007
09:00:11             1h 28m 20s    3600
Darwin-backup_pe          14-08-2007 16:32:01             14-08-2007
17:45:11             1h 13m 10s    4380
Hobart_Harrington_St      15-08-2007 12:28:40             15-08-2007
16:00:10             3h 31m 30s    12720


host_down_report is a small public application (see Nagios::Report) that
outputs the host availability report on the CLI.

Nagios::SLA is an unpublished Perl module that computes the amount of
down time in an outage according to an SLA.

In the first case, the SLA is called "24x7" so the SLA outage (last
column) is the outage interval in seconds (3 hours, 31 mins and 30
seconds should be the same as 12, 720 seconds).

In the second case, the SLA is for a default SLA of Mon to Fri 8 am to 6
pm. So the outage of 6 mins between 01:12:33 and 01:18:33 on Thur 2 Aug
2007 (Au has Euro style dates) contributes 0 seconds of downtime to the
SLA.

The method sla_down() should take a pair of time stamps representing the
outage interval (ie DOWN, UP) and return the number of seconds the
outage overlapped the SLA.

Computation is done by bit maps encoding the SLAs (ie 3 bytes/day in a
monthly SLA) and the hour part of the outage. The sla_down() method
supports MySQL and Nagios time stamps and possibly others.

sla_down() does not work with outages that span months (eg 1 minute to
midnight until say midday the following workday morning which happens to
be the first of the following month). The dumb workaround with this
would be to split the outage into as many months as the outage spans. On
the other hand, if you have an outage like this perhaps the focus should
be somewhere else.

This behaviour of sla_down() is because the SLA that is constructed
meets my requirements of monthly reports. There are no plans to change
this, although if you report on smaller intervals eg weekly, the SLA
computation should be Ok. You simply construct the standard monthly SLA
and only compute on the part of the month that you want to report on.

I am using it to report availability against an SLA of Mon-Fri 8 am to 6
pm (Mon-Fri 8-18). In our case, the outages are stored as rows in a
MySQL table. The report is constructed by iterating over the rows and
computing the SLA outage for that row.

I would say this is not a particuarly good module as far as Perl modules
go (where an alpha module would be WWW::Mechanize or LWP)but it may be
helpful, or at least annoy an alpha developer enough to do it properly.

If there is any interest, you will find it in the usual locations.
However, given the problems with ePN and version 3.x, there may be no
time for responding to bugs.

Yours sincerely.

Classification: UNCLASSIFIED

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list