Nagios 3.0 SLA Reporting

Alan Cooper ajcooper80 at googlemail.com
Tue May 27 16:36:57 CEST 2008


Hi,

We had a similar requirement - the quick & dirty way we managed to get 
reports out of Nagios & NDO was to write our own DB interogation script 
in perl (once you have the schema from the docs and a copy of SQLyg, the 
DB is easy to navigate).

To get round the downtime reports for events in the past, I wrote a 
script that simply parsed the nagios logs (scheduled downtime events get 
logged by nagios, just not acted on) with epoch times, so it's 
straightforward to enter a downtime start / stop epoch pair, and then 
for each outage, you can check if it occured in a scheduled downtime period.

Would love to hear if anyone has a more elegant solution.

Mohr James wrote:
> Hi All!
>
> We are in the process of moving from Nagios 2.5 to Nagios 3.0 with
> MySQL. We monitor and report services for several customers and thus
> have a number of SLAs to consider. Currently we have a self-written
> reporting mechanism, but the developer is no longer with the company and
> the documentation is lacking in many areas. Since we are using the
> Nagios NDO, we would prefer not to try to force the old mechanism to
> work with 3.0. So, we need a new reporting mechanism. 
>
> I looked at a couple of tools, but found nothing which seems to be close
> to finished and none that address adding downtimes after the fact. 
>
> We cannot simply define a check_period or notification_period and
> consider that, because we need to monitor 24x7 and more or less prove we
> monitoring even if there is scheduled maintenance. Also there are cases
> where the service is down and it is not our fault and per the SLA we do
> not subtract that time from availability. Therefore, we need a mechanism
> to be able to somehow add downtimes after the fact which then prevents
> the reporting mechanisms from counting that time. 
>
> NagiosSLA seems promising and I downloaded it from SourceForge. However,
> I do not find any mechanism to manage the SLA periods other than simply
> saying to reporting everything within the check_period. Since we are
> using NDO, creating an extra EventHandler seems like a waste and the
> report_script.pl seems to depend on the DB tables filled by the event
> handler. Looking at the script, I do not seem much of a problem changing
> the table and column names. However, as far as I can tell, the
> sla_exclusion table is never really used. The exlusions are read into an
> array ( my @exclusion = retrieveData("sla_exclusion"); but @exclusion is
> never used after that. This means that every outage is reported.  
>
> Since we already have the data in MySQL, I thought about simply using
> the nagios_scheduleddowntime tables. However, I see a problem with
> outages in the past. As far as I can tell, if you schedule an downtime
> in the past, it is silently ignored. Also, from what I see, the table is
> cleared when the outage is over. Both of these are logical to some
> extent and I think my C is good enough to be able to modify the code to
> either add all outages and not delete them, or maybe or straight-forward
> simply write to a completly different table and avoid changing too much
> existing code.  
>
> So, the first question is whether there are any tools available to do
> SLA Reporting properly, FOSS or commercial. If not, does anyone have any
> suggestions about making changes to the existing code as I suggested?
>
> I would be grateful for any input.
>
> Regards,
>
> Jim Mohr
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
>   


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list