Feature Request

Andreas Ericsson ae at op5.se
Tue Apr 10 17:06:13 CEST 2007


Adam Augustine wrote:
> On 4/4/07, Andreas Ericsson <ae at op5.se> wrote:
>> My apologies. I'll rephrase:
>> As the scheduled downtime only affects notifications, all state changes get
>> logged as usual. Hence, they do not affect availability reports in a negative
>> way. However, the start and end of scheduled downtime gets logged as well,
>> and those numbers get displayed on the availability reports too. For detailed
>> explanations on how to read availability reports, refer to the docs.
>>
>>> I'll continue to research it.
>>>
>> Do so by reading the docs. You won't find a better explanation else-where.
>> www.nagios.org. Click your way from there.
>>
>> --
>> Andreas Ericsson                   andreas.ericsson at op5.se
> 
> We use the availability reports extensively, and there are a few bugs
> that we have fixed over the last couple of years as well as some
> additional functionality we have added. As I recall, the bugs had to
> do with improper calculation of the "Time Undetermined" field in
> certain circumstances, and the correct assignment of values in the
> parenthesis based on what you tell avail.cgi that the undetermined
> time should be counted towards.
> 
> We had to patch the Nagios core to handle a new variable ("SLA
> Target") for our modified version of the Availability report, and we
> patched some of the Scheduled Downtime logging to make sure the
> downtime states got logged in a way that would properly report the
> state across log file rotations and reloads. A side effect of this is
> that you no longer require backtracking through all the logs to the
> start of scheduled downtime for it to show up properly.
> 
> We added a new "SERVICEGROUP SUMMARY" report page, that takes all the
> service groups and summarizes their availability numbers. We also sort
> based first on whether they are in SLA or out, and then based on
> "Actual SLA" which means, the "uptime" they actually achieved. We also
> changed the number in parenthesis to represent the "scheduled
> downtime" spent in that state.
> 
> So time for a particular service gets divided up into 9 "buckets":
> 
> 1) OK - Not scheduled down (called "Unscheduled" in the availability
> detail reports)
> 2) OK - Scheduled Down
> 3) WARN - Not scheduled down
> 4) WARN - Scheduled Down
> 5) CRIT - Not scheduled down
> 6) CRIT - Scheduled Down
> 7) UNKN - Not scheduled down
> 8) UNKN - Scheduled Down
> 9) Undetermined
> 
> For purposes of our reporting, we decided that only #6 (Unscheduled
> CRITICAL time) should count against the SLA, so basically the "Actual
> SLA" column is 100%-CRIT.
> 
> We also made a change so that you could select different "views" from
> a drop down (which basically changed who you were authenticated as,
> from avail.cgi's perspective). This probably isn't useful outside our
> company.
> 
> I had attached some pictures showing how the summary page looks, and
> the view of one of the service group links (which looks almost exactly
> like the original service group detail availability report), but the
> mailing list thought they were too big :-(. So I threw them on flickr
> (http://www.flickr.com/photos/7665289@N05/).
> 
> There may be other things we changed that I am not recalling at the
> moment, but those are the highlights.
> 
> We have never posted the patches because (as I understood it) no more
> work was going into updating the CGIs because the new interface was
> coming out. Since that isn't going to happen until well after 3.0,
> maybe there is interest in the patch set.
> 
> If there is interest, I think we have them mostly broken out enough
> that I could post them somewhere or to the list.
> 

This sounds very interesting indeed. If you could post the patches to this
mailing list I'd be most grateful. :)

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV




More information about the Developers mailing list