performance data from plugins - does it exist?

Karl DeBisschop karl at debisschop.net
Sat Sep 14 13:50:01 CEST 2002


On Fri, 2002-09-13 at 23:59, Jeremy Tinley wrote:
> On 9/13/02 9:23 PM, Karl DeBisschop (karl at debisschop.net)  said: 
> 
> >These are not currently tracked separately. The could be, but I'd like
> >to get basic performance stuff in, do some bug squashing, and make a
> >formal release. It just doesn't make sense to delay a formal release for
> >these (IMHO).
> 
> Hmm, it wasn't terribly tough to do.  In the check_disk.c, I return 2 
> variables, percentage used and kb free.  Just modify the printf line.  I 
> finished about 4 modifications today (plus some of my custom scripts) so 
> I could get some sample data from the weekend.  Ping, Disk, HTTP and Swap.

Nobody at all said it was tough. I just said there were infants with
diapers needing to be changed, jobs with project deadlines, houses that
need to be sold, NT servers with catastrophic disc failures, relatives
getting married....

It simply isn't the only thing on my plate. I suspect the same is true
for most of us.

As for your patches, I'd be gald to have them posted to the list
(nagiosplug-devel). People can comment on them, and they can get added
to CVS. 

> Maybe you can help me with saving my diff as a patch, or commiting to CVS.

Forgive my bluntness, but if I need to teach you haw to make a patch,
then I'm certainly not going to authorize you to commit directly to CVS.
Nor does your posting to me alone instead of the plugin development list
suggest a great commitment to working with the group, another
prerequisite for CVS access.

As for how to make a patch, I refer you to
http://nagiosplug.sourceforge.net/developer-guidelines.html#SUBMITTINGCHANGES
with the additional comment that a patch is simply the output of diff. I
prefer unified diffs. I think the following works:

cvs diff -U 4 Makefile.am | tail +5

Ah but now I've spent 20 minutes time looking this up, which means I
will basically not have time to work on actual coding or looking at
patches :-(

> >To start with, let's say "time" is one. Units will be defined to be
> >seconds.
> >
> >I have taken a quick stab at logging time for check_http.
> >
> >Turns out it's not quite as clear as one might think. If the plugin 
> >returns an CRITICAL state because a desired string is not found, should
> >a time be logged? Probably depends on the user. There's at least a few
> >judgement calls like that. I assume errors on 404s should not be logged,
> >generally, but that again may depend on the application.
> >
> >But I took a crack at it, and I'll commit it for comment.
> 
> 
> In this example, I would return response time of null and a status code 
> of 404 (or whatever HTTP status code was returned.

Does everyone else agree? Are all handlers able to accept a response
time of null? I assume that would be 'time='. IIRC, that is not in the
syntax, so I'm not even sure nagios will like it.
 
> Part of the information that gets logged is the state (OK, CRIT, WARN, 
> UNKNOWN).  I suppose you could log EVERYTHING to a database that is 
> returned from the plugin...  but that seemed a bit overkill to me.

So state (returned by the plugin) is another standard variable? Probably
'status', like '404' should also be.

Should 'state be the number or the word?

> My database consists of host, timestamp, Nagios description, State, 
> variable and metric with a key off of host,timestamp,description and 
> variable.  The PHP code takes a host, description and variable as the 
> query, and optionally a timestamp.
> 
> Getting back to the task at hand, the code should be sufficiently 
> modified to include what each possible variable returned and it's 
> correponding unit in the --help/-h screen.

Above you said the task was trivial -- just modify the printf line. But
now we also need to modify the help page. And there's 38 plugins in C
alone, and many plugins have more than one printf line, and many have
interesting variables that are not currently tracked - like the dns
reslottion time that one user requested for check_http.

I've now spent a full half hour just on this letter. Is it now becoming
clear how going into too much detail could distract from other things
that are essential for a true release.

I do feel that performance data should be part of the formal 1.3.0
release, although it's never been discussed to my knowlege. I just don't
want to spend time logging performance data for every concievable
variable on this pass, especailly if it requires tracking new timers and
varibles for them, and other ancillary coding. I'd rather focus on
getting a formal release that includes an acceptable level of detail
done in a way that the most (all?) users and developers feel makes
sense.

--
Karl

> -J
> 




-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf




More information about the Users mailing list