Really slow log2ndo

Andreas Ericsson ae at op5.se
Thu Sep 18 12:20:20 CEST 2008


Benjamin Krein wrote:
> On Sep 18, 2008, at 4:04 AM, Andreas Ericsson wrote:
> 
>> Mikael Fridh wrote:
>>> On Wed, Sep 17, 2008 at 7:46 PM, Andreas Ericsson <ae at op5.se> wrote:
>>>> Benjamin Krein wrote:
>>>>> Is there a reason log entries aren't split up into multiple  
>>>>> fields in
>>>>> the NDO DB?  It seems kind of silly to put the entire log line in a
>>>>> single field when there are very clear delimiters in the line.
>>> These are a few excerpts of the logentries;
>>>
>>> Auto-save of retention data completed successfully.
>> Junk.
>>
>>> LOG ROTATION: DAILY
>> Implementation detail junk.
>>
>>> CURRENT HOST STATE: bernicla2;UP;HARD;1;PING OK - Packet loss = 0%,
>>> RTA = 0.50 ms
>> Useful, but superfluous (especially with LOG_ROTATION: Daily).
> 
> I don't understand this response.  I see some important stats in there  
> that I'd want to put on a graph.  Not sure why you consider it  
> superfluous.
> 

Because it's CURRENT HOST STATE, which just re-iterates the latest
HOST ALERT, but does it on every log rotation. In other words, it's
superfluous for data analysis.

>>> ndomod: Successfully reconnected to data sink!  0 items lost, 88
>>> queued items to flush.
>> NEB junk. Will go to separate logfile in a not too far off future.
>>
>>> CURRENT SERVICE STATE: rt-130-238-128-160-27;PING;OK;HARD;1;PING OK -
>>> Packet loss = 0%, RTA = 2.77 ms
>>>
>> See comment for CURRENT HOST STATE entry.
>>
>>> There are only clear delimiters for some types of entries, not all.
>>> And if you do separate the types into different tables it will stop
>>> being a Log... A log is where you go to check historically what
>>> happened during a certain period of time. If you split it up in
>>> separate tables it will be more complex to get an excerpt on just  
>>> that
>>> - all events during a certain period.
> 
> Well, each of the various log types have clear delimiters.  Doesn't  
> seem like the logic involved in deciding which items belong in which  
> field based on the log type would be that difficult.
> 

You're contradicting yourself. No, it's not hard to determine where
each logentry should go. Nagios does it automatically when issuing
its callbacks.

>>>
>>>> I imagine the current structure was designed to facilitate  
>>>> displaying
>>>> the entire log though. Your guess is as good as mine, as it was a  
>>>> long
>>>> time ago I took a look at the ndoutils code.
>>> It's a log on disk, thus it's a log in the database. Atleast that's  
>>> my
>>> take on it.
>>> I would use MyISAM MERGE tables or PARTITIONS for this if you want to
>>> keep an "online" archive of the logs.
>>> With the MERGE trick you can compress old tables and rejoin them in
>>> the merge periodically if you'd like.
>>>
>> Or one can just print the logfiles from disk. They're already  
>> partitioned
>> into rotated logfiles so it's not that much of a chore.
>>
>>>>> I'm contemplating writing my own parser for the archived logs,  
>>>>> but I'm
>>>>> tempted to modify the NDO code to make use of multiple fields.
>>> You could modify the ndo code to recognize more event types. There  
>>> are
>>> a few logentry_type IDs that are duplicate it seems.
> 
> My knowledge of C is pretty weak.  I'll see what I can come up with.
> 

Don't bother with the LOG_DATA type. Just use the other nebcallbacks
and insert its data into various tables. That'll be a whole lot
better actually.

>>>
>>> The event history of objects (services and hosts) is already in
>>> nagios_statehistory, so what else is there in the log that you could
>>> gain so much from parsing out into separate tables/fields?
>>>
>> Notifications. Program start and stop. Downtime start and stop.
>> Adding a *raw* log to the database just duplicates the log that's
>> already saved on disk, so it buys us absolutely nothing while not
>> taking advantage of the indexing that a database can offer. And
>> even if the logfiles would ever go away (not likely), it's totally
>> trivial to concatenate logentries from several tables when one
>> wants to view them.
> 
> This is exactly my point.  What's the point of using a DB if you're  
> just dumping a log entry into it? The timestamps make sense for  
> querying for specific log entries, but as Andreas said, they already  
> live in logically rotated files on disk that are easy to grep.  Since  
> they're already delimited, it would really be more beneficial to be  
> able to query on some of the other fields as well (ie, host names,  
> service descriptions, states, etc.)
> 
> My ultimate goal for using these log entries in the DB is to compile  
> reports based on various metrics that are gathered by Nagios already.   
> The basic trend reports in Nagios & things like NagiosGrapher are ok,  
> but they aren't very flexible.  Pulling that data from the DB would be  
> far easier, but not in the format it's in now.  As it is now, it's no  
> different than just grepping files on disk.
> 

http://www.op5.org/git/nagios/reports-module.git
http://www.op5.org/git/nagios/reports-gui.git

The code is already there. It's already opensource, and it's already
being used in production by our +300 customers.

If you want area graphs from it (as opposed to better, prettier and
more accurate availability reports) I suggest you take a look at the
Netways grapher. I believe you'll find it at http://www.nagiosforge.org
That uses NDOutils data-format though, so you won't get away from that
log2ndo stuff.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/




More information about the Developers mailing list