Really slow log2ndo

Benjamin Krein benk at aweber.com
Thu Sep 18 11:43:49 CEST 2008


On Sep 18, 2008, at 4:04 AM, Andreas Ericsson wrote:

> Mikael Fridh wrote:
>> On Wed, Sep 17, 2008 at 7:46 PM, Andreas Ericsson <ae at op5.se> wrote:
>>> Benjamin Krein wrote:
>>>> Is there a reason log entries aren't split up into multiple  
>>>> fields in
>>>> the NDO DB?  It seems kind of silly to put the entire log line in a
>>>> single field when there are very clear delimiters in the line.
>>
>> These are a few excerpts of the logentries;
>>
>> Auto-save of retention data completed successfully.
>
> Junk.
>
>> LOG ROTATION: DAILY
>
> Implementation detail junk.
>
>> CURRENT HOST STATE: bernicla2;UP;HARD;1;PING OK - Packet loss = 0%,
>> RTA = 0.50 ms
>
> Useful, but superfluous (especially with LOG_ROTATION: Daily).

I don't understand this response.  I see some important stats in there  
that I'd want to put on a graph.  Not sure why you consider it  
superfluous.

>> ndomod: Successfully reconnected to data sink!  0 items lost, 88
>> queued items to flush.
>
> NEB junk. Will go to separate logfile in a not too far off future.
>
>> CURRENT SERVICE STATE: rt-130-238-128-160-27;PING;OK;HARD;1;PING OK -
>> Packet loss = 0%, RTA = 2.77 ms
>>
>
> See comment for CURRENT HOST STATE entry.
>
>> There are only clear delimiters for some types of entries, not all.
>> And if you do separate the types into different tables it will stop
>> being a Log... A log is where you go to check historically what
>> happened during a certain period of time. If you split it up in
>> separate tables it will be more complex to get an excerpt on just  
>> that
>> - all events during a certain period.

Well, each of the various log types have clear delimiters.  Doesn't  
seem like the logic involved in deciding which items belong in which  
field based on the log type would be that difficult.

>>
>>
>>> I imagine the current structure was designed to facilitate  
>>> displaying
>>> the entire log though. Your guess is as good as mine, as it was a  
>>> long
>>> time ago I took a look at the ndoutils code.
>>
>> It's a log on disk, thus it's a log in the database. Atleast that's  
>> my
>> take on it.
>> I would use MyISAM MERGE tables or PARTITIONS for this if you want to
>> keep an "online" archive of the logs.
>> With the MERGE trick you can compress old tables and rejoin them in
>> the merge periodically if you'd like.
>>
>
> Or one can just print the logfiles from disk. They're already  
> partitioned
> into rotated logfiles so it's not that much of a chore.
>
>>>> I'm contemplating writing my own parser for the archived logs,  
>>>> but I'm
>>>> tempted to modify the NDO code to make use of multiple fields.
>>
>> You could modify the ndo code to recognize more event types. There  
>> are
>> a few logentry_type IDs that are duplicate it seems.

My knowledge of C is pretty weak.  I'll see what I can come up with.

>>
>>
>> The event history of objects (services and hosts) is already in
>> nagios_statehistory, so what else is there in the log that you could
>> gain so much from parsing out into separate tables/fields?
>>
>
> Notifications. Program start and stop. Downtime start and stop.
> Adding a *raw* log to the database just duplicates the log that's
> already saved on disk, so it buys us absolutely nothing while not
> taking advantage of the indexing that a database can offer. And
> even if the logfiles would ever go away (not likely), it's totally
> trivial to concatenate logentries from several tables when one
> wants to view them.

This is exactly my point.  What's the point of using a DB if you're  
just dumping a log entry into it? The timestamps make sense for  
querying for specific log entries, but as Andreas said, they already  
live in logically rotated files on disk that are easy to grep.  Since  
they're already delimited, it would really be more beneficial to be  
able to query on some of the other fields as well (ie, host names,  
service descriptions, states, etc.)

My ultimate goal for using these log entries in the DB is to compile  
reports based on various metrics that are gathered by Nagios already.   
The basic trend reports in Nagios & things like NagiosGrapher are ok,  
but they aren't very flexible.  Pulling that data from the DB would be  
far easier, but not in the format it's in now.  As it is now, it's no  
different than just grepping files on disk.


Benjamin Krein
Systems Administrator
AWeber Communications, Inc.


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/




More information about the Developers mailing list