Patch submission for comments : CGI speed improvement (XNG)

Andreas Ericsson ae at op5.se
Thu Jun 9 22:42:30 CEST 2005


François Laupretre wrote:
>>Yes and no. The comments were in french, much of the code wasn't 
>>indented and mnemonic identifiers doesn't appear to be in the 
>>programming style you use, so it took a while to grok all of it.
> 
> 
> Right. The only thing I wanted was to know if it could be interesting from
> your point of view. I didn't expect people to look at the code. Actually, I
> wanted to know if it was worth the pain to cleanup the code, to comment it,
> and to write a documentation. I spent more than one week on this code and,
> before going on, I wanted to submit an alpha version.
> 

Good plan. :)

> And if you prefer a different indenting style, it is not a problem.
>  

That was just me being stupid. I was reading the patch rather than 
applying it and running indent (I can't read Ethan's code either).

> 
>>A simpler solution to the same problem would be to implement real 
>>hashing instead of the algorithm in current use. Presently it 
>>generates 
>>worse hashes than an internet checksum function and the 
>>compare_hashdata* functions (called from find_*) are 
>>extremely expensive.
> 
> 
> I agree but I saw it as another issue. Of course, the current pseudo-hashing
> system is not very efficient but, until it gets suppressed, it is better to
> transmit the hash tables to the CGIs instead of having them rebuild them.

The largest problem isn't building the hash-tables, but using them to 
find things. In many cases the cgi's won't need all the data but will 
read all of the configuration anyways. This is a major slowdown that 
won't go away until there's some other engine doing (proper) hashing and 
serving what the cgi's want (i.e. an RDBMS).

> There are also too many mallocs in the current nagios code, and several
> memory leaks. But, once again, I tried to find the best ratio of improvement
> versus quantity of changes in code.
> 

I think you're wrong about the memory leaks (although I whole-heartedly 
agree about the malloc()'s). Nagios has struggled recently with severe 
memory leaks and some valgrind sessions has narrowed it down to 
virtually nothing. Perhaps you aren't running recent CVS code?

> 
>>nagios-db isn't the ultimate solution, although using a 
>>database is far superior to flat-files (regardless of format).
> 
> 
> Certainly for the version where CGIs are replaced by some PHP code. And,
> when it is time to work on it, I will be happy to participate. But I have a
> problem now, and I need a solution before one or two years. If you tell me
> that this version 3 is planned for 3rd quarter 2005, it is OK and I won't
> try to add such code.
> 
> Yes, storing information in a database can be better as an alternative, but
> I am not sure that the flat file option will disappear soon.
> 

Neither am I, but it will disappear sooner or later. How about starting 
to work on that PHP Gui right now then? I think the nagios-db 
neb-modules also updates tables which aren't materialized, so using the 
data collected by nagios-db but without materialized views (I really 
don't like those) is always an option.

> 
>>I'm not so sure about this patch you've proposed though. By 
>>necessity it 
>>involves a fair deal of pointer voodoo. Code such as that is 
>>always hell 
>>to maintain, and it gets increasingly difficult to ensure support for 
>>various compilers and architectures. F.e., gcc aligns components of 
>>datastructures on sizeof(register) boundaries with certain 
>>optimization 
>>options, and squeezes them with others, making pointer-hopping using 
>>anything but relative offsets a debugging nightmare. Other compilers 
>>have similar symptoms.
> 
> 
> Yes. This is the drawback. The current structure of the memory used by the
> core makes it quite complex to serialize. Let me just add that we could
> assume that the core and CGI code are compiled on the same machine by the
> same compiler. I know that the code is complex and I already spent much
> debugging time on it. Actually, the really complex part of the code is in
> serialize_object_section(). As I wanted to keep it small, I did a generic
> function, which takes an object description as parameter. This way, the code
> is twice more complex, but 14 times smaller :-).
> 

I noticed. ;) It could do with some optimization though, but that's for 
a rainy day.

> Another thing is the serialize/unserialize functions for status data. If the
> program status data used a structure, and if the core and the CGIs used
> common structures to store the host & service status data, this code could
> use the object generic code, which would suppress nearly everything from
> xsdng.c.
> 
> And, another idea, involving more changes, can be to reorganize all these
> structures in the core after the configuration is read from the obj cfg
> files. At this time, except for strings that would be treated as individual
> mallocs, we know how much size we need for the structures, we can allocate a
> single block of memory for them, and we can replace the linked lists with
> arrays, even for the sub-object lists.

I've made this same suggestion twice already. It's done (partially) by 
the hashing routine, but it's inadequate and the bucket's not big enough 
(at least not with the current hash code).

> The code needed to do that is nearly
> the same as serialize_object_section() but, there, it would be used only
> once. It would provide a much more efficient way of managing the lists,
> would suppress nearly every pointer manipulation, and would maximize the
> serialize/unserialize process speed (nearly everything would be ready to be
> transferred as is to the CGIs).
> 

When it comes to this, I would prefer an in-core thread to service the 
cgi's over local (or remote, even) sockets. Perhaps even with a simple 
query language to speed up serving pages.

> In conclusion : I consider the complexity of this code as a rather low price
> to pay to speed up the CGI execution by a factor of 10 to 30. Now, the
> choice depends of the value you give to this improvement...
> 

It is indeed interesting. I'm a bit of a chicken though, and I don't 
want to test code that I cannot guarantee I can fix in case it breaks 
something.

> Maybe other people with large configurations can say how interesting it
> seems to them.
> 
> Regards
> 
> François
> 

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Lead Developer


-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you shotput
a projector? How fast can you ride your desk chair down the office luge track?
If you want to score the big prize, get to know the little guy.  
Play to win an NEC 61" plasma display: http://www.necitguy.com/?r=20




More information about the Developers mailing list