Nagios 2.3 internal server error.

Eli Stair estair at ilm.com
Wed May 10 05:05:32 CEST 2006


Yeah, I think it was that (or b2, have to check, just keep the dir symlinked
for use) that was the last ³stable² version for me, without the CGI¹s
crapping out with alarming regularity.  I¹ve actually seen the daemon die
twice today (the typical Œcaught sigsegv, shutting down...¹) since running
2.3 (not present in 2.2 for me).

Any input from anyone anywhere on the cause?  I still haven¹t heard a peep
in response other than that I¹m not the only one this is happening to....
Offering to try and be of use doesn¹t seem to be well regarded.  I¹m trying
not to sound whiny, but this has been a fairly unresponsive project WRT
acknowledging and fixing problems.   I don¹t know what more I can do, I
can¹t even find anyone who seems potentially interested in helping to throw
money at :)  

I¹m leaving 2.3 run overnight, doing a ps ­jHF on it every second, maybe
I¹ll catch it in some bad act of wedging during some child process spawn, or
amidst a mem leak phase right before it dies... It won¹t have anything to do
(I¹d imagine) with the CGI¹s dying, but maybe the cause of the segfaults...
Not going to recompile and run the daemon in heavy debug until I get a hit
that it¹s wanted or useful to my cause.

Thanks for the input Alessandro,

/eli


On 5/9/06 7:30 PM, "Alessandro Ren" <alessandro.ren at opservices.com.br>
wrote:

> 
>     Eli,
> 
> try to use nagios-2.0b4, it doest give any errors to me so far.
>     Nagios 2.3 seems to be generating more errors in the CGIs than the
> previius 2.2.
>     I will try to find a pattern on this and look in the code for memory leeks
> and th like.
> 
>     []s.
> 
> Eli Stair wrote: 
>>  Re: [Nagios-devel] Re: Nagios 2.3 internal server error.
>> Sorry for the clutter,  my earlier post was too optimistic... Total of 5000
>> requests via elinks for a host detail shows only 3 500 (internal server
>> error) issues on the client side.  At the same time, this generated 63
>> ³Premature end of script headers: status.cgi² errors in the apache logs.
>> Only one revealed the referrer URL:
>>  
>>   [Tue May 09 16:23:57 2006] [error] [client 10.73.16.108] Premature end of
>> script headers: status.cgi, referer:
>> https://monitor02/nagios/cgi-bin/status.cgi?hostgroup=deathstar-opteron-850-3
>> 2G&style=overview
>>  
>> During this period of testing (several hours, 5000+ checks with links, and 10
>> windows open with various UI views refreshing every 5 minutes), only this one
>> verbose message, the rest to the effect of:
>>  
>>   [Tue May 09 16:23:52 2006] [error] [client 10.73.16.108] Premature end of
>> script headers: status.cgi
>>  
>> And through the entirety firefox received four 500 pages, links three, and
>> yet 63 ³premature² errors were generated by the CGI¹s.
>>  
>> Still looks broken, my bad for being excited.  Will compiling with any of the
>> debug flags set cause the CGI¹s to output more useful info, or are they only
>> for the nagios daemon as it seems to be?
>>  
>> /eli
>>  
>>  
>> On 5/9/06 2:31 PM, "Eli Stair" <estair at ilm.com> <mailto:estair at ilm.com>
>> wrote:
>>  
>>   
>>> Hmm, missed this.
>>>  
>>> Gave this code a shot and am still seeing the problem, though it seems at a
>>> _MUCH_ lower rate of frequency.  Of 1200 hits I got only 3 500¹s returned by
>>> the client (previous failure rate was around 1:40).  The other significant
>>> change is I¹m not seeing segfaults reported by the CGI¹s, nor the ³premature
>>> end of script headers² message in the apache logs that used to correspond
>>> with these 500¹s.  I¹m guessing this fixed the major problem and the
>>> symptoms of it (segv¹s, script header issue)...
>>>  
>>> Unless something else was changed, I¹d say it¹s ³mostly fixed², or at least
>>> better, likely due to the content_length issue?
>>>  
>>> Thanks devs, I had actually given up hope that this would be tracked down
>>> and addressed.  This is great news (pre-emptively).  No idea if this was
>>> randomly spotted or someone went looking for it due to my (and others¹)
>>> reports, but either way I appreciate it.
>>>  
>>> Cheers,
>>>  
>>> /eli
>>>  
>>> 2.3 - 05/03/2006
>>>     * Bug fix for negative HTTP content_length header in CGIs
>>>  
>>>  
>>> On 5/9/06 5:22 AM, "Alessandro Ren" <alessandro.ren at opservices.com.br>
>>> <mailto:alessandro.ren at opservices.com.br>  wrote:
>>>  
>>>   
>>>> 
>>>>     I've updated to nagios 2.3 and I am still getting the internal server
>>>> error from time to time in the CGIs refresh.
>>>>     Eli, have you tried the 2.3 already?
>>>>     Just to let the list know.
>>>>  
>>>  
>>>  
>>>  
>>  
>>  
> 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20060509/c00ceb45/attachment.html>


More information about the Developers mailing list