strange nagios main process

Stanley Hopcroft Stanley.Hopcroft at IPAustralia.Gov.AU
Wed May 4 11:35:45 CEST 2005


On Wed, May 04, 2005 at 10:01:07AM +0200, Christophe Yayon wrote:
> Hi all,
> 
> i am upgrading our nagios 1.2 (on freebsd 5.3-release) to nagios 2.0
> (currently last cvs after 2.0b3) on Freebsd-5.4RC2 and i saw a very
> strange thing.
> 
> After few hours, nagios main process (nagios -d ...) use lot of cpu time
> and when i do a truss on the pid (like a strace on linux), i have a
> 'kse_release' loop message.
> 
> # top
> last pid: 75729;  load averages:  1.81,  2.08,  2.03
> 63 processes:  2 running, 61 sleeping
> CPU states: 12.5% user,  0.0% nice, 16.0% system,  0.0% interrupt, 71.5% idle
> Mem: 36M Active, 1639M Inact, 219M Wired, 68M Cache, 112M Buf, 44M Free
> Swap: 5000M Total, 52K Used, 5000M Free
> 
>   PID USERNAME PRI NICE   SIZE    RES STATE  C   TIME   WCPU    CPU COMMAND
> 40435 nagios   112    0  4688K  3544K CPU0   0 569:46 93.99% 93.99% nagios
> [...]
> 
> 
> # truss -p 40435
> kse_release(0xbfbf9b70)                          ERR#22 'Invalid argument'
> kse_release(0xbfbf9b70)                          ERR#22 'Invalid argument'
> kse_release(0xbfbf9b70)                          ERR#22 'Invalid argument'
> [...]
> 
> What does it means ? i know that there are some issue with pthreads on
> FreeBSD, but is it the same problem ?
>

This may be _related_ to the 'Known issues' problem documented at
http://<your_nag_host>/nagios/docs/whatsnew.html 

I was hoping that the different thread (KSE) implementation on FreeBSD 
5.x might deal with this.

(FWIW FreeBSD 4 definitely shows the documented problem with the pthread 
lib but its liveable with).

Knowledgable people say that the problem is in the thread library 
(pthread); you could try the documented advice and try the Linux thread 
port.

If you do, please let the list know the result.

Sending your observations to FreeBSD-Stable at FreeBSD.ORG may yield a 
better answer.

You could pehaps live with this by running check_procs from cron and 
invoking the restart script when the %CPU exceeds threshold.

> Thanks in advance...
> 

Yours sincerely.

-- 
Stanley Hopcroft

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: disclaimer.txt
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20050504/475d6a05/attachment.txt>


More information about the Developers mailing list