[Apan-users] Re: nagios and apan cause server to crash...

Igor Kurtovic igor.kurtovic at qsc.de
Tue Oct 14 14:01:21 CEST 2003


step back to RH 8.0 ..

i had similar probs, the only difference was a daily crash :P

even with changed reaper-frequency there was no improvement to see.
after getting it back on RH 8.0 all is fine again.

300 hosts
1500 services
400 apan's
150 mrtg-hosts

all on this box:

Dual Xeon III 1 Ghz
2 GB RAM

never had any perfomrance issues or stability probs b4 going onto RH 9.0

Regards, Igor



On Tue, 2003-10-14 at 09:25, Fredrik Wänglund wrote:

> I have service_reaper_frequency=3, and I remember that before I changed 
> it from the default, my load used to be 8-10.
> 
> /FredrikW
> 
> Evan Weston wrote:
> 
> >I was having a simmilar problem under Redhat 9 on a pIII 900 512 meg ram.
> >
> >I set 'service_reaper_frequency=4' instead of the default 'service_reaper_frequency=10' in the 'nagios.cfg' file and its completely stable now.
> >
> >Evan Weston
> >
> >
> >-----Original Message-----
> >From: Fredrik Wänglund [mailto:fredrik.wanglund at datavis.se]
> >Sent: Tuesday, 14 October 2003 4:21 PM
> >To: jeff vier
> >Cc: Matthew Wilson; nagios-users; Apan-users List
> >Subject: Re: [Apan-users] Re: [Nagios-users] nagios and apan cause server to crash...
> >
> >What platform/version are you running on?
> >
> >I'm running without any problem under RedHat 8.0 on a PIII 1400MHz with
> >170 hosts, 200 apan-services and 300 'normal' services.
> >My system-load stays between 1 and 2, CPU is mainly >80% idle
> >
> >jeff vier wrote:
> >
> >  
> >
> >>I'm having the same problem here.
> >>
> >>I have been capturing dumps of the top command, pulling only active
> >>processes.  It looks like something causes an instance of apan.sh to
> >>hang, and then they just start piling up (fast).
> >>
> >>The load is usually under 1.0 (sometimes jumping up to 1.xx - no big
> >>deal).  When it died, my load was over 80 (yes eighty) with 46 (maybe
> >>more) *active* apan processes (not sure of the actual count, top dump
> >>only shows 62 lines of processes.  It said 73 running, though, so likely
> >>more were apan.sh - also, unknown count of inactive apan.sh process
> >>sitting and waiting), 17 zombies (unknown parent, alas). 99% CPU usage
> >>on CPU0, 100% on CPU1.  Yikes.  This jump happened over 16 minutes, at
> >>which point my crons no longer ran, so who knows how badly it kept
> >>piling up.
> >>
> >>apan.debug log file doesn't show anything abnormal (whee.)
> >>
> >>I'm going to have to write a watcher to manually kill the hanging
> >>apan.sh procs, which I don't want to do for fear of inadvertently
> >>killing valid processes, but I am quite sick of having to go over to the
> >>colo to poke the power button once a week (only been in production 3
> >>weeks - 4 crashes so far).
> >>
> >>I'm going to increase my level of manual debugging, too, of processes,
> >>etc.  I'll post any new insight.
> >>
> >>--jeff
> >>
> >>On Wed, 2003-10-08 at 10:31, Matthew Wilson wrote:
> >>
> >>
> >>    
> >>
> >>>UPDATE: I have checked and my nagios installation does not have ePN compiled
> >>>in.  So this is not the cause.  I would greatly appreciate any suggestions
> >>>on how to prevent/cure this problem.
> >>>
> >>>  
> >>>
> >>>      
> >>>
> >>>>Thanks
> >>>>Matthew Wilson.
> >>>>    
> >>>>
> >>>>        
> >>>>
> >>>>>Matthew Wilson wrote:
> >>>>>
> >>>>>      
> >>>>>
> >>>>>          
> >>>>>
> >>>>>>Hi guys,
> >>>>>>I have read in the list archives in the last couple of months a few
> >>>>>>threads about nagios and apan chewing up memory.  I have tried a few
> >>>>>>of the solutions posted but still have no joy.
> >>>>>>        
> >>>>>>
> >>>>>>            
> >>>>>>
> >>>-------------------------------------------------------
> >>>This SF.net email is sponsored by: SF.net Giveback Program.
> >>>SourceForge.net hosts over 70,000 Open Source Projects.
> >>>See the people who have HELPED US provide better services:
> >>>Click here: http://sourceforge.net/supporters.php
> >>>_______________________________________________
> >>>Nagios-users mailing list
> >>>Nagios-users at lists.sourceforge.net
> >>>https://lists.sourceforge.net/lists/listinfo/nagios-users
> >>>::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> >>>::: Messages without supporting info will risk being sent to /dev/null
> >>>  
> >>>
> >>>      
> >>>
> >>
> >>-------------------------------------------------------
> >>This SF.net email is sponsored by: SF.net Giveback Program.
> >>SourceForge.net hosts over 70,000 Open Source Projects.
> >>See the people who have HELPED US provide better services:
> >>Click here: http://sourceforge.net/supporters.php
> >>_______________________________________________
> >>Apan-users mailing list
> >>Apan-users at lists.sourceforge.net
> >>https://lists.sourceforge.net/lists/listinfo/apan-users
> >>
> >>
> >>    
> >>
> >
> >
> >
> >
> >-------------------------------------------------------
> >This SF.net email is sponsored by: SF.net Giveback Program.
> >SourceForge.net hosts over 70,000 Open Source Projects.
> >See the people who have HELPED US provide better services:
> >Click here: http://sourceforge.net/supporters.php
> >_______________________________________________
> >Apan-users mailing list
> >Apan-users at lists.sourceforge.net
> >https://lists.sourceforge.net/lists/listinfo/apan-users
> >  
> >
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: SF.net Giveback Program.
> SourceForge.net hosts over 70,000 Open Source Projects.
> See the people who have HELPED US provide better services:
> Click here: http://sourceforge.net/supporters.php
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null

-- 
********************************

Igor Kurtovic
Technische Systemlösungen
QSC AG

Phone:   +49 221 6698 404
Mobile:  +49 163 6698 075
Fax:     +49 221 6698 469
WWW:     www.q-dsl.de
Email:   igor.kurtovic at qsc.de

********************************

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20031014/d9deff26/attachment.html>


More information about the Users mailing list