[Apan-users] Re: nagios and apan cause server to crash...

Fredrik Wänglund fredrik.wanglund at datavis.se
Tue Oct 14 09:25:17 CEST 2003


I have service_reaper_frequency=3, and I remember that before I changed 
it from the default, my load used to be 8-10.

/FredrikW

Evan Weston wrote:

>I was having a simmilar problem under Redhat 9 on a pIII 900 512 meg ram.
>
>I set 'service_reaper_frequency=4' instead of the default 'service_reaper_frequency=10' in the 'nagios.cfg' file and its completely stable now.
>
>Evan Weston
>
>
>-----Original Message-----
>From: Fredrik Wänglund [mailto:fredrik.wanglund at datavis.se]
>Sent: Tuesday, 14 October 2003 4:21 PM
>To: jeff vier
>Cc: Matthew Wilson; nagios-users; Apan-users List
>Subject: Re: [Apan-users] Re: [Nagios-users] nagios and apan cause server to crash...
>
>What platform/version are you running on?
>
>I'm running without any problem under RedHat 8.0 on a PIII 1400MHz with
>170 hosts, 200 apan-services and 300 'normal' services.
>My system-load stays between 1 and 2, CPU is mainly >80% idle
>
>jeff vier wrote:
>
>  
>
>>I'm having the same problem here.
>>
>>I have been capturing dumps of the top command, pulling only active
>>processes.  It looks like something causes an instance of apan.sh to
>>hang, and then they just start piling up (fast).
>>
>>The load is usually under 1.0 (sometimes jumping up to 1.xx - no big
>>deal).  When it died, my load was over 80 (yes eighty) with 46 (maybe
>>more) *active* apan processes (not sure of the actual count, top dump
>>only shows 62 lines of processes.  It said 73 running, though, so likely
>>more were apan.sh - also, unknown count of inactive apan.sh process
>>sitting and waiting), 17 zombies (unknown parent, alas). 99% CPU usage
>>on CPU0, 100% on CPU1.  Yikes.  This jump happened over 16 minutes, at
>>which point my crons no longer ran, so who knows how badly it kept
>>piling up.
>>
>>apan.debug log file doesn't show anything abnormal (whee.)
>>
>>I'm going to have to write a watcher to manually kill the hanging
>>apan.sh procs, which I don't want to do for fear of inadvertently
>>killing valid processes, but I am quite sick of having to go over to the
>>colo to poke the power button once a week (only been in production 3
>>weeks - 4 crashes so far).
>>
>>I'm going to increase my level of manual debugging, too, of processes,
>>etc.  I'll post any new insight.
>>
>>--jeff
>>
>>On Wed, 2003-10-08 at 10:31, Matthew Wilson wrote:
>>
>>
>>    
>>
>>>UPDATE: I have checked and my nagios installation does not have ePN compiled
>>>in.  So this is not the cause.  I would greatly appreciate any suggestions
>>>on how to prevent/cure this problem.
>>>
>>>  
>>>
>>>      
>>>
>>>>Thanks
>>>>Matthew Wilson.
>>>>    
>>>>
>>>>        
>>>>
>>>>>Matthew Wilson wrote:
>>>>>
>>>>>      
>>>>>
>>>>>          
>>>>>
>>>>>>Hi guys,
>>>>>>I have read in the list archives in the last couple of months a few
>>>>>>threads about nagios and apan chewing up memory.  I have tried a few
>>>>>>of the solutions posted but still have no joy.
>>>>>>        
>>>>>>
>>>>>>            
>>>>>>
>>>-------------------------------------------------------
>>>This SF.net email is sponsored by: SF.net Giveback Program.
>>>SourceForge.net hosts over 70,000 Open Source Projects.
>>>See the people who have HELPED US provide better services:
>>>Click here: http://sourceforge.net/supporters.php
>>>_______________________________________________
>>>Nagios-users mailing list
>>>Nagios-users at lists.sourceforge.net
>>>https://lists.sourceforge.net/lists/listinfo/nagios-users
>>>::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
>>>::: Messages without supporting info will risk being sent to /dev/null
>>>  
>>>
>>>      
>>>
>>
>>-------------------------------------------------------
>>This SF.net email is sponsored by: SF.net Giveback Program.
>>SourceForge.net hosts over 70,000 Open Source Projects.
>>See the people who have HELPED US provide better services:
>>Click here: http://sourceforge.net/supporters.php
>>_______________________________________________
>>Apan-users mailing list
>>Apan-users at lists.sourceforge.net
>>https://lists.sourceforge.net/lists/listinfo/apan-users
>>
>>
>>    
>>
>
>
>
>
>-------------------------------------------------------
>This SF.net email is sponsored by: SF.net Giveback Program.
>SourceForge.net hosts over 70,000 Open Source Projects.
>See the people who have HELPED US provide better services:
>Click here: http://sourceforge.net/supporters.php
>_______________________________________________
>Apan-users mailing list
>Apan-users at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/apan-users
>  
>




-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list