checking process time

Andreas Ericsson ae at op5.se
Thu Sep 8 20:36:20 CEST 2005


Marc Powell wrote:
> 
>>-----Original Message-----
>>From: nagios-users-admin at lists.sourceforge.net [mailto:nagios-users-
>>admin at lists.sourceforge.net] On Behalf Of Rossz Vamos-Wentworth
>>Sent: Thursday, September 08, 2005 11:26 AM
>>To: nagios-users at lists.sourceforge.net
>>Subject: [Nagios-users] checking process time
>>
>>I have a perl script used as a pipe for email that does some special
>>processing of data.  Occassionally, unfortunately, it gets "stuck" and
>>does not terminate.  When this happens, it ends up using most of the
> 
> CPU
> 
>>and pretty much screws up the system.  Until I can track down what is
>>causing the infinite loop I was wondering if there was a way to check
>>the life of a process of a specific name and execute an event handler
> 
> if
> 
>>it's been running too long.  The script should only take a few seconds
>>to run, so I figure if it is more than a few minutes old I can simply
>>have nagios kill the problem process (e.g. (kill -9 pid" should do the
>>job).
> 
> 
> Nagios-plugins-1.4.1 check_procs *under linux* adds an additional metric
> called ELAPSED which appears to allow for checking how long a process
> has been running. I've tried testing it but the call to ps isn't
> including the 'etime' option ala "/bin/ps -axwo 'stat uid ppid vsz rss
> pcpu comm args etime'" so it isn't working properly. It looks to me like
> configure tests less informative variations of the ps command first and
> if one of those matches it will use that for the ps format instead of
> progressing to more informative variations, including the one that has
> etime. From configure.log --
> 
> configure:14078: result: /bin/ps
> configure:14086: checking for ps syntax
> configure:14095: result: /bin/ps axwo 'stat uid pid ppid vsz rss pcpu
> comm args'
> 
> when in fact, the one that includes etime works correctly (taken from
> configure) --
> 
> $ ps -weo 'stat comm vsz rss user uid pid ppid etime args'
> STAT COMMAND            VSZ  RSS USER       UID   PID  PPID     ELAPSED
> COMMAND
> S    init              1376  368 root         0     1     0 132-06:03:18
> init
> SW   keventd              0    0 root         0     2     1 132-06:03:17
> [keventd]
> SWN  ksoftirqd_CPU0       0    0 root         0     3     1 132-06:03:17
> [ksoftirqd_CPU0]
> 
> Can anyone else confirm this as a bug? I don't see anything in the
> tracker.
> 

I have a vague memory of this being because some systems failed silently 
in the configure test, causing check_procs to sigsegv in whatever 
configuration it ran. I believe they were re-arranged rather than 
dropped so it would be easy to re-enable it later. cvs log should tell you.

> --
> Marc 
> 
> 
> -------------------------------------------------------
> SF.Net email is Sponsored by the Better Software Conference & EXPO
> September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
> Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
> Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
> 

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Lead Developer


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list