attempt recovery on service CRITICAL with nrpe

Jamin jragle at unm.edu
Thu Jan 23 02:05:15 CET 2003


Oh yeah, by the way, I did this before I put this script in place, but I
trigger the script to run via a hack of a notification command and nrpe
for the ldap service so I can still run my check_ldap tests to make sure
the ldap server is functioning properly.

It would probably be better to set up two service checks, one being the
network based check_ldap plugin and the other, a check_ldap_procs check
to see if the processes are running and restart the server if needed via
nrpe (or whatever).

My nrpe check only gets called on a service notification event now.  I
should probably change it to something like I described above so it is
more responsive...  

I made a fake contact person and contact group and everything.  muhaha I
shouldn't abuse nagios like this... poor poor nagios.  I know, I'm sick.

Thanks for the help guys.
-Jamin

On Wed, 22 Jan 2003, Jamin wrote:

> Since I'm checking an ldap service (iplanet) I changed the code like so.
> Since I could I went ahead and returned the number of supposed processes 
> running minus the grep count.
> 
> here is my modified script:
> #!/bin/sh
> 
> ERS=`ps -eaf | grep -v grep | grep ns-slapd | wc -l`
> 
> if [ $ERS -gt 1 ] 
>   then
>     /bin/echo "LDAP OK - LDAP running $ERS processes"
>     exit 0
> fi
> 
> if [ $ERS -eq 0 ]
>   then
>     sleep 10
>     ERSLATE=`ps -eaf | grep -v grep | grep ns-slapd | wc -l`
>     if [ $ERSLATE -eq 0 ]
>       then 
>         /sbin/service ldap start
>       else
>         /bin/echo "LDAP OK - LDAP running $ERS processes"
>         exit 0
>     fi
>     /bin/echo "LDAP CRITICAL - not running, attempted to restart"
>     exit 2
> fi
> 
> IT seems to be working great.  We shall see...  
> Hopefully we get this bug tracted down why the iplanet server keeps 
> crashing...
> -Jamin
> 
> 
> On Wed, 22 Jan 2003, donavan nelson wrote:
> 
> > I'll try this again and include the list :)
> > 
> > I would really clean up the following a bit.  
> > 
> > ERS=`ps -eaf | grep ersjsf | wc -l`
> > 
> > This removes the ambiguity of counting the grep process
> > 
> > ERS=`ps -ef | grep ersjsf | grep -v  "grep" -c`
> > 
> > Because the way your script is written, the first grep could return 0 and you
> > would fall out and exit with 0.
> > 
> > .dn
> > --
> > Donavan Nelson
> > 4wx Networks
> > www.4wx.net
> > 
> > ---------- Original Message -----------
> > From: Rasmus Plewe <rplewe at ess.nec.de>
> > To: nagios-users at lists.sourceforge.net
> > Sent: Thu, 23 Jan 2003 00:16:12 +0100
> > Subject: Re: [Nagios-users] attempt recovery on service CRITICAL with nrpe
> > 
> > > Hello,
> > > 
> > > On Wed, Jan 22, 2003 at 10:23:49AM -0700, Jamin wrote:
> > > > Hey all,
> > > > 	I was wondering if any of you have tried to use nrpe to fix 
> > > > problems on systems before any paging occurs.  Basically I have an LDAP 
> > > > service running on a remote machine and I would like to set up nagios with 
> > > > nrpe to try to restart the service when it detects that it has gone down.
> > > 
> > > if you don't mind an example without the context, where you have to
> > > extract the principle yourself:
> > > 
> > > #!/bin/sh
> > > 
> > > ERS=`ps -eaf | grep ersjsf | wc -l`
> > > 
> > > if [ $ERS -eq 2 ] || [ $ERS -eq 3 ]
> > >   then
> > >     /bin/echo "ERS OK - ERS running"
> > >     exit 0
> > > fi
> > > 
> > > if [ $ERS -eq 1 ]
> > >   then
> > >     sleep 20
> > >     ERSLATE=`ps -eaf | grep ersjsf | wc -l`
> > >     if [ $ERSLATE -eq 1 ]
> > >       then
> > >         /etc/rc2.d/S99ers start
> > >       else
> > >         /bin/echo "ERS OK - ERS running" 
> > >         exit 0 
> > >     fi
> > >     /bin/echo "ERS CRITICAL - not running, attempted to restart"
> > >     exit 2
> > > fi
> > > 
> > > Regards,
> > >          Rasmus
> > > 
> > > -------------------------------------------------------
> > > This SF.net email is sponsored by: Scholarships for Techies!
> > > Can't afford IT training? All 2003 ictp students receive 
> > > scholarships. Get hands-on training in Microsoft, Cisco, Sun,
> > >  Linux/UNIX, and more. www.ictp.com/training/sourceforge.asp
> > _______________________________________________
> > > Nagios-users mailing list
> > > Nagios-users at lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/nagios-users
> > ------- End of Original Message -------
> > 
> > 
> > 
> > -------------------------------------------------------
> > This SF.net email is sponsored by: Scholarships for Techies!
> > Can't afford IT training? All 2003 ictp students receive scholarships.
> > Get hands-on training in Microsoft, Cisco, Sun, Linux/UNIX, and more.
> > www.ictp.com/training/sourceforge.asp
> > _______________________________________________
> > Nagios-users mailing list
> > Nagios-users at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nagios-users
> > 
> 
> ----------------------------------------------------------------------
> Jesus must have been a party animal that none could match to this day.
> Think about it.  When I get slobering drunk I sometimes wake up face down
> in the hallway in the Hokona dorm.  He woke up on a cross.  Man, that hang
> over must have sucked.  No wonder people thought he was dead.
>                                                   -- March 12th, 1999
> ----------------------------------------------------------------------
> 
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: Scholarships for Techies!
> Can't afford IT training? All 2003 ictp students receive scholarships.
> Get hands-on training in Microsoft, Cisco, Sun, Linux/UNIX, and more.
> www.ictp.com/training/sourceforge.asp
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> 

----------------------------------------------------------------------
 "I had to perform an act of faith. I had to prove to myself that I
  was a man. Not just a producing-consuming economical animal ... but
  a man."                                -- Robert A. Heinlein
----------------------------------------------------------------------





-------------------------------------------------------
This SF.net email is sponsored by: Scholarships for Techies!
Can't afford IT training? All 2003 ictp students receive scholarships.
Get hands-on training in Microsoft, Cisco, Sun, Linux/UNIX, and more.
www.ictp.com/training/sourceforge.asp




More information about the Users mailing list