Nag event handlers restarting failed programs on NT ?

Carroll, Jim P [Contractor] jcarro10 at sprintspectrum.com
Thu Jan 16 21:43:51 CET 2003


FWIW, FireDaemon might be of value to you:

http://www.firedaemon.com/

jc

> -----Original Message-----
> From: Stanley Hopcroft [mailto:Stanley.Hopcroft at IPAustralia.Gov.AU]
> Sent: Thursday, January 16, 2003 2:32 PM
> To: Carroll, Jim P [Contractor]
> Cc: nagios-users at lists.sourceforge.net
> Subject: Re: [Nagios-users] Nag event handlers restarting failed
> programs on NT ?
> 
> 
> Dear Sir,
> 
> I am writing to thank you for reply (I will certainly take your
> advice) and summarise for the archives some options,
> 
> On Thu, Jan 16, 2003 at 09:33:42AM -0600, Carroll, Jim P 
> [Contractor] wrote:
> 
> > > Is anyone using Nagios (event handlers) to restart failed 
> > > programs on NT hosts ?
> 
> 
> > An interesting thought.  I don't have an answer off the top 
> of my head
> > (still working on my first coffee).
> > 
> > You might wish to check out www.infrastructures.org, 
> subscribe to the
> > mailing list (it's quite low-volume) and post a variant of 
> your query there.
> > 
> 
> Probably in order of seriousness/helpfulness (although I 
> think option 2
> is the probably the most durable). 
> 
> 1 Convert the program - ask someone else to do it - to a 
> service and use
> the rpcclient program (from Samba-tng or Samba-2.2.x or 
> Samba-alpha) to
> start the service.
> 
> This requires that
> 
> . the Nag host be set up with a machine account on the MS host
> that is running the program/service
> 
> . the program can be converted to a service (I understand 
> from a Windows
> programmer that in the case of Java applications this can be kludgy).
> 
> 2 Suggested by Mr T De Blende,
> 
> '
> * NSClient checks to see if the program is still running.
> 
> In our case, the culprit program will be appending heartbeat 
> messages to
> a text file in a shared directory. A Nag service check will 
> 'tail' that
> file and return a CRITICAL if it can find no log records newer (the
> records will have time stamps) than a the current time minus 
> a threshold
> iterval (the last record in the file was logged more than 10 minutes
> ago).
> 
> * If the program is not running, it puts a simple text file on a
> Windows share on the server that is supposed to be running that
> program. Just share a directory with write rights only for a certain
> account that is used by the Nagios box to make the SMB connection.
> 
> * Create a small script on the Windows server that checks for the
> existance of that text file in that shared directory, and if it is
> there: 1) delete it and 2) restart the program.'
> 
> (This latter program may be run by AT periodically).
> 
> 3 Wait for it ... this is my idea.
> 
> Write a Tk/Expect or Perl/Tk program to drive a VNC session with the
> host and use this VNC session to start the program.
> 
> It would probably be a good thing to have the program set up to be run
> from the GUI (by clicking an icon that runs a bat file for example).
> 
> Sheesh.
> 
> Yours sincerely.
> 
> -- 
> --------------------------------------------------------------
> ----------
> Stanley Hopcroft
> --------------------------------------------------------------
> ----------
> 
> '...No man is an island, entire of itself; every man is a piece of the
> continent, a part of the main. If a clod be washed away by the sea,
> Europe is the less, as well as if a promontory were, as well as if a
> manor of thy friend's or of thine own were. Any man's death diminishes
> me, because I am involved in mankind; and therefore never send to know
> for whom the bell tolls; it tolls for thee...'
> 
> from Meditation 17, J Donne.
> 


-------------------------------------------------------
This SF.NET email is sponsored by: Thawte.com
Understand how to protect your customers personal information by implementing
SSL on your Apache Web Server. Click here to get our FREE Thawte Apache 
Guide: http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0029en




More information about the Users mailing list