Nagios kept from restarting after reboot by lock file

Mike Lindsey mike-nagios at 5dninja.net
Tue Dec 21 07:37:48 CET 2010


On 12/20/10 8:16 AM, eric.berg at barclayscapital.com wrote:
> Alternatively, could you recommend a good system/resource monitoring tool that would be able to let me know if nagios is down and restart it automatically?
>
Add a cronjob on a five (or whatever you're comfortable with) minute 
interval, similar to:
#!/bin/bash

PATH=/bin:/usr/bin:/usr/local/bin
PID=`cat /home/nagios/nagios/var/nagios.lock`
PIDTEST=`kill -0 ${PID} 2>&1 >/dev/null`

if [ "${PIDTEST}" -eq "1" ]
then
     rm /home/nagios/nagios/var/nagios.lock
     # INSERT RESTART COMMAND HERE
     echo "Killed Lockfile and restarted Nagios" | mail -s "Nagios 
restart `hostname`" your-email at here.com
fi
 >>>

Just be aware that it'll also trigger that if block, if nagios is 
running under a different username.  You can check for that by doing 
some tests in the script with ps and grep.

> _____________________________________________
> From:   Berg, Eric: IT (NYK)
> Sent:   Monday, December 20, 2010 11:03 AM
> To:     'nagios-users at lists.sourceforge.net'
> Subject:        Nagios kept from restarting after reboot by lock file
>
> Gee, this seems like an annoying newbie problem, but if Nagios crashes or is killed (as on system reboot), it leaves a lock file around that prevents it from starting again until the lock file is manually removed.
>
> I see this on Monday mornings after weekend reboots on a Red Hat Linux box:
>
> nagios: Lockfile '/home/nagios/nagios/var/nagios.lock' looks like its already held by another instance of Nagios (PID 0).  Bailing out...
Sounds like something in the shutdown process is throwing a 0 into the 
pid file, or the startup in the rc script is.

Either way, you should never have a 0 in there, either the rc script is 
putting the wrong data in there, or it's reporting incorrectly.
> Does anyone know if there's a config option or something else that obviates the need to write a wrapper scropt to check to see if Nagios is really running and remove the lock file (look slike Nagios already knows it's not running by virtue of the value of the PID inthis very message!) so that it can cleanly start up again?

-- 
Mike Lindsey


------------------------------------------------------------------------------
Lotusphere 2011
Register now for Lotusphere 2011 and learn how
to connect the dots, take your collaborative environment
to the next level, and enter the era of Social Business.
http://p.sf.net/sfu/lotusphere-d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list