check_nagios isn't very smart.

jeff vier jeff.vier at tradingtechnologies.com
Tue Sep 30 19:59:11 CEST 2003


that ignores the possibility of zombies.  Same as my first point from
the original message.

If check_procs had an arg that would look for the pid (not the ppid),
then it would be useful because I could cat in the nagios.lock.  But it
doesn't.

On Tue, 2003-09-30 at 12:41, Williams, P. Lane wrote:
> use one of the check_procs plugins.
> 
> -----Original Message-----
> From: jeff vier [mailto:jeff.vier at tradingtechnologies.com]
> Sent: Tuesday, September 30, 2003 12:51 PM
> To: nagios-users
> Subject: [Nagios-users] check_nagios isn't very smart.
> 
> 
> Okay, I tuned our nagios system, here.
> 
> With an increase in efficiency and "intelligence" there's a lot less
> false alerts.
> 
> However, that in itself is causing another problem.
> 
> Since check_nagios depends on the log being updated to figure out if
> nagios is running, it often thinks it's dead.  We can easily go an hour
> without an update to the log file.
> 
> I fixed this by setting log_service_retries=1, but that seems
> ridiculous.  Turning on what amounts to debugging to trick another
> element of nagios.
> 
> So, my question is, is there another way to watch nagios that doesn't
> cause me to have to pile tons of garbage into my filesystem?
> 
> Some things I was considering, and the reasons I haven't [yet?]:
> 
> option 1 - cron once per 1 min (and have a 2 min nagios_check max):
> 	if [ "`ps -ef |grep nagios|grep -v grep|wc`" -gt 2 ]; then echo
> "[`date
> +%s`] Heartbeat">> nagios.log; fi
> 
>   problem - What about zombied processes?  I'm falsely assuming 1 or
> more nagios processes means it's okay.
> 
> option 2 - change the nagios_check_command in cgi.cfg to use a script
> with a bunch more logic, but basically use
> 'lynx -head -dump -auth=user:pwd \
> "http://localhost/nagios/cgi-bin/extinfo.cgi?type=1&host=hostname"'
> 
>   problem - I'm depending on http, which I guess is okay, since if http
> is failing, I'd be updating the nagios.log anyway with that error and
> sending out alerts.  also, I have to re-invent the process with, so far,
> unknown feasibility, and I don't have much time to waste if it turns out
> this is a bad idea for reasons I didn't think of (hence my asking).
> 
> Thoughts?  If I do end up figuring out a new way to do it, I'll
> certainly post it.
> 
> 
> 
> -------------------------------------------------------
> This sf.net email is sponsored by:ThinkGeek
> Welcome to geek heaven.
> http://thinkgeek.com/sf
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting
> any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list