Design question

Sean McAfee smcafee at collaborativefusion.com
Thu Jul 31 22:56:14 CEST 2008


Michael Weiner wrote:
> My only question after reading the redundant network monitoring
> documentation is, how to do a remote check on the nagios process? Can
> this be wrapped in nrpe?
> 
> Thanks

Exactly that - slaves monitor the master (and vice-versa) via a 
nrpe-wrapped check_nagios that checks /var/spool/nagios/nagios.log (runs 
every minute).  This is where the event handler is kicked off on any 
HARD state change.

Since any large-scale problem will result in moderate-to-severe latency 
as checks timeout and reschedule (especially if send_nsca is timing out 
on passive result submissions as well), there's also a cron job that 
runs every two minutes that basically does the same thing, just outside 
the guise of the actual Nagios daemon.

The only drawback is it can't self-demote (or you'd be spamming the 
external command file every time it runs).  Prompt self-promotion is the 
primary goal here, so some latency in self-demotion is perfectly 
acceptable since the only risk is receiving redundant notifications from 
each slave for any globally-monitored hosts.

Sean McAfee
System Engineer

Collaborative Fusion, Inc.
  smcafee at collaborativefusion.com
  412-422-3463 x 4025

5849 Forbes Avenue
Pittsburgh, PA 15217

****************************************************************
IMPORTANT: This message contains confidential information
and is intended only for the individual named. If the reader of
this message is not an intended recipient (or the individual
responsible for the delivery of this message to an intended
recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited. Please
notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
sender therefore does not accept liability for any errors or
omissions in the contents of this message, which arise as a
result of e-mail transmission.
****************************************************************

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list