Understanding Passive Checks

Cliff Riggs cliff at proteris.com
Tue Mar 30 23:52:44 CEST 2004


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

With some research and help from the list I got my passive checks using 
NSCA to work. I'm now trying to interpret the results as I'm seeing 
quite a bit of flapping.

I have freshness checking enabled on the primary as follows:

check_service_freshness=1
freshness_check_interval=600
command_check_interval=-1
use_retained_program_state=0
retain_state_information=1

and I have it configured to use the freshness checking commands as 
described here: http://nagios.sourceforge.net/docs/1_0/distributed.html

The primary service definition defines the check_command as 
"service_is_stale" while the remote cisco-test service definition 
defines the check_command as "check_ping!100.0,20%!500.0,60%" On the 
primary system "active_checks_enabled 0" as well.

The Nagios event log looks like this:

[03-30-2004 16:25:25] SERVICE ALERT: cisco-test;PING;OK;HARD;3;PING OK 
- - Packet loss = 0%, RTA = 3.48 ms
[03-30-2004 16:25:25] SERVICE ALERT: 
cisco-test;PING;CRITICAL;HARD;3;CRITICAL: Service results are stale!
[03-30-2004 16:25:19] EXTERNAL COMMAND: 
PROCESS_SERVICE_CHECK_RESULT;cisco-test;PING;0;PING OK - Packet loss = 
0%, RTA = 3.48 ms
[03-30-2004 16:24:25] SERVICE ALERT: 
cisco-test;PING;CRITICAL;SOFT;2;CRITICAL: Service results are stale!
[03-30-2004 16:23:25] SERVICE ALERT: 
cisco-test;PING;CRITICAL;SOFT;1;CRITICAL: Service results are stale!
[03-30-2004 16:22:25] SERVICE ALERT: cisco-test;PING;OK;SOFT;2;PING OK 
- - Packet loss = 0%, RTA = 3.45 ms
[03-30-2004 16:22:25] SERVICE ALERT: 
cisco-test;PING;CRITICAL;SOFT;1;CRITICAL: Service results are stale!
[03-30-2004 16:22:18] EXTERNAL COMMAND: 
PROCESS_SERVICE_CHECK_RESULT;cisco-test;PING;0;PING OK - Packet loss = 
0%, RTA = 3.45 ms
[03-30-2004 16:19:25] SERVICE ALERT: cisco-test;PING;OK;HARD;3;PING OK 
- - Packet loss = 0%, RTA = 3.34 ms

As a result, the service is flapping, but the external command check is 
being correctly received. From what I understand from the timing, the 
external check is being received, but not read in a timely fashion? Is 
there something I am missing in this equation? I am especially confused 
by the timing. It looks like it is using a 60 second check for 
freshness (which was the default that I changed it from). Either that 
or the timing is coincidental as this worked fine for the first 21 
minutes or so and then started flapping 22 min and 15 seconds after 
restart.

Thanks as usual for your insights!

Cliff
- --
- --------------------------------------------
Clifford Riggs
CCIE #9314, CISSP
- --------------------------------------------
Proteris Group LLC
Information Security Consultants
Trust. Expertise. Results.
- --------------------------------------------
www.proteris.com
- --------------------------------------------
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (Darwin)

iD8DBQFAaewsJ3mHWY7troQRAo+FAJ9zFrQFZn0t2kFbClC/gYAYPKdXtwCfTuoW
XMgh2Jh98kS9WCLSohCOBgs=
=fiOP
-----END PGP SIGNATURE-----



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list