check_cluster2 does not work as expected

Werner Flamme werner.flamme at ufz.de
Mon Sep 7 14:16:37 CEST 2009


Werner Flamme [07.09.2009 11:06]:
> Jim Avery [07.09.2009 10:03]:
>> 2009/9/5 Werner Flamme <werner.flamme at ufz.de>:
>>> Hi,
>>>
>>> I want to check a cluster consisting of 2 nodes. The task is simple:
>>> show how many nodes are up (respective down, there are two nodes).
>>>
>>> The command definition is:
>>>
>>> $USER1$/check_cluster2 -h -l $HOSTALIAS$ -w 1 -c 2 -d $ARG1$
>>>
>>> So, the host alias of the cluster will be the label, the plugin should
>>> give a "warning" when 1 node is down, and should cry "critical" when
>>> both nodes are down.
>>>
>>> That's what I thought this command would do.
>>>
>>> And that's what I read in the mean time:
>>>
>>> CLUSTER OK: FW-Cluster: 1 up, 1 down, 0 unreachable
>>>
>>> Sorry? Why is "1 down" not seen as warning? What do I do wrong?
>>>
>>> TIA
>>> Werner
>>
>> I'm not familiar with check_cluster2, but came across a similar
>> situation when using check_cluster for a host check recently.  When I
>> run check_cluster --help, it tells me :-
>>
>>  See:
>>  http://nagiosplug.sourceforge.net/developer-guidelines.html#THRESHOLDFORMAT
>>  for THRESHOLD format and examples.
>>
>> I found I had to use "-w 0 -c 1" to make the plugin behave how I
>> wanted (warn if one host is down and critical if two are down).
> 
> Good graciuos ;-) - it never occured to me when reading
> 
>  -w, --warning=THRESHOLD
>     Specifies the range of hosts or services in cluster that must be in
> a non-OK state in order to return a WARNING status level
> 
> that a THRESHOLD of zero hosts means 1 host. But this really seems to
> work, since when both nodes were down (yesterday afternoon and this
> morning), the status finally changed to WARNING.
> 
> All right, I changed -w 1 -c 2 to -w 0 -c 1. I still wonder how we
> managed to get the alarms on the other clusters weeks ago :-(
> 
>> Also, if you're using this as a host check (not a service check) note
>> that if the host check returns a warning state, Nagios will usually
>> interpret this to mean that the host is 'UP'.  See
>> http://nagios.sourceforge.net/docs/3_0/hostchecks.html for an
>> explanation of how plugin results are interpreted for host checks in
>> Nagios.
> 
> Thanks - lucklily we use a test that only returns OK and CRITICAL.
> 
> Thanks for the hint - I think one of the nodes might fall down any
> minute now, I will tell if this works ;-)

OK, the node was up longer than expected ;-) - but it worked with -w 0
-c 1: one node down causes WARNING state!

Thank you very much!

Regards
Werner

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list