plugins works on console but not reliable inside nagios

Heiko rupertt at gmail.com
Mon Jun 23 16:13:29 CEST 2008


On Mon, Jun 23, 2008 at 2:40 PM, Marc Powell <marc at ena.com> wrote:
>
> On Jun 23, 2008, at 3:18 AM, Heiko wrote:
>>
>
>> The plugin is used to monitor some F5 load balancers, i
>> already hat a cat witht the developer on this list, and the
>> conclusion was that
>> the problem seems to be inside nagios.
>
> Which plugin is it, where did you get it and was that his conclusion?
> For what reason?
>
Hello,
We are monitoring how many members in a pool are still active, if we
have a webserverpool that has
5 servers and 3 go offline we wanna be informed about that situation.

http://www.nagiosexchange.org/cgi-bin/page.cgi?g=Detailed%2F2447.html;d=1

You can find the discusion I had on this list, title was "monitoring
F5 bigIP Load Balancers"


>> Can I give you some information like config or logfiles to analyse
>> this problem`?
>
> I don't have access to an F5 to do any testing with but I can see if
> there's anything obvious. The host, service and command definitions
> would be helpful. Also, try running the plugin from the command line
> repeatedly, as the nagios user, from the nagios machine, as nagios
> would run it by substituting appropriate values for $MACROS$ you use.
> Does it return instantly with correct values or take some time to
> complete (> 10s)? Timeouts mean that the plugin is just taking longer
> to finish than you've told nagios to wait. Given what's known so far,
> you may just need to increase the service_check_timeout value in
> nagios.cfg.
>
from the 50 time I executed the plugin I had 1 timeout on the bash, as
the nagios user

/usr/local/nagios/libexec/check_bigip_pool -H 172.17.1.10 -C public -S
9 -vw 51 -c 26 -P cbw_dev -t 180

from my nagios.conf
service_check_timeout=180
host_check_timeout=180
event_handler_timeout=60

this is the service defintion template I use for the machines.

define service{
        name                    tmpl-a6
        host_name               BigIP-Active
        max_check_attempts      3
        normal_check_interval   5
        retry_check_interval    3
        check_period            24x7
        notification_interval   60
        notification_period     24x7
        notification_options    w,c,r,f
        contact_groups          noc
        active_checks_enabled   1
        register                0
}



>> Am I right that the nagios.log is cronologicaly?
>
> It is (standard unix timestamp).
>
>> in the following past you can see that the service is one second alive
>> and in the next line it did a timeout,
>
> Does the problem only occur when there is a problem with the pool?
>
nope, it looks the problem occurs kind of sporadically, we monitor
some other services on this machine that dont make these problems.
They use the same template as above.
We first thought it is a load problem on the LB, but since many other
services work fine it cant really that problem


thx for youre patience


Heiko
> --
> Marc
>
> -------------------------------------------------------------------------
> Check out the new SourceForge.net Marketplace.
> It's the best place to buy or sell services for
> just about anything Open Source.
> http://sourceforge.net/services/buy/index.php
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list