From support at crazynetwork.it  Mon Dec  3 16:41:30 2012
From: support at crazynetwork.it (Supporto Tecnico - Crazy Network)
Date: Mon, 03 Dec 2012 16:41:30 +0100
Subject: Nagios 3.99.95 contact_groups
In-Reply-To: <50AB8D20.2020806@crazynetwork.it>
References: <50AB8D20.2020806@crazynetwork.it>
Message-ID: <50BCC82A.8050607@crazynetwork.it>

  Hi,

i did re-install nagios again (from git repo) using: git clone 
git://github.com/ageric/nagios.git

But still got same error from debug level 32:

[1354549057.568949] [032.0] [pid=4151] ** Service Notification Attempt 
** Host: 'Server.SysAdminDiary.it', Service: 'Server FTP', Type: 0, 
Options: 0, Current State: 2, Last Notification: Thu Jan  1 01:00:00 1970
[1354549057.569017] [032.0] [pid=4151] Notification viability test passed.
[1354549057.569029] [032.1] [pid=4151] Current notification number: 1 
(incremented)
[1354549057.569040] [032.1] [pid=4151] Service notification will NOT be 
escalated.
[1354549057.569049] [032.1] [pid=4151] Adding normal contacts for 
service to notification list.
[1354549057.569056] [032.0] [pid=4151] No contacts were found for 
notification purposes.  No notification was sent out.

Here contacts file:

define contact{
         name                            cn-contacts
         service_notification_period     24x7
         host_notification_period        24x7
         service_notification_options    c,r
         host_notification_options       d,u,r
         service_notification_commands   notify-service-by-email
         host_notification_commands      notify-host-by-email
         register                        0
}


define contact{
         contact_name                    andrea.iannucci
         use                             cn-contacts
         alias                           Andrea Iannucci
         email                           support at crazynetwork.it
         contact_groups                  sysadmindiary.it
         service_notification_commands   
notify-service-by-email,notify-service-by-sms
         host_notification_commands      
notify-host-by-email,notify-host-by-sms
         pager                           00391234567890
}

define contactgroup{
         contactgroup_name               sysadmindiary.it
         alias                           SysAdminDiary.it Staff
}

Here service definition:

define service{
         name                            cn-ftp-service
         active_checks_enabled           1
         passive_checks_enabled          1
         parallelize_check               1
         obsess_over_service             1
         check_freshness                 0
         notifications_enabled           1
         event_handler_enabled           1
         flap_detection_enabled          1
         process_perf_data               1
         retain_status_information       1
         retain_nonstatus_information    1
         is_volatile                     0
         servicegroups                   ftp-server
         service_description             Server FTP
         check_command                   check_ftp
         icon_image                      ftp.png
         check_period                    24x7
         max_check_attempts              3
         normal_check_interval           5
         retry_check_interval            2
         contact_groups                  sysadmindiary.it
         notification_options            c,r
         notification_interval           60
         notification_period             24x7
         register                        0
}

define servicegroup {
     servicegroup_name               ftp-server
     alias                           FTP Server
}

Here service call in server host file.

define service{
     use                         cn-ftp-service
     host_name                   Server.SysAdminDiary.it
}

Any suggestions/hint?

Thanks

Il 20/11/2012 15:01, Supporto Tecnico - Crazy Network ha scritto:
>    Hi,
>
> i did update Nagios to this alpha release for test purpose using working
> configurations (on Nagios 3.4.1).
>
> The only big trouble im having is that i dont receive any notification.
>
> Here contacts.cfg
>
> define contact{
>           name                            cn-contacts
>           service_notification_period     24x7
>           host_notification_period        24x7
>           service_notification_options    c,r
>           host_notification_options       d,u,r
>           service_notification_commands   notify-service-by-email
>           host_notification_commands      notify-host-by-email
>           register                        0
> }
>
> define contact{
>           contact_name                    andrea.iannucci
>           use                             cn-contacts
>           alias                           Andrea Iannucci
>           email                           support at crazynetwork.it
>           contact_groups                   crazynetwork.it
>           service_notification_commands
> notify-service-by-email,notify-service-by-sms
>           host_notification_commands
> notify-host-by-email,notify-host-by-sms
>           pager                           *************
> }
>
> define contactgroup{
>           contactgroup_name               crazynetwork.it
>           alias                           CrazyNetwork.it Staff
> }
>
>
> I did try to use contactgroups and contact_groups in contact definition
> I did try to add members andrea.iannucci in contactgroup definition
>
> The error on debug_level=32 i can see is that for group crazynetwork.it
> he cant fine any member (in any case).
>
> Setting contacts instead of contact_groups to services definition make
> notification work.
>
> Here debug output:
>
> [1353420007.497082] [032.0] [pid=14604] ** Service Notification Attempt
> ** Host: 'Test.CrazyNetwork.it', Service: 'MySQL', Type: 0, Options: 0,
> Current State: 0, Last Notification: Thu Jan  1 01:00:00 1970
> [1353420007.497154] [032.0] [pid=14604] Notification viability test passed.
> [1353420007.497181] [032.1] [pid=14604] Current notification number: 2
> (incremented)
> [1353420007.497195] [032.2] [pid=14604] Creating list of contacts to be
> notified.
> [1353420007.497206] [032.1] [pid=14604] Service notification will NOT be
> escalated.
> [1353420007.497217] [032.1] [pid=14604] Adding normal contacts for
> service to notification list.
> [1353420007.497233] [032.2] [pid=14604] Adding members of contact group
> 'crazynetwork.it' for service to notification list.
> [1353420007.497245] [032.0] [pid=14604] No contacts were found for
> notification purposes.  No notification was sent out.
>
>
> Hope someone can help me fix this cause on older version was working at
> actually it look like a bug to me.
>
> Thanks


-- 
Andrea Iannucci
----------------------------

----------------------------
Crazy Network di Iannucci Andrea
Viale G.B. Lulli, 24
00050 Cerveteri - RM
(w) www.crazynetwork.it
(e) andrea.iannucci at crazynetwork.it
(t) +39 06 62279876
(f) +39 06 62298767
(m) +39 338 8552885

-------------------------------------------------------------------------------
Please consider our enviromental responsabilit? before printing this 
E-Mail. Thank you.
-------------------------------------------------------------------------------
Questo messaggio di posta elettronica contiene informazioni di carattere 
confidenziale rivolte esclusivamente al destinatario sopra indicato.
E' vietato l'uso, la diffusione, distribuzione o riproduzione da parte 
di ogni altra persona. Nel caso aveste ricevuto questo messaggio di 
posta elettronica per errore, siete pregati di segnalarlo immediatamente 
al mittente e distruggere quanto ricevuto (compresi i file allegati) 
senza farne copia.
Qualsivoglia utilizzo non autorizzato del contenuto di questo messaggio 
costituisce violazione dell'obbligo di non prendere cognizione della 
corrispondenza tra altri soggetti, salvo pi? grave illecito, ed espone 
il responsabile alle relative conseguenze.
--------------------------------------------------------------------------------
This e-mail is confidential and may also contain privileged information. 
If you are not the intended recipient you are not authorised to read, 
print, save, process or disclose this message. If you have received this 
message by mistake, please inform the sender immediately and delete this 
e-mail, its attachments and any copies.
Any use, distribution, reproduction or disclosure by any person other 
than the intended recipient is strictly prohibited and the person 
responsible may incur penalties.
-------------------------------------------------------------------------------- 


------------------------------------------------------------------------------
Keep yourself connected to Go Parallel: 
BUILD Helping you discover the best ways to construct your parallel projects.
http://goparallel.sourceforge.net
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From ftlnagios at gmail.com  Mon Dec  3 18:38:58 2012
From: ftlnagios at gmail.com (FTL Nagios)
Date: Mon, 3 Dec 2012 17:38:58 -0000
Subject: Nagios is ignoring the retry_interval setting
In-Reply-To: <2d7c52241f12c9d3419ef557df8509a4.squirrel@picard.linux.it>
References: <A52C996F892310499AF1F3DDA60A9E6DA1944A8B@FTLMAIL.evesham.fulgent.co.uk>
	<2d7c52241f12c9d3419ef557df8509a4.squirrel@picard.linux.it>
Message-ID: <009401cdd17d$13ec1a90$3bc44fb0$@gmail.com>

Hi Georgio,

Apologies for the delay,

I am doing this first thing tomorrow morning (Tue 4th Dec)- I will post the
debug log then.

Thankyou


-----Original Message-----
From: Giorgio Zarrelli [mailto:zarrelli at linux.it] 
Sent: 29 November 2012 19:24
To: Nagios Users List
Subject: Re: [Nagios-users] Nagios is ignoring the retry_interval setting

Hi,

do not seee anything wrong. Could you set debug=-1

repeat the problem and put the log online?

Giorgio

<quota chi="Andrew Thompson">
> Hi Georgio,
>
> The whole test cfg I am using to try troubleshoot this can be found at:
>
> http://dl.dropbox.com/u/895609/test.cfg
>
> This is a direct copy of my main servers config but with the rest of 
> the servers and some templates for other server checks taken out
>
>
>
> Kind Regards
> Andrew
>
> From: Andrew Thompson
> Sent: 29 November 2012 16:11
> To: nagios-users at lists.sourceforge.net
> Subject: Nagios is ignoring the retry_interval setting
>
> Hi,
>
> My nagios box has decided to stop listening to the retry_interval 
> entry in my templates.
>
> My server template reads:
>
> define host{
>      name                       host-server
>      check_period              server_24x7
>      check_interval            1
>      retry_interval            3
>      max_check_attempts        3
>      notification_period       server_24x7
>      notification_interval      3
>      notification_options      d,r
>      notifications_enabled      1
>      contact_groups            servers email, servers sms
>      event_handler_enabled      1
>      process_perf_data         1
>      retain_status_information    1
>      retain_nonstatus_information 1
>      passive_checks_enabled          0
>      obsess_over_host          0
>      check_freshness          0
>      flap_detection_enabled          0
>      failure_prediction_enabled   0
>      }
>
> Now this is what happens:
>
>
> *         Server goes down at 1pm.
>
> *         I check the next scheduled check and it clearly states 1.03pm
>
> *         But at 1.01pm it checks again and then spits out an email and
> text message saying the server is down.
>
> Completely ignoring the retry_interval setting!!!
>
> Id expect from the above:
>
>
> *         1pm server goes down
>
> *         1.03pm check 2 is done
>
> *         1.06pm check 3 is done and determined hard state.
>
> *         At 1.06pm the notification should be sent out.
>
> Why is this, is something in my config wrong?
>
> Ubuntu 12.04 desktop and Nagios 3.4.1
>
> Thanks
>
>
> ----------------------------------------------------------------------
> -------- Keep yourself connected to Go Parallel:
> VERIFY Test and improve your parallel project with help from experts 
> and peers.
> http://goparallel.sourceforge.net_____________________________________
> __________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when 
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


----------------------------------------------------------------------------
--
Keep yourself connected to Go Parallel: 
VERIFY Test and improve your parallel project with help from experts and
peers. http://goparallel.sourceforge.net
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


------------------------------------------------------------------------------
Keep yourself connected to Go Parallel: 
BUILD Helping you discover the best ways to construct your parallel projects.
http://goparallel.sourceforge.net
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From Edwin.Zoeller at ama-assn.org  Mon Dec  3 19:17:11 2012
From: Edwin.Zoeller at ama-assn.org (Edwin Zoeller)
Date: Mon, 3 Dec 2012 18:17:11 +0000
Subject: Listing Contacts, Groups, Services, etc.
Message-ID: <EFEEE1306BFAAD4890A78E609ECE72035288EC91@UTLP5162.ad.ama-assn.org>

I may have missed something along these lines and if I did, I do apologize.

What I am looking for and needing is if there is a way to get a listing of all the  contacts, contact groups, hosts, services. My management wants to see who, what and how. I am running XI is that makes a difference.

Thanks,

Ed Zoeller
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121203/35f99d91/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
Keep yourself connected to Go Parallel: 
BUILD Helping you discover the best ways to construct your parallel projects.
http://goparallel.sourceforge.net
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From rodrigo.gesswein at gmail.com  Mon Dec  3 23:57:22 2012
From: rodrigo.gesswein at gmail.com (Rodrigo Gesswein)
Date: Mon, 3 Dec 2012 19:57:22 -0300
Subject: Service scheduled for one week later....
Message-ID: <CAOpWx3vFGML1_9RDMCWmkHkRXVbx7R8Prbq7yW1o59gquDA2FA@mail.gmail.com>

Dear All:

        I have a curios question about 'timeperiod' and 'service'. My 'cfg'
files look like:

define timeperiod{
        timeperiod_name DayTime
        alias           DayTime
        monday          08:00-19:00
        tuesday         08:00-19:00
        wednesday       08:00-19:00
        thursday        08:00-19:00
        friday          08:00-19:00
        saturday        08:00-19:00
        sunday          08:00-19:00
}

define timeperiod{
        timeperiod_name NightTime
        alias           NightTime
        monday          19:01-07:59
        tuesday         19:01-07:59
        wednesday       19:01-07:59
        thursday        19:01-07:59
        friday          19:01-07:59
        saturday        19:01-07:59
        sunday          19:01-07:59
}

define service {
        host_name               XXXX
        service_description     NightTime Data Partition: /diska
        check_command           check_nrpe!check_disk!20%!10%!/diska!MB
        check_period            NightTime
}

define service {
        host_name               XXXX
        service_description     DayTime Data Partition: /diska
        check_command           check_nrpe!check_disk!40%!20%!/diska!MB
        check_period            DayTime
}

        However, 'NightTime Data Partition: /diska' was scheduled for
Dec. 11th 19:01..
one week later! and 'DayTime Data Partition: /diska' scheduled as spected...

        Any ideas ??

        Thank you

Rodrigo.

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From Martin_Hugo at hboe.org  Tue Dec  4 04:26:03 2012
From: Martin_Hugo at hboe.org (Martin Hugo)
Date: Tue, 4 Dec 2012 03:26:03 +0000
Subject: Weird Nagios Problem
Message-ID: <70D7A4219FE365439DA53582BA6D8F2EF5DF3F2D@HCSD-MAIL1>

I have been running Nagios for over a year with no issues.  All of a sudden, all of my current loads on my linux servers all go into warning state at the same time, showing the exact same load, which then increments every hour to critical.  After a while (3 or 4 hours)  they all come back down to normal.

Checking on the servers themselves using HTOP shows normal load levels throughout the time period.

Another issue is one check_Interface_Table that returns 255 out of bounds but over 30 others (in the same service group using the same command) return normal.

The problem is obviously with Nagios but I don't know where to look.

Any suggestions?

Martin T. Hugo
Network Administrator
Hilliard City Schools
Tel: 614-921-7102
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121204/f1b4598e/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From ae at op5.se  Tue Dec  4 10:24:06 2012
From: ae at op5.se (Andreas Ericsson)
Date: Tue, 04 Dec 2012 10:24:06 +0100
Subject: servicedependecy is not working
In-Reply-To: <CAG+8EEZf9EPz2sXXth=GO9AjUe61XeAtDcNs=G=dj1FKqVqxKg@mail.gmail.com>
References: <CAG+8EEZf9EPz2sXXth=GO9AjUe61XeAtDcNs=G=dj1FKqVqxKg@mail.gmail.com>
Message-ID: <50BDC136.2010503@op5.se>

On 11/30/2012 05:21 PM, Leonardo Bacha Abrantes wrote:
> Hi guys,
> 
> I use nrpe to monitor my machines and I configured servicedependency (see
> below), however, nagios still sending alerts when nrpe is on critical state.
> 
> define servicedependency{
>           dependent_host_name           srv1
>           dependent_service_description  /var Partition
>           host_name                 srv1
>           service_description            NRPE plugin
>           execution_failure_criteria     c,u,p
>           notification_failure_criteria  c,u,p
>          dependency_period               24x7
> }
> 
> Can you help me please ?
> 

Well, you should be getting a notification about 'NRPE plugin' even
with this servicedependency in place. Are you saying you get one for
'/var partition' as well, even if the 'NRPE plugin' check is either
critical, unknown or pending?

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From ae at op5.se  Tue Dec  4 11:47:27 2012
From: ae at op5.se (Andreas Ericsson)
Date: Tue, 04 Dec 2012 11:47:27 +0100
Subject: Nagios 3.99.95 contact_groups
In-Reply-To: <50BCC82A.8050607@crazynetwork.it>
References: <50AB8D20.2020806@crazynetwork.it>
	<50BCC82A.8050607@crazynetwork.it>
Message-ID: <50BDD4BF.9010109@op5.se>

On 12/03/2012 04:41 PM, Supporto Tecnico - Crazy Network wrote:
>    Hi,
> 
> i did re-install nagios again (from git repo) using: git clone
> git://github.com/ageric/nagios.git
> 
> But still got same error from debug level 32:
> 
> [1354549057.568949] [032.0] [pid=4151] ** Service Notification Attempt
> ** Host: 'Server.SysAdminDiary.it', Service: 'Server FTP', Type: 0,
> Options: 0, Current State: 2, Last Notification: Thu Jan  1 01:00:00 1970
> [1354549057.569017] [032.0] [pid=4151] Notification viability test passed.
> [1354549057.569029] [032.1] [pid=4151] Current notification number: 1
> (incremented)
> [1354549057.569040] [032.1] [pid=4151] Service notification will NOT be
> escalated.
> [1354549057.569049] [032.1] [pid=4151] Adding normal contacts for
> service to notification list.
> [1354549057.569056] [032.0] [pid=4151] No contacts were found for
> notification purposes.  No notification was sent out.
> 
> Here contacts file:
> 
> define contact{
>           name                            cn-contacts
>           service_notification_period     24x7
>           host_notification_period        24x7
>           service_notification_options    c,r
>           host_notification_options       d,u,r
>           service_notification_commands   notify-service-by-email
>           host_notification_commands      notify-host-by-email
>           register                        0
> }
> 
> 
> define contact{
>           contact_name                    andrea.iannucci
>           use                             cn-contacts
>           alias                           Andrea Iannucci
>           email                           support at crazynetwork.it
>           contact_groups                  sysadmindiary.it
>           service_notification_commands
> notify-service-by-email,notify-service-by-sms
>           host_notification_commands
> notify-host-by-email,notify-host-by-sms
>           pager                           00391234567890
> }
> 
> define contactgroup{
>           contactgroup_name               sysadmindiary.it
>           alias                           SysAdminDiary.it Staff
> }
> 
> Here service definition:
> 
> define service{
>           name                            cn-ftp-service
>           active_checks_enabled           1
>           passive_checks_enabled          1
>           parallelize_check               1
>           obsess_over_service             1
>           check_freshness                 0
>           notifications_enabled           1
>           event_handler_enabled           1
>           flap_detection_enabled          1
>           process_perf_data               1
>           retain_status_information       1
>           retain_nonstatus_information    1
>           is_volatile                     0
>           servicegroups                   ftp-server
>           service_description             Server FTP
>           check_command                   check_ftp
>           icon_image                      ftp.png
>           check_period                    24x7
>           max_check_attempts              3
>           normal_check_interval           5
>           retry_check_interval            2
>           contact_groups                  sysadmindiary.it
>           notification_options            c,r
>           notification_interval           60
>           notification_period             24x7
>           register                        0
> }
> 
> define servicegroup {
>       servicegroup_name               ftp-server
>       alias                           FTP Server
> }
> 
> Here service call in server host file.
> 
> define service{
>       use                         cn-ftp-service
>       host_name                   Server.SysAdminDiary.it
> }
> 
> Any suggestions/hint?
> 

So contactgroup is inherited from template. That might be a clue. Can
you verify if the contactgroup is present on the proper services in
your objects.cache file?

Also, if you're not keeping secrets in the configuration, it would
help if I could get my hands on it so I can see the problem for myself.
If template inheritance is broken for contacts and/or contactgroups,
that's a pretty serious issue.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From ae at op5.se  Tue Dec  4 11:49:01 2012
From: ae at op5.se (Andreas Ericsson)
Date: Tue, 04 Dec 2012 11:49:01 +0100
Subject: Service scheduled for one week later....
In-Reply-To: <CAOpWx3vFGML1_9RDMCWmkHkRXVbx7R8Prbq7yW1o59gquDA2FA@mail.gmail.com>
References: <CAOpWx3vFGML1_9RDMCWmkHkRXVbx7R8Prbq7yW1o59gquDA2FA@mail.gmail.com>
Message-ID: <50BDD51D.1080708@op5.se>

On 12/03/2012 11:57 PM, Rodrigo Gesswein wrote:
> Dear All:
> 
>          I have a curios question about 'timeperiod' and 'service'. My 'cfg'
> files look like:
> 
> define timeperiod{
>          timeperiod_name DayTime
>          alias           DayTime
>          monday          08:00-19:00
>          tuesday         08:00-19:00
>          wednesday       08:00-19:00
>          thursday        08:00-19:00
>          friday          08:00-19:00
>          saturday        08:00-19:00
>          sunday          08:00-19:00
> }
> 
> define timeperiod{
>          timeperiod_name NightTime
>          alias           NightTime
>          monday          19:01-07:59
>          tuesday         19:01-07:59
>          wednesday       19:01-07:59
>          thursday        19:01-07:59
>          friday          19:01-07:59
>          saturday        19:01-07:59
>          sunday          19:01-07:59
> }
> 

This is wonky. Make it '00:00-07:59,19:01-24:00' instead.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From benny at bennyvision.com  Tue Dec  4 12:04:48 2012
From: benny at bennyvision.com (C. Bensend)
Date: Tue, 4 Dec 2012 05:04:48 -0600
Subject: Weird Nagios Problem
In-Reply-To: <70D7A4219FE365439DA53582BA6D8F2EF5DF3F2D@HCSD-MAIL1>
References: <70D7A4219FE365439DA53582BA6D8F2EF5DF3F2D@HCSD-MAIL1>
Message-ID: <b3b2d20f4f0f18afc4abfced11979134.squirrel@webmail.stinkweasel.net>


> I have been running Nagios for over a year with no issues.  All of a
> sudden, all of my current loads on my linux servers all go into warning
> state at the same time, showing the exact same load, which then increments
> every hour to critical.  After a while (3 or 4 hours)  they all come back
> down to normal.
>
> Checking on the servers themselves using HTOP shows normal load levels
> throughout the time period.

Hmmm, yeah.  Check that service and checkcommand definition.  I bet
you're actually testing the load on the *Nagios* server, and not
the individual servers you think you're testing it on.

What's the Nagios server's load during that time?  I bet it matches
up...


-- 
"Unless you're a lawyer, you don't understand Oracle licensing.
That applies equally to Oracle employees as well as customers."
                                  -- Me, 2012-05-10


------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From rodrigo.gesswein at gmail.com  Tue Dec  4 12:32:41 2012
From: rodrigo.gesswein at gmail.com (Rodrigo Gesswein)
Date: Tue, 4 Dec 2012 08:32:41 -0300
Subject: Service scheduled for one week later....
In-Reply-To: <50BDD51D.1080708@op5.se>
References: <CAOpWx3vFGML1_9RDMCWmkHkRXVbx7R8Prbq7yW1o59gquDA2FA@mail.gmail.com>
	<50BDD51D.1080708@op5.se>
Message-ID: <CAOpWx3udN5oNweMd67i_CYFnnFLuhO0G+o1tTBPq7dw2r-Nh+g@mail.gmail.com>

Dear Andreas:

> This is wonky. Make it '00:00-07:59,19:01-24:00' instead.

I try, but services still scheduled for one week more (Dec. 11th). I'm
using Nagios versin 3.3.1

Thank you
Rodrigo


On Tue, Dec 4, 2012 at 7:49 AM, Andreas Ericsson <ae at op5.se> wrote:
> On 12/03/2012 11:57 PM, Rodrigo Gesswein wrote:
>> Dear All:
>>
>>          I have a curios question about 'timeperiod' and 'service'. My 'cfg'
>> files look like:
>>
>> define timeperiod{
>>          timeperiod_name DayTime
>>          alias           DayTime
>>          monday          08:00-19:00
>>          tuesday         08:00-19:00
>>          wednesday       08:00-19:00
>>          thursday        08:00-19:00
>>          friday          08:00-19:00
>>          saturday        08:00-19:00
>>          sunday          08:00-19:00
>> }
>>
>> define timeperiod{
>>          timeperiod_name NightTime
>>          alias           NightTime
>>          monday          19:01-07:59
>>          tuesday         19:01-07:59
>>          wednesday       19:01-07:59
>>          thursday        19:01-07:59
>>          friday          19:01-07:59
>>          saturday        19:01-07:59
>>          sunday          19:01-07:59
>> }
>>
>
> This is wonky. Make it '00:00-07:59,19:01-24:00' instead.
>
> --
> Andreas Ericsson                   andreas.ericsson at op5.se
> OP5 AB                             www.op5.se
> Tel: +46 8-230225                  Fax: +46 8-230231
>
> Considering the successes of the wars on alcohol, poverty, drugs and
> terror, I think we should give some serious thought to declaring war
> on peace.

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From hdperfors at che.nl  Tue Dec  4 12:34:24 2012
From: hdperfors at che.nl (Perfors, Henny)
Date: Tue, 4 Dec 2012 12:34:24 +0100
Subject: netapp
Message-ID: <944EFA13E9A01F47823F16D30BD9DAAB160E8261AD@CHECEXCL02.che.local>

Hi,

I can get the check_netapp command to check the volumes which it is supposed to do, so that is fine.
But can it also check LUNS?

Anyone an idea?

Henry

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121204/2bf5d157/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From nagios at flatto.net  Tue Dec  4 12:49:08 2012
From: nagios at flatto.net (Assaf Flatto)
Date: Tue, 04 Dec 2012 11:49:08 +0000
Subject: netapp
In-Reply-To: <944EFA13E9A01F47823F16D30BD9DAAB160E8261AD@CHECEXCL02.che.local>
References: <944EFA13E9A01F47823F16D30BD9DAAB160E8261AD@CHECEXCL02.che.local>
Message-ID: <50BDE334.7030306@flatto.net>

On 04/12/12 11:34, Perfors, Henny wrote:
>
> Hi,
>
> I can get the check_netapp command to check the volumes which it is 
> supposed to do, so that is fine.
>
> But can it also check LUNS?
>
> Anyone an idea?
>
> Henry
>
>
Does the plugin docs say it can ?

If a feature is not documented for a plugin  , then the chances are it 
is not capable of doing that feature.

have you looked at the monitoring exchange for a plugin that actually 
say it monitors LUNS ?


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121204/18362b5b/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From support at crazynetwork.it  Tue Dec  4 14:29:45 2012
From: support at crazynetwork.it (Supporto Tecnico - Crazy Network)
Date: Tue, 04 Dec 2012 14:29:45 +0100
Subject: Nagios 3.99.95 contact_groups
In-Reply-To: <50BDD4BF.9010109@op5.se>
References: <50AB8D20.2020806@crazynetwork.it>
	<50BCC82A.8050607@crazynetwork.it> <50BDD4BF.9010109@op5.se>
Message-ID: <50BDFAC9.9040508@crazynetwork.it>

  Here from object.cache

define servicegroup {
         servicegroup_name       ftp-server
         alias   FTP Server
         }

define service {
         host_name       Server.SysAdminDiary.it
         service_description     Server FTP
         check_period    24x7
         check_command   check_ftp
         contact_groups  sysadmindiary.it
         notification_period     24x7
         initial_state   o
         hourly_value    1
         check_interval  5.000000
         retry_interval  2.000000
         max_check_attempts      3
         is_volatile     0
         parallelize_check       1
         active_checks_enabled   1
         passive_checks_enabled  1
         obsess  1
         event_handler_enabled   1
         low_flap_threshold      0.000000
         high_flap_threshold     0.000000
         flap_detection_enabled  1
         flap_detection_options  a
         freshness_threshold     0
         check_freshness 0
         notification_options    r,c
         notifications_enabled   1
         notification_interval   60.000000
         first_notification_delay        0.000000
         stalking_options        n
         process_perf_data       1
         icon_image      ftp.png
         retain_status_information       1
         retain_nonstatus_information    1
         }


define contact {
         contact_name    andrea.iannucci
         alias   Andrea Iannucci
         service_notification_period     24x7
         host_notification_period        24x7
         service_notification_options    r,c

         host_notification_options       r,d,u
         service_notification_commands   
notify-service-by-email,notify-service-by-sms
         host_notification_commands      
notify-host-by-email,notify-host-by-sms
         email   support at crazynetwork.it
         pager   00393388552885
         minimum_value   1
         host_notifications_enabled      1
         service_notifications_enabled   1
         can_submit_commands     1
         retain_status_information       1
         retain_nonstatus_information    1
         }


Apparently in contact definition the contact_groups disappear and it 
come with an empty line.

Im preparing the box to send privatly login data so you can check by 
yourself.

Thanks

Il 04/12/2012 11:47, Andreas Ericsson ha scritto:
> On 12/03/2012 04:41 PM, Supporto Tecnico - Crazy Network wrote:
>>     Hi,
>>
>> i did re-install nagios again (from git repo) using: git clone
>> git://github.com/ageric/nagios.git
>>
>> But still got same error from debug level 32:
>>
>> [1354549057.568949] [032.0] [pid=4151] ** Service Notification Attempt
>> ** Host: 'Server.SysAdminDiary.it', Service: 'Server FTP', Type: 0,
>> Options: 0, Current State: 2, Last Notification: Thu Jan  1 01:00:00 1970
>> [1354549057.569017] [032.0] [pid=4151] Notification viability test passed.
>> [1354549057.569029] [032.1] [pid=4151] Current notification number: 1
>> (incremented)
>> [1354549057.569040] [032.1] [pid=4151] Service notification will NOT be
>> escalated.
>> [1354549057.569049] [032.1] [pid=4151] Adding normal contacts for
>> service to notification list.
>> [1354549057.569056] [032.0] [pid=4151] No contacts were found for
>> notification purposes.  No notification was sent out.
>>
>> Here contacts file:
>>
>> define contact{
>>            name                            cn-contacts
>>            service_notification_period     24x7
>>            host_notification_period        24x7
>>            service_notification_options    c,r
>>            host_notification_options       d,u,r
>>            service_notification_commands   notify-service-by-email
>>            host_notification_commands      notify-host-by-email
>>            register                        0
>> }
>>
>>
>> define contact{
>>            contact_name                    andrea.iannucci
>>            use                             cn-contacts
>>            alias                           Andrea Iannucci
>>            email                           support at crazynetwork.it
>>            contact_groups                  sysadmindiary.it
>>            service_notification_commands
>> notify-service-by-email,notify-service-by-sms
>>            host_notification_commands
>> notify-host-by-email,notify-host-by-sms
>>            pager                           00391234567890
>> }
>>
>> define contactgroup{
>>            contactgroup_name               sysadmindiary.it
>>            alias                           SysAdminDiary.it Staff
>> }
>>
>> Here service definition:
>>
>> define service{
>>            name                            cn-ftp-service
>>            active_checks_enabled           1
>>            passive_checks_enabled          1
>>            parallelize_check               1
>>            obsess_over_service             1
>>            check_freshness                 0
>>            notifications_enabled           1
>>            event_handler_enabled           1
>>            flap_detection_enabled          1
>>            process_perf_data               1
>>            retain_status_information       1
>>            retain_nonstatus_information    1
>>            is_volatile                     0
>>            servicegroups                   ftp-server
>>            service_description             Server FTP
>>            check_command                   check_ftp
>>            icon_image                      ftp.png
>>            check_period                    24x7
>>            max_check_attempts              3
>>            normal_check_interval           5
>>            retry_check_interval            2
>>            contact_groups                  sysadmindiary.it
>>            notification_options            c,r
>>            notification_interval           60
>>            notification_period             24x7
>>            register                        0
>> }
>>
>> define servicegroup {
>>        servicegroup_name               ftp-server
>>        alias                           FTP Server
>> }
>>
>> Here service call in server host file.
>>
>> define service{
>>        use                         cn-ftp-service
>>        host_name                   Server.SysAdminDiary.it
>> }
>>
>> Any suggestions/hint?
>>
> So contactgroup is inherited from template. That might be a clue. Can
> you verify if the contactgroup is present on the proper services in
> your objects.cache file?
>
> Also, if you're not keeping secrets in the configuration, it would
> help if I could get my hands on it so I can see the problem for myself.
> If template inheritance is broken for contacts and/or contactgroups,
> that's a pretty serious issue.
>


-- 
Andrea Iannucci
----------------------------

----------------------------
Crazy Network di Iannucci Andrea
Viale G.B. Lulli, 24
00050 Cerveteri - RM
(w) www.crazynetwork.it
(e) andrea.iannucci at crazynetwork.it
(t) +39 06 62279876
(f) +39 06 62298767
(m) +39 338 8552885

-------------------------------------------------------------------------------
Please consider our enviromental responsabilit? before printing this 
E-Mail. Thank you.
-------------------------------------------------------------------------------
Questo messaggio di posta elettronica contiene informazioni di carattere 
confidenziale rivolte esclusivamente al destinatario sopra indicato.
E' vietato l'uso, la diffusione, distribuzione o riproduzione da parte 
di ogni altra persona. Nel caso aveste ricevuto questo messaggio di 
posta elettronica per errore, siete pregati di segnalarlo immediatamente 
al mittente e distruggere quanto ricevuto (compresi i file allegati) 
senza farne copia.
Qualsivoglia utilizzo non autorizzato del contenuto di questo messaggio 
costituisce violazione dell'obbligo di non prendere cognizione della 
corrispondenza tra altri soggetti, salvo pi? grave illecito, ed espone 
il responsabile alle relative conseguenze.
--------------------------------------------------------------------------------
This e-mail is confidential and may also contain privileged information. 
If you are not the intended recipient you are not authorised to read, 
print, save, process or disclose this message. If you have received this 
message by mistake, please inform the sender immediately and delete this 
e-mail, its attachments and any copies.
Any use, distribution, reproduction or disclosure by any person other 
than the intended recipient is strictly prohibited and the person 
responsible may incur penalties.
-------------------------------------------------------------------------------- 


------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From hdperfors at che.nl  Tue Dec  4 14:54:41 2012
From: hdperfors at che.nl (Perfors, Henny)
Date: Tue, 4 Dec 2012 14:54:41 +0100
Subject: netapp
In-Reply-To: <50BDE334.7030306@flatto.net>
References: <944EFA13E9A01F47823F16D30BD9DAAB160E8261AD@CHECEXCL02.che.local>
	<50BDE334.7030306@flatto.net>
Message-ID: <944EFA13E9A01F47823F16D30BD9DAAB160E8261D6@CHECEXCL02.che.local>

That's right, it is not documented for this plugin.
I cannot find such a plugin. I hope someone has a good solution.


Van: Assaf Flatto [mailto:nagios at flatto.net]
Verzonden: dinsdag 4 december 2012 12:49
Aan: Nagios Users List
Onderwerp: Re: [Nagios-users] netapp

On 04/12/12 11:34, Perfors, Henny wrote:
Hi,

I can get the check_netapp command to check the volumes which it is supposed to do, so that is fine.
But can it also check LUNS?

Anyone an idea?

Henry

Does the plugin docs say it can ?

If a feature is not documented for a plugin  , then the chances are it is not capable of doing that feature.

have you looked at the monitoring exchange for a plugin that actually say it monitors LUNS ?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121204/a3c13a85/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From Martin_Hugo at hboe.org  Tue Dec  4 15:33:11 2012
From: Martin_Hugo at hboe.org (Martin Hugo)
Date: Tue, 4 Dec 2012 14:33:11 +0000
Subject: Weird Nagios Problem
In-Reply-To: <b3b2d20f4f0f18afc4abfced11979134.squirrel@webmail.stinkweasel.net>
References: <70D7A4219FE365439DA53582BA6D8F2EF5DF3F2D@HCSD-MAIL1>
	<b3b2d20f4f0f18afc4abfced11979134.squirrel@webmail.stinkweasel.net>
Message-ID: <70D7A4219FE365439DA53582BA6D8F2EF5DF4B53@HCSD-MAIL1>

You are right, it was using check_local_load, is there a remote version of this command?

Thanks.

Marty

-----Original Message-----
From: C. Bensend [mailto:benny at bennyvision.com] 
Sent: Tuesday, December 04, 2012 6:05 AM
To: nagios-users at lists.sourceforge.net
Subject: Re: [Nagios-users] Weird Nagios Problem


> I have been running Nagios for over a year with no issues.  All of a 
> sudden, all of my current loads on my linux servers all go into 
> warning state at the same time, showing the exact same load, which 
> then increments every hour to critical.  After a while (3 or 4 hours)  
> they all come back down to normal.
>
> Checking on the servers themselves using HTOP shows normal load levels 
> throughout the time period.

Hmmm, yeah.  Check that service and checkcommand definition.  I bet you're actually testing the load on the *Nagios* server, and not the individual servers you think you're testing it on.

What's the Nagios server's load during that time?  I bet it matches up...


--
"Unless you're a lawyer, you don't understand Oracle licensing.
That applies equally to Oracle employees as well as customers."
                                  -- Me, 2012-05-10


------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From ck at claudiokuenzler.com  Tue Dec  4 15:50:13 2012
From: ck at claudiokuenzler.com (Claudio Kuenzler)
Date: Tue, 4 Dec 2012 15:50:13 +0100
Subject: Weird Nagios Problem
In-Reply-To: <70D7A4219FE365439DA53582BA6D8F2EF5DF3F2D@HCSD-MAIL1>
References: <70D7A4219FE365439DA53582BA6D8F2EF5DF3F2D@HCSD-MAIL1>
Message-ID: <CAF-yqgiYjcvwNjBtZw49TxPpcYrnRsP0_a0SVOdj+Sy0TAc-4A@mail.gmail.com>

On Tue, Dec 4, 2012 at 4:26 AM, Martin Hugo <Martin_Hugo at hboe.org> wrote:

> Another issue is one check_Interface_Table that returns 255 out of bounds
> but over 30 others (in the same service group using the same command)
> return normal.


Such an error "255 out of bounds" can happen, when the plugin gives too
much information Nagios could handle.
Happened to me for example when I tried to monitor all ports of a 48-port
switch. You'll have to use some regex or another way to narrow down the
real information you need.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121204/d604f395/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From jeffrey.w.watts at gmail.com  Tue Dec  4 15:53:53 2012
From: jeffrey.w.watts at gmail.com (Jeffrey Watts)
Date: Tue, 4 Dec 2012 08:53:53 -0600
Subject: Weird Nagios Problem
In-Reply-To: <70D7A4219FE365439DA53582BA6D8F2EF5DF4B53@HCSD-MAIL1>
References: <70D7A4219FE365439DA53582BA6D8F2EF5DF3F2D@HCSD-MAIL1>
	<b3b2d20f4f0f18afc4abfced11979134.squirrel@webmail.stinkweasel.net>
	<70D7A4219FE365439DA53582BA6D8F2EF5DF4B53@HCSD-MAIL1>
Message-ID: <CAMvPdm13eSuaDHAo6OBOrZz3Xo=LWC1eywrg0rOwL4sOoPcayw@mail.gmail.com>

Martin, I've always used NRPE to run check_load remotely.  If you use SNMP,
you can also write a custom plugin to gather the values that way.  There
might be a plugin that someone else has written, too.

Jeffrey.


On Tue, Dec 4, 2012 at 8:33 AM, Martin Hugo <Martin_Hugo at hboe.org> wrote:

> You are right, it was using check_local_load, is there a remote version of
> this command?
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121204/6974ad63/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From admin at dougware.net  Tue Dec  4 16:44:22 2012
From: admin at dougware.net (Doug Eubanks)
Date: Tue, 4 Dec 2012 10:44:22 -0500
Subject: Nagios running checks way too often
Message-ID: <CAG5Afr8w07ebZ=9UxAu+DQv_Tt2b9FNNDKDsOAphcXJGEza7yA@mail.gmail.com>

Nagios is checking services way too often.  It's supposed to check once
every 2 minutes, then failback to checking once every 1 minute on a failure.

I believe this is the relevant parts of the nagios.cfg file:
sleep_time=20
service_interleave_factor=s
max_concurrent_checks=0
service_reaper_frequency=5
interval_length=60
service_check_timeout=75
host_check_timeout=75
obsess_over_services=1
process_performance_data=1
check_for_orphaned_services=0
check_service_freshness=1
service_inter_check_delay_method=s
use_retained_scheduling_info=1
max_service_check_spread=1
host_inter_check_delay_method=s
max_host_check_spread=1
auto_reschedule_checks=1
auto_rescheduling_interval=15
auto_rescheduling_window=30

Nagios Version Information:
nagios-3.4.1-2.el6.i686

Here's the access log from the service being checked.  This is only one
vhost, and that one vhost is only in Nagios as a service once.
198.17.xxx.xxx - - [04/Dec/2012:10:31:41 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:31:41 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:31:44 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:31:50 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:31:50 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:31:50 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:32:37 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:32:37 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:32:38 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:33:46 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:33:46 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:33:48 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:33:55 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:33:55 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:33:55 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:34:43 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:34:43 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:34:43 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:35:52 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:35:52 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:35:55 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:36:01 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:36:01 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:36:01 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:36:49 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:36:49 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:36:49 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:37:57 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
198.17.xxx.xxx - - [04/Dec/2012:10:37:57 -0500] "GET / HTTP/1.1" 200 22414
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"

Any suggestions?

Thanks,
Doug Eubanks
admin at dougware.net
K1DUG
(919) 201-8750
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121204/4eb76eb6/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From ck at claudiokuenzler.com  Wed Dec  5 08:26:51 2012
From: ck at claudiokuenzler.com (Claudio Kuenzler)
Date: Wed, 5 Dec 2012 08:26:51 +0100
Subject: Nagios running checks way too often
In-Reply-To: <CAG5Afr8w07ebZ=9UxAu+DQv_Tt2b9FNNDKDsOAphcXJGEza7yA@mail.gmail.com>
References: <CAG5Afr8w07ebZ=9UxAu+DQv_Tt2b9FNNDKDsOAphcXJGEza7yA@mail.gmail.com>
Message-ID: <CAF-yqggMxNCuJfv+h5_YHNJ2vv7MrCd6g=WYMjK212WBo9DGmA@mail.gmail.com>

On Tue, Dec 4, 2012 at 4:44 PM, Doug Eubanks <admin at dougware.net> wrote:

> Nagios is checking services way too often.  It's supposed to check once
> every 2 minutes, then failback to checking once every 1 minute on a failure.
>
> I believe this is the relevant parts of the nagios.cfg file:
>

Actually the relevant part for how often a check should be executed is in
the service definition of the check. Mostly the service itself uses a
template with the "use" option. In this case you have to check your
templates.cfg file.

If you don't find it, please post the relevant service definition and the
definition of the template being used by the service.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121205/362f6a0b/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From Eliot.Picken at wenaas.co.uk  Wed Dec  5 08:42:58 2012
From: Eliot.Picken at wenaas.co.uk (Eliot.Picken at wenaas.co.uk)
Date: Wed, 5 Dec 2012 07:42:58 +0000
Subject: AUTO: Eliot Picken is out of the office (returning
	06/12/2012)
Message-ID: <OF2D36638A.3B9B7520-ON80257ACB.002A6301-80257ACB.002A6301@kwintet.com>


I am out of the office until 06/12/2012.

I am currently out of the office. Your email has not been forwarded, and
will be actionned on my return.

For emergency issues, please contact Alex Lawrie on +44 (0) 1224 894 000


Note: This is an automated response to your message  "Re: [Nagios-users]
Nagios running checks way too often" sent on 05/12/2012 07:26:51.

This is the only notification you will receive while this person is away.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121205/40191a0b/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From ae at op5.se  Wed Dec  5 12:31:09 2012
From: ae at op5.se (Andreas Ericsson)
Date: Wed, 05 Dec 2012 12:31:09 +0100
Subject: servicedependecy is not working
In-Reply-To: <CAG+8EEZUYn3BQEMSinBGch+y43yN0zruFAu2-oSAqTHdoJ=dqg@mail.gmail.com>
References: <CAG+8EEZf9EPz2sXXth=GO9AjUe61XeAtDcNs=G=dj1FKqVqxKg@mail.gmail.com>
	<50BDC136.2010503@op5.se>
	<CAG+8EEZUYn3BQEMSinBGch+y43yN0zruFAu2-oSAqTHdoJ=dqg@mail.gmail.com>
Message-ID: <50BF307D.3030506@op5.se>

On 12/04/2012 03:52 PM, Leonardo Bacha Abrantes wrote:
> Hey Andreas,
> 
> yes, that is. however if I run an passive check of nrpe, nagios does not
> send alerts. I'm testing it to confirm.
> 

So the notification for '/var partition' only goes out when NRPE is in
PENDING state?

> 
> on doubt:
> 
> /var was checked and it failed. when it happens nagios will automaticaly
> re-check nrpe plugin or only check the current status ?

Only if active checks are enabled for the check and it can run it at the
time it wants to run the dependency check.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From ae at op5.se  Wed Dec  5 14:20:18 2012
From: ae at op5.se (Andreas Ericsson)
Date: Wed, 05 Dec 2012 14:20:18 +0100
Subject: Nagios 3.99.95 contact_groups
In-Reply-To: <50BDFAC9.9040508@crazynetwork.it>
References: <50AB8D20.2020806@crazynetwork.it>
	<50BCC82A.8050607@crazynetwork.it> <50BDD4BF.9010109@op5.se>
	<50BDFAC9.9040508@crazynetwork.it>
Message-ID: <50BF4A12.1090309@op5.se>

On 12/04/2012 02:29 PM, Supporto Tecnico - Crazy Network wrote:
>   Here from object.cache
> 
> define servicegroup {
>          servicegroup_name       ftp-server
>          alias   FTP Server
>          }
> 
> define service {
>          host_name       Server.SysAdminDiary.it
>          service_description     Server FTP
>          check_period    24x7
>          check_command   check_ftp
>          contact_groups  sysadmindiary.it
>          notification_period     24x7
>          initial_state   o
>          hourly_value    1
>          check_interval  5.000000
>          retry_interval  2.000000
>          max_check_attempts      3
>          is_volatile     0
>          parallelize_check       1
>          active_checks_enabled   1
>          passive_checks_enabled  1
>          obsess  1
>          event_handler_enabled   1
>          low_flap_threshold      0.000000
>          high_flap_threshold     0.000000
>          flap_detection_enabled  1
>          flap_detection_options  a
>          freshness_threshold     0
>          check_freshness 0
>          notification_options    r,c
>          notifications_enabled   1
>          notification_interval   60.000000
>          first_notification_delay        0.000000
>          stalking_options        n
>          process_perf_data       1
>          icon_image      ftp.png
>          retain_status_information       1
>          retain_nonstatus_information    1
>          }
> 
> 
> define contact {
>          contact_name    andrea.iannucci
>          alias   Andrea Iannucci
>          service_notification_period     24x7
>          host_notification_period        24x7
>          service_notification_options    r,c
> 
>          host_notification_options       r,d,u
>          service_notification_commands notify-service-by-email,notify-service-by-sms
>          host_notification_commands notify-host-by-email,notify-host-by-sms
>          email   support at crazynetwork.it
>          pager   00393388552885
>          minimum_value   1
>          host_notifications_enabled      1
>          service_notifications_enabled   1
>          can_submit_commands     1
>          retain_status_information       1
>          retain_nonstatus_information    1
>          }
> 
> 
> 
> Apparently in contact definition the contact_groups disappear and it come with an empty line.
> 

In objects.cache the contacts are always written to the contactgroup's
members variable. If it's available there, Nagios knows about it. If
not, then there's something seriously wrong.

> Im preparing the box to send privatly login data so you can check by yourself.
> 

Much appreciated. Many thanks.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From support at crazynetwork.it  Wed Dec  5 14:28:22 2012
From: support at crazynetwork.it (Supporto Tecnico - Crazy Network)
Date: Wed, 05 Dec 2012 14:28:22 +0100
Subject: Nagios 3.99.95 contact_groups
In-Reply-To: <50BF4A12.1090309@op5.se>
References: <50AB8D20.2020806@crazynetwork.it>
	<50BCC82A.8050607@crazynetwork.it> <50BDD4BF.9010109@op5.se>
	<50BDFAC9.9040508@crazynetwork.it> <50BF4A12.1090309@op5.se>
Message-ID: <50BF4BF6.2070702@crazynetwork.it>

  define contactgroup {
         contactgroup_name       sysadmindiary.it
         alias   SysAdminDiary.it Staff
         }

The only contactgroup i can see in object.cache

I did send you a private mail with login data for the box.

Regards, and thanks

Il 05/12/2012 14:20, Andreas Ericsson ha scritto:
> On 12/04/2012 02:29 PM, Supporto Tecnico - Crazy Network wrote:
>>    Here from object.cache
>>
>> define servicegroup {
>>           servicegroup_name       ftp-server
>>           alias   FTP Server
>>           }
>>
>> define service {
>>           host_name       Server.SysAdminDiary.it
>>           service_description     Server FTP
>>           check_period    24x7
>>           check_command   check_ftp
>>           contact_groups  sysadmindiary.it
>>           notification_period     24x7
>>           initial_state   o
>>           hourly_value    1
>>           check_interval  5.000000
>>           retry_interval  2.000000
>>           max_check_attempts      3
>>           is_volatile     0
>>           parallelize_check       1
>>           active_checks_enabled   1
>>           passive_checks_enabled  1
>>           obsess  1
>>           event_handler_enabled   1
>>           low_flap_threshold      0.000000
>>           high_flap_threshold     0.000000
>>           flap_detection_enabled  1
>>           flap_detection_options  a
>>           freshness_threshold     0
>>           check_freshness 0
>>           notification_options    r,c
>>           notifications_enabled   1
>>           notification_interval   60.000000
>>           first_notification_delay        0.000000
>>           stalking_options        n
>>           process_perf_data       1
>>           icon_image      ftp.png
>>           retain_status_information       1
>>           retain_nonstatus_information    1
>>           }
>>
>>
>> define contact {
>>           contact_name    andrea.iannucci
>>           alias   Andrea Iannucci
>>           service_notification_period     24x7
>>           host_notification_period        24x7
>>           service_notification_options    r,c
>>
>>           host_notification_options       r,d,u
>>           service_notification_commands notify-service-by-email,notify-service-by-sms
>>           host_notification_commands notify-host-by-email,notify-host-by-sms
>>           email   support at crazynetwork.it
>>           pager   00393388552885
>>           minimum_value   1
>>           host_notifications_enabled      1
>>           service_notifications_enabled   1
>>           can_submit_commands     1
>>           retain_status_information       1
>>           retain_nonstatus_information    1
>>           }
>>
>>
>>
>> Apparently in contact definition the contact_groups disappear and it come with an empty line.
>>
> In objects.cache the contacts are always written to the contactgroup's
> members variable. If it's available there, Nagios knows about it. If
> not, then there's something seriously wrong.
>
>> Im preparing the box to send privatly login data so you can check by yourself.
>>
> Much appreciated. Many thanks.
>


-- 
Andrea Iannucci
----------------------------

----------------------------
Crazy Network di Iannucci Andrea
Viale G.B. Lulli, 24
00050 Cerveteri - RM
(w) www.crazynetwork.it
(e) andrea.iannucci at crazynetwork.it
(t) +39 06 62279876
(f) +39 06 62298767
(m) +39 338 8552885

-------------------------------------------------------------------------------
Please consider our enviromental responsabilit? before printing this 
E-Mail. Thank you.
-------------------------------------------------------------------------------
Questo messaggio di posta elettronica contiene informazioni di carattere 
confidenziale rivolte esclusivamente al destinatario sopra indicato.
E' vietato l'uso, la diffusione, distribuzione o riproduzione da parte 
di ogni altra persona. Nel caso aveste ricevuto questo messaggio di 
posta elettronica per errore, siete pregati di segnalarlo immediatamente 
al mittente e distruggere quanto ricevuto (compresi i file allegati) 
senza farne copia.
Qualsivoglia utilizzo non autorizzato del contenuto di questo messaggio 
costituisce violazione dell'obbligo di non prendere cognizione della 
corrispondenza tra altri soggetti, salvo pi? grave illecito, ed espone 
il responsabile alle relative conseguenze.
--------------------------------------------------------------------------------
This e-mail is confidential and may also contain privileged information. 
If you are not the intended recipient you are not authorised to read, 
print, save, process or disclose this message. If you have received this 
message by mistake, please inform the sender immediately and delete this 
e-mail, its attachments and any copies.
Any use, distribution, reproduction or disclosure by any person other 
than the intended recipient is strictly prohibited and the person 
responsible may incur penalties.
-------------------------------------------------------------------------------- 


------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From ae at op5.se  Wed Dec  5 15:51:15 2012
From: ae at op5.se (Andreas Ericsson)
Date: Wed, 05 Dec 2012 15:51:15 +0100
Subject: Nagios 3.99.95 contact_groups
In-Reply-To: <50BF4BF6.2070702@crazynetwork.it>
References: <50AB8D20.2020806@crazynetwork.it>
	<50BCC82A.8050607@crazynetwork.it> <50BDD4BF.9010109@op5.se>
	<50BDFAC9.9040508@crazynetwork.it> <50BF4A12.1090309@op5.se>
	<50BF4BF6.2070702@crazynetwork.it>
Message-ID: <50BF5F63.90500@op5.se>

On 12/05/2012 02:28 PM, Supporto Tecnico - Crazy Network wrote:
>   define contactgroup {
>          contactgroup_name       sysadmindiary.it
>          alias   SysAdminDiary.it Staff
>          }
> 
> The only contactgroup i can see in object.cache
> 
> I did send you a private mail with login data for the box.
> 

You did, and I'm perplexed.

The problem arises when precaching objects (which is discouraged in
Nagios 4, since it actually makes config loading slower rather than
faster for almost all cases).

The objects.precache file has you as a member of the contactgroup,
but the objects.cache file does not.

If I don't precache objects, it doesn't matter how I specify that
you should be a member of the group. You still get notifications.

If I do use the precache file, you don't get notifications no matter
how I specify it.

Now that I noticed this discrepancy, it appears there are other
issues with group membership parsing as well, and they only appear
when reading the precached object file. I'm working on a patch as
we speak and it appears I've got things fixed. I'll just run some
more tests and then send it out muy pronto.

Thanks for letting me sneak a peak at your server. This one would
most likely have baffled me for quite some time otherwise.

This should sort out many of the "Nagios doesn't send notifications"
issues that have cropped up here and there and that I've been unable
to reproduce.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From support at crazynetwork.it  Wed Dec  5 16:12:23 2012
From: support at crazynetwork.it (Supporto Tecnico - Crazy Network)
Date: Wed, 05 Dec 2012 16:12:23 +0100
Subject: Nagios 3.99.95 contact_groups
In-Reply-To: <50BF5F63.90500@op5.se>
References: <50AB8D20.2020806@crazynetwork.it>
	<50BCC82A.8050607@crazynetwork.it> <50BDD4BF.9010109@op5.se>
	<50BDFAC9.9040508@crazynetwork.it> <50BF4A12.1090309@op5.se>
	<50BF4BF6.2070702@crazynetwork.it> <50BF5F63.90500@op5.se>
Message-ID: <50BF6457.4080502@crazynetwork.it>

  Nice to know is not my config way problem..

Let me know once i can re-try to install and when i can reset the 
testbox provided.

For now, thanks :)

Regards

Il 05/12/2012 15:51, Andreas Ericsson ha scritto:
> On 12/05/2012 02:28 PM, Supporto Tecnico - Crazy Network wrote:
>>    define contactgroup {
>>           contactgroup_name       sysadmindiary.it
>>           alias   SysAdminDiary.it Staff
>>           }
>>
>> The only contactgroup i can see in object.cache
>>
>> I did send you a private mail with login data for the box.
>>
> You did, and I'm perplexed.
>
> The problem arises when precaching objects (which is discouraged in
> Nagios 4, since it actually makes config loading slower rather than
> faster for almost all cases).
>
> The objects.precache file has you as a member of the contactgroup,
> but the objects.cache file does not.
>
> If I don't precache objects, it doesn't matter how I specify that
> you should be a member of the group. You still get notifications.
>
> If I do use the precache file, you don't get notifications no matter
> how I specify it.
>
> Now that I noticed this discrepancy, it appears there are other
> issues with group membership parsing as well, and they only appear
> when reading the precached object file. I'm working on a patch as
> we speak and it appears I've got things fixed. I'll just run some
> more tests and then send it out muy pronto.
>
> Thanks for letting me sneak a peak at your server. This one would
> most likely have baffled me for quite some time otherwise.
>
> This should sort out many of the "Nagios doesn't send notifications"
> issues that have cropped up here and there and that I've been unable
> to reproduce.
>


-- 
Andrea Iannucci
----------------------------

----------------------------
Crazy Network di Iannucci Andrea
Viale G.B. Lulli, 24
00050 Cerveteri - RM
(w) www.crazynetwork.it
(e) andrea.iannucci at crazynetwork.it
(t) +39 06 62279876
(f) +39 06 62298767
(m) +39 338 8552885

-------------------------------------------------------------------------------
Please consider our enviromental responsabilit? before printing this 
E-Mail. Thank you.
-------------------------------------------------------------------------------
Questo messaggio di posta elettronica contiene informazioni di carattere 
confidenziale rivolte esclusivamente al destinatario sopra indicato.
E' vietato l'uso, la diffusione, distribuzione o riproduzione da parte 
di ogni altra persona. Nel caso aveste ricevuto questo messaggio di 
posta elettronica per errore, siete pregati di segnalarlo immediatamente 
al mittente e distruggere quanto ricevuto (compresi i file allegati) 
senza farne copia.
Qualsivoglia utilizzo non autorizzato del contenuto di questo messaggio 
costituisce violazione dell'obbligo di non prendere cognizione della 
corrispondenza tra altri soggetti, salvo pi? grave illecito, ed espone 
il responsabile alle relative conseguenze.
--------------------------------------------------------------------------------
This e-mail is confidential and may also contain privileged information. 
If you are not the intended recipient you are not authorised to read, 
print, save, process or disclose this message. If you have received this 
message by mistake, please inform the sender immediately and delete this 
e-mail, its attachments and any copies.
Any use, distribution, reproduction or disclosure by any person other 
than the intended recipient is strictly prohibited and the person 
responsible may incur penalties.
-------------------------------------------------------------------------------- 


------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From ae at op5.se  Wed Dec  5 16:47:00 2012
From: ae at op5.se (Andreas Ericsson)
Date: Wed, 05 Dec 2012 16:47:00 +0100
Subject: Nagios 3.99.95 contact_groups
In-Reply-To: <50BF6457.4080502@crazynetwork.it>
References: <50AB8D20.2020806@crazynetwork.it>
	<50BCC82A.8050607@crazynetwork.it> <50BDD4BF.9010109@op5.se>
	<50BDFAC9.9040508@crazynetwork.it> <50BF4A12.1090309@op5.se>
	<50BF4BF6.2070702@crazynetwork.it> <50BF5F63.90500@op5.se>
	<50BF6457.4080502@crazynetwork.it>
Message-ID: <50BF6C74.3060809@op5.se>

On 12/05/2012 04:12 PM, Supporto Tecnico - Crazy Network wrote:
>   Nice to know is not my config way problem..
> 
> Let me know once i can re-try to install and when i can reset the testbox provided.
> 

You can test it now.

> For now, thanks :)
> 

You're welcome, and thanks you too :)

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From support at crazynetwork.it  Wed Dec  5 16:54:48 2012
From: support at crazynetwork.it (Supporto Tecnico - Crazy Network)
Date: Wed, 05 Dec 2012 16:54:48 +0100
Subject: Nagios 3.99.95 contact_groups
In-Reply-To: <50BF6C74.3060809@op5.se>
References: <50AB8D20.2020806@crazynetwork.it>
	<50BCC82A.8050607@crazynetwork.it> <50BDD4BF.9010109@op5.se>
	<50BDFAC9.9040508@crazynetwork.it> <50BF4A12.1090309@op5.se>
	<50BF4BF6.2070702@crazynetwork.it> <50BF5F63.90500@op5.se>
	<50BF6457.4080502@crazynetwork.it> <50BF6C74.3060809@op5.se>
Message-ID: <50BF6E48.30508@crazynetwork.it>

  Ok, seems is working!

Recompiled, restarted nagios and...

[1354722485.871976] [032.1] [pid=22807] Service notification will NOT be 
escalated.
[1354722485.871983] [032.1] [pid=22807] Adding normal contacts for 
service to notification list.
[1354722485.871990] [032.0] [pid=22807] No contacts were found for 
notification purposes.  No notification was sent out.
[1354722682.351821] [032.0] [pid=25265] ** Service Notification Attempt 
** Host: 'Server.SysAdminDiary.it', Service: 'Server FTP', Type: 0, 
Options: 0, Current State: 2, Last Notification: Thu Jan  1 01:00:00 1970
[1354722682.351891] [032.0] [pid=25265] Notification viability test passed.
[1354722682.351903] [032.1] [pid=25265] Current notification number: 1 
(incremented)
[1354722682.351914] [032.1] [pid=25265] Service notification will NOT be 
escalated.
[1354722682.351925] [032.1] [pid=25265] Adding normal contacts for 
service to notification list.
[1354722682.352187] [032.0] [pid=25265] 1 contacts were notified.  Next 
possible notification time: Wed Dec  5 17:51:22 2012
[1354722682.352203] [032.0] [pid=25265] 1 contacts were notified.

And email are arriving now!

Thanks a lot Andreas

Il 05/12/2012 16:47, Andreas Ericsson ha scritto:
> On 12/05/2012 04:12 PM, Supporto Tecnico - Crazy Network wrote:
>>    Nice to know is not my config way problem..
>>
>> Let me know once i can re-try to install and when i can reset the testbox provided.
>>
> You can test it now.
>
>> For now, thanks :)
>>
> You're welcome, and thanks you too :)
>


-- 
Andrea Iannucci
----------------------------

----------------------------
Crazy Network di Iannucci Andrea
Viale G.B. Lulli, 24
00050 Cerveteri - RM
(w) www.crazynetwork.it
(e) andrea.iannucci at crazynetwork.it
(t) +39 06 62279876
(f) +39 06 62298767
(m) +39 338 8552885

-------------------------------------------------------------------------------
Please consider our enviromental responsabilit? before printing this 
E-Mail. Thank you.
-------------------------------------------------------------------------------
Questo messaggio di posta elettronica contiene informazioni di carattere 
confidenziale rivolte esclusivamente al destinatario sopra indicato.
E' vietato l'uso, la diffusione, distribuzione o riproduzione da parte 
di ogni altra persona. Nel caso aveste ricevuto questo messaggio di 
posta elettronica per errore, siete pregati di segnalarlo immediatamente 
al mittente e distruggere quanto ricevuto (compresi i file allegati) 
senza farne copia.
Qualsivoglia utilizzo non autorizzato del contenuto di questo messaggio 
costituisce violazione dell'obbligo di non prendere cognizione della 
corrispondenza tra altri soggetti, salvo pi? grave illecito, ed espone 
il responsabile alle relative conseguenze.
--------------------------------------------------------------------------------
This e-mail is confidential and may also contain privileged information. 
If you are not the intended recipient you are not authorised to read, 
print, save, process or disclose this message. If you have received this 
message by mistake, please inform the sender immediately and delete this 
e-mail, its attachments and any copies.
Any use, distribution, reproduction or disclosure by any person other 
than the intended recipient is strictly prohibited and the person 
responsible may incur penalties.
-------------------------------------------------------------------------------- 


------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From ae at op5.se  Wed Dec  5 17:22:21 2012
From: ae at op5.se (Andreas Ericsson)
Date: Wed, 05 Dec 2012 17:22:21 +0100
Subject: Nagios 3.99.95 contact_groups
In-Reply-To: <50BF6E48.30508@crazynetwork.it>
References: <50AB8D20.2020806@crazynetwork.it>
	<50BCC82A.8050607@crazynetwork.it> <50BDD4BF.9010109@op5.se>
	<50BDFAC9.9040508@crazynetwork.it> <50BF4A12.1090309@op5.se>
	<50BF4BF6.2070702@crazynetwork.it> <50BF5F63.90500@op5.se>
	<50BF6457.4080502@crazynetwork.it> <50BF6C74.3060809@op5.se>
	<50BF6E48.30508@crazynetwork.it>
Message-ID: <50BF74BD.5040200@op5.se>

On 12/05/2012 04:54 PM, Supporto Tecnico - Crazy Network wrote:
>   Ok, seems is working!
> 
> Recompiled, restarted nagios and...
> 
> [1354722485.871976] [032.1] [pid=22807] Service notification will NOT be escalated.
> [1354722485.871983] [032.1] [pid=22807] Adding normal contacts for service to notification list.
> [1354722485.871990] [032.0] [pid=22807] No contacts were found for notification purposes.  No notification was sent out.
> [1354722682.351821] [032.0] [pid=25265] ** Service Notification Attempt ** Host: 'Server.SysAdminDiary.it', Service: 'Server FTP', Type: 0, Options: 0, Current State: 2, Last Notification: Thu Jan  1 01:00:00 1970
> [1354722682.351891] [032.0] [pid=25265] Notification viability test passed.
> [1354722682.351903] [032.1] [pid=25265] Current notification number: 1 (incremented)
> [1354722682.351914] [032.1] [pid=25265] Service notification will NOT be escalated.
> [1354722682.351925] [032.1] [pid=25265] Adding normal contacts for service to notification list.
> [1354722682.352187] [032.0] [pid=25265] 1 contacts were notified.  Next possible notification time: Wed Dec  5 17:51:22 2012
> [1354722682.352203] [032.0] [pid=25265] 1 contacts were notified.
> 
> And email are arriving now!
> 
> Thanks a lot Andreas
> 

Excellent. Thanks for testing so promptly :)

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From admin at dougware.net  Thu Dec  6 01:13:24 2012
From: admin at dougware.net (Doug Eubanks)
Date: Wed, 5 Dec 2012 19:13:24 -0500
Subject: Nagios running checks way too often
In-Reply-To: <CAF-yqggMxNCuJfv+h5_YHNJ2vv7MrCd6g=WYMjK212WBo9DGmA@mail.gmail.com>
References: <CAG5Afr8w07ebZ=9UxAu+DQv_Tt2b9FNNDKDsOAphcXJGEza7yA@mail.gmail.com>
	<CAF-yqggMxNCuJfv+h5_YHNJ2vv7MrCd6g=WYMjK212WBo9DGmA@mail.gmail.com>
Message-ID: <CAG5Afr-drVMEHBRmriPfQ_-JvRKN3ykxxbb7DphB=Eh4Wtz_Dw@mail.gmail.com>

Of course you are correct, here's one of the services.  According to the
site log, it looks like Nagios is firing off three to four requests to the
server, each time it's being checked.  These log entries are all from one
vhost log file, so it's not like Nagios is checking 4 sites, it's checking
the same site 4 times at once.

NagiosServer - - [05/Dec/2012:19:09:40 -0500] "GET / HTTP/1.1" 200 22459
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
NagiosServer - - [05/Dec/2012:19:09:40 -0500] "GET / HTTP/1.1" 200 22459
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
NagiosServer - - [05/Dec/2012:19:09:40 -0500] "GET / HTTP/1.1" 200 22459
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
NagiosServer - - [05/Dec/2012:19:09:40 -0500] "GET / HTTP/1.1" 200 22459
"-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"

define service {
        host_name       server
        service_description     www.website.com
        initial_state   o
        is_volatile     0
        max_check_attempts      2
        normal_check_interval   2
        retry_interval  1
        first_notification_delay        0
        active_checks_enabled   1
        passive_checks_enabled  1
        check_period    24x7
        parallelize_check       1
        obsess_over_service     1
        check_freshness 1
        freshness_threshold     60
        event_handler_enabled   1
        process_perf_data       1
        retain_status_information       1
        retain_nonstatus_information    1
        notification_interval   4
        notification_period     24x7
        notifications_enabled   1
        action_url
 /pnp4nagios/index.php?host=$HOSTNAME$&srv=$SERVICEDESC$
        check_command   check_http - vhost - url - string!www.website.com
!/!Ap$
        icon_image      www.png
        display_name    website.com
        notification_options    w,u,c,r,f,s
        stalking_options        o,w,u,c
        contact_groups  Null Placeholder Group
        servicegroups   Public Facing Services
}

Sincerely,
Doug Eubanks
admin at dougware.net
K1DUG
(919) 201-8750


On Wed, Dec 5, 2012 at 2:26 AM, Claudio Kuenzler <ck at claudiokuenzler.com>wrote:

>
>
> On Tue, Dec 4, 2012 at 4:44 PM, Doug Eubanks <admin at dougware.net> wrote:
>
>> Nagios is checking services way too often.  It's supposed to check once
>> every 2 minutes, then failback to checking once every 1 minute on a failure.
>>
>> I believe this is the relevant parts of the nagios.cfg file:
>>
>
> Actually the relevant part for how often a check should be executed is in
> the service definition of the check. Mostly the service itself uses a
> template with the "use" option. In this case you have to check your
> templates.cfg file.
>
> If you don't find it, please post the relevant service definition and the
> definition of the template being used by the service.
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121205/b684355d/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From admin at dougware.net  Thu Dec  6 01:15:47 2012
From: admin at dougware.net (Doug Eubanks)
Date: Wed, 5 Dec 2012 19:15:47 -0500
Subject: Nagios running checks way too often
In-Reply-To: <CAG5Afr-drVMEHBRmriPfQ_-JvRKN3ykxxbb7DphB=Eh4Wtz_Dw@mail.gmail.com>
References: <CAG5Afr8w07ebZ=9UxAu+DQv_Tt2b9FNNDKDsOAphcXJGEza7yA@mail.gmail.com>
	<CAF-yqggMxNCuJfv+h5_YHNJ2vv7MrCd6g=WYMjK212WBo9DGmA@mail.gmail.com>
	<CAG5Afr-drVMEHBRmriPfQ_-JvRKN3ykxxbb7DphB=Eh4Wtz_Dw@mail.gmail.com>
Message-ID: <CAG5Afr-ojHHNi-VPTVXx5N969J=y6j6y5rnSCokDr6rkzxpRHQ@mail.gmail.com>

We use Lilac as a configuration GUI, and as a sanity check I checked to
make sure the service was only specified once in the configuration file.

Sincerely,
Doug Eubanks
admin at dougware.net
K1DUG
(919) 201-8750


On Wed, Dec 5, 2012 at 7:13 PM, Doug Eubanks <admin at dougware.net> wrote:

> Of course you are correct, here's one of the services.  According to the
> site log, it looks like Nagios is firing off three to four requests to the
> server, each time it's being checked.  These log entries are all from one
> vhost log file, so it's not like Nagios is checking 4 sites, it's checking
> the same site 4 times at once.
>
> NagiosServer - - [05/Dec/2012:19:09:40 -0500] "GET / HTTP/1.1" 200 22459
> "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
> NagiosServer - - [05/Dec/2012:19:09:40 -0500] "GET / HTTP/1.1" 200 22459
> "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
> NagiosServer - - [05/Dec/2012:19:09:40 -0500] "GET / HTTP/1.1" 200 22459
> "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
> NagiosServer - - [05/Dec/2012:19:09:40 -0500] "GET / HTTP/1.1" 200 22459
> "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
>
> define service {
>         host_name       server
>         service_description     www.website.com
>         initial_state   o
>         is_volatile     0
>         max_check_attempts      2
>         normal_check_interval   2
>         retry_interval  1
>         first_notification_delay        0
>         active_checks_enabled   1
>         passive_checks_enabled  1
>         check_period    24x7
>         parallelize_check       1
>         obsess_over_service     1
>         check_freshness 1
>         freshness_threshold     60
>         event_handler_enabled   1
>         process_perf_data       1
>         retain_status_information       1
>         retain_nonstatus_information    1
>         notification_interval   4
>         notification_period     24x7
>         notifications_enabled   1
>         action_url
>  /pnp4nagios/index.php?host=$HOSTNAME$&srv=$SERVICEDESC$
>         check_command   check_http - vhost - url - string!www.website.com
> !/!Ap$
>         icon_image      www.png
>         display_name    website.com
>         notification_options    w,u,c,r,f,s
>         stalking_options        o,w,u,c
>         contact_groups  Null Placeholder Group
>         servicegroups   Public Facing Services
> }
>
> Sincerely,
>
> Doug Eubanks
> admin at dougware.net
> K1DUG
> (919) 201-8750
>
>
>
> On Wed, Dec 5, 2012 at 2:26 AM, Claudio Kuenzler <ck at claudiokuenzler.com>wrote:
>
>>
>>
>> On Tue, Dec 4, 2012 at 4:44 PM, Doug Eubanks <admin at dougware.net> wrote:
>>
>>> Nagios is checking services way too often.  It's supposed to check once
>>> every 2 minutes, then failback to checking once every 1 minute on a failure.
>>>
>>> I believe this is the relevant parts of the nagios.cfg file:
>>>
>>
>> Actually the relevant part for how often a check should be executed is in
>> the service definition of the check. Mostly the service itself uses a
>> template with the "use" option. In this case you have to check your
>> templates.cfg file.
>>
>> If you don't find it, please post the relevant service definition and the
>> definition of the template being used by the service.
>>
>>
>> ------------------------------------------------------------------------------
>> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
>> Remotely access PCs and mobile devices and provide instant support
>> Improve your efficiency, and focus on delivering more value-add services
>> Discover what IT Professionals Know. Rescue delivers
>> http://p.sf.net/sfu/logmein_12329d2d
>> _______________________________________________
>> Nagios-users mailing list
>> Nagios-users at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>> ::: Please include Nagios version, plugin version (-v) and OS when
>> reporting any issue.
>> ::: Messages without supporting info will risk being sent to /dev/null
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121205/1315ef5f/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From lilogohard at gmail.com  Thu Dec  6 06:30:38 2012
From: lilogohard at gmail.com (Luo Li)
Date: Thu, 6 Dec 2012 13:30:38 +0800
Subject: Help on Nagios about using Python to query the
	history status of service
Message-ID: <CAEJ23Ucbn84k9vAWHj41rAn_1J3Pm_i7n+6_dRQRtLvHffmi6Q@mail.gmail.com>

I have a question about how can I use Python to query the history status of
the service.
I look through the "NDOUtils Database Model" document,but have no idea
which table or tables can I get the information.
Of course we can view the information by Web but my demand is get the
special number stored in MySQL and re-draw it by the way I like.

Apologize for my poor English.

Best Regards!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121206/191c4f67/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From andrew at fulgent.co.uk  Thu Dec  6 13:48:49 2012
From: andrew at fulgent.co.uk (Andrew Thompson)
Date: Thu, 6 Dec 2012 12:48:49 +0000
Subject: Anybody use check_mssql_health plugin?
Message-ID: <A52C996F892310499AF1F3DDA60A9E6DA1956EAD@FTLMAIL.evesham.fulgent.co.uk>

If so, ever come across this issue before?

Everything works fine apart from 1 server and 1 check.

1 particular Windows 2008R2 server replies its CPU usage as a crazy percentage:

[12-05-2012 14:47:35] SERVICE ALERT: XXXXXX;CPU USAGE;CRITICAL;HARD;3;CRITICAL - CPU busy 194180.74%
[12-05-2012 14:44:35] SERVICE ALERT: XXXXXX;CPU USAGE;CRITICAL;SOFT;2;CRITICAL - CPU busy 116508.44%
[12-05-2012 14:41:35] SERVICE ALERT: XXXXXX;CPU USAGE;CRITICAL;SOFT;1;CRITICAL - CPU busy 233016.89%

When I check the server the CPU isn't even using 10% most of the time.

It does this from the terminal aswell, with nagios user and also root user.

Anybody have any ideas as to what can cause this please?

Many Thanks


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121206/10100259/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From ck at claudiokuenzler.com  Thu Dec  6 17:22:27 2012
From: ck at claudiokuenzler.com (Claudio Kuenzler)
Date: Thu, 6 Dec 2012 17:22:27 +0100
Subject: Nagios running checks way too often
In-Reply-To: <CAG5Afr-drVMEHBRmriPfQ_-JvRKN3ykxxbb7DphB=Eh4Wtz_Dw@mail.gmail.com>
References: <CAG5Afr8w07ebZ=9UxAu+DQv_Tt2b9FNNDKDsOAphcXJGEza7yA@mail.gmail.com>
	<CAF-yqggMxNCuJfv+h5_YHNJ2vv7MrCd6g=WYMjK212WBo9DGmA@mail.gmail.com>
	<CAG5Afr-drVMEHBRmriPfQ_-JvRKN3ykxxbb7DphB=Eh4Wtz_Dw@mail.gmail.com>
Message-ID: <CAF-yqgjxvFJrk5Er_OLG0T3OXcfXWHKYf4E+KM+68dWn7wMhhQ@mail.gmail.com>

On Thu, Dec 6, 2012 at 1:13 AM, Doug Eubanks <admin at dougware.net> wrote:

> Of course you are correct, here's one of the services.  According to the
> site log, it looks like Nagios is firing off three to four requests to the
> server, each time it's being checked.  These log entries are all from one
> vhost log file, so it's not like Nagios is checking 4 sites, it's checking
> the same site 4 times at once.
>
> NagiosServer - - [05/Dec/2012:19:09:40 -0500] "GET / HTTP/1.1" 200 22459
> "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
> NagiosServer - - [05/Dec/2012:19:09:40 -0500] "GET / HTTP/1.1" 200 22459
> "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
> NagiosServer - - [05/Dec/2012:19:09:40 -0500] "GET / HTTP/1.1" 200 22459
> "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
> NagiosServer - - [05/Dec/2012:19:09:40 -0500] "GET / HTTP/1.1" 200 22459
> "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
>
> define service {
>         host_name       server
>         service_description     www.website.com
>         initial_state   o
>         is_volatile     0
>         max_check_attempts      2
>         normal_check_interval   2
>         retry_interval  1
>         first_notification_delay        0
>         active_checks_enabled   1
>         passive_checks_enabled  1
>         check_period    24x7
>         parallelize_check       1
>         obsess_over_service     1
>         check_freshness 1
>         freshness_threshold     60
>         event_handler_enabled   1
>         process_perf_data       1
>         retain_status_information       1
>         retain_nonstatus_information    1
>         notification_interval   4
>         notification_period     24x7
>         notifications_enabled   1
>         action_url
>  /pnp4nagios/index.php?host=$HOSTNAME$&srv=$SERVICEDESC$
>         check_command   check_http - vhost - url - string!www.website.com
> !/!Ap$
>         icon_image      www.png
>         display_name    website.com
>         notification_options    w,u,c,r,f,s
>         stalking_options        o,w,u,c
>         contact_groups  Null Placeholder Group
>         servicegroups   Public Facing Services
> }
>
>
You're right, the check should only happen every 2 mins
(normal_check_interval).
But what looks strange to me is the check_command.
Do you actually have command definition called "check_http - vhost - url -
string" ? I'm not sure if spaces are allowed in the definition.
Can you post the command definition?


What happens if you change the check_command to the following:

check_command    check_website!www.website.com!-u /

where the command definition of check_website looks like that:

define command{
        command_name    check_website
        command_line    $USER1$/check_http -H $ARG1$ $ARG2$
        }
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121206/96d69c15/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From RWerner at pomwonderful.com  Thu Dec  6 23:04:44 2012
From: RWerner at pomwonderful.com (Werner, Robert)
Date: Thu, 6 Dec 2012 22:04:44 +0000
Subject: Anybody use check_mssql_health plugin?
In-Reply-To: <A52C996F892310499AF1F3DDA60A9E6DA1956EAD@FTLMAIL.evesham.fulgent.co.uk>
References: <A52C996F892310499AF1F3DDA60A9E6DA1956EAD@FTLMAIL.evesham.fulgent.co.uk>
Message-ID: <3D480E2907FD164191FE65820A669FFF4FD836F5@POM-LA-MBX02.pomwonderful.com>

I didn't know you could check the CPU status with that plugin.

What is the command definition that you are using?

--
Robert G. Werner
Oracle Apps Systems Administrator
rwerner at pomwonderful.com
559.521.5089
________________________________
From: Andrew Thompson [andrew at fulgent.co.uk]
Sent: Thursday, December 06, 2012 4:48 AM
To: nagios-users at lists.sourceforge.net
Subject: [Nagios-users] Anybody use check_mssql_health plugin?

If so, ever come across this issue before?

Everything works fine apart from 1 server and 1 check.

1 particular Windows 2008R2 server replies its CPU usage as a crazy percentage:

[12-05-2012 14:47:35] SERVICE ALERT: XXXXXX;CPU USAGE;CRITICAL;HARD;3;CRITICAL - CPU busy 194180.74%
[12-05-2012 14:44:35] SERVICE ALERT: XXXXXX;CPU USAGE;CRITICAL;SOFT;2;CRITICAL - CPU busy 116508.44%
[12-05-2012 14:41:35] SERVICE ALERT: XXXXXX;CPU USAGE;CRITICAL;SOFT;1;CRITICAL - CPU busy 233016.89%

When I check the server the CPU isn?t even using 10% most of the time.

It does this from the terminal aswell, with nagios user and also root user.

Anybody have any ideas as to what can cause this please?

Many Thanks


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121206/f45ef6da/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From adaugherity at tamu.edu  Thu Dec  6 23:31:28 2012
From: adaugherity at tamu.edu (Andrew Daugherity)
Date: Thu, 6 Dec 2012 22:31:28 +0000
Subject: check_openmanage: timeout vs. SNMP timeout
Message-ID: <57ED10AF40B31A42B5D540BEEF944813218E8185@mb03.ads.tamu.edu>

I'm troubleshooting an issue where one server is occasionally not responding (I think it's a firewall or snmpd issue, not this plugin), and I noticed that changing the timeout option to check_openmanage did not affect how long it took before receiving the
  SNMP CRITICAL: No response from remote host A.B.C.D

message.  Looking at the code I see the timeout option is _not_ passed to the Net::SNMP session object, so the SNMP connection timeout uses the default value (5 seconds according to the Net::SNMP man page, but 10 seconds in my testing).

If I pass the timeout option to the Net::SNMP->session object like so:
====
diff --git a/check_openmanage b/check_openmanage
index b6abec5..3558ed4 100755
--- a/check_openmanage
+++ b/check_openmanage
@@ -860,6 +860,7 @@ sub snmp_initialize {
         '-port'     => $opt{port},
         '-hostname' => $opt{hostname},
         '-version'  => $opt{protocol},
+        '-timeout'  => $opt{timeout},
        );
 
     # Setting the domain (IP version and transport protocol)
====
Then it does obey the timeout option and I instead get the
  PLUGIN TIMEOUT: check_openmanage timed out after 30 seconds

message.  This might be by design though, to have a shorter SNMP timeout and different error messages, but it was perplexing to me why the timeout option was seemingly not working.  Perhaps a different option for the SNMP timeout, or a documentation clarification, is a better way?


Thanks,

Andrew Daugherity
Systems Analyst
Division of Research, Texas A&M University

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From admin at dougware.net  Fri Dec  7 06:34:28 2012
From: admin at dougware.net (Doug Eubanks)
Date: Fri, 7 Dec 2012 00:34:28 -0500
Subject: Nagios running checks way too often
In-Reply-To: <CAF-yqgjxvFJrk5Er_OLG0T3OXcfXWHKYf4E+KM+68dWn7wMhhQ@mail.gmail.com>
References: <CAG5Afr8w07ebZ=9UxAu+DQv_Tt2b9FNNDKDsOAphcXJGEza7yA@mail.gmail.com>
	<CAF-yqggMxNCuJfv+h5_YHNJ2vv7MrCd6g=WYMjK212WBo9DGmA@mail.gmail.com>
	<CAG5Afr-drVMEHBRmriPfQ_-JvRKN3ykxxbb7DphB=Eh4Wtz_Dw@mail.gmail.com>
	<CAF-yqgjxvFJrk5Er_OLG0T3OXcfXWHKYf4E+KM+68dWn7wMhhQ@mail.gmail.com>
Message-ID: <CAG5Afr9GZ7LMO0kFuLK+wp1QO3s110PDLPNm38MZbEdDc=+Aaw@mail.gmail.com>

I removed the spaces from the command.  I noticed that there were two
Nagios processes running, so I killed them both and restarted Nagios.

Within a few minutes, it was checking the site more often that it should:
Nagios - - [07/Dec/2012:00:29:05 -0500] "GET / HTTP/1.1" 200 22458 "-"
"check_http/v1.4.16 (nagios-plugins 1.4.16)"
Nagios - - [07/Dec/2012:00:29:28 -0500] "GET / HTTP/1.1" 200 22459 "-"
"check_http/v1.4.16 (nagios-plugins 1.4.16)"
Nagios - - [07/Dec/2012:00:30:13 -0500] "GET / HTTP/1.1" 200 22459 "-"
"check_http/v1.4.16 (nagios-plugins 1.4.16)"
Nagios - - [07/Dec/2012:00:31:19 -0500] "GET / HTTP/1.1" 200 22459 "-"
"check_http/v1.4.16 (nagios-plugins 1.4.16)"
Nagios - - [07/Dec/2012:00:31:22 -0500] "GET / HTTP/1.1" 200 22459 "-"
"check_http/v1.4.16 (nagios-plugins 1.4.16)"
Nagios - - [07/Dec/2012:00:31:44 -0500] "GET / HTTP/1.1" 200 22459 "-"
"check_http/v1.4.16 (nagios-plugins 1.4.16)"
Nagios - - [07/Dec/2012:00:32:29 -0500] "GET / HTTP/1.1" 200 22458 "-"
"check_http/v1.4.16 (nagios-plugins 1.4.16)"
Nagios - - [07/Dec/2012:00:33:36 -0500] "GET / HTTP/1.1" 200 22459 "-"
"check_http/v1.4.16 (nagios-plugins 1.4.16)"
Nagios - - [07/Dec/2012:00:33:38 -0500] "GET / HTTP/1.1" 200 22459 "-"
"check_http/v1.4.16 (nagios-plugins 1.4.16)"

Doug Eubanks
admin at dougware.net
K1DUG
(919) 201-8750


On Thu, Dec 6, 2012 at 11:22 AM, Claudio Kuenzler <ck at claudiokuenzler.com>wrote:

>
>
> On Thu, Dec 6, 2012 at 1:13 AM, Doug Eubanks <admin at dougware.net> wrote:
>
>> Of course you are correct, here's one of the services.  According to the
>> site log, it looks like Nagios is firing off three to four requests to the
>> server, each time it's being checked.  These log entries are all from one
>> vhost log file, so it's not like Nagios is checking 4 sites, it's checking
>> the same site 4 times at once.
>>
>> NagiosServer - - [05/Dec/2012:19:09:40 -0500] "GET / HTTP/1.1" 200 22459
>> "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
>> NagiosServer - - [05/Dec/2012:19:09:40 -0500] "GET / HTTP/1.1" 200 22459
>> "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
>> NagiosServer - - [05/Dec/2012:19:09:40 -0500] "GET / HTTP/1.1" 200 22459
>> "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
>> NagiosServer - - [05/Dec/2012:19:09:40 -0500] "GET / HTTP/1.1" 200 22459
>> "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
>>
>> define service {
>>         host_name       server
>>         service_description     www.website.com
>>         initial_state   o
>>         is_volatile     0
>>         max_check_attempts      2
>>         normal_check_interval   2
>>         retry_interval  1
>>         first_notification_delay        0
>>         active_checks_enabled   1
>>         passive_checks_enabled  1
>>         check_period    24x7
>>         parallelize_check       1
>>         obsess_over_service     1
>>         check_freshness 1
>>         freshness_threshold     60
>>         event_handler_enabled   1
>>         process_perf_data       1
>>         retain_status_information       1
>>         retain_nonstatus_information    1
>>         notification_interval   4
>>         notification_period     24x7
>>         notifications_enabled   1
>>         action_url
>>  /pnp4nagios/index.php?host=$HOSTNAME$&srv=$SERVICEDESC$
>>         check_command   check_http - vhost - url - string!www.website.com
>> !/!Ap$
>>         icon_image      www.png
>>         display_name    website.com
>>         notification_options    w,u,c,r,f,s
>>         stalking_options        o,w,u,c
>>         contact_groups  Null Placeholder Group
>>         servicegroups   Public Facing Services
>> }
>>
>>
> You're right, the check should only happen every 2 mins
> (normal_check_interval).
> But what looks strange to me is the check_command.
> Do you actually have command definition called "check_http - vhost - url -
> string" ? I'm not sure if spaces are allowed in the definition.
> Can you post the command definition?
>
>
> What happens if you change the check_command to the following:
>
> check_command    check_website!www.website.com!-u /
>
> where the command definition of check_website looks like that:
>
> define command{
>         command_name    check_website
>         command_line    $USER1$/check_http -H $ARG1$ $ARG2$
>         }
>
>
>
>
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121207/adb1ef41/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From ck at claudiokuenzler.com  Fri Dec  7 08:56:30 2012
From: ck at claudiokuenzler.com (Claudio Kuenzler)
Date: Fri, 7 Dec 2012 08:56:30 +0100
Subject: Nagios running checks way too often
In-Reply-To: <CAG5Afr9GZ7LMO0kFuLK+wp1QO3s110PDLPNm38MZbEdDc=+Aaw@mail.gmail.com>
References: <CAG5Afr8w07ebZ=9UxAu+DQv_Tt2b9FNNDKDsOAphcXJGEza7yA@mail.gmail.com>
	<CAF-yqggMxNCuJfv+h5_YHNJ2vv7MrCd6g=WYMjK212WBo9DGmA@mail.gmail.com>
	<CAG5Afr-drVMEHBRmriPfQ_-JvRKN3ykxxbb7DphB=Eh4Wtz_Dw@mail.gmail.com>
	<CAF-yqgjxvFJrk5Er_OLG0T3OXcfXWHKYf4E+KM+68dWn7wMhhQ@mail.gmail.com>
	<CAG5Afr9GZ7LMO0kFuLK+wp1QO3s110PDLPNm38MZbEdDc=+Aaw@mail.gmail.com>
Message-ID: <CAF-yqgiKQT4o7TTFEXA30f--nX2ZbUM9vDKzEVyfupOOd63Uqw@mail.gmail.com>

On Fri, Dec 7, 2012 at 6:34 AM, Doug Eubanks <admin at dougware.net> wrote:

> I removed the spaces from the command.  I noticed that there were two
> Nagios processes running, so I killed them both and restarted Nagios.
>
> Within a few minutes, it was checking the site more often that it should:
> Nagios - - [07/Dec/2012:00:29:05 -0500] "GET / HTTP/1.1" 200 22458 "-"
> "check_http/v1.4.16 (nagios-plugins 1.4.16)"
> Nagios - - [07/Dec/2012:00:29:28 -0500] "GET / HTTP/1.1" 200 22459 "-"
> "check_http/v1.4.16 (nagios-plugins 1.4.16)"
> Nagios - - [07/Dec/2012:00:30:13 -0500] "GET / HTTP/1.1" 200 22459 "-"
> "check_http/v1.4.16 (nagios-plugins 1.4.16)"
> Nagios - - [07/Dec/2012:00:31:19 -0500] "GET / HTTP/1.1" 200 22459 "-"
> "check_http/v1.4.16 (nagios-plugins 1.4.16)"
> Nagios - - [07/Dec/2012:00:31:22 -0500] "GET / HTTP/1.1" 200 22459 "-"
> "check_http/v1.4.16 (nagios-plugins 1.4.16)"
> Nagios - - [07/Dec/2012:00:31:44 -0500] "GET / HTTP/1.1" 200 22459 "-"
> "check_http/v1.4.16 (nagios-plugins 1.4.16)"
> Nagios - - [07/Dec/2012:00:32:29 -0500] "GET / HTTP/1.1" 200 22458 "-"
> "check_http/v1.4.16 (nagios-plugins 1.4.16)"
> Nagios - - [07/Dec/2012:00:33:36 -0500] "GET / HTTP/1.1" 200 22459 "-"
> "check_http/v1.4.16 (nagios-plugins 1.4.16)"
> Nagios - - [07/Dec/2012:00:33:38 -0500] "GET / HTTP/1.1" 200 22459 "-"
> "check_http/v1.4.16 (nagios-plugins 1.4.16)"
>

Can you still post the command definition?

Did you try to use an alternative command defintion, e.g. check_website,
see last mail.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121207/9fb51518/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From t.h.amundsen at usit.uio.no  Fri Dec  7 10:00:25 2012
From: t.h.amundsen at usit.uio.no (Trond Hasle Amundsen)
Date: Fri, 07 Dec 2012 10:00:25 +0100
Subject: check_openmanage: timeout vs. SNMP timeout
In-Reply-To: <57ED10AF40B31A42B5D540BEEF944813218E8185@mb03.ads.tamu.edu>
	(Andrew Daugherity's message of "Thu, 6 Dec 2012 22:31:28 +0000")
References: <57ED10AF40B31A42B5D540BEEF944813218E8185@mb03.ads.tamu.edu>
Message-ID: <15tzk1qkziu.fsf@tux.uio.no>

Andrew Daugherity <adaugherity at tamu.edu> writes:

> I'm troubleshooting an issue where one server is occasionally not responding (I think it's a firewall or snmpd issue, not this plugin), and I noticed that changing the timeout option to check_openmanage did not affect how long it took before receiving the
>   SNMP CRITICAL: No response from remote host A.B.C.D
>
> message.  Looking at the code I see the timeout option is _not_ passed to the Net::SNMP session object, so the SNMP connection timeout uses the default value (5 seconds according to the Net::SNMP man page, but 10 seconds in my testing).
>
> If I pass the timeout option to the Net::SNMP->session object like so:
> ====
> diff --git a/check_openmanage b/check_openmanage
> index b6abec5..3558ed4 100755
> --- a/check_openmanage
> +++ b/check_openmanage
> @@ -860,6 +860,7 @@ sub snmp_initialize {
>          '-port'     => $opt{port},
>          '-hostname' => $opt{hostname},
>          '-version'  => $opt{protocol},
> +        '-timeout'  => $opt{timeout},
>         );
>  
>      # Setting the domain (IP version and transport protocol)
> ====
> Then it does obey the timeout option and I instead get the
>   PLUGIN TIMEOUT: check_openmanage timed out after 30 seconds
>
> message.  This might be by design though, to have a shorter SNMP timeout and different error messages, but it was perplexing to me why the timeout option was seemingly not working.  Perhaps a different option for the SNMP timeout, or a documentation clarification, is a better way?

Hello Andrew,

Your analysis of this problem is correct, you're hitting the Net::SNMP
timeout which is default 5 seconds. There are two reasons why the
--timeout parameter isn't passed to the SNMP object:

  1. I never saw any reason to :) This is the first time I've heard of
     problems relating to it.

  2. The SNMP object timeout has limitations, it can only be between 1
     and 60 seconds. I don't know how Net::SNMP reacts if the specified
     value is outside of this range.

The documentation is lacking on this, as you pointed out, and I'll fix
that. A new option to specify the SNMP object timeout would be easy to
add, and is in my opinion a cleaner approach than just passing the
plugin timeout.

PS. I'm going away for the weekend and I'm leaving in a few minutes, so
I'll get back to you on this early next week.

Regards,
-- 
Trond H. Amundsen <t.h.amundsen at usit.uio.no>
Center for Information Technology Services, University of Oslo

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From ftlnagios at gmail.com  Fri Dec  7 11:56:16 2012
From: ftlnagios at gmail.com (FTL Nagios)
Date: Fri, 7 Dec 2012 10:56:16 -0000
Subject: Nagios is ignoring the retry_interval setting
In-Reply-To: <2d7c52241f12c9d3419ef557df8509a4.squirrel@picard.linux.it>
References: <A52C996F892310499AF1F3DDA60A9E6DA1944A8B@FTLMAIL.evesham.fulgent.co.uk>
	<2d7c52241f12c9d3419ef557df8509a4.squirrel@picard.linux.it>
Message-ID: <000001cdd469$7ba248a0$72e6d9e0$@gmail.com>

Hi,

Apologies for the delay, been very busy with other things.

Right I have put Nagios into Debug this morning and rerun the tests.

I let it get a couple of successful pings to the server then pulled the
network cable from it.

Behaviour is completely different this morning!!!!

The host check is behaving now and rechecking every 3 minutes as its told
too in the host template. I got my text and email alert to say the host was
down when I expected it!

But now its the service check that is running every 1 minute now, which its
not told too when in problem state.

My service template clearly states  when in problem state to retry_interval
of 3 minutes:

define service{
    name                 service-server        ; The name of this host
template (used above in the checks)
    check_period             server_24x7        ; Server are monitored at
all times
    check_interval             1                ; Server are checked every 1
minute when in OK state
    retry_interval             3                ; Server checked every 3
minutes if in problem state
    max_check_attempts         3                ; Server checked 3 times to
determine if its Up or Down state
    notification_period         server_24x7        ; Emails and Text are
sent out any time of day
    notification_interval         3                ; Resend Notifications
every 3 minutes
    notification_options         c,r            ; Only send alerts for
servers in CRITICAL or RECOVERY state
    notifications_enabled         0                ; Notifications are
disabled
    contact_groups             servers email, servers sms    ; Alerts sent
to contacts in these groups
    event_handler_enabled         1                ; Host event handler is
enabled
    process_perf_data         1                ; Performace data is
processed
    retain_status_information    1                ; Status Info is kept
between server restarts
    retain_nonstatus_information 1                ; Non-Status information
is kept between server restarts
    passive_checks_enabled         0                ; Passive Checks are
disabled
    obsess_over_service         0                 ; We do not obsess over
the server if in problem state
    check_freshness              0                 ; We do not check this
server for freshness
    flap_detection_enabled         0                ; Flap Detection is
disabled
    failure_prediction_enabled   0                ; We will wait for it to
actually fail thankyou!!
    }

And even though its checking every minute, it went straight to Hard State on
the first check it detected it down and has stayed on check 1/3 Hard State
throughout


I really don't understand what is happening here.

The only thing different between this setup and my old nagios box is the
version - old box was 3.31, this new server is 3.4.1, I am using the same
config files that worked fine before.

Here is the debug logfiles of the above testing.

http://dl.dropbox.com/u/895609/nagios.debug1
http://dl.dropbox.com/u/895609/nagios.debug2


If you see anything please let me know, im getting angry with all the
alerts!!! :-)

Thankyou


-----Original Message-----
From: Giorgio Zarrelli [mailto:zarrelli at linux.it] 
Sent: 29 November 2012 19:24
To: Nagios Users List
Subject: Re: [Nagios-users] Nagios is ignoring the retry_interval setting

Hi,

do not seee anything wrong. Could you set debug=-1

repeat the problem and put the log online?

Giorgio

<quota chi="Andrew Thompson">
> Hi Georgio,
>
> The whole test cfg I am using to try troubleshoot this can be found at:
>
> http://dl.dropbox.com/u/895609/test.cfg
>
> This is a direct copy of my main servers config but with the rest of 
> the servers and some templates for other server checks taken out
>
>
>
> Kind Regards
> Andrew
>
> From: Andrew Thompson
> Sent: 29 November 2012 16:11
> To: nagios-users at lists.sourceforge.net
> Subject: Nagios is ignoring the retry_interval setting
>
> Hi,
>
> My nagios box has decided to stop listening to the retry_interval 
> entry in my templates.
>
> My server template reads:
>
> define host{
>      name                       host-server
>      check_period              server_24x7
>      check_interval            1
>      retry_interval            3
>      max_check_attempts        3
>      notification_period       server_24x7
>      notification_interval      3
>      notification_options      d,r
>      notifications_enabled      1
>      contact_groups            servers email, servers sms
>      event_handler_enabled      1
>      process_perf_data         1
>      retain_status_information    1
>      retain_nonstatus_information 1
>      passive_checks_enabled          0
>      obsess_over_host          0
>      check_freshness          0
>      flap_detection_enabled          0
>      failure_prediction_enabled   0
>      }
>
> Now this is what happens:
>
>
> *         Server goes down at 1pm.
>
> *         I check the next scheduled check and it clearly states 1.03pm
>
> *         But at 1.01pm it checks again and then spits out an email and
> text message saying the server is down.
>
> Completely ignoring the retry_interval setting!!!
>
> Id expect from the above:
>
>
> *         1pm server goes down
>
> *         1.03pm check 2 is done
>
> *         1.06pm check 3 is done and determined hard state.
>
> *         At 1.06pm the notification should be sent out.
>
> Why is this, is something in my config wrong?
>
> Ubuntu 12.04 desktop and Nagios 3.4.1
>
> Thanks
>
>
> ----------------------------------------------------------------------
> -------- Keep yourself connected to Go Parallel:
> VERIFY Test and improve your parallel project with help from experts 
> and peers.
> http://goparallel.sourceforge.net_____________________________________
> __________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when 
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


----------------------------------------------------------------------------
--
Keep yourself connected to Go Parallel: 
VERIFY Test and improve your parallel project with help from experts and
peers. http://goparallel.sourceforge.net
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From ftlnagios at gmail.com  Fri Dec  7 12:15:45 2012
From: ftlnagios at gmail.com (FTL Nagios)
Date: Fri, 7 Dec 2012 11:15:45 -0000
Subject: Nagios is ignoring the retry_interval setting
References: <A52C996F892310499AF1F3DDA60A9E6DA1944A8B@FTLMAIL.evesham.fulgent.co.uk>
	<2d7c52241f12c9d3419ef557df8509a4.squirrel@picard.linux.it>
Message-ID: <000101cdd46c$345d2b10$9d178130$@gmail.com>

Re-tested after changing the max file size of the debug file.

This one should contain everything from the moment I started Nagios to the
moment I stopped it during testing (approx. 10 minutes)

http://dl.dropbox.com/u/895609/nagios.debug

Thankyou

-----Original Message-----
From: FTL Nagios [mailto:ftlnagios at gmail.com] 
Sent: 07 December 2012 10:56
To: 'zarrelli at linux.it'; 'Nagios Users List'
Subject: RE: [Nagios-users] Nagios is ignoring the retry_interval setting

Hi,

Apologies for the delay, been very busy with other things.

Right I have put Nagios into Debug this morning and rerun the tests.

I let it get a couple of successful pings to the server then pulled the
network cable from it.

Behaviour is completely different this morning!!!!

The host check is behaving now and rechecking every 3 minutes as its told
too in the host template. I got my text and email alert to say the host was
down when I expected it!

But now its the service check that is running every 1 minute now, which its
not told too when in problem state.

My service template clearly states  when in problem state to retry_interval
of 3 minutes:

define service{
    name                 service-server        ; The name of this host
template (used above in the checks)
    check_period             server_24x7        ; Server are monitored at
all times
    check_interval             1                ; Server are checked every 1
minute when in OK state
    retry_interval             3                ; Server checked every 3
minutes if in problem state
    max_check_attempts         3                ; Server checked 3 times to
determine if its Up or Down state
    notification_period         server_24x7        ; Emails and Text are
sent out any time of day
    notification_interval         3                ; Resend Notifications
every 3 minutes
    notification_options         c,r            ; Only send alerts for
servers in CRITICAL or RECOVERY state
    notifications_enabled         0                ; Notifications are
disabled
    contact_groups             servers email, servers sms    ; Alerts sent
to contacts in these groups
    event_handler_enabled         1                ; Host event handler is
enabled
    process_perf_data         1                ; Performace data is
processed
    retain_status_information    1                ; Status Info is kept
between server restarts
    retain_nonstatus_information 1                ; Non-Status information
is kept between server restarts
    passive_checks_enabled         0                ; Passive Checks are
disabled
    obsess_over_service         0                 ; We do not obsess over
the server if in problem state
    check_freshness              0                 ; We do not check this
server for freshness
    flap_detection_enabled         0                ; Flap Detection is
disabled
    failure_prediction_enabled   0                ; We will wait for it to
actually fail thankyou!!
    }

And even though its checking every minute, it went straight to Hard State on
the first check it detected it down and has stayed on check 1/3 Hard State
throughout


I really don't understand what is happening here.

The only thing different between this setup and my old nagios box is the
version - old box was 3.31, this new server is 3.4.1, I am using the same
config files that worked fine before.

Here is the debug logfiles of the above testing.

http://dl.dropbox.com/u/895609/nagios.debug1
http://dl.dropbox.com/u/895609/nagios.debug2


If you see anything please let me know, im getting angry with all the
alerts!!! :-)

Thankyou


-----Original Message-----
From: Giorgio Zarrelli [mailto:zarrelli at linux.it]
Sent: 29 November 2012 19:24
To: Nagios Users List
Subject: Re: [Nagios-users] Nagios is ignoring the retry_interval setting

Hi,

do not seee anything wrong. Could you set debug=-1

repeat the problem and put the log online?

Giorgio

<quota chi="Andrew Thompson">
> Hi Georgio,
>
> The whole test cfg I am using to try troubleshoot this can be found at:
>
> http://dl.dropbox.com/u/895609/test.cfg
>
> This is a direct copy of my main servers config but with the rest of 
> the servers and some templates for other server checks taken out
>
>
>
> Kind Regards
> Andrew
>
> From: Andrew Thompson
> Sent: 29 November 2012 16:11
> To: nagios-users at lists.sourceforge.net
> Subject: Nagios is ignoring the retry_interval setting
>
> Hi,
>
> My nagios box has decided to stop listening to the retry_interval 
> entry in my templates.
>
> My server template reads:
>
> define host{
>      name                       host-server
>      check_period              server_24x7
>      check_interval            1
>      retry_interval            3
>      max_check_attempts        3
>      notification_period       server_24x7
>      notification_interval      3
>      notification_options      d,r
>      notifications_enabled      1
>      contact_groups            servers email, servers sms
>      event_handler_enabled      1
>      process_perf_data         1
>      retain_status_information    1
>      retain_nonstatus_information 1
>      passive_checks_enabled          0
>      obsess_over_host          0
>      check_freshness          0
>      flap_detection_enabled          0
>      failure_prediction_enabled   0
>      }
>
> Now this is what happens:
>
>
> *         Server goes down at 1pm.
>
> *         I check the next scheduled check and it clearly states 1.03pm
>
> *         But at 1.01pm it checks again and then spits out an email and
> text message saying the server is down.
>
> Completely ignoring the retry_interval setting!!!
>
> Id expect from the above:
>
>
> *         1pm server goes down
>
> *         1.03pm check 2 is done
>
> *         1.06pm check 3 is done and determined hard state.
>
> *         At 1.06pm the notification should be sent out.
>
> Why is this, is something in my config wrong?
>
> Ubuntu 12.04 desktop and Nagios 3.4.1
>
> Thanks
>
>
> ----------------------------------------------------------------------
> -------- Keep yourself connected to Go Parallel:
> VERIFY Test and improve your parallel project with help from experts 
> and peers.
> http://goparallel.sourceforge.net_____________________________________
> __________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when 
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


----------------------------------------------------------------------------
--
Keep yourself connected to Go Parallel: 
VERIFY Test and improve your parallel project with help from experts and
peers. http://goparallel.sourceforge.net
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From andrew at fulgent.co.uk  Fri Dec  7 12:25:59 2012
From: andrew at fulgent.co.uk (Andrew Thompson)
Date: Fri, 7 Dec 2012 11:25:59 +0000
Subject: Anybody use check_mssql_health plugin?
In-Reply-To: <3D480E2907FD164191FE65820A669FFF4FD836F5@POM-LA-MBX02.pomwonderful.com>
References: <A52C996F892310499AF1F3DDA60A9E6DA1956EAD@FTLMAIL.evesham.fulgent.co.uk>
	<3D480E2907FD164191FE65820A669FFF4FD836F5@POM-LA-MBX02.pomwonderful.com>
Message-ID: <A52C996F892310499AF1F3DDA60A9E6DA19858A8@FTLMAIL.evesham.fulgent.co.uk>

Hello Robert,

This is my command:

# 'check_mssql_health CPU Usage' command definition
define command{
    command_name    check_sql_cpu
    command_line    $USER1$/check_mssql_health -server $HOSTADDRESS$ -username $ARG1$ -password $ARG2$ -mode $ARG3$
    }


This is my service check for said server:


define service{
    use            service-sql-server,srv-pnp
    host_name        ABCDEF
    service_description    CPU USAGE
    check_command        check_sql_cpu!"user"!pass!cpu-busy!
    }


Cheers

From: Werner, Robert [mailto:RWerner at pomwonderful.com]
Sent: 06 December 2012 22:05
To: Nagios Users List
Subject: Re: [Nagios-users] Anybody use check_mssql_health plugin?

I didn't know you could check the CPU status with that plugin.

What is the command definition that you are using?

--
Robert G. Werner
Oracle Apps Systems Administrator
rwerner at pomwonderful.com<mailto:rwerner at pomwonderful.com>
559.521.5089
________________________________
From: Andrew Thompson [andrew at fulgent.co.uk]
Sent: Thursday, December 06, 2012 4:48 AM
To: nagios-users at lists.sourceforge.net<mailto:nagios-users at lists.sourceforge.net>
Subject: [Nagios-users] Anybody use check_mssql_health plugin?
If so, ever come across this issue before?

Everything works fine apart from 1 server and 1 check.

1 particular Windows 2008R2 server replies its CPU usage as a crazy percentage:

[12-05-2012 14:47:35] SERVICE ALERT: XXXXXX;CPU USAGE;CRITICAL;HARD;3;CRITICAL - CPU busy 194180.74%
[12-05-2012 14:44:35] SERVICE ALERT: XXXXXX;CPU USAGE;CRITICAL;SOFT;2;CRITICAL - CPU busy 116508.44%
[12-05-2012 14:41:35] SERVICE ALERT: XXXXXX;CPU USAGE;CRITICAL;SOFT;1;CRITICAL - CPU busy 233016.89%

When I check the server the CPU isn't even using 10% most of the time.

It does this from the terminal aswell, with nagios user and also root user.

Anybody have any ideas as to what can cause this please?

Many Thanks


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121207/42da082f/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From admin at dougware.net  Sat Dec  8 07:02:57 2012
From: admin at dougware.net (Doug Eubanks)
Date: Sat, 8 Dec 2012 01:02:57 -0500
Subject: Nagios running checks way too often
In-Reply-To: <CAF-yqgiKQT4o7TTFEXA30f--nX2ZbUM9vDKzEVyfupOOd63Uqw@mail.gmail.com>
References: <CAG5Afr8w07ebZ=9UxAu+DQv_Tt2b9FNNDKDsOAphcXJGEza7yA@mail.gmail.com>
	<CAF-yqggMxNCuJfv+h5_YHNJ2vv7MrCd6g=WYMjK212WBo9DGmA@mail.gmail.com>
	<CAG5Afr-drVMEHBRmriPfQ_-JvRKN3ykxxbb7DphB=Eh4Wtz_Dw@mail.gmail.com>
	<CAF-yqgjxvFJrk5Er_OLG0T3OXcfXWHKYf4E+KM+68dWn7wMhhQ@mail.gmail.com>
	<CAG5Afr9GZ7LMO0kFuLK+wp1QO3s110PDLPNm38MZbEdDc=+Aaw@mail.gmail.com>
	<CAF-yqgiKQT4o7TTFEXA30f--nX2ZbUM9vDKzEVyfupOOd63Uqw@mail.gmail.com>
Message-ID: <CAG5Afr_TF7q8pyTG-TSGhDyG=ZeG+ZmhXrZt9isbF6LJ53ZoHg@mail.gmail.com>

I fixed the problem!

It was an issue with our Check Result Reaper Frequency and Maximum Check
Result Reaper Time, adjusting these fixed the issue.

Thanks for the pointers!

Doug Eubanks
admin at dougware.net
K1DUG
(919) 201-8750


On Fri, Dec 7, 2012 at 2:56 AM, Claudio Kuenzler <ck at claudiokuenzler.com>wrote:

> On Fri, Dec 7, 2012 at 6:34 AM, Doug Eubanks <admin at dougware.net> wrote:
>
>> I removed the spaces from the command.  I noticed that there were two
>> Nagios processes running, so I killed them both and restarted Nagios.
>>
>> Within a few minutes, it was checking the site more often that it should:
>> Nagios - - [07/Dec/2012:00:29:05 -0500] "GET / HTTP/1.1" 200 22458 "-"
>> "check_http/v1.4.16 (nagios-plugins 1.4.16)"
>> Nagios - - [07/Dec/2012:00:29:28 -0500] "GET / HTTP/1.1" 200 22459 "-"
>> "check_http/v1.4.16 (nagios-plugins 1.4.16)"
>> Nagios - - [07/Dec/2012:00:30:13 -0500] "GET / HTTP/1.1" 200 22459 "-"
>> "check_http/v1.4.16 (nagios-plugins 1.4.16)"
>> Nagios - - [07/Dec/2012:00:31:19 -0500] "GET / HTTP/1.1" 200 22459 "-"
>> "check_http/v1.4.16 (nagios-plugins 1.4.16)"
>> Nagios - - [07/Dec/2012:00:31:22 -0500] "GET / HTTP/1.1" 200 22459 "-"
>> "check_http/v1.4.16 (nagios-plugins 1.4.16)"
>> Nagios - - [07/Dec/2012:00:31:44 -0500] "GET / HTTP/1.1" 200 22459 "-"
>> "check_http/v1.4.16 (nagios-plugins 1.4.16)"
>> Nagios - - [07/Dec/2012:00:32:29 -0500] "GET / HTTP/1.1" 200 22458 "-"
>> "check_http/v1.4.16 (nagios-plugins 1.4.16)"
>> Nagios - - [07/Dec/2012:00:33:36 -0500] "GET / HTTP/1.1" 200 22459 "-"
>> "check_http/v1.4.16 (nagios-plugins 1.4.16)"
>> Nagios - - [07/Dec/2012:00:33:38 -0500] "GET / HTTP/1.1" 200 22459 "-"
>> "check_http/v1.4.16 (nagios-plugins 1.4.16)"
>>
>
> Can you still post the command definition?
>
> Did you try to use an alternative command defintion, e.g. check_website,
> see last mail.
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121208/b053208e/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From 1j1qtk at nottheoilrig.com  Sun Dec  9 09:39:24 2012
From: 1j1qtk at nottheoilrig.com (Jack Bates)
Date: Sun, 09 Dec 2012 00:39:24 -0800
Subject: Link directly to "hostdetail"?
Message-ID: <50C44E3C.2070300@nottheoilrig.com>

I'm trying to make an HTML link directly to the "hostdetail" Nagios 
page. Naively, I visit the "hostdetail" page, copy the URL, and add a 
link with that URL to another web page. But when I click the link, 
instead of the "hostdetail" page, I get the "Home" page of our Nagios 
installation.

I can instead copy the URL for the "hostdetail" frame, something like 
"http://wood.lan/cgi-bin/nagios3/status.cgi?hostgroup=all&style=hostdetail", 
but when I visit this URL, the navigation sidebar is missing.

Is there a way to link directly to the "hostdetail" Nagios page?

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From t.h.amundsen at usit.uio.no  Mon Dec 10 17:33:38 2012
From: t.h.amundsen at usit.uio.no (Trond Hasle Amundsen)
Date: Mon, 10 Dec 2012 17:33:38 +0100
Subject: check_openmanage: timeout vs. SNMP timeout
In-Reply-To: <15tzk1qkziu.fsf@tux.uio.no> (Trond Hasle Amundsen's message of
	"Fri, 07 Dec 2012 10:00:25 +0100")
References: <57ED10AF40B31A42B5D540BEEF944813218E8185@mb03.ads.tamu.edu>
	<15tzk1qkziu.fsf@tux.uio.no>
Message-ID: <15tip89lvdp.fsf@tux.uio.no>

Trond Hasle Amundsen <t.h.amundsen at usit.uio.no> writes:

> A new option to specify the SNMP object timeout would be easy to add,
> and is in my opinion a cleaner approach than just passing the plugin
> timeout.

Such an option is now implemented in the Git version:

  http://git.uio.no/git/?p=check_openmanage.git;a=commit;h=32564b44c2631eeac03a920f0c180fb12e4b29c8

Please try this version (named 3.7.8-beta2) and let me know if it works
around your problem. Usage:

  check_openmange --snmp-timeout <integer>

Regards,
-- 
Trond H. Amundsen <t.h.amundsen at usit.uio.no>
Center for Information Technology Services, University of Oslo

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From Edwin.Zoeller at ama-assn.org  Mon Dec 10 21:03:05 2012
From: Edwin.Zoeller at ama-assn.org (Edwin Zoeller)
Date: Mon, 10 Dec 2012 20:03:05 +0000
Subject: Automated Reports
Message-ID: <EFEEE1306BFAAD4890A78E609ECE7203528BFAFF@UTLP5162.ad.ama-assn.org>

I am currently running Nagios XI and wondering if anyone knows it there are any automated reports out there.

Thanks,

Ed Zoeller
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121210/97e858c8/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From mguthrie at nagios.com  Mon Dec 10 21:16:00 2012
From: mguthrie at nagios.com (Mike Guthrie)
Date: Mon, 10 Dec 2012 14:16:00 -0600
Subject: Automated Reports
In-Reply-To: <EFEEE1306BFAAD4890A78E609ECE7203528BFAFF@UTLP5162.ad.ama-assn.org>
References: <EFEEE1306BFAAD4890A78E609ECE7203528BFAFF@UTLP5162.ad.ama-assn.org>
Message-ID: <50C64300.8050705@nagios.com>

Just an FYI, this list is primarily intended for Nagios Core, for Nagios 
XI inquiries go ahead and use the XI support forums:
http://support.nagios.com/forum

Nagios XI 2012 Enterprise Edition has the ability to do regular 
scheduled reports for almost all of the available reports in the web 
interface.


On 12/10/2012 2:03 PM, Edwin Zoeller wrote:
>
> I am currently running Nagios XI and wondering if anyone knows it 
> there are any automated reports out there.
>
> Thanks,
>
> Ed Zoeller
>
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
>
>
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121210/fdcef959/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From Edwin.Zoeller at ama-assn.org  Mon Dec 10 21:36:03 2012
From: Edwin.Zoeller at ama-assn.org (Edwin Zoeller)
Date: Mon, 10 Dec 2012 20:36:03 +0000
Subject: Automated Reports
In-Reply-To: <50C64300.8050705@nagios.com>
References: <50C64300.8050705@nagios.com>
Message-ID: <EFEEE1306BFAAD4890A78E609ECE7203528BFBBC@UTLP5162.ad.ama-assn.org>

Will do sorry

From: Mike Guthrie [mailto:mguthrie at nagios.com]
Sent: Monday, December 10, 2012 02:16 PM
To: Nagios Users List <nagios-users at lists.sourceforge.net>
Subject: Re: [Nagios-users] Automated Reports

Just an FYI, this list is primarily intended for Nagios Core, for Nagios XI inquiries go ahead and use the XI support forums:
http://support.nagios.com/forum

Nagios XI 2012 Enterprise Edition has the ability to do regular scheduled reports for almost all of the available reports in the web interface.


On 12/10/2012 2:03 PM, Edwin Zoeller wrote:
I am currently running Nagios XI and wondering if anyone knows it there are any automated reports out there.

Thanks,

Ed Zoeller


------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d


_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net<mailto:Nagios-users at lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121210/75eef56e/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From adaugherity at tamu.edu  Tue Dec 11 19:35:35 2012
From: adaugherity at tamu.edu (Andrew Daugherity)
Date: Tue, 11 Dec 2012 18:35:35 +0000
Subject: check_openmanage: timeout vs. SNMP timeout
Message-ID: <57ED10AF40B31A42B5D540BEEF944813218EB5BF@mb03.ads.tamu.edu>

Trond Hasle Amundsen <t.h.amundsen at ...> writes:
> > A new option to specify the SNMP object timeout would be easy to add,
> > and is in my opinion a cleaner approach than just passing the plugin
> > timeout.
> 
> Such an option is now implemented in the Git version:
>   
> http://git.uio.no/git/?p=check_openmanage.git;a=commit;h=32564b44c2631eeac03a920f0c180fb12e4b29c8
> 
> 
> Please try this version (named 3.7.8-beta2) and let me know if it works
> around your problem. Usage:
> 
>   check_openmange --snmp-timeout <integer>

I think I fixed my problem (for the time being at least) by restarting OMSA on that server.  Restarting snmpd didn't solve anything, nor did my timeout hack (which just gave me an UNKNOWN status - plugin timeout instead of SNMP CRITICAL when it randomly failed).  Whenever the check failed, it would hang indefinitely, so it was not a case of slow SNMP.  Thanks for the added option, though; I think someone may find it useful.

Regarding your fix:
The timeout option does appear to get passed to SNMP, however the actual timeout is twice what is specified.  E.g. --snmp=timeout=1, get SNMP critical message after 2 seconds; --snmp-timeout=14, SNMP critical at 28 seconds; --snmp-timeout=15 or higher, get UNKNOWN: PLUGIN TIMEOUT message at 30 seconds.  (I used a host without snmpd running for the timeout tests.)  I can't see anything obviously wrong with your code, but it behaves this way both on both SLES 11 SP1 (Perl 5.10, net-snmp 5.4.2.1, Net::SNMP 6.0.1) and OS X 10.8 (Perl 5.12.4, net-snmp 5.6, Net::SNMP 6.1 [from CPAN]).

You probably also want to add this option to the help/usage message.

-Andrew
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From justinp at norchemlab.com  Tue Dec 11 20:22:10 2012
From: justinp at norchemlab.com (Justin T Pryzby)
Date: Tue, 11 Dec 2012 12:22:10 -0700
Subject: check_openmanage: timeout vs. SNMP timeout
In-Reply-To: <57ED10AF40B31A42B5D540BEEF944813218EB5BF@mb03.ads.tamu.edu>
References: <57ED10AF40B31A42B5D540BEEF944813218EB5BF@mb03.ads.tamu.edu>
Message-ID: <20121211192210.GA9659@norchemlab.com>

On Tue, Dec 11, 2012 at 06:35:35PM +0000, Andrew Daugherity wrote:
> Regarding your fix:
> The timeout option does appear to get passed to SNMP, however the
> actual timeout is twice what is specified.  E.g. --snmp=timeout=1,
> get SNMP critical message after 2 seconds; --snmp-timeout=14, SNMP
> critical at 28 seconds; --snmp-timeout=15 or higher, get UNKNOWN:
> PLUGIN TIMEOUT message at 30 seconds.  (I used a host without snmpd
> running for the timeout tests.)  I can't see anything obviously
> wrong with your code, but it behaves this way both on both SLES 11
> SP1 (Perl 5.10, net-snmp 5.4.2.1, Net::SNMP 6.0.1) and OS X 10.8
> (Perl 5.12.4, net-snmp 5.6, Net::SNMP 6.1 [from CPAN]).

That may be explained if the SNMP client has --retries=2.

Justin

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From t.h.amundsen at usit.uio.no  Tue Dec 11 20:54:17 2012
From: t.h.amundsen at usit.uio.no (Trond Hasle Amundsen)
Date: Tue, 11 Dec 2012 20:54:17 +0100
Subject: check_openmanage: timeout vs. SNMP timeout
In-Reply-To: <57ED10AF40B31A42B5D540BEEF944813218EB5BF@mb03.ads.tamu.edu>
	(Andrew Daugherity's message of "Tue, 11 Dec 2012 18:35:35 +0000")
References: <57ED10AF40B31A42B5D540BEEF944813218EB5BF@mb03.ads.tamu.edu>
Message-ID: <15t8v94jrfa.fsf@tux.uio.no>

Andrew Daugherity <adaugherity at tamu.edu> writes:

>> Please try this version (named 3.7.8-beta2) and let me know if it works
>> around your problem. Usage:
>> 
>>   check_openmange --snmp-timeout <integer>
>
> I think I fixed my problem (for the time being at least) by restarting
> OMSA on that server.  Restarting snmpd didn't solve anything, nor did
> my timeout hack (which just gave me an UNKNOWN status - plugin timeout
> instead of SNMP CRITICAL when it randomly failed).  Whenever the check
> failed, it would hang indefinitely, so it was not a case of slow SNMP.
> Thanks for the added option, though; I think someone may find it
> useful.

Yes, I agree. I'll keep it.

> Regarding your fix:
> The timeout option does appear to get passed to SNMP, however the
> actual timeout is twice what is specified.  E.g. --snmp=timeout=1, get
> SNMP critical message after 2 seconds; --snmp-timeout=14, SNMP
> critical at 28 seconds; --snmp-timeout=15 or higher, get UNKNOWN:
> PLUGIN TIMEOUT message at 30 seconds.  (I used a host without snmpd
> running for the timeout tests.)  I can't see anything obviously wrong
> with your code, but it behaves this way both on both SLES 11 SP1 (Perl
> 5.10, net-snmp 5.4.2.1, Net::SNMP 6.0.1) and OS X 10.8 (Perl 5.12.4,
> net-snmp 5.6, Net::SNMP 6.1 [from CPAN]).

Hmm.. kind of confusing. It is due to the fact that Net::SNMP does one
retry (with the same timeout) before it bails out. This is adjustable
with the '-retries' parameter to the SNMP object. The default is 1. If I
set it to 0, the plugin times out in the SNMP object at the specified
time as you would expect. Thanks for pointing this out, I should make a
note of it in the manual page.

> You probably also want to add this option to the help/usage message.

I won't make the help output, as that only covers the most popular
options, but I'll add it to the manual page.

Cheers,
-- 
Trond H. Amundsen <t.h.amundsen at usit.uio.no>
Center for Information Technology Services, University of Oslo

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From DCeola at twgi.net  Wed Dec 12 15:53:01 2012
From: DCeola at twgi.net (Daniel Ceola)
Date: Wed, 12 Dec 2012 14:53:01 +0000
Subject: Simple check_ping question
Message-ID: <40B7337A7D70B149A34D5A0FD9FDFA8C2C9F83AF@twgiex.twgi.net>

Hello all,

Is it meant for the "check_ping" command from the Nagios plugins to not function when using a host name instead of an IP address?  I have a handful of hosts that I've setup within Nagios that I want to monitor the ping time (as well as a few other items) on, but since they are dhcp clients, I have the host within Nagios setup with the hostname on my local network instead of an IP address.

When I do a simple check_ping using the hostname (which I have entered as the local FQDN, not just a short name), it returns a "Network Unreachable" error.

Example:
nagios at nagios:/usr/local/nagios/libexec$ ./check_ping -H dceola.twgi.net -w 400,20% -c 600,50%
CRITICAL - Network Unreachable (dceola.twgi.net)

Thanks,

Daniel Ceola

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121212/931e9b10/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From DCeola at twgi.net  Wed Dec 12 17:12:31 2012
From: DCeola at twgi.net (Daniel Ceola)
Date: Wed, 12 Dec 2012 16:12:31 +0000
Subject: Simple check_ping question
In-Reply-To: <40B7337A7D70B149A34D5A0FD9FDFA8C2C9F83AF@twgiex.twgi.net>
References: <40B7337A7D70B149A34D5A0FD9FDFA8C2C9F83AF@twgiex.twgi.net>
Message-ID: <40B7337A7D70B149A34D5A0FD9FDFA8C2C9F85D1@twgiex.twgi.net>

Nevermind on this question.  I did a good bit more digging on Google and found a very simple suggestion to add the -4 flag onto the check_ping command to force it to use ipv4.  When doing this, it works perfectly.

Thanks,

Daniel Ceola

From: Daniel Ceola [mailto:DCeola at twgi.net]
Sent: Wednesday, December 12, 2012 9:53 AM
To: Nagios Users (nagios-users at lists.sourceforge.net)
Subject: [Nagios-users] Simple check_ping question

Hello all,

Is it meant for the "check_ping" command from the Nagios plugins to not function when using a host name instead of an IP address?  I have a handful of hosts that I've setup within Nagios that I want to monitor the ping time (as well as a few other items) on, but since they are dhcp clients, I have the host within Nagios setup with the hostname on my local network instead of an IP address.

When I do a simple check_ping using the hostname (which I have entered as the local FQDN, not just a short name), it returns a "Network Unreachable" error.

Example:
nagios at nagios:/usr/local/nagios/libexec$ ./check_ping -H dceola.twgi.net -w 400,20% -c 600,50%
CRITICAL - Network Unreachable (dceola.twgi.net)

Thanks,

Daniel Ceola

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121212/1de5dffe/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From grimm26+nagios at gmail.com  Thu Dec 13 16:43:16 2012
From: grimm26+nagios at gmail.com (Mark Keisler)
Date: Thu, 13 Dec 2012 09:43:16 -0600
Subject: service checks running too often
Message-ID: <CA+FAjA8UyE0RcVjA+iW-5==wnsKdyrEMz7w-FnaAGkXypXQy1w@mail.gmail.com>

I'm running Nagios 3.4.1 on RHEL6. I have an issue where I have a poller
(service check) that is running too often and I am not sure why. I have
"service_check_timeout=180" because I had trouble with the poller running
long. Relevant settings for the service check:

        check_period                    24x7
        max_check_attempts              1
        normal_check_interval           5
        retry_check_interval            5

I also set up a tracking logger in the poller to record "timestamp PID
started by PPID : Poll [Start|End] of poller"
2012-12-12_12:26:38 19448 started by 19442 : Poll Start of poller
2012-12-12_12:27:13 19448 started by 19442 : Poll End of poller
2012-12-12_12:28:14 19931 started by 19930 : Poll Start of poller
2012-12-12_12:30:14 19931 started by 19930 : Poll End of poller
2012-12-12_12:31:37 20467 started by 20460 : Poll Start of poller
2012-12-12_12:33:15 20949 started by 20946 : Poll Start of poller
2012-12-12_12:33:15 20467 started by 20460 : Poll End of poller
2012-12-12_12:33:41 20949 started by 20946 : Poll End of poller
2012-12-12_12:36:38 21483 started by 21478 : Poll Start of poller
2012-12-12_12:38:14 21971 started by 21964 : Poll Start of poller
2012-12-12_12:39:17 21483 started by 21478 : Poll End of poller
2012-12-12_12:39:18 21971 started by 21964 : Poll End of poller
2012-12-12_12:41:38 22500 started by 22492 : Poll Start of poller
2012-12-12_12:42:19 22500 started by 22492 : Poll End of poller
2012-12-12_12:43:14 23003 started by 22999 : Poll Start of poller
2012-12-12_12:45:20 23003 started by 22999 : Poll End of poller
2012-12-12_12:46:37 23540 started by 23535 : Poll Start of poller
2012-12-12_12:48:14 24025 started by 24024 : Poll Start of poller
2012-12-12_12:48:20 23540 started by 23535 : Poll End of poller
2012-12-12_12:48:41 24025 started by 24024 : Poll End of poller
2012-12-12_12:51:38 24558 started by 24554 : Poll Start of poller
2012-12-12_12:53:14 25044 started by 25041 : Poll Start of poller
2012-12-12_12:54:35 25044 started by 25041 : Poll End of poller

As you can see, I start to get overlapping pollers. I don't understand why
this would happen. Any hints or clues?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121213/6124da5a/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From Peter.Shankland at ricoh-rpl.com  Thu Dec 13 17:03:29 2012
From: Peter.Shankland at ricoh-rpl.com (Peter.Shankland at ricoh-rpl.com)
Date: Thu, 13 Dec 2012 16:03:29 +0000
Subject: Peter Shankland is out of the office.
Message-ID: <OF371167D6.53D488A0-ON80257AD3.005835BC-80257AD3.005835BC@RICOH-RPL.COM>


I will be out of the office starting  13/12/2012 and will not return until
17/12/2012.

Please contact Tom Barnes if the request is urgent:

Tom Barnes
tom.barnes at ricoh-rpl.com
01952 205362

Regards.

________________________________________
Peter Shankland
TECHNICAL SPECIALIST
IT DEPARTMENT

Ricoh UK Products Limited
Priorslee
Telford, TF2 9NS
UK
Tel: +44 (0) 1952 290090
DD:+44 (0) 1952 205160
F:+44 (0) 1952 213100
M:+44 (0) 7919 444077
Peter.Shankland at ricoh-rpl.com

(Embedded image moved to file: pic16565.gif)

(Embedded image moved to file: pic16036.jpg)
Please do not print this email unless absolutely necessary in order to save
paper and energy, and you will contribute to resource conservation and CO2
reduction. This email including attachments is intended for the
addressee(s) only. It may be labelled confidential/ private and contain
confidential/private information. Please respect the wishes of the sender
in the way you treat this email and the information contained within. If in
doubt clarify the wishes of the sender before acting. If you have received
this email in error, you may not review, copy or forward this message in
whole or in part. Ricoh UK Products employees should delete from their
system and notify us of the error via the ISMS Security Incident Reporting
database. External recipients should delete from their system and alert us
via email, advising the name of the sender and the time and date of
receipt. Any views expressed in this email may not necessarily reflect
those of Ricoh UK Products Ltd. You should ensure that the onward
transmission, opening or use of this message or attachments will not
adversely affect your system or data and carry out anti-virus checks before
downloading. Internet communications are not secure and therefore Ricoh UK
Products Ltd accepts no responsibility for any direct, indirect or
consequential damage resulting from the transmission of this message.

Registered in England No. 1763860
Registered Office: Ricoh UK Products Limited, Priorslee, Telford,
Shropshire, TF2 9NS
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pic16565.gif
Type: image/gif
Size: 2190 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121213/f388f268/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pic16036.jpg
Type: image/jpeg
Size: 7544 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121213/f388f268/attachment.jpg>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From mguthrie at nagios.com  Thu Dec 13 17:24:11 2012
From: mguthrie at nagios.com (Mike Guthrie)
Date: Thu, 13 Dec 2012 10:24:11 -0600
Subject: service checks running too often
In-Reply-To: <CA+FAjA8UyE0RcVjA+iW-5==wnsKdyrEMz7w-FnaAGkXypXQy1w@mail.gmail.com>
References: <CA+FAjA8UyE0RcVjA+iW-5==wnsKdyrEMz7w-FnaAGkXypXQy1w@mail.gmail.com>
Message-ID: <50CA012B.100@nagios.com>

Although some of those start times do seem close together, it's 
important to know that the check_interval in Nagios is not necessarily a 
hard number. Nagios is continually adjusting and recalculating the check 
schedule, so if you need a check to run on a hard 5mn schedule, you 
might be better off using cron, and then pushing the result to Nagios 
passively.

With that said, access the service details for this service. When new 
results come in does the scheduler set the Next Check 5mn out as expected?


On 12/13/2012 9:43 AM, Mark Keisler wrote:
> I'm running Nagios 3.4.1 on RHEL6. I have an issue where I have a 
> poller (service check) that is running too often and I am not sure 
> why. I have "service_check_timeout=180" because I had trouble with the 
> poller running long. Relevant settings for the service check:
>
>         check_period                    24x7
>         max_check_attempts              1
>         normal_check_interval           5
>         retry_check_interval            5
>
> I also set up a tracking logger in the poller to record "timestamp PID 
> started by PPID : Poll [Start|End] of poller"
> 2012-12-12_12:26:38 19448 started by 19442 : Poll Start of poller
> 2012-12-12_12:27:13 19448 started by 19442 : Poll End of poller
> 2012-12-12_12:28:14 19931 started by 19930 : Poll Start of poller
> 2012-12-12_12:30:14 19931 started by 19930 : Poll End of poller
> 2012-12-12_12:31:37 20467 started by 20460 : Poll Start of poller
> 2012-12-12_12:33:15 20949 started by 20946 : Poll Start of poller
> 2012-12-12_12:33:15 20467 started by 20460 : Poll End of poller
> 2012-12-12_12:33:41 20949 started by 20946 : Poll End of poller
> 2012-12-12_12:36:38 21483 started by 21478 : Poll Start of poller
> 2012-12-12_12:38:14 21971 started by 21964 : Poll Start of poller
> 2012-12-12_12:39:17 21483 started by 21478 : Poll End of poller
> 2012-12-12_12:39:18 21971 started by 21964 : Poll End of poller
> 2012-12-12_12:41:38 22500 started by 22492 : Poll Start of poller
> 2012-12-12_12:42:19 22500 started by 22492 : Poll End of poller
> 2012-12-12_12:43:14 23003 started by 22999 : Poll Start of poller
> 2012-12-12_12:45:20 23003 started by 22999 : Poll End of poller
> 2012-12-12_12:46:37 23540 started by 23535 : Poll Start of poller
> 2012-12-12_12:48:14 24025 started by 24024 : Poll Start of poller
> 2012-12-12_12:48:20 23540 started by 23535 : Poll End of poller
> 2012-12-12_12:48:41 24025 started by 24024 : Poll End of poller
> 2012-12-12_12:51:38 24558 started by 24554 : Poll Start of poller
> 2012-12-12_12:53:14 25044 started by 25041 : Poll Start of poller
> 2012-12-12_12:54:35 25044 started by 25041 : Poll End of poller
>
> As you can see, I start to get overlapping pollers. I don't understand 
> why this would happen. Any hints or clues?
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
>
>
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


-- 


Mike Guthrie
Technical Team
___
Nagios Enterprises, LLC
Email:  mguthrie at nagios.com
Web:    www.nagios.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121213/543f9b5d/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From grimm26+nagios at gmail.com  Thu Dec 13 19:38:39 2012
From: grimm26+nagios at gmail.com (Mark Keisler)
Date: Thu, 13 Dec 2012 12:38:39 -0600
Subject: service checks running too often
In-Reply-To: <50CA012B.100@nagios.com>
References: <CA+FAjA8UyE0RcVjA+iW-5==wnsKdyrEMz7w-FnaAGkXypXQy1w@mail.gmail.com>
	<50CA012B.100@nagios.com>
Message-ID: <CA+FAjA_nSb5OU0PtSmFHRmwBs9e87VB6v3PZDrey1eQ1YOsO3w@mail.gmail.com>

I understand that nagios dynamically adjusts service check times, but the
puzzling thing is that there is a check that runs every 5 minutes but then
an extra or two in between.  And yes, the web interface shows the next
service check as 5 mins out and yet another runs before that time hits.


On Thu, Dec 13, 2012 at 10:24 AM, Mike Guthrie <mguthrie at nagios.com> wrote:

>  Although some of those start times do seem close together, it's
> important to know that the check_interval in Nagios is not necessarily a
> hard number. Nagios is continually adjusting and recalculating the check
> schedule, so if you need a check to run on a hard 5mn schedule, you might
> be better off using cron, and then pushing the result to Nagios passively.
>
> With that said, access the service details for this service. When new
> results come in does the scheduler set the Next Check 5mn out as expected?
>
>
>
> On 12/13/2012 9:43 AM, Mark Keisler wrote:
>
> I'm running Nagios 3.4.1 on RHEL6. I have an issue where I have a poller
> (service check) that is running too often and I am not sure why. I have
> "service_check_timeout=180" because I had trouble with the poller running
> long. Relevant settings for the service check:
>
>         check_period                    24x7
>         max_check_attempts              1
>         normal_check_interval           5
>         retry_check_interval            5
>
> I also set up a tracking logger in the poller to record "timestamp PID
> started by PPID : Poll [Start|End] of poller"
> 2012-12-12_12:26:38 19448 started by 19442 : Poll Start of poller
> 2012-12-12_12:27:13 19448 started by 19442 : Poll End of poller
> 2012-12-12_12:28:14 19931 started by 19930 : Poll Start of poller
> 2012-12-12_12:30:14 19931 started by 19930 : Poll End of poller
> 2012-12-12_12:31:37 20467 started by 20460 : Poll Start of poller
> 2012-12-12_12:33:15 20949 started by 20946 : Poll Start of poller
> 2012-12-12_12:33:15 20467 started by 20460 : Poll End of poller
> 2012-12-12_12:33:41 20949 started by 20946 : Poll End of poller
> 2012-12-12_12:36:38 21483 started by 21478 : Poll Start of poller
> 2012-12-12_12:38:14 21971 started by 21964 : Poll Start of poller
> 2012-12-12_12:39:17 21483 started by 21478 : Poll End of poller
> 2012-12-12_12:39:18 21971 started by 21964 : Poll End of poller
> 2012-12-12_12:41:38 22500 started by 22492 : Poll Start of poller
> 2012-12-12_12:42:19 22500 started by 22492 : Poll End of poller
> 2012-12-12_12:43:14 23003 started by 22999 : Poll Start of poller
> 2012-12-12_12:45:20 23003 started by 22999 : Poll End of poller
> 2012-12-12_12:46:37 23540 started by 23535 : Poll Start of poller
> 2012-12-12_12:48:14 24025 started by 24024 : Poll Start of poller
> 2012-12-12_12:48:20 23540 started by 23535 : Poll End of poller
> 2012-12-12_12:48:41 24025 started by 24024 : Poll End of poller
> 2012-12-12_12:51:38 24558 started by 24554 : Poll Start of poller
> 2012-12-12_12:53:14 25044 started by 25041 : Poll Start of poller
> 2012-12-12_12:54:35 25044 started by 25041 : Poll End of poller
>
> As you can see, I start to get overlapping pollers. I don't understand why
> this would happen. Any hints or clues?
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivershttp://p.sf.net/sfu/logmein_12329d2d
>
>
>
> _______________________________________________
> Nagios-users mailing listNagios-users at lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
>
>
> --
>
>
> Mike Guthrie
> Technical Team
> ___
> Nagios Enterprises, LLC
> Email:  mguthrie at nagios.com
> Web:    www.nagios.com
>
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121213/f2074e3b/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From mguthrie at nagios.com  Thu Dec 13 21:37:25 2012
From: mguthrie at nagios.com (Mike Guthrie)
Date: Thu, 13 Dec 2012 14:37:25 -0600
Subject: service checks running too often
In-Reply-To: <CA+FAjA_nSb5OU0PtSmFHRmwBs9e87VB6v3PZDrey1eQ1YOsO3w@mail.gmail.com>
References: <CA+FAjA8UyE0RcVjA+iW-5==wnsKdyrEMz7w-FnaAGkXypXQy1w@mail.gmail.com>
	<50CA012B.100@nagios.com>
	<CA+FAjA_nSb5OU0PtSmFHRmwBs9e87VB6v3PZDrey1eQ1YOsO3w@mail.gmail.com>
Message-ID: <50CA3C85.7050704@nagios.com>


On 12/13/2012 12:38 PM, Mark Keisler wrote:
> I understand that nagios dynamically adjusts service check times, but 
> the puzzling thing is that there is a check that runs every 5 minutes 
> but then an extra or two in between.  And yes, the web interface shows 
> the next service check as 5 mins out and yet another runs before that 
> time hits.
Is there any chance that there could be a second instance of Nagios 
running?   Look for multiple *parent* processes from the following

#modify the nagios binary path to match your system

ps aux | grep /bin/nagios

/etc/init.d/nagios stop

killall -9 nagios

/etc/init.d/nagios start


>
>
> On Thu, Dec 13, 2012 at 10:24 AM, Mike Guthrie <mguthrie at nagios.com 
> <mailto:mguthrie at nagios.com>> wrote:
>
>     Although some of those start times do seem close together, it's
>     important to know that the check_interval in Nagios is not
>     necessarily a hard number. Nagios is continually adjusting and
>     recalculating the check schedule, so if you need a check to run on
>     a hard 5mn schedule, you might be better off using cron, and then
>     pushing the result to Nagios passively.
>
>     With that said, access the service details for this service. When
>     new results come in does the scheduler set the Next Check 5mn out
>     as expected?
>
>
>
>     On 12/13/2012 9:43 AM, Mark Keisler wrote:
>>     I'm running Nagios 3.4.1 on RHEL6. I have an issue where I have a
>>     poller (service check) that is running too often and I am not
>>     sure why. I have "service_check_timeout=180" because I had
>>     trouble with the poller running long. Relevant settings for the
>>     service check:
>>
>>             check_period                    24x7
>>             max_check_attempts              1
>>             normal_check_interval           5
>>             retry_check_interval            5
>>
>>     I also set up a tracking logger in the poller to record
>>     "timestamp PID started by PPID : Poll [Start|End] of poller"
>>     2012-12-12_12:26:38 19448 started by 19442 : Poll Start of poller
>>     2012-12-12_12:27:13 19448 started by 19442 : Poll End of poller
>>     2012-12-12_12:28:14 19931 started by 19930 : Poll Start of poller
>>     2012-12-12_12:30:14 19931 started by 19930 : Poll End of poller
>>     2012-12-12_12:31:37 20467 started by 20460 : Poll Start of poller
>>     2012-12-12_12:33:15 20949 started by 20946 : Poll Start of poller
>>     2012-12-12_12:33:15 20467 started by 20460 : Poll End of poller
>>     2012-12-12_12:33:41 20949 started by 20946 : Poll End of poller
>>     2012-12-12_12:36:38 21483 started by 21478 : Poll Start of poller
>>     2012-12-12_12:38:14 21971 started by 21964 : Poll Start of poller
>>     2012-12-12_12:39:17 21483 started by 21478 : Poll End of poller
>>     2012-12-12_12:39:18 21971 started by 21964 : Poll End of poller
>>     2012-12-12_12:41:38 22500 started by 22492 : Poll Start of poller
>>     2012-12-12_12:42:19 22500 started by 22492 : Poll End of poller
>>     2012-12-12_12:43:14 23003 started by 22999 : Poll Start of poller
>>     2012-12-12_12:45:20 23003 started by 22999 : Poll End of poller
>>     2012-12-12_12:46:37 23540 started by 23535 : Poll Start of poller
>>     2012-12-12_12:48:14 24025 started by 24024 : Poll Start of poller
>>     2012-12-12_12:48:20 23540 started by 23535 : Poll End of poller
>>     2012-12-12_12:48:41 24025 started by 24024 : Poll End of poller
>>     2012-12-12_12:51:38 24558 started by 24554 : Poll Start of poller
>>     2012-12-12_12:53:14 25044 started by 25041 : Poll Start of poller
>>     2012-12-12_12:54:35 25044 started by 25041 : Poll End of poller
>>
>>     As you can see, I start to get overlapping pollers. I don't
>>     understand why this would happen. Any hints or clues?
>>
>>
>>     ------------------------------------------------------------------------------
>>     LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
>>     Remotely access PCs and mobile devices and provide instant support
>>     Improve your efficiency, and focus on delivering more value-add services
>>     Discover what IT Professionals Know. Rescue delivers
>>     http://p.sf.net/sfu/logmein_12329d2d
>>
>>
>>     _______________________________________________
>>     Nagios-users mailing list
>>     Nagios-users at lists.sourceforge.net  <mailto:Nagios-users at lists.sourceforge.net>
>>     https://lists.sourceforge.net/lists/listinfo/nagios-users
>>     ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
>>     ::: Messages without supporting info will risk being sent to /dev/null
>
>
>     -- 
>
>
>     Mike Guthrie
>     Technical Team
>     ___
>     Nagios Enterprises, LLC
>     Email:mguthrie at nagios.com  <mailto:mguthrie at nagios.com>
>     Web:www.nagios.com  <http://www.nagios.com>
>
>
>     ------------------------------------------------------------------------------
>     LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
>     Remotely access PCs and mobile devices and provide instant support
>     Improve your efficiency, and focus on delivering more value-add
>     services
>     Discover what IT Professionals Know. Rescue delivers
>     http://p.sf.net/sfu/logmein_12329d2d
>     _______________________________________________
>     Nagios-users mailing list
>     Nagios-users at lists.sourceforge.net
>     <mailto:Nagios-users at lists.sourceforge.net>
>     https://lists.sourceforge.net/lists/listinfo/nagios-users
>     ::: Please include Nagios version, plugin version (-v) and OS when
>     reporting any issue.
>     ::: Messages without supporting info will risk being sent to /dev/null
>
>
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
>
>
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


-- 


Mike Guthrie
Technical Team
___
Nagios Enterprises, LLC
Email:  mguthrie at nagios.com
Web:    www.nagios.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121213/891e9f48/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From grimm26+nagios at gmail.com  Thu Dec 13 22:33:46 2012
From: grimm26+nagios at gmail.com (Mark Keisler)
Date: Thu, 13 Dec 2012 15:33:46 -0600
Subject: service checks running too often
In-Reply-To: <50CA3C85.7050704@nagios.com>
References: <CA+FAjA8UyE0RcVjA+iW-5==wnsKdyrEMz7w-FnaAGkXypXQy1w@mail.gmail.com>
	<50CA012B.100@nagios.com>
	<CA+FAjA_nSb5OU0PtSmFHRmwBs9e87VB6v3PZDrey1eQ1YOsO3w@mail.gmail.com>
	<50CA3C85.7050704@nagios.com>
Message-ID: <CA+FAjA9x2No2OBC0ms36M0EL4GJmrfhaw_E5fctshAEe=MNMzg@mail.gmail.com>

There isn't a second nagios instance.  While I was watching the pollers
spawn, they all led back to the same master nagios instance.


On Thu, Dec 13, 2012 at 2:37 PM, Mike Guthrie <mguthrie at nagios.com> wrote:

>
> On 12/13/2012 12:38 PM, Mark Keisler wrote:
>
> I understand that nagios dynamically adjusts service check times, but the
> puzzling thing is that there is a check that runs every 5 minutes but then
> an extra or two in between.  And yes, the web interface shows the next
> service check as 5 mins out and yet another runs before that time hits.
>
> Is there any chance that there could be a second instance of Nagios
> running?   Look for multiple *parent* processes from the following
>
> #modify the nagios binary path to match your system
>
> ps aux | grep /bin/nagios
>
>  /etc/init.d/nagios stop
>
> killall -9 nagios
>
> /etc/init.d/nagios start
>
>
>
>
>
> On Thu, Dec 13, 2012 at 10:24 AM, Mike Guthrie <mguthrie at nagios.com>wrote:
>
>>  Although some of those start times do seem close together, it's
>> important to know that the check_interval in Nagios is not necessarily a
>> hard number. Nagios is continually adjusting and recalculating the check
>> schedule, so if you need a check to run on a hard 5mn schedule, you might
>> be better off using cron, and then pushing the result to Nagios passively.
>>
>> With that said, access the service details for this service. When new
>> results come in does the scheduler set the Next Check 5mn out as expected?
>>
>>
>>
>> On 12/13/2012 9:43 AM, Mark Keisler wrote:
>>
>>  I'm running Nagios 3.4.1 on RHEL6. I have an issue where I have a
>> poller (service check) that is running too often and I am not sure why. I
>> have "service_check_timeout=180" because I had trouble with the poller
>> running long. Relevant settings for the service check:
>>
>>         check_period                    24x7
>>         max_check_attempts              1
>>         normal_check_interval           5
>>         retry_check_interval            5
>>
>> I also set up a tracking logger in the poller to record "timestamp PID
>> started by PPID : Poll [Start|End] of poller"
>> 2012-12-12_12:26:38 19448 started by 19442 : Poll Start of poller
>> 2012-12-12_12:27:13 19448 started by 19442 : Poll End of poller
>> 2012-12-12_12:28:14 19931 started by 19930 : Poll Start of poller
>> 2012-12-12_12:30:14 19931 started by 19930 : Poll End of poller
>> 2012-12-12_12:31:37 20467 started by 20460 : Poll Start of poller
>> 2012-12-12_12:33:15 20949 started by 20946 : Poll Start of poller
>> 2012-12-12_12:33:15 20467 started by 20460 : Poll End of poller
>> 2012-12-12_12:33:41 20949 started by 20946 : Poll End of poller
>> 2012-12-12_12:36:38 21483 started by 21478 : Poll Start of poller
>> 2012-12-12_12:38:14 21971 started by 21964 : Poll Start of poller
>> 2012-12-12_12:39:17 21483 started by 21478 : Poll End of poller
>> 2012-12-12_12:39:18 21971 started by 21964 : Poll End of poller
>> 2012-12-12_12:41:38 22500 started by 22492 : Poll Start of poller
>> 2012-12-12_12:42:19 22500 started by 22492 : Poll End of poller
>> 2012-12-12_12:43:14 23003 started by 22999 : Poll Start of poller
>> 2012-12-12_12:45:20 23003 started by 22999 : Poll End of poller
>> 2012-12-12_12:46:37 23540 started by 23535 : Poll Start of poller
>> 2012-12-12_12:48:14 24025 started by 24024 : Poll Start of poller
>> 2012-12-12_12:48:20 23540 started by 23535 : Poll End of poller
>> 2012-12-12_12:48:41 24025 started by 24024 : Poll End of poller
>> 2012-12-12_12:51:38 24558 started by 24554 : Poll Start of poller
>> 2012-12-12_12:53:14 25044 started by 25041 : Poll Start of poller
>> 2012-12-12_12:54:35 25044 started by 25041 : Poll End of poller
>>
>> As you can see, I start to get overlapping pollers. I don't understand
>> why this would happen. Any hints or clues?
>>
>>
>>  ------------------------------------------------------------------------------
>> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
>> Remotely access PCs and mobile devices and provide instant support
>> Improve your efficiency, and focus on delivering more value-add services
>> Discover what IT Professionals Know. Rescue delivershttp://p.sf.net/sfu/logmein_12329d2d
>>
>>
>>
>> _______________________________________________
>> Nagios-users mailing listNagios-users at lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/nagios-users
>> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
>> ::: Messages without supporting info will risk being sent to /dev/null
>>
>>
>>
>> --
>>
>>
>> Mike Guthrie
>> Technical Team
>> ___
>> Nagios Enterprises, LLC
>> Email:  mguthrie at nagios.com
>> Web:    www.nagios.com
>>
>>
>>
>> ------------------------------------------------------------------------------
>> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
>> Remotely access PCs and mobile devices and provide instant support
>> Improve your efficiency, and focus on delivering more value-add services
>> Discover what IT Professionals Know. Rescue delivers
>> http://p.sf.net/sfu/logmein_12329d2d
>> _______________________________________________
>> Nagios-users mailing list
>> Nagios-users at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>> ::: Please include Nagios version, plugin version (-v) and OS when
>> reporting any issue.
>> ::: Messages without supporting info will risk being sent to /dev/null
>>
>
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivershttp://p.sf.net/sfu/logmein_12329d2d
>
>
>
> _______________________________________________
> Nagios-users mailing listNagios-users at lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
>
>
> --
>
>
> Mike Guthrie
> Technical Team
> ___
> Nagios Enterprises, LLC
> Email:  mguthrie at nagios.com
> Web:    www.nagios.com
>
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121213/4815e145/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From grimm26+nagios at gmail.com  Fri Dec 14 05:13:32 2012
From: grimm26+nagios at gmail.com (Mark Keisler)
Date: Thu, 13 Dec 2012 22:13:32 -0600
Subject: service checks running too often
In-Reply-To: <50CA3C85.7050704@nagios.com>
References: <CA+FAjA8UyE0RcVjA+iW-5==wnsKdyrEMz7w-FnaAGkXypXQy1w@mail.gmail.com>
	<50CA012B.100@nagios.com>
	<CA+FAjA_nSb5OU0PtSmFHRmwBs9e87VB6v3PZDrey1eQ1YOsO3w@mail.gmail.com>
	<50CA3C85.7050704@nagios.com>
Message-ID: <CA+FAjA81tdi09S_y-8xDfCkPmOV4vLM79mBDjhRAiqOZZbZi=A@mail.gmail.com>

I think I found the issue.  If I happen to send a reload (HUP) to nagios
while a service check is in progress (fairly easy since my service check is
rather long lived), the reloaded nagios doesn't seem to know about that
service check and so I'll end up with another being scheduled as well as
the original on its schedule.  Create a dummy service check that just
sleeps for 30 seconds or something and issue a reload while it is running
and see if your nagios instance will start another sequence of service
checks.


On Thu, Dec 13, 2012 at 2:37 PM, Mike Guthrie <mguthrie at nagios.com> wrote:

>
> On 12/13/2012 12:38 PM, Mark Keisler wrote:
>
> I understand that nagios dynamically adjusts service check times, but the
> puzzling thing is that there is a check that runs every 5 minutes but then
> an extra or two in between.  And yes, the web interface shows the next
> service check as 5 mins out and yet another runs before that time hits.
>
> Is there any chance that there could be a second instance of Nagios
> running?   Look for multiple *parent* processes from the following
>
> #modify the nagios binary path to match your system
>
> ps aux | grep /bin/nagios
>
>  /etc/init.d/nagios stop
>
> killall -9 nagios
>
> /etc/init.d/nagios start
>
>
>
>
>
> On Thu, Dec 13, 2012 at 10:24 AM, Mike Guthrie <mguthrie at nagios.com>wrote:
>
>>  Although some of those start times do seem close together, it's
>> important to know that the check_interval in Nagios is not necessarily a
>> hard number. Nagios is continually adjusting and recalculating the check
>> schedule, so if you need a check to run on a hard 5mn schedule, you might
>> be better off using cron, and then pushing the result to Nagios passively.
>>
>> With that said, access the service details for this service. When new
>> results come in does the scheduler set the Next Check 5mn out as expected?
>>
>>
>>
>> On 12/13/2012 9:43 AM, Mark Keisler wrote:
>>
>>  I'm running Nagios 3.4.1 on RHEL6. I have an issue where I have a
>> poller (service check) that is running too often and I am not sure why. I
>> have "service_check_timeout=180" because I had trouble with the poller
>> running long. Relevant settings for the service check:
>>
>>         check_period                    24x7
>>         max_check_attempts              1
>>         normal_check_interval           5
>>         retry_check_interval            5
>>
>> I also set up a tracking logger in the poller to record "timestamp PID
>> started by PPID : Poll [Start|End] of poller"
>> 2012-12-12_12:26:38 19448 started by 19442 : Poll Start of poller
>> 2012-12-12_12:27:13 19448 started by 19442 : Poll End of poller
>> 2012-12-12_12:28:14 19931 started by 19930 : Poll Start of poller
>> 2012-12-12_12:30:14 19931 started by 19930 : Poll End of poller
>> 2012-12-12_12:31:37 20467 started by 20460 : Poll Start of poller
>> 2012-12-12_12:33:15 20949 started by 20946 : Poll Start of poller
>> 2012-12-12_12:33:15 20467 started by 20460 : Poll End of poller
>> 2012-12-12_12:33:41 20949 started by 20946 : Poll End of poller
>> 2012-12-12_12:36:38 21483 started by 21478 : Poll Start of poller
>> 2012-12-12_12:38:14 21971 started by 21964 : Poll Start of poller
>> 2012-12-12_12:39:17 21483 started by 21478 : Poll End of poller
>> 2012-12-12_12:39:18 21971 started by 21964 : Poll End of poller
>> 2012-12-12_12:41:38 22500 started by 22492 : Poll Start of poller
>> 2012-12-12_12:42:19 22500 started by 22492 : Poll End of poller
>> 2012-12-12_12:43:14 23003 started by 22999 : Poll Start of poller
>> 2012-12-12_12:45:20 23003 started by 22999 : Poll End of poller
>> 2012-12-12_12:46:37 23540 started by 23535 : Poll Start of poller
>> 2012-12-12_12:48:14 24025 started by 24024 : Poll Start of poller
>> 2012-12-12_12:48:20 23540 started by 23535 : Poll End of poller
>> 2012-12-12_12:48:41 24025 started by 24024 : Poll End of poller
>> 2012-12-12_12:51:38 24558 started by 24554 : Poll Start of poller
>> 2012-12-12_12:53:14 25044 started by 25041 : Poll Start of poller
>> 2012-12-12_12:54:35 25044 started by 25041 : Poll End of poller
>>
>> As you can see, I start to get overlapping pollers. I don't understand
>> why this would happen. Any hints or clues?
>>
>>
>>  ------------------------------------------------------------------------------
>> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
>> Remotely access PCs and mobile devices and provide instant support
>> Improve your efficiency, and focus on delivering more value-add services
>> Discover what IT Professionals Know. Rescue delivershttp://p.sf.net/sfu/logmein_12329d2d
>>
>>
>>
>> _______________________________________________
>> Nagios-users mailing listNagios-users at lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/nagios-users
>> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
>> ::: Messages without supporting info will risk being sent to /dev/null
>>
>>
>>
>> --
>>
>>
>> Mike Guthrie
>> Technical Team
>> ___
>> Nagios Enterprises, LLC
>> Email:  mguthrie at nagios.com
>> Web:    www.nagios.com
>>
>>
>>
>> ------------------------------------------------------------------------------
>> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
>> Remotely access PCs and mobile devices and provide instant support
>> Improve your efficiency, and focus on delivering more value-add services
>> Discover what IT Professionals Know. Rescue delivers
>> http://p.sf.net/sfu/logmein_12329d2d
>> _______________________________________________
>> Nagios-users mailing list
>> Nagios-users at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>> ::: Please include Nagios version, plugin version (-v) and OS when
>> reporting any issue.
>> ::: Messages without supporting info will risk being sent to /dev/null
>>
>
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivershttp://p.sf.net/sfu/logmein_12329d2d
>
>
>
> _______________________________________________
> Nagios-users mailing listNagios-users at lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
>
>
> --
>
>
> Mike Guthrie
> Technical Team
> ___
> Nagios Enterprises, LLC
> Email:  mguthrie at nagios.com
> Web:    www.nagios.com
>
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121213/bf85315f/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From ae at op5.se  Fri Dec 14 10:41:24 2012
From: ae at op5.se (Andreas Ericsson)
Date: Fri, 14 Dec 2012 10:41:24 +0100
Subject: service checks running too often
In-Reply-To: <CA+FAjA81tdi09S_y-8xDfCkPmOV4vLM79mBDjhRAiqOZZbZi=A@mail.gmail.com>
References: <CA+FAjA8UyE0RcVjA+iW-5==wnsKdyrEMz7w-FnaAGkXypXQy1w@mail.gmail.com>
	<50CA012B.100@nagios.com>
	<CA+FAjA_nSb5OU0PtSmFHRmwBs9e87VB6v3PZDrey1eQ1YOsO3w@mail.gmail.com>
	<50CA3C85.7050704@nagios.com>
	<CA+FAjA81tdi09S_y-8xDfCkPmOV4vLM79mBDjhRAiqOZZbZi=A@mail.gmail.com>
Message-ID: <50CAF444.40300@op5.se>

On 12/14/2012 05:13 AM, Mark Keisler wrote:
> I think I found the issue.  If I happen to send a reload (HUP) to nagios
> while a service check is in progress (fairly easy since my service check is
> rather long lived), the reloaded nagios doesn't seem to know about that
> service check and so I'll end up with another being scheduled as well as
> the original on its schedule.  Create a dummy service check that just
> sleeps for 30 seconds or something and issue a reload while it is running
> and see if your nagios instance will start another sequence of service
> checks.
> 

This should be pretty easily fixed by just adding a check reaping event
before initializing the event queue and skipping all checks that have
already been scheduled.

I'll have to add a check for it in 4.x. Since we keep workers between
reloads, the same thing can easily happen there.

That means we'll reschedule all checks like normal when we're starting,
but if a check result comes in when a new check is already scheduled,
we'll remove the old event and reschedule a new one according to the
retry interval. I'd suggest doing something similar in the 3.4.x
branch, but I'm not sure I can commit to that one without doing a new
svn clone, and that takes at least a day.

Mark; Would that be acceptable to you?

Oh, and good catch :)

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From grimm26+nagios at gmail.com  Fri Dec 14 16:19:11 2012
From: grimm26+nagios at gmail.com (Mark Keisler)
Date: Fri, 14 Dec 2012 09:19:11 -0600
Subject: service checks running too often
In-Reply-To: <50CAF444.40300@op5.se>
References: <CA+FAjA8UyE0RcVjA+iW-5==wnsKdyrEMz7w-FnaAGkXypXQy1w@mail.gmail.com>
	<50CA012B.100@nagios.com>
	<CA+FAjA_nSb5OU0PtSmFHRmwBs9e87VB6v3PZDrey1eQ1YOsO3w@mail.gmail.com>
	<50CA3C85.7050704@nagios.com>
	<CA+FAjA81tdi09S_y-8xDfCkPmOV4vLM79mBDjhRAiqOZZbZi=A@mail.gmail.com>
	<50CAF444.40300@op5.se>
Message-ID: <CA+FAjA-uhGE1ssqUHwK_eT4h=-KimosB8q=o=TNVEspyW6Kyrw@mail.gmail.com>

What you propose sounds acceptable.  In the meantime  I need to be careful
about reloading nagios :).  Once I get it in that state, I have to disable
use_retained_scheduling_info and then do a full restart.


On Fri, Dec 14, 2012 at 3:41 AM, Andreas Ericsson <ae at op5.se> wrote:

> On 12/14/2012 05:13 AM, Mark Keisler wrote:
> > I think I found the issue.  If I happen to send a reload (HUP) to nagios
> > while a service check is in progress (fairly easy since my service check
> is
> > rather long lived), the reloaded nagios doesn't seem to know about that
> > service check and so I'll end up with another being scheduled as well as
> > the original on its schedule.  Create a dummy service check that just
> > sleeps for 30 seconds or something and issue a reload while it is running
> > and see if your nagios instance will start another sequence of service
> > checks.
> >
>
> This should be pretty easily fixed by just adding a check reaping event
> before initializing the event queue and skipping all checks that have
> already been scheduled.
>
> I'll have to add a check for it in 4.x. Since we keep workers between
> reloads, the same thing can easily happen there.
>
> That means we'll reschedule all checks like normal when we're starting,
> but if a check result comes in when a new check is already scheduled,
> we'll remove the old event and reschedule a new one according to the
> retry interval. I'd suggest doing something similar in the 3.4.x
> branch, but I'm not sure I can commit to that one without doing a new
> svn clone, and that takes at least a day.
>
> Mark; Would that be acceptable to you?
>
> Oh, and good catch :)
>
> --
> Andreas Ericsson                   andreas.ericsson at op5.se
> OP5 AB                             www.op5.se
> Tel: +46 8-230225                  Fax: +46 8-230231
>
> Considering the successes of the wars on alcohol, poverty, drugs and
> terror, I think we should give some serious thought to declaring war
> on peace.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121214/869e6184/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From ae at op5.se  Fri Dec 14 16:27:10 2012
From: ae at op5.se (Andreas Ericsson)
Date: Fri, 14 Dec 2012 16:27:10 +0100
Subject: service checks running too often
In-Reply-To: <CA+FAjA-uhGE1ssqUHwK_eT4h=-KimosB8q=o=TNVEspyW6Kyrw@mail.gmail.com>
References: <CA+FAjA8UyE0RcVjA+iW-5==wnsKdyrEMz7w-FnaAGkXypXQy1w@mail.gmail.com>
	<50CA012B.100@nagios.com>
	<CA+FAjA_nSb5OU0PtSmFHRmwBs9e87VB6v3PZDrey1eQ1YOsO3w@mail.gmail.com>
	<50CA3C85.7050704@nagios.com>
	<CA+FAjA81tdi09S_y-8xDfCkPmOV4vLM79mBDjhRAiqOZZbZi=A@mail.gmail.com>
	<50CAF444.40300@op5.se>
	<CA+FAjA-uhGE1ssqUHwK_eT4h=-KimosB8q=o=TNVEspyW6Kyrw@mail.gmail.com>
Message-ID: <50CB454E.7010604@op5.se>

On 12/14/2012 04:19 PM, Mark Keisler wrote:
> What you propose sounds acceptable.  In the meantime  I need to be careful
> about reloading nagios :).  Once I get it in that state, I have to disable
> use_retained_scheduling_info and then do a full restart.
> 

I've actually checked Nagios 4 now, and it appears we don't do this there.
I didn't test it all that thoroughly (and I probably should), but it's
friday and I'm two beers past my best-before-thinking hour, so I'll just
refrain from trying it further today.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From leonardo at lbasolutions.com  Fri Dec 14 16:34:17 2012
From: leonardo at lbasolutions.com (Leonardo Bacha Abrantes)
Date: Fri, 14 Dec 2012 13:34:17 -0200
Subject: refresh and stay on current page
Message-ID: <CAG+8EEYAjLuzpK194iimgg29xhsXrdUfsdahDfyPGJ1g9HM9Yw@mail.gmail.com>

Hello guys!

I'm using nagios 3.4.1 and when I press F5 to refresh the page, nagios go
to home.
How can I configure to stay on the current page when F5 is pressed ?

many thanks!!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121214/405a5f9f/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From ae at op5.se  Fri Dec 14 17:28:45 2012
From: ae at op5.se (Andreas Ericsson)
Date: Fri, 14 Dec 2012 17:28:45 +0100
Subject: refresh and stay on current page
In-Reply-To: <CAG+8EEYAjLuzpK194iimgg29xhsXrdUfsdahDfyPGJ1g9HM9Yw@mail.gmail.com>
References: <CAG+8EEYAjLuzpK194iimgg29xhsXrdUfsdahDfyPGJ1g9HM9Yw@mail.gmail.com>
Message-ID: <50CB53BD.4010209@op5.se>

On 12/14/2012 04:34 PM, Leonardo Bacha Abrantes wrote:
> Hello guys!
> 
> I'm using nagios 3.4.1 and when I press F5 to refresh the page, nagios go
> to home.
> How can I configure to stay on the current page when F5 is pressed ?
> 

If you're using Firefox, you can the go to chrome://settings and set the
variable "pixie_dust_my_frames" and it will magically do the right thing.
It's possible it's only available through some plugin though.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From james.osbourn at citrix.com  Sat Dec 15 15:43:25 2012
From: james.osbourn at citrix.com (James Osbourn)
Date: Sat, 15 Dec 2012 14:43:25 +0000
Subject: Nagios Graph converting figures to binary bytes
	rather than decimal
Message-ID: <09051C7A8945F944AB7AC4E86BEB1ED5011663DD1C6E@LONPMAILBOX01.citrite.net>

I have Nagios servers installed with standard installation of nagiosgraph.  I am seeing some weird behaviour of the graphs showing the data returned from check_disk.

I have a filesystem which has usage as follows:
Filesystem            Size  Used Avail Use% Mounted on
filer02.uk.xensource.com:/vol/groups/images
                      450G  310G  141G  69% /usr/groups/images

check_disk returns the following information from the following command
check_disk -w 90 -c 95 -p /usr/groups/images --units=GB

DISK OK - free space: /usr/groups/images 144182 MB (31% inode=99%);| /usr/groups/images=316617MB;460710;460705;0;460800

However, nagiosgraph is showing the following information
Size = 331.79, should be 310
Warning = 477.82, should be 445.5
Critical = 483.18, should be 450
Min = 0, which is correct
Max = 483.18, should be 450


Looking at the GB value and converting to bytes and then back to GB using decimal bytes gives these figures.

Is there any way to make nagiosgraph using binary bytes rather than decimal.  I am not that familiar with nagiosgraph or RRD and cannot work out how to make the change.

Many Thanks

James

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121215/e7cccde6/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Picture (Device Independent Bitmap) 1.jpg
Type: image/jpeg
Size: 41729 bytes
Desc: Picture (Device Independent Bitmap) 1.jpg
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121215/e7cccde6/attachment.jpg>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From ck at claudiokuenzler.com  Sat Dec 15 16:58:44 2012
From: ck at claudiokuenzler.com (Claudio Kuenzler)
Date: Sat, 15 Dec 2012 16:58:44 +0100
Subject: Nagios Graph converting figures to binary bytes
 rather than decimal
In-Reply-To: <09051C7A8945F944AB7AC4E86BEB1ED5011663DD1C6E@LONPMAILBOX01.citrite.net>
References: <09051C7A8945F944AB7AC4E86BEB1ED5011663DD1C6E@LONPMAILBOX01.citrite.net>
Message-ID: <CAF-yqgjRkg_91bx6yeqQohKdTd1nzL2Dvz20QG2JCUWwd1t8wQ@mail.gmail.com>

Hi James,


> Looking at the GB value and converting to bytes and then back to GB using
> decimal bytes gives these figures.
>
> Is there any way to make nagiosgraph using binary bytes rather than
> decimal.  I am not that familiar with nagiosgraph or RRD and cannot work
> out how to make the change.
>

You will have to manually add an entry into the 'map' file to tell
nagiosgraph to use different values.

My system (df -h) shows:
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg0-root   20G  5.4G   14G  29% /

Nagiosgraph shows:
total: 20.32G
used: 5.45G

You can use the following map entry:

# Service Type: check_disk
# Nagiosgraph regex by Claudio Kuenzler
# Check: /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /
# Output: DISK OK - free space: / 235120 MB (66% inode=95%):
# Perfdata: /=119211MB;298635;335964;0;373294
/perfdata:(.*)=(\d+)MB;(\d+);(\d+);(\d+);(\d+).*/
#/perfdata:(\W)=(\d+)MB;(\d+);(\d+);(\d+);(\d+).*/ # only / partition
and push @s, [diskusage,
        ['used', GAUGE, $2*1000**2 ],
        ['total', GAUGE, $6*1000**2 ] ];
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121215/b80a2fbe/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From james.osbourn at citrix.com  Sat Dec 15 17:15:55 2012
From: james.osbourn at citrix.com (James Osbourn)
Date: Sat, 15 Dec 2012 16:15:55 +0000
Subject: Nagios Graph converting figures to binary bytes
 rather than decimal
In-Reply-To: <CAF-yqgjRkg_91bx6yeqQohKdTd1nzL2Dvz20QG2JCUWwd1t8wQ@mail.gmail.com>
References: <09051C7A8945F944AB7AC4E86BEB1ED5011663DD1C6E@LONPMAILBOX01.citrite.net>
	<CAF-yqgjRkg_91bx6yeqQohKdTd1nzL2Dvz20QG2JCUWwd1t8wQ@mail.gmail.com>
Message-ID: <09051C7A8945F944AB7AC4E86BEB1ED5011663DD1C70@LONPMAILBOX01.citrite.net>

Hi Claudio,

Thanks for the feedback.  Just to make sure, where abouts in the map file should these lines go?

Thanks

James

From: Claudio Kuenzler [mailto:ck at claudiokuenzler.com]
Sent: 15 December 2012 15:59
To: Nagios Users List
Subject: Re: [Nagios-users] Nagios Graph converting figures to binary bytes rather than decimal

Hi James,

Looking at the GB value and converting to bytes and then back to GB using decimal bytes gives these figures.

Is there any way to make nagiosgraph using binary bytes rather than decimal.  I am not that familiar with nagiosgraph or RRD and cannot work out how to make the change.

You will have to manually add an entry into the 'map' file to tell nagiosgraph to use different values.

My system (df -h) shows:
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg0-root   20G  5.4G   14G  29% /

Nagiosgraph shows:
total: 20.32G
used: 5.45G

You can use the following map entry:

# Service Type: check_disk
# Nagiosgraph regex by Claudio Kuenzler
# Check: /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /
# Output: DISK OK - free space: / 235120 MB (66% inode=95%):
# Perfdata: /=119211MB;298635;335964;0;373294
/perfdata:(.*)=(\d+)MB;(\d+);(\d+);(\d+);(\d+).*/
#/perfdata:(\W)=(\d+)MB;(\d+);(\d+);(\d+);(\d+).*/ # only / partition
and push @s, [diskusage,
        ['used', GAUGE, $2*1000**2 ],
        ['total', GAUGE, $6*1000**2 ] ];
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121215/7cc0364a/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From ck at claudiokuenzler.com  Sat Dec 15 17:26:06 2012
From: ck at claudiokuenzler.com (Claudio Kuenzler)
Date: Sat, 15 Dec 2012 17:26:06 +0100
Subject: Nagios Graph converting figures to binary bytes
 rather than decimal
In-Reply-To: <09051C7A8945F944AB7AC4E86BEB1ED5011663DD1C70@LONPMAILBOX01.citrite.net>
References: <09051C7A8945F944AB7AC4E86BEB1ED5011663DD1C6E@LONPMAILBOX01.citrite.net>
	<CAF-yqgjRkg_91bx6yeqQohKdTd1nzL2Dvz20QG2JCUWwd1t8wQ@mail.gmail.com>
	<09051C7A8945F944AB7AC4E86BEB1ED5011663DD1C70@LONPMAILBOX01.citrite.net>
Message-ID: <CAF-yqgiJXvYVgBqhm1MUjWC0vuZ1j-GuY+fu1AbBtPjXiGTzNg@mail.gmail.com>

> Hi Claudio,****
>
> ** **
>
> Thanks for the feedback.  Just to make sure, where abouts in the map file
> should these lines go?****
>
> ** **
>
> Thanks****
>
> ** **
>
> James
>

Just make sure you add these lines BEFORE the following part:

##############################################################################################
# default rule.  if none of the other rules did anything, then check for
# perfdata that meets the standard format.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121215/ca79f349/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From nibin.vm at piserve.com  Mon Dec 17 05:45:59 2012
From: nibin.vm at piserve.com (Nibin V M)
Date: Mon, 17 Dec 2012 10:15:59 +0530
Subject: check_snmp command
Message-ID: <CAM+xPJVqu6z7WmXJzf8qVUr5o9UvqXi1OSgg74WO49bKNMJkyQ@mail.gmail.com>

Hello,

I am trying to use check_snmp command and it is not working as expected.
The command I tried is outlined below.

 ./check_snmp -H myhost -C public -o UCD-SNMP-MIB::extOutput.4 -w 500 -c
1000
SNMP OK - 1106 | UCD-SNMP-MIB::extOutput.4=1106

As per the check_snmp man page, it should show CRITICAL status, where it is
in OK status now! Or am I missing something?

Please advice.

-- 
Regards,
Nibin.
---------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121217/ee097aa2/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From james.osbourn at citrix.com  Mon Dec 17 16:14:44 2012
From: james.osbourn at citrix.com (James Osbourn)
Date: Mon, 17 Dec 2012 15:14:44 +0000
Subject: Nagios Graph converting figures to binary bytes
 rather than decimal
References: <09051C7A8945F944AB7AC4E86BEB1ED5011663DD1C6E@LONPMAILBOX01.citrite.net>
	<CAF-yqgjRkg_91bx6yeqQohKdTd1nzL2Dvz20QG2JCUWwd1t8wQ@mail.gmail.com>
Message-ID: <09051C7A8945F944AB7AC4E86BEB1ED5011663DD1DBD@LONPMAILBOX01.citrite.net>

Hi Claudio,

I have entered the map entry below based on your example and I am still seeing the results on the graph show as a decimal version of the Bytes value.

/perfdata:(.*)=(\d+)MB;(\d+);(\d+);(\d+);(\d+)/
and push @s, [$1,
        ['data', GAUGE, $2*1024*1024 ],
        ['warn', GAUGE, $3*1024*1024 ],
        ['crit', GAUGE, $4*1024*1024 ],
        ['min', GAUGE, $5*1024*1024 ],
        ['max', GAUGE, $6*1024*1024 ] ];

[cid:image001.png at 01CDDC69.3ED5FAE0]

Is there something after the map has been processed that takes that value and produces the graphs where the calculation is going wrong?

This is a really annoying issue that I would like to resolve if I can.

Cheers

James

From: James Osbourn
Sent: 15 December 2012 16:16
To: Nagios Users List
Subject: RE: [Nagios-users] Nagios Graph converting figures to binary bytes rather than decimal

Hi Claudio,

Thanks for the feedback.  Just to make sure, where abouts in the map file should these lines go?

Thanks

James

From: Claudio Kuenzler [mailto:ck at claudiokuenzler.com]
Sent: 15 December 2012 15:59
To: Nagios Users List
Subject: Re: [Nagios-users] Nagios Graph converting figures to binary bytes rather than decimal

Hi James,

Looking at the GB value and converting to bytes and then back to GB using decimal bytes gives these figures.

Is there any way to make nagiosgraph using binary bytes rather than decimal.  I am not that familiar with nagiosgraph or RRD and cannot work out how to make the change.

You will have to manually add an entry into the 'map' file to tell nagiosgraph to use different values.

My system (df -h) shows:
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg0-root   20G  5.4G   14G  29% /

Nagiosgraph shows:
total: 20.32G
used: 5.45G

You can use the following map entry:

# Service Type: check_disk
# Nagiosgraph regex by Claudio Kuenzler
# Check: /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /
# Output: DISK OK - free space: / 235120 MB (66% inode=95%):
# Perfdata: /=119211MB;298635;335964;0;373294
/perfdata:(.*)=(\d+)MB;(\d+);(\d+);(\d+);(\d+).*/
#/perfdata:(\W)=(\d+)MB;(\d+);(\d+);(\d+);(\d+).*/ # only / partition
and push @s, [diskusage,
        ['used', GAUGE, $2*1000**2 ],
        ['total', GAUGE, $6*1000**2 ] ];
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121217/259c59e3/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 18833 bytes
Desc: image001.png
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121217/259c59e3/attachment.png>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From Samuel.Kidman at panres.com  Tue Dec 18 09:29:27 2012
From: Samuel.Kidman at panres.com (Samuel Kidman)
Date: Tue, 18 Dec 2012 08:29:27 +0000
Subject: Embedded Perl in Nagios 4
Message-ID: <CB04A43C154C0A4B8CC1BF6B9E036A4E0BE40105@PR-PRO-MDC-EX01.SMY.local>

Hello

I just had a look at Ethan's slide decks on Nagios 4 from 2012 Nagios Conference, and found out that embedded perl won't be in Nagios 4. I'm running a distributed Nagios deployment that makes extensive use of perl plugins, and the performance of my Nagios servers is beginning to struggle a bit as a result. I was working on making my plugins work with embedded perl but since this is now not an option I was wondering what others will be doing to optimise perl plugins without embedded perl?

Thanks
Sam Kidman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121218/81ed6230/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From ae at op5.se  Tue Dec 18 10:02:55 2012
From: ae at op5.se (Andreas Ericsson)
Date: Tue, 18 Dec 2012 10:02:55 +0100
Subject: Embedded Perl in Nagios 4
In-Reply-To: <CB04A43C154C0A4B8CC1BF6B9E036A4E0BE40105@PR-PRO-MDC-EX01.SMY.local>
References: <CB04A43C154C0A4B8CC1BF6B9E036A4E0BE40105@PR-PRO-MDC-EX01.SMY.local>
Message-ID: <50D0313F.5080302@op5.se>

On 12/18/2012 09:29 AM, Samuel Kidman wrote:
> Hello
> 
> I just had a look at Ethan's slide decks on Nagios 4 from 2012 Nagios
> Conference, and found out that embedded perl won't be in Nagios 4.
> I'm running a distributed Nagios deployment that makes extensive use
> of perl plugins, and the performance of my Nagios servers is
> beginning to struggle a bit as a result. I was working on making my
> plugins work with embedded perl but since this is now not an option I
> was wondering what others will be doing to optimise perl plugins
> without embedded perl?
> 

For starters, you should just upgrade and it's entirely possible that
the performance issues go away completely. Nagios 4 has awesome check
scaling.

The second thing to do would be to look into running mod_gearman with
workers living on the same server as the master Nagios process.
mod_gearman still has embedded perl support (although it really only
makes a difference for "large" plugins).

The third thing to do would be to inspect your most "expensive" plugins
(expensive in terms of Perl loadtime multiplied by the number of times
the plugin is used for any given time interval) and see if rewriting
them in a different language makes a huge difference. We did that for
the snmp interface checks when one of our large customers wanted to
monitor some 60000 services. Perl simply wasn't fast enough. Embedding
it meant we couldn't fork() fast enough (embedding languages has its
own overhead too), and it still leaked memory, so we rewrote them in C
and we cut system load by more than 85%.

The fourth thing to do would be to either sponsor a developer or buy
development time to build a special-purpose Nagios worker that handles
perl checks and that has a perl interpreter and a cache embedded. It's
not certain that would be better than just running the perl interpreter
directly. Without a cache, embedded perl is completely useless, since
it still has to bytecompile all the modules, and with it it can grow
to consume all memory on the system, no matter how carefully you write
your plugins. Tradeoffs, compromises and possible leaks no matter what
you do if you start down that route. It might be awesome though, so I
guess it could be worth a shot.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From ck at claudiokuenzler.com  Tue Dec 18 13:14:35 2012
From: ck at claudiokuenzler.com (Claudio Kuenzler)
Date: Tue, 18 Dec 2012 13:14:35 +0100
Subject: Nagios Graph converting figures to binary bytes
 rather than decimal
In-Reply-To: <09051C7A8945F944AB7AC4E86BEB1ED5011663DD1DBD@LONPMAILBOX01.citrite.net>
References: <09051C7A8945F944AB7AC4E86BEB1ED5011663DD1C6E@LONPMAILBOX01.citrite.net>
	<CAF-yqgjRkg_91bx6yeqQohKdTd1nzL2Dvz20QG2JCUWwd1t8wQ@mail.gmail.com>
	<09051C7A8945F944AB7AC4E86BEB1ED5011663DD1DBD@LONPMAILBOX01.citrite.net>
Message-ID: <CAF-yqggVXv1KL_1Ycg_wcw6moH0q2h1wyHneUG5RG2zZZ=ejtg@mail.gmail.com>

> Hi Claudio,****
>
> ** **
>
> I have entered the map entry below based on your example and I am still
> seeing the results on the graph show as a decimal version of the Bytes
> value.****
>
> ** **
>
> /perfdata:(.*)=(\d+)MB;(\d+);(\d+);(\d+);(\d+)/****
>
> and push @s, [$1,****
>
>         ['data', GAUGE, $2*1024*1024 ],****
>
>         ['warn', GAUGE, $3*1024*1024 ],****
>
>         ['crit', GAUGE, $4*1024*1024 ],****
>
>         ['min', GAUGE, $5*1024*1024 ],****
>
>         ['max', GAUGE, $6*1024*1024 ] ];
>

You didn't follow my example, as you're again multiplying with 1024.

Take _another_ look at my example:

        ['used', GAUGE, $2*1000**2 ],
        ['total', GAUGE, $6*1000**2 ] ];
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121218/fd5076fc/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From james.osbourn at citrix.com  Tue Dec 18 14:51:46 2012
From: james.osbourn at citrix.com (James Osbourn)
Date: Tue, 18 Dec 2012 13:51:46 +0000
Subject: Nagios Graph converting figures to binary bytes
 rather than decimal
In-Reply-To: <CAF-yqggVXv1KL_1Ycg_wcw6moH0q2h1wyHneUG5RG2zZZ=ejtg@mail.gmail.com>
References: <09051C7A8945F944AB7AC4E86BEB1ED5011663DD1C6E@LONPMAILBOX01.citrite.net>
	<CAF-yqgjRkg_91bx6yeqQohKdTd1nzL2Dvz20QG2JCUWwd1t8wQ@mail.gmail.com>
	<09051C7A8945F944AB7AC4E86BEB1ED5011663DD1DBD@LONPMAILBOX01.citrite.net>
	<CAF-yqggVXv1KL_1Ycg_wcw6moH0q2h1wyHneUG5RG2zZZ=ejtg@mail.gmail.com>
Message-ID: <09051C7A8945F944AB7AC4E86BEB1ED5011663DD1F73@LONPMAILBOX01.citrite.net>

Hi Claudio,

I modified your code as it was not working for me and wanted to check what was going on.  I have reverted back to using the example that you have given and I am still getting the same result as can be seen here
[cid:image001.png at 01CDDD26.57B72AD0]
The filesystem is only 450GB in size, yet the graph values are still showing 460.80, which is the byte value show in decimal GB.

I cannot work out why the graph is showing the wrong values when all other information is correct.

James

From: Claudio Kuenzler [mailto:ck at claudiokuenzler.com]
Sent: 18 December 2012 12:15
To: Nagios Users List
Subject: Re: [Nagios-users] Nagios Graph converting figures to binary bytes rather than decimal


Hi Claudio,

I have entered the map entry below based on your example and I am still seeing the results on the graph show as a decimal version of the Bytes value.

/perfdata:(.*)=(\d+)MB;(\d+);(\d+);(\d+);(\d+)/
and push @s, [$1,
        ['data', GAUGE, $2*1024*1024 ],
        ['warn', GAUGE, $3*1024*1024 ],
        ['crit', GAUGE, $4*1024*1024 ],
        ['min', GAUGE, $5*1024*1024 ],
        ['max', GAUGE, $6*1024*1024 ] ];

You didn't follow my example, as you're again multiplying with 1024.

Take _another_ look at my example:

        ['used', GAUGE, $2*1000**2 ],
        ['total', GAUGE, $6*1000**2 ] ];
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121218/713ff6c1/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 16703 bytes
Desc: image001.png
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121218/713ff6c1/attachment.png>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From ck at claudiokuenzler.com  Wed Dec 19 15:18:59 2012
From: ck at claudiokuenzler.com (Claudio Kuenzler)
Date: Wed, 19 Dec 2012 15:18:59 +0100
Subject: Nagios Graph converting figures to binary bytes
 rather than decimal
In-Reply-To: <09051C7A8945F944AB7AC4E86BEB1ED5011663DD1F73@LONPMAILBOX01.citrite.net>
References: <09051C7A8945F944AB7AC4E86BEB1ED5011663DD1C6E@LONPMAILBOX01.citrite.net>
	<CAF-yqgjRkg_91bx6yeqQohKdTd1nzL2Dvz20QG2JCUWwd1t8wQ@mail.gmail.com>
	<09051C7A8945F944AB7AC4E86BEB1ED5011663DD1DBD@LONPMAILBOX01.citrite.net>
	<CAF-yqggVXv1KL_1Ycg_wcw6moH0q2h1wyHneUG5RG2zZZ=ejtg@mail.gmail.com>
	<09051C7A8945F944AB7AC4E86BEB1ED5011663DD1F73@LONPMAILBOX01.citrite.net>
Message-ID: <CAF-yqggki7WRWvXBvOLPN1d-21Ra8AkogOP7RTwacPzWLNzFYg@mail.gmail.com>

James and I continued the troubleshooting off-list and we came to the
solution, which we want to share of course.
Here's more or less my mail:

---------------------------------------

You're absolutely right, the graphs were not correct. With both *1000**2
and *1024**2.

Actually, thanks to your e-mail I figured that in the past years I've lived
in denial. I must have come up with the multiplication of 1000 in the map
file as a kind of workaround, because the graph was closer to the actual
reality. Then I must have forgotten that and went on....

I broke it down to this:

# df

Filesystem      Size               Used              Avail             Use%
Mounted on
/dev/md4         682587992     117456312     530731228      19% /

# df -h

Filesystem      Size            Used          Avail          Use% Mounted on
/dev/md4         651G          113G          507G          19% /

So df shows a value in KB (682587992).

The Nagios plugin itself takes this value and presents it in MB (666589):

/=114709MB;533271;599930;0;666589

So in order to present Nagiosgraph the values, we have to go down to the
lowest level, which in this case is Byte.
To get Byte value from the Nagios output we have to multiply it with
1024^2: 666589*1024*1024 = 698969227264

The job of Nagiosgraph is now to take this 698969227264 value and divide it
so often through 1024 until a "reasonable" and human readable value is
given, which would be the 651 GB.

But here's the problem: Nagiosgraph divides 698969227264 through 1000
instead of 1024, showing the graph at 698 GB.
But why? It took me some guesses which I had to confirm but: Nagiosgraph BY
DEFAULT divides through 1000. Probably because the initial reason for rrd
graphs was the graphing of network connections which are usually in bits.
Anyhow we need to tell Nagiosgraph to divide through 1024 for our disk
checks.
There's a special file for that called *rrdopts.conf*. I added the
following lines to it:

# disk values need to be divided by 1024 not 1000
Diskspace /=-b 1024
Root Partition=-b 1024

The string left defines the service description in Nagios. So in my case
this is "Diskspace /". -b 1024 tells Nagiosgraph to take 1024 as a base
value.
See the following entry from the "rrdgraph" manpage:

[*-b*|*--base* *value*] If you are graphing memory (and NOT network
traffic) this switch should be set to 1024 so that one Kb is 1024 byte. For
traffic measurement, 1 kb/s is 1000 b/s.

Now you just have to make sure, that rrdopts.conf is not commented in your
nagiosgraph.conf file and there you go.
Positive thing is that there is no need to recreate the rrd files. This rrd
option is only for viewing/drawing the graphs. Which means that the correct
values are shown immediately.


On Tue, Dec 18, 2012 at 2:51 PM, James Osbourn <james.osbourn at citrix.com>wrote:

> Hi Claudio,****
>
> ** **
>
> I modified your code as it was not working for me and wanted to check what
> was going on.  I have reverted back to using the example that you have
> given and I am still getting the same result as can be seen here****
>
> ****
>
> The filesystem is only 450GB in size, yet the graph values are still
> showing 460.80, which is the byte value show in decimal GB.****
>
> ** **
>
> I cannot work out why the graph is showing the wrong values when all other
> information is correct.****
>
> ** **
>
> James****
>
> ** **
>
> *From:* Claudio Kuenzler [mailto:ck at claudiokuenzler.com]
> *Sent:* 18 December 2012 12:15
>
> *To:* Nagios Users List
> *Subject:* Re: [Nagios-users] Nagios Graph converting figures to binary
> bytes rather than decimal****
>
> ** **
>
> ** **
>
> Hi Claudio,****
>
>  ****
>
> I have entered the map entry below based on your example and I am still
> seeing the results on the graph show as a decimal version of the Bytes
> value.****
>
>  ****
>
> /perfdata:(.*)=(\d+)MB;(\d+);(\d+);(\d+);(\d+)/****
>
> and push @s, [$1,****
>
>         ['data', GAUGE, $2*1024*1024 ],****
>
>         ['warn', GAUGE, $3*1024*1024 ],****
>
>         ['crit', GAUGE, $4*1024*1024 ],****
>
>         ['min', GAUGE, $5*1024*1024 ],****
>
>         ['max', GAUGE, $6*1024*1024 ] ];****
>
>
> You didn't follow my example, as you're again multiplying with 1024.
>
> Take _another_ look at my example:
>
>         ['used', GAUGE, $2*1000**2 ],
>         ['total', GAUGE, $6*1000**2 ] ];****
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121219/3792d78f/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From james.osbourn at citrix.com  Wed Dec 19 15:26:42 2012
From: james.osbourn at citrix.com (James Osbourn)
Date: Wed, 19 Dec 2012 14:26:42 +0000
Subject: Nagios Graph converting figures to binary bytes
 rather than decimal
In-Reply-To: <CAF-yqggki7WRWvXBvOLPN1d-21Ra8AkogOP7RTwacPzWLNzFYg@mail.gmail.com>
References: <09051C7A8945F944AB7AC4E86BEB1ED5011663DD1C6E@LONPMAILBOX01.citrite.net>
	<CAF-yqgjRkg_91bx6yeqQohKdTd1nzL2Dvz20QG2JCUWwd1t8wQ@mail.gmail.com>
	<09051C7A8945F944AB7AC4E86BEB1ED5011663DD1DBD@LONPMAILBOX01.citrite.net>
	<CAF-yqggVXv1KL_1Ycg_wcw6moH0q2h1wyHneUG5RG2zZZ=ejtg@mail.gmail.com>
	<09051C7A8945F944AB7AC4E86BEB1ED5011663DD1F73@LONPMAILBOX01.citrite.net>
	<CAF-yqggki7WRWvXBvOLPN1d-21Ra8AkogOP7RTwacPzWLNzFYg@mail.gmail.com>
Message-ID: <09051C7A8945F944AB7AC4E86BEB1ED5011663DD2164@LONPMAILBOX01.citrite.net>

Thanks to Claudio for finding the solution to this issue.  To add to the instructions below I found that on my installation the line in nagiosgraph.conf to enable the rrdopts.conf file was commented out.  You will need to uncomment this and make sure that it points to the correct place in order for the graphs to be correctly updated.

James

From: Claudio Kuenzler [mailto:ck at claudiokuenzler.com]
Sent: 19 December 2012 14:19
To: Nagios Users List
Subject: Re: [Nagios-users] Nagios Graph converting figures to binary bytes rather than decimal

James and I continued the troubleshooting off-list and we came to the solution, which we want to share of course.
Here's more or less my mail:

---------------------------------------

You're absolutely right, the graphs were not correct. With both *1000**2 and *1024**2.

Actually, thanks to your e-mail I figured that in the past years I've lived in denial. I must have come up with the multiplication of 1000 in the map file as a kind of workaround, because the graph was closer to the actual reality. Then I must have forgotten that and went on....

I broke it down to this:

# df

Filesystem      Size               Used              Avail             Use% Mounted on
/dev/md4         682587992     117456312     530731228      19% /

# df -h

Filesystem      Size            Used          Avail          Use% Mounted on
/dev/md4         651G          113G          507G          19% /

So df shows a value in KB (682587992).

The Nagios plugin itself takes this value and presents it in MB (666589):

/=114709MB;533271;599930;0;666589

So in order to present Nagiosgraph the values, we have to go down to the lowest level, which in this case is Byte.
To get Byte value from the Nagios output we have to multiply it with 1024^2: 666589*1024*1024 = 698969227264

The job of Nagiosgraph is now to take this 698969227264 value and divide it so often through 1024 until a "reasonable" and human readable value is given, which would be the 651 GB.

But here's the problem: Nagiosgraph divides 698969227264 through 1000 instead of 1024, showing the graph at 698 GB.
But why? It took me some guesses which I had to confirm but: Nagiosgraph BY DEFAULT divides through 1000. Probably because the initial reason for rrd graphs was the graphing of network connections which are usually in bits. Anyhow we need to tell Nagiosgraph to divide through 1024 for our disk checks.
There's a special file for that called rrdopts.conf. I added the following lines to it:

# disk values need to be divided by 1024 not 1000
Diskspace /=-b 1024
Root Partition=-b 1024

The string left defines the service description in Nagios. So in my case this is "Diskspace /". -b 1024 tells Nagiosgraph to take 1024 as a base value.
See the following entry from the "rrdgraph" manpage:

[-b|--base value] If you are graphing memory (and NOT network traffic) this switch should be set to 1024 so that one Kb is 1024 byte. For traffic measurement, 1 kb/s is 1000 b/s.

Now you just have to make sure, that rrdopts.conf is not commented in your nagiosgraph.conf file and there you go.
Positive thing is that there is no need to recreate the rrd files. This rrd option is only for viewing/drawing the graphs. Which means that the correct values are shown immediately.


On Tue, Dec 18, 2012 at 2:51 PM, James Osbourn <james.osbourn at citrix.com<mailto:james.osbourn at citrix.com>> wrote:
Hi Claudio,

I modified your code as it was not working for me and wanted to check what was going on.  I have reverted back to using the example that you have given and I am still getting the same result as can be seen here
The filesystem is only 450GB in size, yet the graph values are still showing 460.80, which is the byte value show in decimal GB.

I cannot work out why the graph is showing the wrong values when all other information is correct.

James

From: Claudio Kuenzler [mailto:ck at claudiokuenzler.com<mailto:ck at claudiokuenzler.com>]
Sent: 18 December 2012 12:15

To: Nagios Users List
Subject: Re: [Nagios-users] Nagios Graph converting figures to binary bytes rather than decimal


Hi Claudio,

I have entered the map entry below based on your example and I am still seeing the results on the graph show as a decimal version of the Bytes value.

/perfdata:(.*)=(\d+)MB;(\d+);(\d+);(\d+);(\d+)/
and push @s, [$1,
        ['data', GAUGE, $2*1024*1024 ],
        ['warn', GAUGE, $3*1024*1024 ],
        ['crit', GAUGE, $4*1024*1024 ],
        ['min', GAUGE, $5*1024*1024 ],
        ['max', GAUGE, $6*1024*1024 ] ];

You didn't follow my example, as you're again multiplying with 1024.

Take _another_ look at my example:

        ['used', GAUGE, $2*1000**2 ],
        ['total', GAUGE, $6*1000**2 ] ];

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net<mailto:Nagios-users at lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121219/1c8d5b72/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From darose at darose.net  Fri Dec 21 00:20:31 2012
From: darose at darose.net (David Rosenstrauch)
Date: Thu, 20 Dec 2012 18:20:31 -0500
Subject: Plugin exists?
Message-ID: <50D39D3F.3020306@darose.net>

Hi just wondering if anyone might know of a plugin that does what I'm 
looking for.

The idea is that it would work similar to check_mysql_query, in that it 
checks the numeric value return by a query against warning and critical 
threshold values.  But instead of querying mysql, it would just cat the 
contents of a text file.  I.e., read the first line and first column of 
a text file, parse it as a number, and then compare it against the 
threshold values.

It obviously wouldn't be too hard to write a plugin like this, but I 
figured I'd try to see if one existed in order to not re-invent the wheel.

A search through Nagios Exchange didn't turn anything up.  But I thought 
someone on the list might know of one, or have already written one 
themselves.

Thanks,

DR

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From mail at catsnest.co.uk  Fri Dec 21 15:50:19 2012
From: mail at catsnest.co.uk (RichTea)
Date: Fri, 21 Dec 2012 14:50:19 +0000
Subject: Plugin exists?
In-Reply-To: <50D39D3F.3020306@darose.net>
References: <50D39D3F.3020306@darose.net>
Message-ID: <CAFWLKdeBc4AXR=x0+a17=Yjjn6Pk=KzY=jOHxQWJyDbRQotg8g@mail.gmail.com>

On Thu, Dec 20, 2012 at 11:20 PM, David Rosenstrauch <darose at darose.net>wrote:

> Hi just wondering if anyone might know of a plugin that does what I'm
> looking for.
>
> The idea is that it would work similar to check_mysql_query, in that it
> checks the numeric value return by a query against warning and critical
> threshold values.  But instead of querying mysql, it would just cat the
> contents of a text file.  I.e., read the first line and first column of
> a text file, parse it as a number, and then compare it against the
> threshold values.
>
> It obviously wouldn't be too hard to write a plugin like this, but I
> figured I'd try to see if one existed in order to not re-invent the wheel.
>
> A search through Nagios Exchange didn't turn anything up.  But I thought
> someone on the list might know of one, or have already written one
> themselves.
>
> Thanks,
>
> DR
>
>
Hi,

I wrote something similar to this a while back, it looks at data pairs in a
text file eg <DataName>:<DataValue>.

It should not be to hard to make it work with csv, you will have to ignore
the shoddy coding though!

http://23.me.uk/scripts/check_jmxview.pl.text
http://23.me.uk/scripts/check_jmxview.server.cfg.text

I might even be able to find the Nagios related config for this if needs be.

Ritchie
--
<-- http://23.me.uk/2 -->
<--Time flies like an arrow; fruit flies like a banana.  -->


>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121221/adf827b2/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From darose at darose.net  Fri Dec 21 15:56:15 2012
From: darose at darose.net (David Rosenstrauch)
Date: Fri, 21 Dec 2012 09:56:15 -0500
Subject: Plugin exists?
In-Reply-To: <CAFWLKdeBc4AXR=x0+a17=Yjjn6Pk=KzY=jOHxQWJyDbRQotg8g@mail.gmail.com>
References: <50D39D3F.3020306@darose.net>
	<CAFWLKdeBc4AXR=x0+a17=Yjjn6Pk=KzY=jOHxQWJyDbRQotg8g@mail.gmail.com>
Message-ID: <50D4788F.7030200@darose.net>

On 12/21/2012 09:50 AM, RichTea wrote:
> Hi,
>
> I wrote something similar to this a while back, it looks at data pairs in a
> text file eg <DataName>:<DataValue>.
>
> It should not be to hard to make it work with csv, you will have to ignore
> the shoddy coding though!
>
> http://23.me.uk/scripts/check_jmxview.pl.text
> http://23.me.uk/scripts/check_jmxview.server.cfg.text
>
> I might even be able to find the Nagios related config for this if needs be.
>
> Ritchie

I'll take a look - tnx!

DR


------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


From Peter.Shankland at ricoh-rpl.com  Fri Dec 21 18:26:22 2012
From: Peter.Shankland at ricoh-rpl.com (Peter.Shankland at ricoh-rpl.com)
Date: Fri, 21 Dec 2012 17:26:22 +0000
Subject: AUTO: Peter Shankland is out of the office.
	(returning 02/01/2013)
Message-ID: <OF698C5365.1145734A-ON80257ADB.005FCC5D-80257ADB.005FCC5D@RICOH-RPL.COM>


I am out of the office until 02/01/2013.

Please contact Tom Barnes if the request is urgent:

Tom Barnes
tom.barnes at ricoh-rpl.com
01952 205362

Regards.


Note: This is an automated response to your message  "Re: [Nagios-users]
Plugin exists?" sent on 21/12/2012 14:56:15.

This is the only notification you will receive while this person is away.

________________________________________
Peter Shankland
TECHNICAL SPECIALIST
IT DEPARTMENT

Ricoh UK Products Limited
Priorslee
Telford, TF2 9NS
UK
Tel: +44 (0) 1952 290090
DD:+44 (0) 1952 205160
F:+44 (0) 1952 213100
M:+44 (0) 7919 444077
Peter.Shankland at ricoh-rpl.com

(Embedded image moved to file: pic12941.gif)

(Embedded image moved to file: pic15790.jpg)
Please do not print this email unless absolutely necessary in order to save
paper and energy, and you will contribute to resource conservation and CO2
reduction. This email including attachments is intended for the
addressee(s) only. It may be labelled confidential/ private and contain
confidential/private information. Please respect the wishes of the sender
in the way you treat this email and the information contained within. If in
doubt clarify the wishes of the sender before acting. If you have received
this email in error, you may not review, copy or forward this message in
whole or in part. Ricoh UK Products employees should delete from their
system and notify us of the error via the ISMS Security Incident Reporting
database. External recipients should delete from their system and alert us
via email, advising the name of the sender and the time and date of
receipt. Any views expressed in this email may not necessarily reflect
those of Ricoh UK Products Ltd. You should ensure that the onward
transmission, opening or use of this message or attachments will not
adversely affect your system or data and carry out anti-virus checks before
downloading. Internet communications are not secure and therefore Ricoh UK
Products Ltd accepts no responsibility for any direct, indirect or
consequential damage resulting from the transmission of this message.

Registered in England No. 1763860
Registered Office: Ricoh UK Products Limited, Priorslee, Telford,
Shropshire, TF2 9NS
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pic12941.gif
Type: image/gif
Size: 2190 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121221/94e62a10/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pic15790.jpg
Type: image/jpeg
Size: 7544 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121221/94e62a10/attachment.jpg>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From sda at torand.org  Sun Dec 23 00:40:48 2012
From: sda at torand.org (Scott Anderson)
Date: Sat, 22 Dec 2012 18:40:48 -0500
Subject: check_mem.pl returns NRPE: Unable to read output
Message-ID: <CAB5aYzRadER0r-K1HBB8dpj1J1SLkJKNjoF72fq0ZHN98pBXFQ@mail.gmail.com>

I am using the check_mem.pl script created buy Justin Ellison. I can run it
on my 64 bit box as the user nagios, with out a problem. my nrpe.cfg line
looks like this:

command[check_mem]=/usr/lib64/nagios/plugins/check_mem.pl -f -C -w 20 -c 10

but on the nagios page I get the Unable to read output..

I cannot figure out why this is since as the nagios user on the box I get
this as output..

outland /etc/nagios $sudo su - nagios
-sh-4.1$ /usr/lib64/nagios/plugins/check_mem.pl -f -C -w 20 -c 10
OK - 89.9% (3292024 kB) free.|TOTAL=3663148KB;;;; USED=371124KB;;;;
FREE=3292024KB;;;; CACHES=1043004KB;;;;

so it works as a stand alone but not within nrpe/nagios
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121222/7b4baad1/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From Eliot.Picken at wenaas.co.uk  Sun Dec 23 01:22:57 2012
From: Eliot.Picken at wenaas.co.uk (Eliot.Picken at wenaas.co.uk)
Date: Sun, 23 Dec 2012 00:22:57 +0000
Subject: AUTO: Eliot Picken is out of the office (returning
	31/12/2012)
Message-ID: <OF54524CFC.507B4A25-ON80257ADD.00021A1C-80257ADD.00021A1C@kwintet.com>


I am out of the office until 31/12/2012.

I am currently out of the office, on annual leave.

Your email has not been forwarded.


Note: This is an automated response to your message  "[Nagios-users]
check_mem.pl returns NRPE: Unable to read output" sent on 22/12/2012
23:40:48.

This is the only notification you will receive while this person is away.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121223/002126f1/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From zarrelli at linux.it  Sun Dec 23 01:54:31 2012
From: zarrelli at linux.it (Giorgio Zarrelli)
Date: Sun, 23 Dec 2012 01:54:31 +0100
Subject: check_mem.pl returns NRPE: Unable to read output
Message-ID: <ge7ul82ddwh9l1mtcs9l1jy1.1356224010329@email.android.com>

Just to check if it's a problem related to per missioni, chmod 777 the plugin

Scott Anderson <sda at torand.org> ha scritto:

>------------------------------------------------------------------------------
>LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
>Remotely access PCs and mobile devices and provide instant support
>Improve your efficiency, and focus on delivering more value-add services
>Discover what IT Professionals Know. Rescue delivers
>http://p.sf.net/sfu/logmein_12329d2d
>_______________________________________________
>Nagios-users mailing list
>Nagios-users at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/nagios-users
>::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
>::: Messages without supporting info will risk being sent to /dev/null
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20121223/2bc1203b/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

From eoindickson at gmail.com  Mon Dec 31 13:24:16 2012
From: eoindickson at gmail.com (Eoin Dickson)
Date: Mon, 31 Dec 2012 12:24:16 +0000
Subject: Nagios Looking Glass - authentication
Message-ID: <CALwqRCLvyfnNE8_=UnMjQChpyaxiHJheC2DPhzb9UjREOsmCrQ@mail.gmail.com>

Hi,
I have set up Nagios Looking Glass (v110_b2) and it seems to work ok
except that it still asks for a username/password even though I have
specified the nagiosadmin username and password in
client/s3_config_stub.inc.php and server/sync-files/s3_config.inc.php.
When I go to the URL http://my_nagios_server/nagios/client I still
have to enter the username/password.The whole point of setting this up
was to give a "dashboard" view without having to authenticate.

Thanks for any help.

------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122412
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null