From AdcockJ at leoncountyfl.gov Sun Oct 2 02:14:57 2011 From: AdcockJ at leoncountyfl.gov (Jon Adcock) Date: Sat, 1 Oct 2011 20:14:57 -0400 Subject: Nagios - Check for Updates Message-ID: <4E8774C1.D962.0075.0@leoncountyfl.gov> This feature (check for updates) does not appear to be working for me. When 3.3.1 came out, I waited for a week, and never saw the update available banner displayed on the Nagios main landing page (main.php). I've been playing with the main.php page to get it to display the date/time of the last update check, and it returns blank (no value), so I'm assuming that too mean it's not actually checking. I am currently running Nagios 3.3.1 on the Novell SLES v10 server (Nagios compiled from source). Can anyone give me some troubleshooting steps to get me started? For example, is there a way to enable logging of the check for updates feature, and is there a way to manually start the update check (the API, not the web page URL link)? Jon Adcock Network Systems Administrator Leon County MIS 301 S. Monroe St. Tallahassee, FL 32301 Office: (850) 606-5518 adcockj at leoncountyfl.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpg Size: 2261 bytes Desc: not available URL: -------------- next part -------------- ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From stuart.browne at ausregistry.com.au Mon Oct 3 05:36:06 2011 From: stuart.browne at ausregistry.com.au (Stuart Browne) Date: Mon, 3 Oct 2011 14:36:06 +1100 Subject: Average Check latency and execution time growth - 3.2.3 Message-ID: <8CEF048B9EC83748B1517DC64EA130FB609A2605B8@off-win2003-01.ausregistrygroup.local> Hi, I know this topic has been covered many times, but I've tried those tweaks and I have the remaining issue. After a few days, the latency on checks explodes. It goes along quite happily with small values, then after (about) 3 days, the values rise quite sharply. I've recently been graphing performance statistics (nagiostats, mrtg) and as you can see by the two attachments (day, week), it's rather surprising. We restart Nagios every few days (for other reasons) so thankfully the issue never gets completely out of control, but as you can see, it gets a bit crazy. I can't think of any combination of settings that would cause such growth after such a long period of time. Does anybody have any knowledge as to why it would suddenly increase after running for days without issue? Basic Nagios system stats: 2 x dual-core Xeon 5160 (3Ghz) 6GB Memory 4 x SAS, RAID1 (hardware, BBU, LVM over RAID1) RHEL5, fully patched Load average between 0.5 and 3.2 'nagios -s /etc/nagios/nagios.cfg' output (trimmed): HOST SCHEDULING INFORMATION --------------------------- Total hosts: 252 Total scheduled hosts: 252 Host inter-check delay method: SMART Average host check interval: 300.00 sec Host inter-check delay: 1.19 sec Max host check spread: 30 min First scheduled check: Mon Oct 3 14:31:17 2011 Last scheduled check: Mon Oct 3 14:36:15 2011 SERVICE SCHEDULING INFORMATION ------------------------------- Total services: 1575 Total scheduled services: 1386 Service inter-check delay method: SMART Average service check interval: 878.40 sec Inter-check delay: 0.63 sec Interleave factor method: SMART Average services per host: 6.25 Service interleave factor: 6 Max service check spread: 30 min First scheduled check: Mon Oct 3 14:33:43 2011 Last scheduled check: Mon Oct 3 14:48:21 2011 CHECK PROCESSING INFORMATION ---------------------------- Check result reaper interval: 5 sec Max concurrent service checks: Unlimited PERFORMANCE SUGGESTIONS ----------------------- I have no suggestions - things look okay. Stuart J. Browne Senior Linux Administrator -------------- next part -------------- A non-text attachment was scrubbed... Name: nagios-a-day[1].png Type: image/png Size: 2551 bytes Desc: nagios-a-day[1].png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: nagios-a-week[1].png Type: image/png Size: 2368 bytes Desc: nagios-a-week[1].png URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mad at b-care.net Mon Oct 3 08:44:56 2011 From: mad at b-care.net (=?ISO-8859-1?Q?Marc-Andr=E9?= Doll) Date: Mon, 03 Oct 2011 08:44:56 +0200 Subject: Certificate problems with check_ldap In-Reply-To: <481090375.233622.1317407948324.JavaMail.root@sz0051a.emeryville.ca.mail.comcast.net> References: <481090375.233622.1317407948324.JavaMail.root@sz0051a.emeryville.ca.mail.comcast.net> Message-ID: <1317624296.1588.4.camel@MADness> Hi, I had this problem once. You have to get your root CA and copy it to your default CA certificates directory on your Nagios server (on RedHat it is /etc/openldap/cacerts) or copy it where ever you want and add the line "TLS_CACERT /path/to/my/root/CA.pem" to your openldap configuration file. It solved my problem. Marc-Andr? On Fri, 2011-09-30 at 18:39 +0000, f.hugh at comcast.net wrote: > I have been able to get check_ldap to work fine over the clear on port > 389. When I try to use ssl 636 it fails. It can't verify the cert > since it is our own CA and not a comercial CA that signed the cert. > > This is the error I get: > > ldap_bind: Can't contact LDAP server (-1) > additional info: error:14090086:SSL > routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed > Could not bind to the LDAP server > > > I am certain that it is the trust of the cert that is the problem. I > have googled this for half the day looking for the method to insert > our Root CA as trusted, but have had no luck. Anyone been able to > accomplish this? Think of it as a self signed cert installad on our > AD domain controllers. > > -paul > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2dcopy2 > _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From cyborg9799 at gmail.com Tue Oct 4 14:44:33 2011 From: cyborg9799 at gmail.com (Mark Thomas) Date: Tue, 4 Oct 2011 08:44:33 -0400 Subject: check_snmp_storage.pl ver 1.3.3 ERROR: General time-out (Alarm signal) Message-ID: this script was running fine until we moved to another machine more powerful. Now all alerts of my windows machines are failing. with the above error. My unix machines are fine. I have not implemented the alert for Linux yet. My other snmp alerts (that check windows services are working fine. Nagios 3.3.1 Ubuntu 10 2.6.35-30-server 64 bit Thanks for any help -- Thomas* ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mguthrie at nagios.com Tue Oct 4 16:33:00 2011 From: mguthrie at nagios.com (Mike Guthrie) Date: Tue, 04 Oct 2011 09:33:00 -0500 Subject: Nagios - Check for Updates In-Reply-To: <4E8774C1.D962.0075.0@leoncountyfl.gov> References: <4E8774C1.D962.0075.0@leoncountyfl.gov> Message-ID: <4E8B191C.3030405@nagios.com> Jon Adcock wrote: > This feature (check for updates) does not appear to be working for > me. When 3.3.1 came out, I waited for a week, and never saw the > update available banner displayed on the Nagios main landing page > (main.php). I've been playing with the main.php page to get it to > display the date/time of the last update check, and it returns blank > (no value), so I'm assuming that too mean it's not actually checking. Is your system running behind a proxy? If so, I'm wondering if the update check is attempting to check against the later version API directly... > > I am currently running Nagios 3.3.1 on the Novell SLES v10 server > (Nagios compiled from source). Can anyone give me some > troubleshooting steps to get me started? For example, is there a way > to enable logging of the check for updates feature, and is there a way > to manually start the update check (the API, not the web page URL link)? > > > /**/ > /*Florida's Capital County*/ > /*Jon Adcock*/ > Network Systems Administrator > Leon County MIS > 301 S. Monroe St. > Tallahassee, FL 32301 > Office: (850) 606-5518 > adcockj at leoncountyfl.gov > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2dcopy2 > ------------------------------------------------------------------------ > > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- Mike Guthrie Technical Team ___ Nagios Enterprises, LLC Email: mguthrie at nagios.com Web: www.nagios.com ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From AdcockJ at leoncountyfl.gov Tue Oct 4 17:08:13 2011 From: AdcockJ at leoncountyfl.gov (Jon Adcock) Date: Tue, 4 Oct 2011 11:08:13 -0400 Subject: Nagios - Check for Updates In-Reply-To: <4E8B191C.3030405@nagios.com> References: <4E8774C1.D962.0075.0@leoncountyfl.gov> <4E8B191C.3030405@nagios.com> Message-ID: <4E8AE91D.D962.0075.0@leoncountyfl.gov> Mike, No, my Nagios servers are not behind a proxy. Is there some way to force the API to check now, and is there a way to enable debug logging for the check for updates? Jon Adcock Network Systems Administrator Leon County MIS 301 S. Monroe St. Tallahassee, FL 32301 Office: (850) 606-5518 adcockj at leoncountyfl.gov >>> On 10/4/2011 at 10:33 AM, Mike Guthrie wrote: Jon Adcock wrote: > This feature (check for updates) does not appear to be working for > me. When 3.3.1 came out, I waited for a week, and never saw the > update available banner displayed on the Nagios main landing page > (main.php). I've been playing with the main.php page to get it to > display the date/time of the last update check, and it returns blank > (no value), so I'm assuming that too mean it's not actually checking. Is your system running behind a proxy? If so, I'm wondering if the update check is attempting to check against the later version API directly... > > I am currently running Nagios 3.3.1 on the Novell SLES v10 server > (Nagios compiled from source). Can anyone give me some > troubleshooting steps to get me started? For example, is there a way > to enable logging of the check for updates feature, and is there a way > to manually start the update check (the API, not the web page URL link)? > > > /**/ > /*Florida's Capital County*/ > /*Jon Adcock*/ > Network Systems Administrator > Leon County MIS > 301 S. Monroe St. > Tallahassee, FL 32301 > Office: (850) 606-5518 > adcockj at leoncountyfl.gov > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2dcopy2 > ------------------------------------------------------------------------ > > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- Mike Guthrie Technical Team ___ Nagios Enterprises, LLC Email: mguthrie at nagios.com Web: www.nagios.com ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpg Size: 2261 bytes Desc: not available URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From AdcockJ at leoncountyfl.gov Tue Oct 4 17:09:37 2011 From: AdcockJ at leoncountyfl.gov (Jon Adcock) Date: Tue, 4 Oct 2011 11:09:37 -0400 Subject: Nagios - Check for Updates In-Reply-To: <4E8B191C.3030405@nagios.com> References: <4E8774C1.D962.0075.0@leoncountyfl.gov> <4E8B191C.3030405@nagios.com> Message-ID: <4E8AE971.D962.0075.0@leoncountyfl.gov> Jon Adcock Network Systems Administrator Leon County MIS 301 S. Monroe St. Tallahassee, FL 32301 Office: (850) 606-5518 adcockj at leoncountyfl.gov >>> On 10/4/2011 at 10:33 AM, Mike Guthrie wrote: Jon Adcock wrote: > This feature (check for updates) does not appear to be working for > me. When 3.3.1 came out, I waited for a week, and never saw the > update available banner displayed on the Nagios main landing page > (main.php). I've been playing with the main.php page to get it to > display the date/time of the last update check, and it returns blank > (no value), so I'm assuming that too mean it's not actually checking. Is your system running behind a proxy? If so, I'm wondering if the update check is attempting to check against the later version API directly... > > I am currently running Nagios 3.3.1 on the Novell SLES v10 server > (Nagios compiled from source). Can anyone give me some > troubleshooting steps to get me started? For example, is there a way > to enable logging of the check for updates feature, and is there a way > to manually start the update check (the API, not the web page URL link)? > > > /**/ > /*Florida's Capital County*/ > /*Jon Adcock*/ > Network Systems Administrator > Leon County MIS > 301 S. Monroe St. > Tallahassee, FL 32301 > Office: (850) 606-5518 > adcockj at leoncountyfl.gov > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2dcopy2 > ------------------------------------------------------------------------ > > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- Mike Guthrie Technical Team ___ Nagios Enterprises, LLC Email: mguthrie at nagios.com Web: www.nagios.com ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpg Size: 2261 bytes Desc: not available URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From michael.friedrich at univie.ac.at Tue Oct 4 17:32:06 2011 From: michael.friedrich at univie.ac.at (Michael Friedrich) Date: Tue, 04 Oct 2011 17:32:06 +0200 Subject: Nagios - Check for Updates In-Reply-To: <4E8774C1.D962.0075.0@leoncountyfl.gov> References: <4E8774C1.D962.0075.0@leoncountyfl.gov> Message-ID: <4E8B26F6.8070102@univie.ac.at> On 02.10.2011 02:14, Jon Adcock wrote: > This feature (check for updates) does not appear to be working for > me. When 3.3.1 came out, I waited for a week, and never saw the > update available banner displayed on the Nagios main landing page > (main.php). I've been playing with the main.php page to get it to > display the date/time of the last update check, and it returns blank > (no value), so I'm assuming that too mean it's not actually checking. > I am currently running Nagios 3.3.1 on the Novell SLES v10 server > (Nagios compiled from source). Can anyone give me some > troubleshooting steps to get me started? For example, is there a way > to enable logging of the check for updates feature, and is there a way > to manually start the update check (the API, not the web page URL link)? the core is scheduling call home events and saves the version information and various other attributes in both status.dat and retention.dat. the cgis (or the php part of it) parse that information into a readable version onto the web. http://nagios.svn.sourceforge.net/viewvc/nagios/nagioscore/trunk/include/nagios.h?revision=1786&view=markup line 129ff http://nagios.svn.sourceforge.net/viewvc/nagios/nagioscore/trunk/base/utils.c?revision=1797&view=markup line 3724ff http://nagios.svn.sourceforge.net/viewvc/nagios/nagioscore/trunk/xdata/xsddefault.c?revision=1793&view=markup 405ff http://nagios.svn.sourceforge.net/viewvc/nagios/nagioscore/trunk/xdata/xrddefault.c?revision=1787&view=markup 317ff http://nagios.svn.sourceforge.net/viewvc/nagios/nagioscore/trunk/html/includes/utils.inc.php?revision=1242 so i would guess if you don't allow the nagios core to phone home, it won't show an updated version. scheduling cycle is somewhere around 22 hours, changing is only possible if you recompile. > /**/ > /*Florida's Capital County*/ > /*Jon Adcock*/ > Network Systems Administrator > Leon County MIS > 301 S. Monroe St. > Tallahassee, FL 32301 > Office: (850) 606-5518 > adcockj at leoncountyfl.gov > > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2dcopy2 > > > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- DI (FH) Michael Friedrich Vienna University Computer Center Universitaetsstrasse 7 A-1010 Vienna, Austria email: michael.friedrich at univie.ac.at phone: +43 1 4277 14359 mobile: +43 664 60277 14359 fax: +43 1 4277 14338 web: http://www.univie.ac.at/zid http://www.aco.net Icinga Core& IDOUtils Developer http://www.icinga.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpg Size: 2261 bytes Desc: not available URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mad at b-care.net Tue Oct 4 17:50:51 2011 From: mad at b-care.net (=?ISO-8859-1?Q?Marc-Andr=E9?= Doll) Date: Tue, 04 Oct 2011 17:50:51 +0200 Subject: check_snmp_storage.pl ver 1.3.3 ERROR: General time-out (Alarm signal) In-Reply-To: References: Message-ID: <1317743451.1656.2.camel@MADness> Hi, If you moved to a new server, checks if UDP 161 is open from your monitoring server to your monitored ones and if the snmp configuration on your monitored servers allows queries from your Nagios server. Marc-Andr? On Tue, 2011-10-04 at 08:44 -0400, Mark Thomas wrote: > this script was running fine until we moved to another machine more powerful. > Now all alerts of my windows machines are failing. with the above error. > My unix machines are fine. I have not implemented the alert for Linux yet. > My other snmp alerts (that check windows services are working fine. > > Nagios 3.3.1 > Ubuntu 10 > 2.6.35-30-server > 64 bit > > Thanks for any help ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From AdcockJ at leoncountyfl.gov Tue Oct 4 19:33:48 2011 From: AdcockJ at leoncountyfl.gov (Jon Adcock) Date: Tue, 4 Oct 2011 13:33:48 -0400 Subject: Nagios - Check for Updates In-Reply-To: <4E8B26F6.8070102@univie.ac.at> References: <4E8774C1.D962.0075.0@leoncountyfl.gov> <4E8B26F6.8070102@univie.ac.at> Message-ID: <4E8B0B3C.D962.0075.0@leoncountyfl.gov> Michael, Very helpful. The last link gave me something the latch onto. Here is the top of my status.dat file: info { created=1317748445 version=3.3.1 last_update_check=1317680163 update_available=0 last_version=3.3.1 new_version=3.3.1 } So it appears that Nagios is checking just fine. So my problem appears to be the in main.php, which never did display the "update available" banner when I was running version 3.2.3 (and 3.3.1 had been out for a week). Any ideas? Jon Adcock Network Systems Administrator Leon County MIS 301 S. Monroe St. Tallahassee, FL 32301 Office: (850) 606-5518 adcockj at leoncountyfl.gov >>> On 10/4/2011 at 11:32 AM, Michael Friedrich wrote: On 02.10.2011 02:14, Jon Adcock wrote: This feature (check for updates) does not appear to be working for me. When 3.3.1 came out, I waited for a week, and never saw the update available banner displayed on the Nagios main landing page (main.php). I've been playing with the main.php page to get it to display the date/time of the last update check, and it returns blank (no value), so I'm assuming that too mean it's not actually checking. I am currently running Nagios 3.3.1 on the Novell SLES v10 server (Nagios compiled from source). Can anyone give me some troubleshooting steps to get me started? For example, is there a way to enable logging of the check for updates feature, and is there a way to manually start the update check (the API, not the web page URL link)? the core is scheduling call home events and saves the version information and various other attributes in both status.dat and retention.dat. the cgis (or the php part of it) parse that information into a readable version onto the web. http://nagios.svn.sourceforge.net/viewvc/nagios/nagioscore/trunk/include/nagios.h?revision=1786&view=markup line 129ff http://nagios.svn.sourceforge.net/viewvc/nagios/nagioscore/trunk/base/utils.c?revision=1797&view=markup line 3724ff http://nagios.svn.sourceforge.net/viewvc/nagios/nagioscore/trunk/xdata/xsddefault.c?revision=1793&view=markup 405ff http://nagios.svn.sourceforge.net/viewvc/nagios/nagioscore/trunk/xdata/xrddefault.c?revision=1787&view=markup 317ff http://nagios.svn.sourceforge.net/viewvc/nagios/nagioscore/trunk/html/includes/utils.inc.php?revision=1242 so i would guess if you don't allow the nagios core to phone home, it won't show an updated version. scheduling cycle is somewhere around 22 hours, changing is only possible if you recompile. Jon Adcock Network Systems Administrator Leon County MIS 301 S. Monroe St. Tallahassee, FL 32301 Office: (850) 606-5518 adcockj at leoncountyfl.gov ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- DI (FH) Michael Friedrich Vienna University Computer Center Universitaetsstrasse 7 A-1010 Vienna, Austria email: michael.friedrich at univie.ac.at phone: +43 1 4277 14359 mobile: +43 664 60277 14359 fax: +43 1 4277 14338 web: http://www.univie.ac.at/zid http://www.aco.net Icinga Core & IDOUtils Developer http://www.icinga.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpg Size: 2261 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpg Size: 2261 bytes Desc: not available URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From michael.friedrich at univie.ac.at Tue Oct 4 19:47:48 2011 From: michael.friedrich at univie.ac.at (Michael Friedrich) Date: Tue, 04 Oct 2011 19:47:48 +0200 Subject: Nagios - Check for Updates In-Reply-To: <4E8B0B3C.D962.0075.0@leoncountyfl.gov> References: <4E8774C1.D962.0075.0@leoncountyfl.gov> <4E8B26F6.8070102@univie.ac.at> <4E8B0B3C.D962.0075.0@leoncountyfl.gov> Message-ID: <4E8B46C4.8050209@univie.ac.at> On 04.10.2011 19:33, Jon Adcock wrote: > Michael, > Very helpful. The last link gave me something the latch onto. Here > is the top of my status.dat file: > > info { > created=1317748445 > version=3.3.1 > last_update_check=1317680163 > update_available=0 > last_version=3.3.1 > new_version=3.3.1 > } > So it appears that Nagios is checking just fine. So my problem > appears to be the in main.php, which never did display the "update > available" banner when I was running version 3.2.3 (and 3.3.1 had > been out for a week). Any ideas? that'll be a bug report for nagios developers then. the only support from my side on that functionality - you can have a patchset which completely removes the home calling functionality from both, core and gui and re-adds the default look on the tac.cgi - but i don't think that you want that ;-)) > > /**/ > /*Florida's Capital County*/ > /*Jon Adcock*/ > Network Systems Administrator > Leon County MIS > 301 S. Monroe St. > Tallahassee, FL 32301 > Office: (850) 606-5518 > adcockj at leoncountyfl.gov > >>> On 10/4/2011 at 11:32 AM, Michael Friedrich > wrote: > On 02.10.2011 02:14, Jon Adcock wrote: >> This feature (check for updates) does not appear to be working for >> me. When 3.3.1 came out, I waited for a week, and never saw the >> update available banner displayed on the Nagios main landing page >> (main.php). I've been playing with the main.php page to get it to >> display the date/time of the last update check, and it returns blank >> (no value), so I'm assuming that too mean it's not actually checking. >> I am currently running Nagios 3.3.1 on the Novell SLES v10 server >> (Nagios compiled from source). Can anyone give me some >> troubleshooting steps to get me started? For example, is there a way >> to enable logging of the check for updates feature, and is there a >> way to manually start the update check (the API, not the web page URL >> link)? > > the core is scheduling call home events and saves the version > information and various other attributes in both status.dat and > retention.dat. the cgis (or the php part of it) parse that information > into a readable version onto the web. > > http://nagios.svn.sourceforge.net/viewvc/nagios/nagioscore/trunk/include/nagios.h?revision=1786&view=markup > line 129ff > http://nagios.svn.sourceforge.net/viewvc/nagios/nagioscore/trunk/base/utils.c?revision=1797&view=markup > line 3724ff > http://nagios.svn.sourceforge.net/viewvc/nagios/nagioscore/trunk/xdata/xsddefault.c?revision=1793&view=markup > 405ff > http://nagios.svn.sourceforge.net/viewvc/nagios/nagioscore/trunk/xdata/xrddefault.c?revision=1787&view=markup > 317ff > http://nagios.svn.sourceforge.net/viewvc/nagios/nagioscore/trunk/html/includes/utils.inc.php?revision=1242 > > so i would guess if you don't allow the nagios core to phone home, it > won't show an updated version. scheduling cycle is somewhere around 22 > hours, changing is only possible if you recompile. > > > >> /**/ >> /*Florida's Capital County*/ >> /*Jon Adcock*/ >> Network Systems Administrator >> Leon County MIS >> 301 S. Monroe St. >> Tallahassee, FL 32301 >> Office: (850) 606-5518 >> adcockj at leoncountyfl.gov >> >> >> ------------------------------------------------------------------------------ >> All of the data generated in your IT infrastructure is seriously valuable. >> Why? It contains a definitive record of application performance, security >> threats, fraudulent activity, and more. Splunk takes this data and makes >> sense of it. IT sense. And common sense. >> http://p.sf.net/sfu/splunk-d2dcopy2 >> >> >> _______________________________________________ >> Nagios-users mailing list >> Nagios-users at lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/nagios-users >> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. >> ::: Messages without supporting info will risk being sent to /dev/null > > > -- > DI (FH) Michael Friedrich > > Vienna University Computer Center > Universitaetsstrasse 7 A-1010 Vienna, Austria > > email:michael.friedrich at univie.ac.at > phone: +43 1 4277 14359 > mobile: +43 664 60277 14359 > fax: +43 1 4277 14338 > web:http://www.univie.ac.at/zid > http://www.aco.net > > Icinga Core& IDOUtils Developer > http://www.icinga.org > > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > > > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- DI (FH) Michael Friedrich Vienna University Computer Center Universitaetsstrasse 7 A-1010 Vienna, Austria email: michael.friedrich at univie.ac.at phone: +43 1 4277 14359 mobile: +43 664 60277 14359 fax: +43 1 4277 14338 web: http://www.univie.ac.at/zid http://www.aco.net Icinga Core& IDOUtils Developer http://www.icinga.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpg Size: 2261 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpg Size: 2261 bytes Desc: not available URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From cyborg9799 at gmail.com Tue Oct 4 20:42:42 2011 From: cyborg9799 at gmail.com (Mark Thomas) Date: Tue, 4 Oct 2011 14:42:42 -0400 Subject: check_snmp_storage.pl ver 1.3.3 ERROR: General time-out (Alarm signal) In-Reply-To: <1317743451.1656.2.camel@MADness> References: <1317743451.1656.2.camel@MADness> Message-ID: Marc-Andre, Hey thanks for the reply. New server has same name as old. Old is out of production. I see 161snmp is in /etc/services both tcp and udp. My other windows alerts for windows services checks use snmp on the same server and monitored windows server and they are all working fine. I forgot to mention. The command works fine from command line: /check_snmp_storage.pl -H emerxxxxx -C xxxxxx_public -m C: -q FixedDisk -w 85% -c 90% C:\ Label: Serial Number 4485efbd: 30%used(21314MB/69966MB) (<85%) : OK Maybe I should try rebuilding the command from scratch and test it on a single server. Mark On Tue, Oct 4, 2011 at 11:50 AM, Marc-Andr? Doll wrote: > Hi, > > If you moved to a new server, checks if UDP 161 is open from your > monitoring server to your monitored ones and if the snmp configuration > on your monitored servers allows queries from your Nagios server. > > Marc-Andr? > > On Tue, 2011-10-04 at 08:44 -0400, Mark Thomas wrote: > > this script was running fine until we moved to another machine more > powerful. > > Now all alerts of my windows machines are failing. with the above error. > > My unix machines are fine. I have not implemented the alert for Linux > yet. > > My other snmp alerts (that check windows services are working fine. > > > > Nagios 3.3.1 > > Ubuntu 10 > > 2.6.35-30-server > > 64 bit > > > > Thanks for any help > > > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- *Mark Thomas* -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From rlh1533 at gmail.com Wed Oct 5 15:13:42 2011 From: rlh1533 at gmail.com (R. Leigh Hennig) Date: Wed, 5 Oct 2011 09:13:42 -0400 Subject: How can I change Nagios/NRPE log location? Message-ID: On my remote hosts, /var/log/messages is filling up with messages like this: Sep 26 06:33:53 xinetd[13362]: EXIT: nrpe status=0 pid=8099 duration=0(sec) Sep 26 06:34:01 xinetd[13362]: START: nrpe pid=8105 from= Sep 26 06:34:01 xinetd[13362]: EXIT: nrpe status=0 pid=8105 duration=0(sec) Sep 26 06:34:57 xinetd[13362]: START: nrpe pid=8113 from= How can I make it so that Nagios/NRPE throws these in a different file, and not just /var/log/messages? Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From rosenski at wave-computer.de Wed Oct 5 15:41:26 2011 From: rosenski at wave-computer.de (Axel Rosenski) Date: Wed, 5 Oct 2011 15:41:26 +0200 Subject: How can I change Nagios/NRPE log location? In-Reply-To: References: Message-ID: <201110051541.26733.rosenski@wave-computer.de> Hi R., Am Mittwoch 05 Oktober 2011, 15:13:42 schrieb R. Leigh Hennig: > On my remote hosts, /var/log/messages is filling up with messages like > this: > > Sep 26 06:33:53 xinetd[13362]: EXIT: nrpe status=0 pid=8099 > duration=0(sec) > Sep 26 06:34:01 xinetd[13362]: START: nrpe pid=8105 > from= Sep 26 06:34:01 xinetd[13362]: EXIT: nrpe > status=0 pid=8105 duration=0(sec) > Sep 26 06:34:57 xinetd[13362]: START: nrpe pid=8113 > from= > > How can I make it so that Nagios/NRPE throws these in a different file, and > not just /var/log/messages? You have to configure your logserver. Regards, Axel -- Axel Rosenski - Administration - ______________________________ Wave Computersysteme GmbH Philipp-Reis-Str. 1-3 / 9 35440 Linden Gesch?ftsf?hrer: Carsten Kellmann Registergericht Gie?en HRB 1823 Tel.: +49 (0)6403 / 9050 8317 Fax: +49 (0)6403 / 9050 5089 mailto:rosenski at wave-computer.de http://www.wave-computer.de ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From Schimpke.Thomas at bhn-services.com Wed Oct 5 15:51:18 2011 From: Schimpke.Thomas at bhn-services.com (Schimpke, Dr. Thomas - bhn) Date: Wed, 5 Oct 2011 15:51:18 +0200 Subject: How can I change Nagios/NRPE log location? In-Reply-To: References: Message-ID: <4E8C60D6.2050809@bhn-services.com> These aren't messages from nrpe but from xinetd. You should set the log_type parameter in nrpe's config fiule for the xinetd. Either use SYSLOG with facility local0 and configure syslog to log local0 to a file .../nrpe.log or use FILE as a parameter for the log_type and and the full path to the desired logfile. Check out the xinetd.conf man page for more details. Since you poll nrpe quite often it may be better to run nrpe as a daemon (nrpe -d ...) anyway to avoid the start overhead. Thomas On 10/05/2011 03:13 PM, R. Leigh Hennig wrote: On my remote hosts, /var/log/messages is filling up with messages like this: Sep 26 06:33:53 xinetd[13362]: EXIT: nrpe status=0 pid=8099 duration=0(sec) Sep 26 06:34:01 xinetd[13362]: START: nrpe pid=8105 from= Sep 26 06:34:01 xinetd[13362]: EXIT: nrpe status=0 pid=8105 duration=0(sec) Sep 26 06:34:57 xinetd[13362]: START: nrpe pid=8113 from= How can I make it so that Nagios/NRPE throws these in a different file, and not just /var/log/messages? Thanks ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From komodo at uvt.cz Wed Oct 5 15:43:37 2011 From: komodo at uvt.cz (komodo) Date: Wed, 5 Oct 2011 15:43:37 +0200 Subject: How can I change Nagios/NRPE log location? In-Reply-To: References: Message-ID: <201110051543.37976.komodo@uvt.cz> Hi you can change log_facility=daemon to something like log_facility=local4 and add this to your syslog.conf, for example local4.* /var/log/nrpe.log Best regards Martin On Wednesday 05 October 2011 15:13:42 R. Leigh Hennig wrote: > On my remote hosts, /var/log/messages is filling up with messages like > this: > > Sep 26 06:33:53 xinetd[13362]: EXIT: nrpe status=0 pid=8099 > duration=0(sec) > Sep 26 06:34:01 xinetd[13362]: START: nrpe pid=8105 > from= Sep 26 06:34:01 xinetd[13362]: EXIT: nrpe > status=0 pid=8105 duration=0(sec) > Sep 26 06:34:57 xinetd[13362]: START: nrpe pid=8113 > from= > > How can I make it so that Nagios/NRPE throws these in a different file, and > not just /var/log/messages? > > Thanks ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From rlh1533 at gmail.com Wed Oct 5 16:24:26 2011 From: rlh1533 at gmail.com (R. Leigh Hennig) Date: Wed, 5 Oct 2011 10:24:26 -0400 Subject: How can I change Nagios/NRPE log location? In-Reply-To: <4E8C60D6.2050809@bhn-services.com> References: <4E8C60D6.2050809@bhn-services.com> Message-ID: I've made the configuration change - now I'm guessing I need to restart NRPE daemon to read in the changed config file. How do I restart NRPE? I want it to run as a daemon, and I believe that it is...there's an nrpe file in /etc/xinte.d/...I already restarted xinted after I made the log file change there... On Wed, Oct 5, 2011 at 9:51 AM, Schimpke, Dr. Thomas - bhn < Schimpke.Thomas at bhn-services.com> wrote: > These aren't messages from nrpe but from xinetd. You should set the > log_type parameter in nrpe's config fiule for the xinetd. Either use SYSLOG > with facility local0 and configure syslog to log local0 to a file > .../nrpe.log or use > FILE as a parameter for the log_type and and the full path to the desired > logfile. > > Check out the xinetd.conf man page for more details. > > Since you poll nrpe quite often it may be better to run nrpe as a daemon > (nrpe -d ...) anyway to avoid the start overhead. > > Thomas > > On 10/05/2011 03:13 PM, R. Leigh Hennig wrote: > On my remote hosts, /var/log/messages is filling up with messages like > this: > > Sep 26 06:33:53 xinetd[13362]: EXIT: nrpe status=0 pid=8099 > duration=0(sec) > Sep 26 06:34:01 xinetd[13362]: START: nrpe pid=8105 > from= > Sep 26 06:34:01 xinetd[13362]: EXIT: nrpe status=0 pid=8105 > duration=0(sec) > Sep 26 06:34:57 xinetd[13362]: START: nrpe pid=8113 > from= > > How can I make it so that Nagios/NRPE throws these in a different file, and > not just /var/log/messages? > > Thanks > > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From Schimpke.Thomas at bhn-services.com Wed Oct 5 16:43:51 2011 From: Schimpke.Thomas at bhn-services.com (Schimpke, Dr. Thomas - bhn) Date: Wed, 5 Oct 2011 16:43:51 +0200 Subject: How can I change Nagios/NRPE log location? In-Reply-To: References: <4E8C60D6.2050809@bhn-services.com> Message-ID: <4E8C6D27.70601@bhn-services.com> Hi, it's not nrpe's config file - it's xinetd's config file for the nrpe service. nrpe is started by xinetd as soon as a request from the nagios server arrives. So you simply need to restart xinetd (or reload its configuration). If you still use SYSLOG (and not FILE), then you should configure the facility in /etc/syslog.conf appropriately. You need to restart/reload syslog for the change to have effect. Thomas On 10/05/2011 04:24 PM, R. Leigh Hennig wrote: I've made the configuration change - now I'm guessing I need to restart NRPE daemon to read in the changed config file. How do I restart NRPE? I want it to run as a daemon, and I believe that it is...there's an nrpe file in /etc/xinte.d/...I already restarted xinted after I made the log file change there... On Wed, Oct 5, 2011 at 9:51 AM, Schimpke, Dr. Thomas - bhn > wrote: These aren't messages from nrpe but from xinetd. You should set the log_type parameter in nrpe's config fiule for the xinetd. Either use SYSLOG with facility local0 and configure syslog to log local0 to a file .../nrpe.log or use FILE as a parameter for the log_type and and the full path to the desired logfile. Check out the xinetd.conf man page for more details. Since you poll nrpe quite often it may be better to run nrpe as a daemon (nrpe -d ...) anyway to avoid the start overhead. Thomas On 10/05/2011 03:13 PM, R. Leigh Hennig wrote: On my remote hosts, /var/log/messages is filling up with messages like this: Sep 26 06:33:53 xinetd[13362]: EXIT: nrpe status=0 pid=8099 duration=0(sec) Sep 26 06:34:01 xinetd[13362]: START: nrpe pid=8105 from= Sep 26 06:34:01 xinetd[13362]: EXIT: nrpe status=0 pid=8105 duration=0(sec) Sep 26 06:34:57 xinetd[13362]: START: nrpe pid=8113 from= How can I make it so that Nagios/NRPE throws these in a different file, and not just /var/log/messages? Thanks ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From ae at op5.se Wed Oct 5 16:46:31 2011 From: ae at op5.se (Andreas Ericsson) Date: Wed, 05 Oct 2011 16:46:31 +0200 Subject: Nagios World Conference Message-ID: <4E8C6DC7.2050200@op5.se> Hi all. I attended the Nagios World Conference North America last week and though I'd dish out some kudos where such are due, and also dense up the information to any newcomers that might get lucky when looking for solutions to any particular problems. Overall, the standard of the conference was very, very high. It was the first Nagios conference I've gone to where I learned something new. A rare occasion indeed, so many thanks to Ethan, Mary and Nagios Enterprises for arranging such a high-quality event. I won't mention their talks, since I don't want to inflate their egos too much, but check out the one on visualizations by Mike Guthrie. Pretty cool stuff :) Much of the focus was on scaling up Nagios. mod_gearman and livestatus seem to be the most known and used projects for achieving that goal. Reading status files is just too slow when viewing the UI, and a single server just doesn't scale to enough checks (yet). DNX also seemed very well investigated and used in some places, although a documentation mishap seems to have lead many potential users away from it. For those wondering, DNX can indeed distribute checks to workers based on host- groups, just as mod_gearman can. It's just not well documented. LivestatusSlave also got a lot of interest, although it didn't seem to be as well used as either of the other three. Kudos to Sven Nierlein (mod_gearman author/maintainer) Mathias Kettner (mk_livestatus author/maintainer) and Lars Michelsen (LivestatusSlave author). Your stuff is being used in production for positively *huge* installs, so well done guys :) I sure hope you go to the conference next year so you can talk about future development and gather even more interest for your projects. Merlin wasn't much discussed, although the DNX maintainers (and I) recommend it as the only sane way to get redundancy and automagic loadbalancing. Probably because of the misconception that you're required to run a separate UI and a fork of Nagios when using it. At least that's what my slightly hurt ego wants to believe ;) General tips for running large installations is to offload the various spool directories to ramdisk, along with status.dat and objects.cache (since they're read quite frequently). Work is under way to make that unnecessary by simply getting rid of disk I/O as much as possible. It was pretty much headnodding when these tips were iterated in one talk after another, so it seems the attending part of the Nagios community have reach consensus that that's the best way to do it. Mounting all disks with the noatime option is also a very good tip that'll get your disk write operations (the slow ones) down to a fragment of what they were before you latched that option on. Many have large headaches with getting various graphing solutions to scale properly. Some resorted to using Fusion I/O cards with exabyte performance (quite expensive...), since using ramdisk to store the tens or hundreds of gigabytes of rrd-files generated in large installs isn't really an option. It would be nice to hear Joerg Linge's (author of PNP4Nagios) take on other paths to increase performance next year. It seems his project is the most widely used for graphing, so getting it to perform exceptionally well would be time well spent. Apart from that, there were plenty of other good presentations and very awesome drinking^H^H^H^H^H^H^H mingle sessions. I highly recommend you attend it next year if you're managing nagios install at $dayjob, or if you're working on a Nagios addon project and want to get immediate feedback on what users are looking for. -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From rlh1533 at gmail.com Wed Oct 5 16:54:52 2011 From: rlh1533 at gmail.com (R. Leigh Hennig) Date: Wed, 5 Oct 2011 10:54:52 -0400 Subject: How can I change Nagios/NRPE log location? In-Reply-To: <4E8C6D27.70601@bhn-services.com> References: <4E8C60D6.2050809@bhn-services.com> <4E8C6D27.70601@bhn-services.com> Message-ID: I made the change and restarted xinted. How do I restart syslog? On Wed, Oct 5, 2011 at 10:43 AM, Schimpke, Dr. Thomas - bhn < Schimpke.Thomas at bhn-services.com> wrote: > Hi, > > it's not nrpe's config file - it's xinetd's config file for the nrpe > service. nrpe is started by xinetd as soon as a request from the nagios > server arrives. So you simply need to restart xinetd (or reload its > configuration). > > If you still use SYSLOG (and not FILE), then you should configure the > facility in /etc/syslog.conf appropriately. You need to restart/reload > syslog for the change to have effect. > > Thomas > > > On 10/05/2011 04:24 PM, R. Leigh Hennig wrote: > I've made the configuration change - now I'm guessing I need to restart > NRPE daemon to read in the changed config file. How do I restart NRPE? I > want it to run as a daemon, and I believe that it is...there's an nrpe file > in /etc/xinte.d/...I already restarted xinted after I made the log file > change there... > > On Wed, Oct 5, 2011 at 9:51 AM, Schimpke, Dr. Thomas - bhn < > Schimpke.Thomas at bhn-services.com> > wrote: > These aren't messages from nrpe but from xinetd. You should set the > log_type parameter in nrpe's config fiule for the xinetd. Either use SYSLOG > with facility local0 and configure syslog to log local0 to a file > .../nrpe.log or use > FILE as a parameter for the log_type and and the full path to the desired > logfile. > > Check out the xinetd.conf man page for more details. > > Since you poll nrpe quite often it may be better to run nrpe as a daemon > (nrpe -d ...) anyway to avoid the start overhead. > > Thomas > > On 10/05/2011 03:13 PM, R. Leigh Hennig wrote: > On my remote hosts, /var/log/messages is filling up with messages like > this: > > Sep 26 06:33:53 xinetd[13362]: EXIT: nrpe status=0 pid=8099 > duration=0(sec) > Sep 26 06:34:01 xinetd[13362]: START: nrpe pid=8105 > from= > Sep 26 06:34:01 xinetd[13362]: EXIT: nrpe status=0 pid=8105 > duration=0(sec) > Sep 26 06:34:57 xinetd[13362]: START: nrpe pid=8113 > from= > > How can I make it so that Nagios/NRPE throws these in a different file, and > not just /var/log/messages? > > Thanks > > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net Nagios-users at lists.sourceforge.net> > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > > > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From Schimpke.Thomas at bhn-services.com Wed Oct 5 17:24:29 2011 From: Schimpke.Thomas at bhn-services.com (Schimpke, Dr. Thomas - bhn) Date: Wed, 5 Oct 2011 17:24:29 +0200 Subject: How can I change Nagios/NRPE log location? In-Reply-To: References: <4E8C60D6.2050809@bhn-services.com> <4E8C6D27.70601@bhn-services.com> Message-ID: <4E8C76AD.5000604@bhn-services.com> It somewhat depends upon your operating system (and Linux distribution, if you use Linux). On a RedHat based system you may want to try "service syslog restart". Typically syslog reloads its configuration, if you sent him the SIGHUP signal. So you may want to use ps -ef | grep syslog to determine syslog's pid and then kill -HUP pid. You may want to check syslog's man page to verify if your syslog responds to signals ... if your're not on linux. Thomas On 10/05/2011 04:54 PM, R. Leigh Hennig wrote: I made the change and restarted xinted. How do I restart syslog? On Wed, Oct 5, 2011 at 10:43 AM, Schimpke, Dr. Thomas - bhn > wrote: Hi, it's not nrpe's config file - it's xinetd's config file for the nrpe service. nrpe is started by xinetd as soon as a request from the nagios server arrives. So you simply need to restart xinetd (or reload its configuration). If you still use SYSLOG (and not FILE), then you should configure the facility in /etc/syslog.conf appropriately. You need to restart/reload syslog for the change to have effect. Thomas On 10/05/2011 04:24 PM, R. Leigh Hennig wrote: I've made the configuration change - now I'm guessing I need to restart NRPE daemon to read in the changed config file. How do I restart NRPE? I want it to run as a daemon, and I believe that it is...there's an nrpe file in /etc/xinte.d/...I already restarted xinted after I made the log file change there... On Wed, Oct 5, 2011 at 9:51 AM, Schimpke, Dr. Thomas - bhn >> wrote: These aren't messages from nrpe but from xinetd. You should set the log_type parameter in nrpe's config fiule for the xinetd. Either use SYSLOG with facility local0 and configure syslog to log local0 to a file .../nrpe.log or use FILE as a parameter for the log_type and and the full path to the desired logfile. Check out the xinetd.conf man page for more details. Since you poll nrpe quite often it may be better to run nrpe as a daemon (nrpe -d ...) anyway to avoid the start overhead. Thomas On 10/05/2011 03:13 PM, R. Leigh Hennig wrote: On my remote hosts, /var/log/messages is filling up with messages like this: Sep 26 06:33:53 xinetd[13362]: EXIT: nrpe status=0 pid=8099 duration=0(sec) Sep 26 06:34:01 xinetd[13362]: START: nrpe pid=8105 from= Sep 26 06:34:01 xinetd[13362]: EXIT: nrpe status=0 pid=8105 duration=0(sec) Sep 26 06:34:57 xinetd[13362]: START: nrpe pid=8113 from= How can I make it so that Nagios/NRPE throws these in a different file, and not just /var/log/messages? Thanks ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From rlh1533 at gmail.com Wed Oct 5 17:34:11 2011 From: rlh1533 at gmail.com (R. Leigh Hennig) Date: Wed, 5 Oct 2011 11:34:11 -0400 Subject: How can I change Nagios/NRPE log location? In-Reply-To: <4E8C76AD.5000604@bhn-services.com> References: <4E8C60D6.2050809@bhn-services.com> <4E8C6D27.70601@bhn-services.com> <4E8C76AD.5000604@bhn-services.com> Message-ID: service syslog restart did restart the syslog service, however no changes have been made. Logs about NRPE are still going to /var/log/messages, even though I added "local4.* /var/log/nrpe.log" to that file, and the nrpe.cfg I changed it to local4 as well... On Wed, Oct 5, 2011 at 11:24 AM, Schimpke, Dr. Thomas - bhn < Schimpke.Thomas at bhn-services.com> wrote: > It somewhat depends upon your operating system (and Linux distribution, if > you use Linux). On a RedHat based system you may want to try "service syslog > restart". Typically syslog reloads its configuration, if you sent him the > SIGHUP signal. So you may want to use ps -ef | grep syslog to determine > syslog's pid and then kill -HUP pid. > > You may want to check syslog's man page to verify if your syslog responds > to signals ... if your're not on linux. > > > Thomas > > On 10/05/2011 04:54 PM, R. Leigh Hennig wrote: > I made the change and restarted xinted. How do I restart syslog? > > On Wed, Oct 5, 2011 at 10:43 AM, Schimpke, Dr. Thomas - bhn < > Schimpke.Thomas at bhn-services.com> > wrote: > Hi, > > it's not nrpe's config file - it's xinetd's config file for the nrpe > service. nrpe is started by xinetd as soon as a request from the nagios > server arrives. So you simply need to restart xinetd (or reload its > configuration). > > If you still use SYSLOG (and not FILE), then you should configure the > facility in /etc/syslog.conf appropriately. You need to restart/reload > syslog for the change to have effect. > > Thomas > > > On 10/05/2011 04:24 PM, R. Leigh Hennig wrote: > I've made the configuration change - now I'm guessing I need to restart > NRPE daemon to read in the changed config file. How do I restart NRPE? I > want it to run as a daemon, and I believe that it is...there's an nrpe file > in /etc/xinte.d/...I already restarted xinted after I made the log file > change there... > > On Wed, Oct 5, 2011 at 9:51 AM, Schimpke, Dr. Thomas - bhn < > Schimpke.Thomas at bhn-services.com > Schimpke.Thomas at bhn-services.com>>> wrote: > These aren't messages from nrpe but from xinetd. You should set the > log_type parameter in nrpe's config fiule for the xinetd. Either use SYSLOG > with facility local0 and configure syslog to log local0 to a file > .../nrpe.log or use > FILE as a parameter for the log_type and and the full path to the desired > logfile. > > Check out the xinetd.conf man page for more details. > > Since you poll nrpe quite often it may be better to run nrpe as a daemon > (nrpe -d ...) anyway to avoid the start overhead. > > Thomas > > On 10/05/2011 03:13 PM, R. Leigh Hennig wrote: > On my remote hosts, /var/log/messages is filling up with messages like > this: > > Sep 26 06:33:53 xinetd[13362]: EXIT: nrpe status=0 pid=8099 > duration=0(sec) > Sep 26 06:34:01 xinetd[13362]: START: nrpe pid=8105 > from= > Sep 26 06:34:01 xinetd[13362]: EXIT: nrpe status=0 pid=8105 > duration=0(sec) > Sep 26 06:34:57 xinetd[13362]: START: nrpe pid=8113 > from= > > How can I make it so that Nagios/NRPE throws these in a different file, and > not just /var/log/messages? > > Thanks > > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net Nagios-users at lists.sourceforge.net> Nagios-users at lists.sourceforge.net Nagios-users at lists.sourceforge.net>> > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > > > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net Nagios-users at lists.sourceforge.net> > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > > > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mguthrie at nagios.com Wed Oct 5 19:43:42 2011 From: mguthrie at nagios.com (Mike Guthrie) Date: Wed, 05 Oct 2011 12:43:42 -0500 Subject: Nagios World Conference In-Reply-To: <4E8C6DC7.2050200@op5.se> References: <4E8C6DC7.2050200@op5.se> Message-ID: <4E8C974E.9010704@nagios.com> Hi All, Andreas Ericsson wrote: > Hi all. I attended the Nagios World Conference North America last week > and though I'd dish out some kudos where such are due, and also dense > up the information to any newcomers that might get lucky when looking > for solutions to any particular problems. > > Overall, the standard of the conference was very, very high. It was the > first Nagios conference I've gone to where I learned something new. A > rare occasion indeed, so many thanks to Ethan, Mary and Nagios Enterprises > for arranging such a high-quality event. I won't mention their talks, > since I don't want to inflate their egos too much, but check out the one > on visualizations by Mike Guthrie. Pretty cool stuff :) > > I agree, the conference was very cool. I'm still decompressing from all of the ideas that I got from other users. Thanks for the shout out, I'm flattered ; ) > Much of the focus was on scaling up Nagios. mod_gearman and livestatus > seem to be the most known and used projects for achieving that goal. > Reading status files is just too slow when viewing the UI, and a single > server just doesn't scale to enough checks (yet). DNX also seemed very > well investigated and used in some places, although a documentation > mishap seems to have lead many potential users away from it. For those > wondering, DNX can indeed distribute checks to workers based on host- > groups, just as mod_gearman can. It's just not well documented. > LivestatusSlave also got a lot of interest, although it didn't seem to > be as well used as either of the other three. > Kudos to Sven Nierlein (mod_gearman author/maintainer) Mathias Kettner > (mk_livestatus author/maintainer) and Lars Michelsen (LivestatusSlave > author). Your stuff is being used in production for positively *huge* > installs, so well done guys :) I sure hope you go to the conference > next year so you can talk about future development and gather even more > interest for your projects. > > Merlin wasn't much discussed, although the DNX maintainers (and I) > recommend it as the only sane way to get redundancy and automagic > loadbalancing. Probably because of the misconception that you're > required to run a separate UI and a fork of Nagios when using it. At > least that's what my slightly hurt ego wants to believe ;) > > I will admit my own ignorance in realizing what Merlin can do, along with several other projects I learned about last week. I think one of the biggest things I took away from the conference is just the enormous need to document and better publicize existing projects and also how to implement some of the powerful set ups that many users have had to figure out the hard way. Nagios is incredibly flexible and there's so much that can be done with it, after this week I'm realizing that the existing documentation just scratches the surface of what can be done with it. If there's a need from the community, it's more documentation, tutorials, and publicity. There's a wealth of great projects and Nagios tricks out there and I for one would love to see that users are getting all of the information that they need for the environments that they run. > General tips for running large installations is to offload the various > spool directories to ramdisk, along with status.dat and objects.cache > (since they're read quite frequently). Work is under way to make that > unnecessary by simply getting rid of disk I/O as much as possible. It > was pretty much headnodding when these tips were iterated in one talk > after another, so it seems the attending part of the Nagios community > have reach consensus that that's the best way to do it. Mounting all > disks with the noatime option is also a very good tip that'll get your > disk write operations (the slow ones) down to a fragment of what they > were before you latched that option on. > > > Many have large headaches with getting various graphing solutions to > scale properly. Some resorted to using Fusion I/O cards with exabyte > performance (quite expensive...), since using ramdisk to store the > tens or hundreds of gigabytes of rrd-files generated in large installs > isn't really an option. It would be nice to hear Joerg Linge's (author of > PNP4Nagios) take on other paths to increase performance next year. It > seems his project is the most widely used for graphing, so getting it to > perform exceptionally well would be time well spent. > > Apart from that, there were plenty of other good presentations and very > awesome drinking^H^H^H^H^H^H^H mingle sessions. I highly recommend you > attend it next year if you're managing nagios install at $dayjob, or > if you're working on a Nagios addon project and want to get immediate > feedback on what users are looking for. > > -- Mike Guthrie Technical Team ___ Nagios Enterprises, LLC Email: mguthrie at nagios.com Web: www.nagios.com ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From bluethundr at jokefire.com Thu Oct 6 04:52:20 2011 From: bluethundr at jokefire.com (Tim Dunphy) Date: Thu, 06 Oct 2011 02:52:20 -0000 (UTC) Subject: disk checks unreliable In-Reply-To: References: Message-ID: <93a735d2-34f3-463d-9919-365b975d51f9@li289-212> hello list! hello.. I am running a nagios disk check that reports OK even when the partition is not mounted or the machine is shut down .. how can I test the check and adjust it so that it reports accurately? ## Machine info CentOS release 5.6 (Final) i686 ##Nagios Version Nagios Core 3.3.1 ## Command definition define command{ command_name check_store_disk command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$ } ## Service definition define service{ use local-service ; Name of service template to use #host_name localhost hostgroup_name web-servers service_description Store Partition check_command check_store_disk!20%!10%!/ } The disk is mounted: [root at VIRTCENT11:~] #df -h nas2:/mnt/store 1.4T 370G 876G 30% /mnt/store [root at VIRTCENT11:~] #/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /mnt/store DISK OK - free space: /mnt/store 896088 MB (70% inode=99%);| /mnt/store=378829MB;1108624;1247202;0;1385780 In this case the check is accurate...the disk is mounted Now I unmount the partition: [root at VIRTCENT11:~] #umount /mnt/store I verify that the partition is not mounted with df and then run the check again: [root at VIRTCENT11:~] #/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /mnt/store DISK OK - free space: / 5737 MB (68% inode=96%);| /=2581MB;7017;7894;0;8772 But the check still thinks the disk is ok. How can I best address this problem? Thank you, Tim ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From jeffrey.w.watts at gmail.com Thu Oct 6 07:42:09 2011 From: jeffrey.w.watts at gmail.com (Jeffrey Watts) Date: Thu, 6 Oct 2011 00:42:09 -0500 Subject: disk checks unreliable In-Reply-To: <93a735d2-34f3-463d-9919-365b975d51f9@li289-212> References: <93a735d2-34f3-463d-9919-365b975d51f9@li289-212> Message-ID: The check is working correctly - /mnt/store is a valid path in both circumstances. Remember, in Unix mounted filesystems all sit on top of the / filesystem, so when you umount the filesystem on /mnt/store, that mountpoint still exists (on /). The way I've done it in the past is by using -r/-R to match against the source path. For example, to match "//server/FooBar$" I had a check_disk check with "-r FooBar" in it. I imagine that you might also be able to do what you're looking to do by using -X to exclude whatever filesystem type the / filesystem is (assuming that the mounted filesystem is a different type, of course). I'm sure others will have different ways of doing it too. Good luck. Jeffrey. On Wed, Oct 5, 2011 at 9:52 PM, Tim Dunphy wrote: > hello list! > > hello.. I am running a nagios disk check that reports OK even when the > partition is not mounted or the machine is shut down .. how can I test the > check and adjust it so that it reports accurately? > > > ## Machine info > > CentOS release 5.6 (Final) > i686 > > ##Nagios Version > > Nagios Core 3.3.1 > > ## Command definition > > define command{ > command_name check_store_disk > command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$ > } > > > ## Service definition > > define service{ > use local-service ; Name of > service template to use > #host_name localhost > hostgroup_name web-servers > service_description Store Partition > check_command check_store_disk!20%!10%!/ > } > > The disk is mounted: > > [root at VIRTCENT11:~] #df -h > nas2:/mnt/store > 1.4T 370G 876G 30% /mnt/store > > [root at VIRTCENT11:~] #/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p > /mnt/store > DISK OK - free space: /mnt/store 896088 MB (70% inode=99%);| > /mnt/store=378829MB;1108624;1247202;0;1385780 > > In this case the check is accurate...the disk is mounted > > Now I unmount the partition: > > [root at VIRTCENT11:~] #umount /mnt/store > > I verify that the partition is not mounted with df and then run the check > again: > > [root at VIRTCENT11:~] #/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p > /mnt/store > DISK OK - free space: / 5737 MB (68% inode=96%);| /=2581MB;7017;7894;0;8772 > > But the check still thinks the disk is ok. > > How can I best address this problem? > > Thank you, > Tim > > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From rosenski at wave-computer.de Thu Oct 6 10:47:56 2011 From: rosenski at wave-computer.de (Axel Rosenski) Date: Thu, 6 Oct 2011 10:47:56 +0200 Subject: disk checks unreliable In-Reply-To: <93a735d2-34f3-463d-9919-365b975d51f9@li289-212> References: <93a735d2-34f3-463d-9919-365b975d51f9@li289-212> Message-ID: <201110061047.56819.rosenski@wave-computer.de> Hi Tim, Am Donnerstag 06 Oktober 2011, 04:52:20 schrieb Tim Dunphy: > hello.. I am running a nagios disk check that reports OK even when the > partition is not mounted or the machine is shut down .. how can I test the > check and adjust it so that it reports accurately? > How can I best address this problem? I've added -E and -M -E, --exact-match For paths or partitions specified with -p, only check for exact paths -M, --mountpoint Display the mountpoint instead of the partition Regards, Axel -- Axel Rosenski - Administration - ______________________________ Wave Computersysteme GmbH Philipp-Reis-Str. 1-3 / 9 35440 Linden Gesch?ftsf?hrer: Carsten Kellmann Registergericht Gie?en HRB 1823 Tel.: +49 (0)6403 / 9050 8317 Fax: +49 (0)6403 / 9050 5089 mailto:rosenski at wave-computer.de http://www.wave-computer.de ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From Schimpke.Thomas at bhn-services.com Thu Oct 6 11:24:56 2011 From: Schimpke.Thomas at bhn-services.com (Schimpke, Dr. Thomas - bhn) Date: Thu, 6 Oct 2011 11:24:56 +0200 Subject: How can I change Nagios/NRPE log location? In-Reply-To: References: <4E8C60D6.2050809@bhn-services.com> <4E8C6D27.70601@bhn-services.com> <4E8C76AD.5000604@bhn-services.com> Message-ID: <4E8D73E8.5000409@bhn-services.com> If you have log_facility=local4 in your nrpe.conf and local4.* /var/log/nrpe.log (or whatever file you choose) in syslog.conf you should have log messages from nrpe in your specified log file. The log entrys in your first post are, as I already mentioned, from xinetd directly ...and xinetd seems to log to the daemon facility and I think, that you cannot change this. I tried (for the rsync service) to set log_on_failure = log_on_success = so effectively clearing these two options for that service. Afterwards this start/success messages generated by xinetd when the service starts up were gone. You could try this with the nrpe service...Then your /var/log/messages should be clean. Thomas On 10/05/2011 05:34 PM, R. Leigh Hennig wrote: service syslog restart did restart the syslog service, however no changes have been made. Logs about NRPE are still going to /var/log/messages, even though I added "local4.* /var/log/nrpe.log" to that file, and the nrpe.cfg I changed it to local4 as well... On Wed, Oct 5, 2011 at 11:24 AM, Schimpke, Dr. Thomas - bhn > wrote: It somewhat depends upon your operating system (and Linux distribution, if you use Linux). On a RedHat based system you may want to try "service syslog restart". Typically syslog reloads its configuration, if you sent him the SIGHUP signal. So you may want to use ps -ef | grep syslog to determine syslog's pid and then kill -HUP pid. You may want to check syslog's man page to verify if your syslog responds to signals ... if your're not on linux. Thomas On 10/05/2011 04:54 PM, R. Leigh Hennig wrote: I made the change and restarted xinted. How do I restart syslog? On Wed, Oct 5, 2011 at 10:43 AM, Schimpke, Dr. Thomas - bhn >> wrote: Hi, it's not nrpe's config file - it's xinetd's config file for the nrpe service. nrpe is started by xinetd as soon as a request from the nagios server arrives. So you simply need to restart xinetd (or reload its configuration). If you still use SYSLOG (and not FILE), then you should configure the facility in /etc/syslog.conf appropriately. You need to restart/reload syslog for the change to have effect. Thomas On 10/05/2011 04:24 PM, R. Leigh Hennig wrote: I've made the configuration change - now I'm guessing I need to restart NRPE daemon to read in the changed config file. How do I restart NRPE? I want it to run as a daemon, and I believe that it is...there's an nrpe file in /etc/xinte.d/...I already restarted xinted after I made the log file change there... On Wed, Oct 5, 2011 at 9:51 AM, Schimpke, Dr. Thomas - bhn >>>> wrote: These aren't messages from nrpe but from xinetd. You should set the log_type parameter in nrpe's config fiule for the xinetd. Either use SYSLOG with facility local0 and configure syslog to log local0 to a file .../nrpe.log or use FILE as a parameter for the log_type and and the full path to the desired logfile. Check out the xinetd.conf man page for more details. Since you poll nrpe quite often it may be better to run nrpe as a daemon (nrpe -d ...) anyway to avoid the start overhead. Thomas On 10/05/2011 03:13 PM, R. Leigh Hennig wrote: On my remote hosts, /var/log/messages is filling up with messages like this: Sep 26 06:33:53 xinetd[13362]: EXIT: nrpe status=0 pid=8099 duration=0(sec) Sep 26 06:34:01 xinetd[13362]: START: nrpe pid=8105 from= Sep 26 06:34:01 xinetd[13362]: EXIT: nrpe status=0 pid=8105 duration=0(sec) Sep 26 06:34:57 xinetd[13362]: START: nrpe pid=8113 from= How can I make it so that Nagios/NRPE throws these in a different file, and not just /var/log/messages? Thanks ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net>>> https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From brodard.anthony at gmail.com Thu Oct 6 12:54:12 2011 From: brodard.anthony at gmail.com (Anthony BRODARD) Date: Thu, 6 Oct 2011 12:54:12 +0200 Subject: Check Message-ID: Hi list, I would to detect if any user is directly connected (console) on my Linux servers (Debian, CentOS). I've created this bash simply bash script : #!/bin/bash WHO=`which who` GREP=`which grep` WC=`which wc` RESULT=`$WHO | $GREP tty | $WC -l` if [ $RESULT -ne 0 ] then echo $RESULT "utilisateur(s) connecte(s) en console" exit 1 else echo "OK" exit 0 fi It works fine, but I prefer to use an other method, most lighter than the check_by_ssh. Do you know an other way to do that, via SNMP for exemple. Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mad at b-care.net Thu Oct 6 13:25:54 2011 From: mad at b-care.net (=?ISO-8859-1?Q?Marc-Andr=E9?= Doll) Date: Thu, 06 Oct 2011 13:25:54 +0200 Subject: Check In-Reply-To: References: Message-ID: <1317900354.1650.5.camel@MADness> Hi, I don't know about some SNMP OID which lists the connected users on Linux servers (and if someone knows about it, I will be glad to be updated). You can however interface your script with the UCD MIB on your monitored servers using the extTable table (1.3.6.1.4.1.2021.8). Marc-Andr? On Thu, 2011-10-06 at 12:54 +0200, Anthony BRODARD wrote: > Hi list, > > > I would to detect if any user is directly connected (console) on my > Linux servers (Debian, CentOS). > I've created this bash simply bash script : > > > #!/bin/bash > > > WHO=`which who` > GREP=`which grep` > WC=`which wc` > > > RESULT=`$WHO | $GREP tty | $WC -l` > > > if [ $RESULT -ne 0 ] > then > echo $RESULT "utilisateur(s) connecte(s) en console" > exit 1 > else > echo "OK" > exit 0 > fi > > > It works fine, but I prefer to use an other method, most lighter than > the check_by_ssh. > Do you know an other way to do that, via SNMP for exemple. > > > Regards > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From Schimpke.Thomas at bhn-services.com Thu Oct 6 14:32:03 2011 From: Schimpke.Thomas at bhn-services.com (Schimpke, Dr. Thomas - bhn) Date: Thu, 6 Oct 2011 14:32:03 +0200 Subject: Check In-Reply-To: <1317900354.1650.5.camel@MADness> References: <1317900354.1650.5.camel@MADness> Message-ID: <4E8D9FC3.6050903@bhn-services.com> Hi, what about HOST-RESOURCES-MIB::hrSystemNumUsers.0 ? snmpget -v2c -c xxx belinda HOST-RESOURCES-MIB::hrSystemNumUsers.0 HOST-RESOURCES-MIB::hrSystemNumUsers.0 = Gauge32: 2 You have to load the MIB of cause. If you prefer the numerical OID: snmptranslate -On HOST-RESOURCES-MIB::hrSystemNumUsers.0 ..1.3.6.1.2.1.25.1.5.0 Thomas On 10/06/2011 01:25 PM, Marc-Andr? Doll wrote: > Hi, > > I don't know about some SNMP OID which lists the connected users on > Linux servers (and if someone knows about it, I will be glad to be > updated). You can however interface your script with the UCD MIB on your > monitored servers using the extTable table (1.3.6.1.4.1.2021.8). > > Marc-Andr? > > On Thu, 2011-10-06 at 12:54 +0200, Anthony BRODARD wrote: >> Hi list, >> >> >> I would to detect if any user is directly connected (console) on my >> Linux servers (Debian, CentOS). >> I've created this bash simply bash script : >> >> >> #!/bin/bash >> >> >> WHO=`which who` >> GREP=`which grep` >> WC=`which wc` >> >> >> RESULT=`$WHO | $GREP tty | $WC -l` >> >> >> if [ $RESULT -ne 0 ] >> then >> echo $RESULT "utilisateur(s) connecte(s) en console" >> exit 1 >> else >> echo "OK" >> exit 0 >> fi >> >> >> It works fine, but I prefer to use an other method, most lighter than >> the check_by_ssh. >> Do you know an other way to do that, via SNMP for exemple. >> >> >> Regards >> ------------------------------------------------------------------------------ >> All the data continuously generated in your IT infrastructure contains a >> definitive record of customers, application performance, security >> threats, fraudulent activity and more. Splunk takes this data and makes >> sense of it. Business sense. IT sense. Common sense. >> http://p.sf.net/sfu/splunk-d2dcopy1 >> _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From benny at bennyvision.com Thu Oct 6 14:40:21 2011 From: benny at bennyvision.com (C. Bensend) Date: Thu, 6 Oct 2011 07:40:21 -0500 Subject: Check In-Reply-To: References: Message-ID: <5e8c07c34d764345f198f02416c9cd8f.squirrel@webmail.stinkweasel.net> > It works fine, but I prefer to use an other method, most lighter than the > check_by_ssh. > Do you know an other way to do that, via SNMP for exemple. I run NRPE on my Linux systems... It is much lighter than using check_by_ssh. Benny -- "Open your door, or I open your wall." -- Seen on an image on fukung.net ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From brodard.anthony at gmail.com Thu Oct 6 14:54:25 2011 From: brodard.anthony at gmail.com (Anthony BRODARD) Date: Thu, 6 Oct 2011 14:54:25 +0200 Subject: Check In-Reply-To: <4E8D9FC3.6050903@bhn-services.com> References: <1317900354.1650.5.camel@MADness> <4E8D9FC3.6050903@bhn-services.com> Message-ID: Thanks for your answers. I don't know how works the OID 1.3.6.1.4.1.2201.8. I will search more information about it. Thanks for this solution. About HOST-RESOURCES-MIB::hrSystemNumUsers.0, it returns number of opened sessions, include ssh. I just want to see if a session is opened in a tty. But thanks too for your help =) 2011/10/6 Schimpke, Dr. Thomas - bhn > Hi, > > what about HOST-RESOURCES-MIB::hrSystemNumUsers.0 ? > > snmpget -v2c -c xxx belinda HOST-RESOURCES-MIB::hrSystemNumUsers.0 > HOST-RESOURCES-MIB::hrSystemNumUsers.0 = Gauge32: 2 > > You have to load the MIB of cause. If you prefer the numerical OID: > > snmptranslate -On HOST-RESOURCES-MIB::hrSystemNumUsers.0 > ..1.3.6.1.2.1.25.1.5.0 > > Thomas > > On 10/06/2011 01:25 PM, Marc-Andr? Doll wrote: > > Hi, > > > > I don't know about some SNMP OID which lists the connected users on > > Linux servers (and if someone knows about it, I will be glad to be > > updated). You can however interface your script with the UCD MIB on your > > monitored servers using the extTable table (1.3.6.1.4.1.2021.8). > > > > Marc-Andr? > > > > On Thu, 2011-10-06 at 12:54 +0200, Anthony BRODARD wrote: > >> Hi list, > >> > >> > >> I would to detect if any user is directly connected (console) on my > >> Linux servers (Debian, CentOS). > >> I've created this bash simply bash script : > >> > >> > >> #!/bin/bash > >> > >> > >> WHO=`which who` > >> GREP=`which grep` > >> WC=`which wc` > >> > >> > >> RESULT=`$WHO | $GREP tty | $WC -l` > >> > >> > >> if [ $RESULT -ne 0 ] > >> then > >> echo $RESULT "utilisateur(s) connecte(s) en console" > >> exit 1 > >> else > >> echo "OK" > >> exit 0 > >> fi > >> > >> > >> It works fine, but I prefer to use an other method, most lighter than > >> the check_by_ssh. > >> Do you know an other way to do that, via SNMP for exemple. > >> > >> > >> Regards > >> > ------------------------------------------------------------------------------ > >> All the data continuously generated in your IT infrastructure contains a > >> definitive record of customers, application performance, security > >> threats, fraudulent activity and more. Splunk takes this data and makes > >> sense of it. Business sense. IT sense. Common sense. > >> http://p.sf.net/sfu/splunk-d2dcopy1 > >> _______________________________________________ Nagios-users mailing > list Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please > include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > > > > > > > ------------------------------------------------------------------------------ > > All the data continuously generated in your IT infrastructure contains a > > definitive record of customers, application performance, security > > threats, fraudulent activity and more. Splunk takes this data and makes > > sense of it. Business sense. IT sense. Common sense. > > http://p.sf.net/sfu/splunk-d2dcopy1 > > _______________________________________________ > > Nagios-users mailing list > > Nagios-users at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/nagios-users > > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > > ::: Messages without supporting info will risk being sent to /dev/null > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From brodard.anthony at gmail.com Thu Oct 6 15:02:17 2011 From: brodard.anthony at gmail.com (Anthony BRODARD) Date: Thu, 6 Oct 2011 15:02:17 +0200 Subject: Check In-Reply-To: <5e8c07c34d764345f198f02416c9cd8f.squirrel@webmail.stinkweasel.net> References: <5e8c07c34d764345f198f02416c9cd8f.squirrel@webmail.stinkweasel.net> Message-ID: Actully, i don't use NRPE. But why not, i 'll have a look at this. 2011/10/6 C. Bensend > > > It works fine, but I prefer to use an other method, most lighter than the > > check_by_ssh. > > Do you know an other way to do that, via SNMP for exemple. > > I run NRPE on my Linux systems... It is much lighter than using > check_by_ssh. > > Benny > > > -- > "Open your door, or I open your wall." > -- Seen on an image on fukung.net > > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From metron6 at gmail.com Fri Oct 7 11:46:14 2011 From: metron6 at gmail.com (Metron 6 (six)) Date: Fri, 7 Oct 2011 12:46:14 +0300 Subject: checking ms sql... Message-ID: hello, i have a script to check a microsoft sql server. script runs every hour and checks if there are any new records. output of the script is like this: 400 added since last time 60' ago (33716 - 33316) Participants an hour ago: 33316 Participants now: 33716 Difference: 400 i want to add it, and if the difference is 0, to send notifications... but i dont know how... can anyone help me ? rgds, george -- regards, Metron 6 (six) Metron6 at gmail.com ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From kirill.bychkov at gmail.com Fri Oct 7 12:08:12 2011 From: kirill.bychkov at gmail.com (Kirill Bychkov) Date: Fri, 7 Oct 2011 14:08:12 +0400 Subject: checking ms sql... In-Reply-To: References: Message-ID: Hello, For example, in your script you must set a conditional: if difference 0, then exit with status CRITICAL. On 7 October 2011 13:46, Metron 6 (six) wrote: > hello, > > i have a script to check a microsoft sql server. > script runs every hour and checks if there are any new records. > output of the script is like this: > > 400 added since last time 60' ago (33716 - 33316) > Participants an hour ago: 33316 > Participants now: 33716 > Difference: 400 > > i want to add it, and if the difference is 0, to send notifications... > but i dont know how... > > can anyone help me ? > > > rgds, george > > -- > > regards, > Metron 6 (six) > > Metron6 at gmail.com > > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2dcopy2 > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > -- Kirill -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From nagios at flatto.net Fri Oct 7 11:54:00 2011 From: nagios at flatto.net (Nagios) Date: Fri, 07 Oct 2011 09:54:00 -0000 (UTC) Subject: checking ms sql... In-Reply-To: References: Message-ID: <54843fe9-5931-4a82-a4d9-bc33d424ef5b@vhost1.heptagon.co.il> read this and incorporate it to your exit codes. http://nagios.sourceforge.net/docs/nagioscore/3/en/pluginapi.html Assaf ----- Original Message ----- From: "Metron 6 (six)" To: "Nagios Users List" Sent: Friday, 7 October, 2011 10:46:14 AM Subject: [Nagios-users] checking ms sql... hello, i have a script to check a microsoft sql server. script runs every hour and checks if there are any new records. output of the script is like this: 400 added since last time 60' ago (33716 - 33316) Participants an hour ago: 33316 Participants now: 33716 Difference: 400 i want to add it, and if the difference is 0, to send notifications... but i dont know how... can anyone help me ? rgds, george -- regards, Metron 6 (six) Metron6 at gmail.com ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From cyborg9799 at gmail.com Fri Oct 7 20:28:05 2011 From: cyborg9799 at gmail.com (Mark Thomas) Date: Fri, 7 Oct 2011 14:28:05 -0400 Subject: check_snmp_storage.pl ver 1.3.3 ERROR: General time-out (Alarm signal) In-Reply-To: References: Message-ID: I found the problem It looks like during re-org of my services and groups I must have copied some one elses service that was using another community. $USER4$ should have been $USER7$ On Tue, Oct 4, 2011 at 8:44 AM, Mark Thomas wrote: > this script was running fine until we moved to another machine more > powerful. > Now all alerts of my windows machines are failing. with the above error. > My unix machines are fine. I have not implemented the alert for Linux yet. > My other snmp alerts (that check windows services are working fine. > > Nagios 3.3.1 > Ubuntu 10 > 2.6.35-30-server > 64 bit > > Thanks for any help > -- > Thomas* > -- *Mark Thomas* -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From jim at jimavery.me.uk Fri Oct 7 20:52:29 2011 From: jim at jimavery.me.uk (Jim Avery) Date: Fri, 7 Oct 2011 19:52:29 +0100 Subject: Average Check latency and execution time growth - 3.2.3 In-Reply-To: <8CEF048B9EC83748B1517DC64EA130FB609A2605B8@off-win2003-01.ausregistrygroup.local> References: <8CEF048B9EC83748B1517DC64EA130FB609A2605B8@off-win2003-01.ausregistrygroup.local> Message-ID: On 3 October 2011 04:36, Stuart Browne wrote: > Hi, > > I know this topic has been covered many times, but I've tried those tweaks and I have the remaining issue. > > After a few days, the latency on checks explodes. ?It goes along quite happily with small values, then after (about) 3 days, the values rise quite sharply. ?I've recently been graphing performance statistics (nagiostats, mrtg) and as you can see by the two attachments (day, week), it's rather surprising. I'm sorry I can't shed much light on it, but I've seen the same behaviour myself, albeit on my system the service check latency wouldn't start increasing until after a week or two but you're right - the rate of increase when it starts is quite alarming. I've recently culled a lot of checks from the system which has ameliorated the issue for the time being, but it would be good to get it fixed properly. ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From pitchfork at ederdrom.de Sat Oct 8 13:03:53 2011 From: pitchfork at ederdrom.de (=?iso-8859-1?Q?J=F6rg_Linge?=) Date: Sat, 8 Oct 2011 13:03:53 +0200 Subject: Average Check latency and execution time growth - 3.2.3 In-Reply-To: References: <8CEF048B9EC83748B1517DC64EA130FB609A2605B8@off-win2003-01.ausregistrygroup.local> Message-ID: Am 07.10.2011 um 20:52 schrieb Jim Avery: > On 3 October 2011 04:36, Stuart Browne wrote: >> Hi, >> >> I know this topic has been covered many times, but I've tried those tweaks and I have the remaining issue. >> >> After a few days, the latency on checks explodes. It goes along quite happily with small values, then after (about) 3 days, the values rise quite sharply. I've recently been graphing performance statistics (nagiostats, mrtg) and as you can see by the two attachments (day, week), it's rather surprising. > > > I'm sorry I can't shed much light on it, but I've seen the same > behaviour myself, albeit on my system the service check latency > wouldn't start increasing until after a week or two but you're right - > the rate of increase when it starts is quite alarming. I've recently > culled a lot of checks from the system which has ameliorated the issue > for the time being, but it would be good to get it fixed properly. Increasing latency is mostly an indicator of memory leaks. The nagios core is well tested but the embedded perl interpreter in combination with badly written per plugin might cause a memory leak. Also some eventbrokers module lead to memory leaks. For example mk_livestatus should never be used with environment macros enabled. You have to monitor more then just the latency to get a feeling whats going on. Joerg ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From bluethundr at jokefire.com Sat Oct 8 16:16:23 2011 From: bluethundr at jokefire.com (Tim Dunphy) Date: Sat, 08 Oct 2011 14:16:23 -0000 (UTC) Subject: problem resgistering new service Message-ID: <05e0ba01-1b50-4335-b980-6b09902f43e8@li289-212> Hello list!! I am trying to setup a new plugin that will check haproxy. However when I try to add the service definition to the config file I am getting an error claiming that it cannot register the service. I was wondering where I could best look to track down this error and if anyone has any suggestion that might help troubleshoot this. # config error Error: Could not register service (config file '/usr/local/nagios/etc/objects/lb.cfg', starting on line 197) Error processing object config files! ## service definition from /usr/local/nagios/etc/objects/lb.cfg define service { host_name virtual ## <- line 197 service_description HAProxy check_command check_haproxy!http://virtual/admin?stats;csv } ## host definition for 'virtual' host in /usr/local/nagios/etc/objects/lb.cfg define host{ use linux-server host_name virtual address 192.168.1.200 } ## command definition define command { command_name check_haproxy command_line $USER1$/check_haproxy.pl -u "$ARG1$" #~ _comment Test url HAProxy } thanks in advance! tim ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From maxs at webwizarddesign.com Sat Oct 8 17:19:29 2011 From: maxs at webwizarddesign.com (Max Schubert) Date: Sat, 8 Oct 2011 11:19:29 -0400 Subject: Average Check latency and execution time growth - 3.2.3 In-Reply-To: <8CEF048B9EC83748B1517DC64EA130FB609A2605B8@off-win2003-01.ausregistrygroup.local> References: <8CEF048B9EC83748B1517DC64EA130FB609A2605B8@off-win2003-01.ausregistrygroup.local> Message-ID: What minor RHEL rev are you running? We had one poller that was running RHEL 5.3 that had constantly increasing latency - a Compaw / AMD based host. None of the optimizations / configuration changes we made to the other pollers we ran at the time seemed to help this one - we updated the poller in-box from 5.3 to 5.4 and voila - issue gone. As Joerge mentioned, probably was a memory leak / bug in a library the parent Nagios poller process was using, we never did determine which one and we haven't hit that same issue since then with any 5.4 or 5.5 pollers. Even with stable software we end up bouncing our pollers every 2-3 days - 1) because we have an active customer base who make config changes often and 2) because we take the metrics from the checks and put them in a time series data warehouse that is sensitive to interval skew...any poller that hits 10 seconds latency has to be bounced. We are at 12 pollers or so right now and we will be up to almost 20 by next year at this time. Max On 10/2/11, Stuart Browne wrote: > Hi, > > I know this topic has been covered many times, but I've tried those tweaks > and I have the remaining issue. > > After a few days, the latency on checks explodes. It goes along quite > happily with small values, then after (about) 3 days, the values rise quite > sharply. I've recently been graphing performance statistics (nagiostats, > mrtg) and as you can see by the two attachments (day, week), it's rather > surprising. > > We restart Nagios every few days (for other reasons) so thankfully the issue > never gets completely out of control, but as you can see, it gets a bit > crazy. > > I can't think of any combination of settings that would cause such growth > after such a long period of time. Does anybody have any knowledge as to why > it would suddenly increase after running for days without issue? > > Basic Nagios system stats: > 2 x dual-core Xeon 5160 (3Ghz) > 6GB Memory > 4 x SAS, RAID1 (hardware, BBU, LVM over RAID1) > RHEL5, fully patched > Load average between 0.5 and 3.2 > > 'nagios -s /etc/nagios/nagios.cfg' output (trimmed): > > HOST SCHEDULING INFORMATION > --------------------------- > Total hosts: 252 > Total scheduled hosts: 252 > Host inter-check delay method: SMART > Average host check interval: 300.00 sec > Host inter-check delay: 1.19 sec > Max host check spread: 30 min > First scheduled check: Mon Oct 3 14:31:17 2011 > Last scheduled check: Mon Oct 3 14:36:15 2011 > > > SERVICE SCHEDULING INFORMATION > ------------------------------- > Total services: 1575 > Total scheduled services: 1386 > Service inter-check delay method: SMART > Average service check interval: 878.40 sec > Inter-check delay: 0.63 sec > Interleave factor method: SMART > Average services per host: 6.25 > Service interleave factor: 6 > Max service check spread: 30 min > First scheduled check: Mon Oct 3 14:33:43 2011 > Last scheduled check: Mon Oct 3 14:48:21 2011 > > CHECK PROCESSING INFORMATION > ---------------------------- > Check result reaper interval: 5 sec > Max concurrent service checks: Unlimited > > > PERFORMANCE SUGGESTIONS > ----------------------- > I have no suggestions - things look okay. > > Stuart J. Browne > Senior Linux Administrator > ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From eng__amir at hotmail.com Mon Oct 10 11:05:35 2011 From: eng__amir at hotmail.com (Amir Saad) Date: Mon, 10 Oct 2011 02:05:35 -0700 Subject: Fwd: I figured I should share the wealth Message-ID: hello! I kept telling myself things would get better my expectations were more than exceeded this is proof that miracles do exist seriously consider this http://wbopole.home.pl/GrahamBailey94.html see you later. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From a31modela at hotmail.com Mon Oct 10 15:16:16 2011 From: a31modela at hotmail.com (steve f) Date: Mon, 10 Oct 2011 09:16:16 -0400 Subject: Questions on snmp checks Message-ID: Good Morning All, 1st off, please don't laugh at me, we do EVERYTHING the hard way here. I am in need to check 600 + remote Cisco routers for their primary port & secondary port. I am running Nagios Core in a distributed environment, 600 + locations. Each location running a Nagios dist server for 12 local clients. All clients and servers are running SuSE Linux I do not have snmp running on the remote distributed server , SNMP is running on the central server and its running on the remote clients. Here's where the fun starts : 1. I assume I can't run an snmp check to the snmp enabled router from the distributed Nagios server since the server does not have snmp installed ( BTW, NOT an Option ) 2. Could I run a check from the distributed server to one of the clients ( who has snmp installed ) to the router ? ( I know, Rube Goldberg.... ) 3. If I run the check from my central server out to the 600 Cisco routers, how manageable would that be? Any other thoughts on how I could monitor 2 separate ports on these routers? The primary reason for the check is the secondary port is Broadband & we run some security camera stuff thru that port only. I need to know when the broadband connection goes down for obvious reasons. Have not tried any of my solutions as there is some configuration needed for the routers to talk to someone via snmp. Didn't want my Comm guys to do any config work if this idea wouldn't work. Thanks, Steve -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From bphelps at gls.com Mon Oct 10 15:24:47 2011 From: bphelps at gls.com (Brandon Phelps) Date: Mon, 10 Oct 2011 09:24:47 -0400 Subject: Questions on snmp checks In-Reply-To: References: Message-ID: <4E92F21F.4080307@gls.com> As far as I know the SNMP checks do not require SNMP to be on the server running the check, only on the remote host. Most of them are simply perl scripts which only require Net::SNMP (and perhaps other modules such as Getopt::Long, etc). You could always try running check_ifstatus (which communicates via SNMP) and see if it works. On 10/10/2011 09:16 AM, steve f wrote: > Good Morning All, > > 1st off, please don't laugh at me, we do EVERYTHING the hard way here. > > I am in need to check 600 + remote Cisco routers for their primary port & secondary port. I am running Nagios Core in a distributed environment, 600 + locations. Each location running a Nagios dist server for 12 local clients. All clients and servers are running SuSE Linux > > I do not have snmp running on the remote distributed server , SNMP is running on the central server and its running on the remote clients. > > Here's where the fun starts : > > 1. I assume I can't run an snmp check to the snmp enabled router from the distributed Nagios server since the server does not have snmp installed ( BTW, NOT an Option ) > > 2. Could I run a check from the distributed server to one of the clients ( who has snmp installed ) to the router ? ( I know, Rube Goldberg.... ) > > 3. If I run the check from my central server out to the 600 Cisco routers, how manageable would that be? > > Any other thoughts on how I could monitor 2 separate ports on these routers? The primary reason for the check is the secondary port is Broadband & we run some security camera stuff thru that port only. I need to know when the broadband connection goes down for obvious reasons. > > Have not tried any of my solutions as there is some configuration needed for the routers to talk to someone via snmp. Didn't want my Comm guys to do any config work if this idea wouldn't work. > > Thanks, > Steve > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > > > > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From kdavison at innosphere.ca Mon Oct 10 15:31:38 2011 From: kdavison at innosphere.ca (Kevin Davison) Date: Mon, 10 Oct 2011 09:31:38 -0400 Subject: Monitoring disk space remote VPS hosts Message-ID: I've recently deployed some of our remote systems on a new VPS host. I'm looking to monitor all of the usual suspects to keep an eye on these. I monitor disk space on remote servers at a number of different hosting companies but this one seems to be different. The root file system is mounted at /dev/simfs so I'm assuming that it's a VZ virtual machine. The oddity, for me, is that the simfs device doesn't technically exist. A df shows that my files system is mounted at /dev/simfs but there is no device or link in the /dev directory to correlate to that. I updated check_disk in my nrpe config thinking that if df can see what's mounted there, then nrpe may be able to as well but no luck. Is anyone else monitoring one of these virtual file systems? How did you do it? [root at p4pi0010-r ~]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/simfs 5242880 687524 4555356 14% / [root at p4pi0010-r ~]# ls /dev {snip} ptyac ptydd ptyqe ptytf ptyx0 ram1 ttycd ttype ttysf ttyw0 ttyz1 ptyad ptyde ptyqf ptyu0 ptyx1 random ttyce ttypf ttyt0 ttyw1 ttyz2 ptyae ptydf ptyr0 ptyu1 ptyx2 shm ttycf ttyq0 ttyt1 ttyw2 ttyz3 ptyaf ptye0 ptyr1 ptyu2 ptyx3 tty ttyd0 ttyq1 ttyt2 ttyw3 ttyz4 {snip} Kevin Davison Network Administrator Innosphere SDG Ltd. 147 Wyndham St. N., Ste 306 Guelph, ON, N1H 4E9 (519) 766-9726 X223 Email: kdavison at innosphere.ca Website: www.innosphere.ca -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From eng__amir at hotmail.com Mon Oct 10 21:29:48 2011 From: eng__amir at hotmail.com (Amir Saad) Date: Mon, 10 Oct 2011 12:29:48 -0700 Subject: (no subject) Message-ID: ..Click it! Don?t miss such a great possibility! http://aaa123.aa.funpic.de/com.friend.php?yrgoogle=74ol7 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From m.borsani at it.net Tue Oct 11 10:37:54 2011 From: m.borsani at it.net (Marco Borsani) Date: Tue, 11 Oct 2011 10:37:54 +0200 Subject: nrpe install and start on sunos joyent Message-ID: <005101cc87f1$17da43b0$478ecb10$@it.net> Hello I have to run NRPE client on several sunos systems, but I don't know a lot that OS. I met also many problems during the compiling..solved using the package precompiled and installed with "pkgin in nrpe." 1) I have prepared nrpe.xml 2) I started the service with "svcadm enable nrpe", but in the error log I read: [ Oct 11 08:26:30 Enabled. ] [ Oct 11 08:26:30 Executing start method ("/opt/local/sbin/nrpe -c /opt/local/etc/nagios/nrpe.cfg -d"). ] [ Oct 11 08:26:30 svc.startd could not set context for method: ] chdir: No such file or directory [ Oct 11 08:26:30 Method "start" exited with status 96. ] Any ideas? Does anyone ever done this kind of installation ? Regards Marco -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From jvela at s2grupo.es Tue Oct 11 13:23:18 2011 From: jvela at s2grupo.es (Javier Vela Diago) Date: Tue, 11 Oct 2011 13:23:18 +0200 Subject: High check latency in a machine with low load Message-ID: Hi, I have a Nagios 3.2.3 deployment with 1000+ Hosts and 3000+ services. This Nagios runs together with NDO and PNP (in bulk mode) in a server with 4GB of Ram and 4 cpus. One day I realized that the check delay in the performance CGI was very high (300-400 seconds). It was very strange so I took the tunning guide form nagios (http://nagios.sourceforge.net/docs/3_0/tuning.html) and applied all the points I could. In particular I adjusted the max_concurrent_checks to zero (no limit): max_concurrent_checks=0 The reaper event: service_reaper_frequency=5 max_check_result_reaper_time=15 and checked that the host checks where not forced. In addition I configured 15 seconds of host check cache. cached_host_check_horizon=15 But the problem remains. And the load of the server is not very high. Load of 2,5, 2 GB of free memory and an average utilization of disc of 7%. I disabled NDO and PNP but it was useless. After the first round of checks, the delay returns, while the load of the server doesn't grow. I have searched in google but all the problems area because of the load in the server, but here this is not the main problem. So my question is ?what can I do now??There is some variable that shows me where to look? I'm a bit lost right now and I don't know how to find the problem. ?Or maybe the only way is to configure a master-slave nagios in order to maximize the server utilization? In addition, I have pretty big timeouts (60 seconds) because of the high latency on the network. All your help is appreciated. Thank you in advance. nagiostats Nagios Stats 3.2.3 Copyright (c) 2003-2008 Ethan Galstad (www.nagios.org) Last Modified: 10-03-2010 License: GPL CURRENT STATUS DATA ------------------------------------------------------ Status File: /usr/local/argos/aplicaciones/nagios/var/status.dat Status File Age: 0d 0h 0m 11s Status File Version: 3.2.3 Program Running Time: 0d 20h 56m 7s Nagios PID: 21834 Used/High/Total Command Buffers: 0 / 0 / 4096 Total Services: 4032 Services Checked: 4032 Services Scheduled: 4030 Services Actively Checked: 4032 Services Passively Checked: 0 Total Service State Change: 0.000 / 37.300 / 0.163 % Active Service Latency: 32.876 / 442.138 / 415.816 sec Active Service Execution Time: 0.051 / 60.097 / 1.545 sec Active Service State Change: 0.000 / 37.300 / 0.163 % Active Services Last 1/5/15/60 min: 237 / 1530 / 4020 / 4020 Passive Service Latency: 0.000 / 0.000 / 0.000 sec Passive Service State Change: 0.000 / 0.000 / 0.000 % Passive Services Last 1/5/15/60 min: 0 / 0 / 0 / 0 Services Ok/Warn/Unk/Crit: 3766 / 38 / 44 / 184 Services Flapping: 0 Services In Downtime: 0 Total Hosts: 931 Hosts Checked: 931 Hosts Scheduled: 931 Hosts Actively Checked: 931 Host Passively Checked: 0 Total Host State Change: 0.000 / 12.370 / 0.077 % Active Host Latency: 0.000 / 441.308 / 416.063 sec Active Host Execution Time: 0.062 / 10.113 / 0.395 sec Active Host State Change: 0.000 / 12.370 / 0.077 % Active Hosts Last 1/5/15/60 min: 74 / 423 / 931 / 931 Passive Host Latency: 0.000 / 0.000 / 0.000 sec Passive Host State Change: 0.000 / 0.000 / 0.000 % Passive Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 0 Hosts Up/Down/Unreach: 897 / 24 / 10 Hosts Flapping: 0 Hosts In Downtime: 1 Active Host Checks Last 1/5/15 min: 109 / 535 / 1583 Scheduled: 87 / 433 / 1300 On-demand: 22 / 102 / 283 Parallel: 87 / 438 / 1323 Serial: 0 / 0 / 0 Cached: 22 / 97 / 260 Passive Host Checks Last 1/5/15 min: 0 / 0 / 0 Active Service Checks Last 1/5/15 min: 304 / 1605 / 4924 Scheduled: 304 / 1605 / 4923 On-demand: 0 / 0 / 1 Cached: 0 / 0 / 0 Passive Service Checks Last 1/5/15 min: 0 / 0 / 0 External Commands Last 1/5/15 min: 0 / 0 / 0 nagios -s Nagios Core 3.2.3 Copyright (c) 2009-2010 Nagios Core Development Team and Community Contributors Copyright (c) 1999-2009 Ethan Galstad Last Modified: 10-03-2010 License: GPL Website: http://www.nagios.org Warning: aggregate_status_updates directive ignored. All status file updates are now aggregated. Warning: downtime_file variable ignored. Downtime entries are now stored in the status and retention files. Warning: comment_file variable ignored. Comments are now stored in the status and retention files. Timing information on object configuration processing is listed below. You can use this information to see if precaching your object configuration would be useful. Object Config Source: Config files (uncached) OBJECT CONFIG PROCESSING TIMES (* = Potential for precache savings with -u option) ---------------------------------- Read: 0.080036 sec Resolve: 0.010660 sec * Recomb Contactgroups: 0.002666 sec * Recomb Hostgroups: 0.004086 sec * Dup Services: 0.034632 sec * Recomb Servicegroups: 0.001277 sec * Duplicate: 0.010939 sec * Inherit: 0.005594 sec * Recomb Contacts: 0.000001 sec * Sort: 0.000000 sec * Register: 0.074413 sec Free: 0.008730 sec ============ TOTAL: 0.234920 sec * = 0.071741 sec (30.54%) estimated savings RETENTION DATA TIMES ---------------------------------- Read and Process: 0.495480 sec ============ TOTAL: 0.495480 sec Timing information on configuration verification is listed below. CONFIG VERIFICATION TIMES (* = Potential for speedup with -x option) ---------------------------------- Object Relationships: 0.060039 sec Circular Paths: 0.026557 sec * Misc: 0.005999 sec ============ TOTAL: 0.092595 sec * = 0.026557 sec (28.7%) estimated savings EVENT SCHEDULING TIMES ------------------------------------- Get service info: 0.014509 sec Get host info info: 0.002853 sec Get service params: 0.000078 sec Schedule service times: 0.039947 sec Schedule service events: 0.034656 sec Get host params: 0.000001 sec Schedule host times: 0.007519 sec Schedule host events: 0.029519 sec ============ TOTAL: 0.129082 sec Projected scheduling information for host and service checks is listed below. This information assumes that you are going to start running Nagios with your current config files. HOST SCHEDULING INFORMATION --------------------------- Total hosts: 931 Total scheduled hosts: 931 Host inter-check delay method: SMART Average host check interval: 259.01 sec Host inter-check delay: 0.28 sec Max host check spread: 30 min First scheduled check: Tue Oct 11 13:14:08 2011 Last scheduled check: Tue Oct 11 13:18:26 2011 SERVICE SCHEDULING INFORMATION ------------------------------- Total services: 4032 Total scheduled services: 4030 Service inter-check delay method: SMART Average service check interval: 299.55 sec Inter-check delay: 0.07 sec Interleave factor method: SMART Average services per host: 4.33 Service interleave factor: 5 Max service check spread: 30 min First scheduled check: Tue Oct 11 13:15:07 2011 Last scheduled check: Tue Oct 11 13:20:07 2011 CHECK PROCESSING INFORMATION ---------------------------- Check result reaper interval: 5 sec Max concurrent service checks: Unlimited PERFORMANCE SUGGESTIONS ----------------------- I have no suggestions - things look okay. -- Javier Vela Diago S2 GRUPO Ramiro de Maeztu, 7 bajo. 46022 Valencia Tel: 963.110.300 Fax: 963.106.086 e-mail : jvela arroba s2grupo punto es http://www.s2grupo.es -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From Edwin.Zoeller at ama-assn.org Tue Oct 11 13:30:51 2011 From: Edwin.Zoeller at ama-assn.org (Edwin Zoeller) Date: Tue, 11 Oct 2011 06:30:51 -0500 Subject: nrpe install and start on sunos joyent In-Reply-To: <005101cc87f1$17da43b0$478ecb10$@it.net> References: <005101cc87f1$17da43b0$478ecb10$@it.net> Message-ID: What version of Solaris are you running? From: Marco Borsani [mailto:m.borsani at it.net] Sent: Tuesday, October 11, 2011 03:37 AM To: NAGIOS Subject: [Nagios-users] nrpe install and start on sunos joyent Hello I have to run NRPE client on several sunos systems, but I don?t know a lot that OS. I met also many problems during the compiling..solved using the package precompiled and installed with ?pkgin in nrpe?? 1) I have prepared nrpe.xml 2) I started the service with ?svcadm enable nrpe?, but in the error log I read: [ Oct 11 08:26:30 Enabled. ] [ Oct 11 08:26:30 Executing start method ("/opt/local/sbin/nrpe -c /opt/local/etc/nagios/nrpe.cfg -d"). ] [ Oct 11 08:26:30 svc.startd could not set context for method: ] chdir: No such file or directory [ Oct 11 08:26:30 Method "start" exited with status 96. ] Any ideas? Does anyone ever done this kind of installation ? Regards Marco -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From m.borsani at it.net Tue Oct 11 14:10:09 2011 From: m.borsani at it.net (Marco Borsani) Date: Tue, 11 Oct 2011 14:10:09 +0200 Subject: R: nrpe install and start on sunos joyent In-Reply-To: References: <005101cc87f1$17da43b0$478ecb10$@it.net> Message-ID: <009101cc880e$bebf27a0$3c3d76e0$@it.net> #> uname -a SunOS mb-sm-512.local 5.11 joyent_20110922T212927Z i86pc i386 i86pc Solaris I did find any way to compile on it, I had to use pkgin command Regards marco Da: Edwin Zoeller [mailto:Edwin.Zoeller at ama-assn.org] Inviato: marted? 11 ottobre 2011 13:31 A: nagios-users at lists.sourceforge.net Oggetto: Re: [Nagios-users] nrpe install and start on sunos joyent What version of Solaris are you running? From: Marco Borsani [mailto:m.borsani at it.net] Sent: Tuesday, October 11, 2011 03:37 AM To: NAGIOS Subject: [Nagios-users] nrpe install and start on sunos joyent Hello I have to run NRPE client on several sunos systems, but I don?t know a lot that OS. I met also many problems during the compiling..solved using the package precompiled and installed with ?pkgin in nrpe?? 1) I have prepared nrpe.xml 2) I started the service with ?svcadm enable nrpe?, but in the error log I read: [ Oct 11 08:26:30 Enabled. ] [ Oct 11 08:26:30 Executing start method ("/opt/local/sbin/nrpe -c /opt/local/etc/nagios/nrpe.cfg -d"). ] [ Oct 11 08:26:30 svc.startd could not set context for method: ] chdir: No such file or directory [ Oct 11 08:26:30 Method "start" exited with status 96. ] Any ideas? Does anyone ever done this kind of installation ? Regards Marco -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From daniel.wittenberg.r0ko at statefarm.com Tue Oct 11 15:44:37 2011 From: daniel.wittenberg.r0ko at statefarm.com (Daniel Wittenberg) Date: Tue, 11 Oct 2011 13:44:37 +0000 Subject: High check latency in a machine with low load In-Reply-To: References: Message-ID: <4288A518A157EC4C8873FEE74F778BF0058A13@WPSDGQHH.OPR.STATEFARM.ORG> I think you have the enable_high_latency option enabled :) j/k Do you have any particular checks that are taking a long time? i.e. can you watch top and see checks taking a while? Dan From: Javier Vela Diago [mailto:jvela at s2grupo.es] Sent: Tuesday, October 11, 2011 6:23 AM To: nagios-users at lists.sourceforge.net Subject: [Nagios-users] High check latency in a machine with low load Hi, I have a Nagios 3.2.3 deployment with 1000+ Hosts and 3000+ services. This Nagios runs together with NDO and PNP (in bulk mode) in a server with 4GB of Ram and 4 cpus. One day I realized that the check delay in the performance CGI was very high (300-400 seconds). It was very strange so I took the tunning guide form nagios (http://nagios.sourceforge.net/docs/3_0/tuning.html) and applied all the points I could. In particular I adjusted the max_concurrent_checks to zero (no limit): max_concurrent_checks=0 The reaper event: service_reaper_frequency=5 max_check_result_reaper_time=15 and checked that the host checks where not forced. In addition I configured 15 seconds of host check cache. cached_host_check_horizon=15 But the problem remains. And the load of the server is not very high. Load of 2,5, 2 GB of free memory and an average utilization of disc of 7%. I disabled NDO and PNP but it was useless. After the first round of checks, the delay returns, while the load of the server doesn't grow. I have searched in google but all the problems area because of the load in the server, but here this is not the main problem. So my question is ?what can I do now??There is some variable that shows me where to look? I'm a bit lost right now and I don't know how to find the problem. ?Or maybe the only way is to configure a master-slave nagios in order to maximize the server utilization? In addition, I have pretty big timeouts (60 seconds) because of the high latency on the network. All your help is appreciated. Thank you in advance. nagiostats Nagios Stats 3.2.3 Copyright (c) 2003-2008 Ethan Galstad (www.nagios.org) Last Modified: 10-03-2010 License: GPL CURRENT STATUS DATA ------------------------------------------------------ Status File: /usr/local/argos/aplicaciones/nagios/var/status.dat Status File Age: 0d 0h 0m 11s Status File Version: 3.2.3 Program Running Time: 0d 20h 56m 7s Nagios PID: 21834 Used/High/Total Command Buffers: 0 / 0 / 4096 Total Services: 4032 Services Checked: 4032 Services Scheduled: 4030 Services Actively Checked: 4032 Services Passively Checked: 0 Total Service State Change: 0.000 / 37.300 / 0.163 % Active Service Latency: 32.876 / 442.138 / 415.816 sec Active Service Execution Time: 0.051 / 60.097 / 1.545 sec Active Service State Change: 0.000 / 37.300 / 0.163 % Active Services Last 1/5/15/60 min: 237 / 1530 / 4020 / 4020 Passive Service Latency: 0.000 / 0.000 / 0.000 sec Passive Service State Change: 0.000 / 0.000 / 0.000 % Passive Services Last 1/5/15/60 min: 0 / 0 / 0 / 0 Services Ok/Warn/Unk/Crit: 3766 / 38 / 44 / 184 Services Flapping: 0 Services In Downtime: 0 Total Hosts: 931 Hosts Checked: 931 Hosts Scheduled: 931 Hosts Actively Checked: 931 Host Passively Checked: 0 Total Host State Change: 0.000 / 12.370 / 0.077 % Active Host Latency: 0.000 / 441.308 / 416.063 sec Active Host Execution Time: 0.062 / 10.113 / 0.395 sec Active Host State Change: 0.000 / 12.370 / 0.077 % Active Hosts Last 1/5/15/60 min: 74 / 423 / 931 / 931 Passive Host Latency: 0.000 / 0.000 / 0.000 sec Passive Host State Change: 0.000 / 0.000 / 0.000 % Passive Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 0 Hosts Up/Down/Unreach: 897 / 24 / 10 Hosts Flapping: 0 Hosts In Downtime: 1 Active Host Checks Last 1/5/15 min: 109 / 535 / 1583 Scheduled: 87 / 433 / 1300 On-demand: 22 / 102 / 283 Parallel: 87 / 438 / 1323 Serial: 0 / 0 / 0 Cached: 22 / 97 / 260 Passive Host Checks Last 1/5/15 min: 0 / 0 / 0 Active Service Checks Last 1/5/15 min: 304 / 1605 / 4924 Scheduled: 304 / 1605 / 4923 On-demand: 0 / 0 / 1 Cached: 0 / 0 / 0 Passive Service Checks Last 1/5/15 min: 0 / 0 / 0 External Commands Last 1/5/15 min: 0 / 0 / 0 nagios -s Nagios Core 3.2.3 Copyright (c) 2009-2010 Nagios Core Development Team and Community Contributors Copyright (c) 1999-2009 Ethan Galstad Last Modified: 10-03-2010 License: GPL Website: http://www.nagios.org Warning: aggregate_status_updates directive ignored. All status file updates are now aggregated. Warning: downtime_file variable ignored. Downtime entries are now stored in the status and retention files. Warning: comment_file variable ignored. Comments are now stored in the status and retention files. Timing information on object configuration processing is listed below. You can use this information to see if precaching your object configuration would be useful. Object Config Source: Config files (uncached) OBJECT CONFIG PROCESSING TIMES (* = Potential for precache savings with -u option) ---------------------------------- Read: 0.080036 sec Resolve: 0.010660 sec * Recomb Contactgroups: 0.002666 sec * Recomb Hostgroups: 0.004086 sec * Dup Services: 0.034632 sec * Recomb Servicegroups: 0.001277 sec * Duplicate: 0.010939 sec * Inherit: 0.005594 sec * Recomb Contacts: 0.000001 sec * Sort: 0.000000 sec * Register: 0.074413 sec Free: 0.008730 sec ============ TOTAL: 0.234920 sec * = 0.071741 sec (30.54%) estimated savings RETENTION DATA TIMES ---------------------------------- Read and Process: 0.495480 sec ============ TOTAL: 0.495480 sec Timing information on configuration verification is listed below. CONFIG VERIFICATION TIMES (* = Potential for speedup with -x option) ---------------------------------- Object Relationships: 0.060039 sec Circular Paths: 0.026557 sec * Misc: 0.005999 sec ============ TOTAL: 0.092595 sec * = 0.026557 sec (28.7%) estimated savings EVENT SCHEDULING TIMES ------------------------------------- Get service info: 0.014509 sec Get host info info: 0.002853 sec Get service params: 0.000078 sec Schedule service times: 0.039947 sec Schedule service events: 0.034656 sec Get host params: 0.000001 sec Schedule host times: 0.007519 sec Schedule host events: 0.029519 sec ============ TOTAL: 0.129082 sec Projected scheduling information for host and service checks is listed below. This information assumes that you are going to start running Nagios with your current config files. HOST SCHEDULING INFORMATION --------------------------- Total hosts: 931 Total scheduled hosts: 931 Host inter-check delay method: SMART Average host check interval: 259.01 sec Host inter-check delay: 0.28 sec Max host check spread: 30 min First scheduled check: Tue Oct 11 13:14:08 2011 Last scheduled check: Tue Oct 11 13:18:26 2011 SERVICE SCHEDULING INFORMATION ------------------------------- Total services: 4032 Total scheduled services: 4030 Service inter-check delay method: SMART Average service check interval: 299.55 sec Inter-check delay: 0.07 sec Interleave factor method: SMART Average services per host: 4.33 Service interleave factor: 5 Max service check spread: 30 min First scheduled check: Tue Oct 11 13:15:07 2011 Last scheduled check: Tue Oct 11 13:20:07 2011 CHECK PROCESSING INFORMATION ---------------------------- Check result reaper interval: 5 sec Max concurrent service checks: Unlimited PERFORMANCE SUGGESTIONS ----------------------- I have no suggestions - things look okay. -- Javier Vela Diago S2 GRUPO Ramiro de Maeztu, 7 bajo. 46022 Valencia Tel: 963.110.300 Fax: 963.106.086 e-mail : jvela arroba s2grupo punto es http://www.s2grupo.es -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From Edwin.Zoeller at ama-assn.org Tue Oct 11 15:53:29 2011 From: Edwin.Zoeller at ama-assn.org (Edwin Zoeller) Date: Tue, 11 Oct 2011 08:53:29 -0500 Subject: R: nrpe install and start on sunos joyent In-Reply-To: <009101cc880e$bebf27a0$3c3d76e0$@it.net> References: <005101cc87f1$17da43b0$478ecb10$@it.net> <009101cc880e$bebf27a0$3c3d76e0$@it.net> Message-ID: I did not have to re-compile nrpe to run on my Solaris servers but, I had to have an older library for it to use. I currently have it running Solaris 2.8, 2.9, 2.10 From: Marco Borsani [mailto:m.borsani at it.net] Sent: Tuesday, October 11, 2011 7:10 AM To: 'Nagios Users List' Subject: [Nagios-users] R: nrpe install and start on sunos joyent #> uname -a SunOS mb-sm-512.local 5.11 joyent_20110922T212927Z i86pc i386 i86pc Solaris I did find any way to compile on it, I had to use pkgin command Regards marco Da: Edwin Zoeller [mailto:Edwin.Zoeller at ama-assn.org] Inviato: marted? 11 ottobre 2011 13:31 A: nagios-users at lists.sourceforge.net Oggetto: Re: [Nagios-users] nrpe install and start on sunos joyent What version of Solaris are you running? From: Marco Borsani [mailto:m.borsani at it.net] Sent: Tuesday, October 11, 2011 03:37 AM To: NAGIOS Subject: [Nagios-users] nrpe install and start on sunos joyent Hello I have to run NRPE client on several sunos systems, but I don?t know a lot that OS. I met also many problems during the compiling..solved using the package precompiled and installed with ?pkgin in nrpe?? 1) I have prepared nrpe.xml 2) I started the service with ?svcadm enable nrpe?, but in the error log I read: [ Oct 11 08:26:30 Enabled. ] [ Oct 11 08:26:30 Executing start method ("/opt/local/sbin/nrpe -c /opt/local/etc/nagios/nrpe.cfg -d"). ] [ Oct 11 08:26:30 svc.startd could not set context for method: ] chdir: No such file or directory [ Oct 11 08:26:30 Method "start" exited with status 96. ] Any ideas? Does anyone ever done this kind of installation ? Regards Marco -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From jvela at s2grupo.es Tue Oct 11 16:16:55 2011 From: jvela at s2grupo.es (Javier Vela Diago) Date: Tue, 11 Oct 2011 16:16:55 +0200 Subject: High check latency in a machine with low load In-Reply-To: <4288A518A157EC4C8873FEE74F778BF0058A13@WPSDGQHH.OPR.STATEFARM.ORG> References: <4288A518A157EC4C8873FEE74F778BF0058A13@WPSDGQHH.OPR.STATEFARM.ORG> Message-ID: I have a lot of custom checks, written mostly in perl, bash and some in python. And some take a lo of time. Nevermind, I think I found the solution, or at least one part. I configured to 1 the enable_large_instalallation_tweaks. This options, 6 months ago, almost crashed my system, so i discarded it. Now, with bigger problems, is the last thing that I wanted to test, but finally this afternoon I tested it. When I restarted Nagios, the load has started to grow until 6-8, and the latency problems dissapeared. I was sceptical about the utility of this options but when the load changes form 2,5 to 6, it means that the machine is doing a lot of work that before wasn't doing. Now the problem is that NDOUtils is causing some latency because of MYSQL, but well, at least I know what to optimize. Some tips will be apreciated :) Thank you and sorry for your time. De: Daniel Wittenberg Para: Nagios Users List Fecha: 11/10/2011 16:02 Asunto: Re: [Nagios-users] High check latency in a machine with low load I think you have the enable_high_latency option enabled J j/k Do you have any particular checks that are taking a long time? i.e. can you watch top and see checks taking a while? Dan From: Javier Vela Diago [mailto:jvela at s2grupo.es] Sent: Tuesday, October 11, 2011 6:23 AM To: nagios-users at lists.sourceforge.net Subject: [Nagios-users] High check latency in a machine with low load Hi, I have a Nagios 3.2.3 deployment with 1000+ Hosts and 3000+ services. This Nagios runs together with NDO and PNP (in bulk mode) in a server with 4GB of Ram and 4 cpus. One day I realized that the check delay in the performance CGI was very high (300-400 seconds). It was very strange so I took the tunning guide form nagios (http://nagios.sourceforge.net/docs/3_0/tuning.html) and applied all the points I could. In particular I adjusted the max_concurrent_checks to zero (no limit): max_concurrent_checks=0 The reaper event: service_reaper_frequency=5 max_check_result_reaper_time=15 and checked that the host checks where not forced. In addition I configured 15 seconds of host check cache. cached_host_check_horizon=15 But the problem remains. And the load of the server is not very high. Load of 2,5, 2 GB of free memory and an average utilization of disc of 7%. I disabled NDO and PNP but it was useless. After the first round of checks, the delay returns, while the load of the server doesn't grow. I have searched in google but all the problems area because of the load in the server, but here this is not the main problem. So my question is ?what can I do now??There is some variable that shows me where to look? I'm a bit lost right now and I don't know how to find the problem. ?Or maybe the only way is to configure a master-slave nagios in order to maximize the server utilization? In addition, I have pretty big timeouts (60 seconds) because of the high latency on the network. All your help is appreciated. Thank you in advance. nagiostats Nagios Stats 3.2.3 Copyright (c) 2003-2008 Ethan Galstad (www.nagios.org) Last Modified: 10-03-2010 License: GPL CURRENT STATUS DATA ------------------------------------------------------ Status File: /usr/local/argos/aplicaciones/nagios/var/status.dat Status File Age: 0d 0h 0m 11s Status File Version: 3.2.3 Program Running Time: 0d 20h 56m 7s Nagios PID: 21834 Used/High/Total Command Buffers: 0 / 0 / 4096 Total Services: 4032 Services Checked: 4032 Services Scheduled: 4030 Services Actively Checked: 4032 Services Passively Checked: 0 Total Service State Change: 0.000 / 37.300 / 0.163 % Active Service Latency: 32.876 / 442.138 / 415.816 sec Active Service Execution Time: 0.051 / 60.097 / 1.545 sec Active Service State Change: 0.000 / 37.300 / 0.163 % Active Services Last 1/5/15/60 min: 237 / 1530 / 4020 / 4020 Passive Service Latency: 0.000 / 0.000 / 0.000 sec Passive Service State Change: 0.000 / 0.000 / 0.000 % Passive Services Last 1/5/15/60 min: 0 / 0 / 0 / 0 Services Ok/Warn/Unk/Crit: 3766 / 38 / 44 / 184 Services Flapping: 0 Services In Downtime: 0 Total Hosts: 931 Hosts Checked: 931 Hosts Scheduled: 931 Hosts Actively Checked: 931 Host Passively Checked: 0 Total Host State Change: 0.000 / 12.370 / 0.077 % Active Host Latency: 0.000 / 441.308 / 416.063 sec Active Host Execution Time: 0.062 / 10.113 / 0.395 sec Active Host State Change: 0.000 / 12.370 / 0.077 % Active Hosts Last 1/5/15/60 min: 74 / 423 / 931 / 931 Passive Host Latency: 0.000 / 0.000 / 0.000 sec Passive Host State Change: 0.000 / 0.000 / 0.000 % Passive Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 0 Hosts Up/Down/Unreach: 897 / 24 / 10 Hosts Flapping: 0 Hosts In Downtime: 1 Active Host Checks Last 1/5/15 min: 109 / 535 / 1583 Scheduled: 87 / 433 / 1300 On-demand: 22 / 102 / 283 Parallel: 87 / 438 / 1323 Serial: 0 / 0 / 0 Cached: 22 / 97 / 260 Passive Host Checks Last 1/5/15 min: 0 / 0 / 0 Active Service Checks Last 1/5/15 min: 304 / 1605 / 4924 Scheduled: 304 / 1605 / 4923 On-demand: 0 / 0 / 1 Cached: 0 / 0 / 0 Passive Service Checks Last 1/5/15 min: 0 / 0 / 0 External Commands Last 1/5/15 min: 0 / 0 / 0 nagios -s Nagios Core 3.2.3 Copyright (c) 2009-2010 Nagios Core Development Team and Community Contributors Copyright (c) 1999-2009 Ethan Galstad Last Modified: 10-03-2010 License: GPL Website: http://www.nagios.org Warning: aggregate_status_updates directive ignored. All status file updates are now aggregated. Warning: downtime_file variable ignored. Downtime entries are now stored in the status and retention files. Warning: comment_file variable ignored. Comments are now stored in the status and retention files. Timing information on object configuration processing is listed below. You can use this information to see if precaching your object configuration would be useful. Object Config Source: Config files (uncached) OBJECT CONFIG PROCESSING TIMES (* = Potential for precache savings with -u option) ---------------------------------- Read: 0.080036 sec Resolve: 0.010660 sec * Recomb Contactgroups: 0.002666 sec * Recomb Hostgroups: 0.004086 sec * Dup Services: 0.034632 sec * Recomb Servicegroups: 0.001277 sec * Duplicate: 0.010939 sec * Inherit: 0.005594 sec * Recomb Contacts: 0.000001 sec * Sort: 0.000000 sec * Register: 0.074413 sec Free: 0.008730 sec ============ TOTAL: 0.234920 sec * = 0.071741 sec (30.54%) estimated savings RETENTION DATA TIMES ---------------------------------- Read and Process: 0.495480 sec ============ TOTAL: 0.495480 sec Timing information on configuration verification is listed below. CONFIG VERIFICATION TIMES (* = Potential for speedup with -x option) ---------------------------------- Object Relationships: 0.060039 sec Circular Paths: 0.026557 sec * Misc: 0.005999 sec ============ TOTAL: 0.092595 sec * = 0.026557 sec (28.7%) estimated savings EVENT SCHEDULING TIMES ------------------------------------- Get service info: 0.014509 sec Get host info info: 0.002853 sec Get service params: 0.000078 sec Schedule service times: 0.039947 sec Schedule service events: 0.034656 sec Get host params: 0.000001 sec Schedule host times: 0.007519 sec Schedule host events: 0.029519 sec ============ TOTAL: 0.129082 sec Projected scheduling information for host and service checks is listed below. This information assumes that you are going to start running Nagios with your current config files. HOST SCHEDULING INFORMATION --------------------------- Total hosts: 931 Total scheduled hosts: 931 Host inter-check delay method: SMART Average host check interval: 259.01 sec Host inter-check delay: 0.28 sec Max host check spread: 30 min First scheduled check: Tue Oct 11 13:14:08 2011 Last scheduled check: Tue Oct 11 13:18:26 2011 SERVICE SCHEDULING INFORMATION ------------------------------- Total services: 4032 Total scheduled services: 4030 Service inter-check delay method: SMART Average service check interval: 299.55 sec Inter-check delay: 0.07 sec Interleave factor method: SMART Average services per host: 4.33 Service interleave factor: 5 Max service check spread: 30 min First scheduled check: Tue Oct 11 13:15:07 2011 Last scheduled check: Tue Oct 11 13:20:07 2011 CHECK PROCESSING INFORMATION ---------------------------- Check result reaper interval: 5 sec Max concurrent service checks: Unlimited PERFORMANCE SUGGESTIONS ----------------------- I have no suggestions - things look okay. -- Javier Vela Diago S2 GRUPO Ramiro de Maeztu, 7 bajo. 46022 Valencia Tel: 963.110.300 Fax: 963.106.086 e-mail : jvela arroba s2grupo punto es http://www.s2grupo.es ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mguthrie at nagios.com Tue Oct 11 16:25:48 2011 From: mguthrie at nagios.com (Mike Guthrie) Date: Tue, 11 Oct 2011 09:25:48 -0500 Subject: High check latency in a machine with low load In-Reply-To: References: <4288A518A157EC4C8873FEE74F778BF0058A13@WPSDGQHH.OPR.STATEFARM.ORG> Message-ID: <4E9451EC.7000102@nagios.com> If ndoutils starts to create a heavy burden on the system you can also offload ndoutils/mysql to a second machine. We wrote the below document for Nagios XI, but the doc has the info you'd need to make it work for Nagios Core as well. http://library.nagios.com/library/products/nagiosxi/documentation/462-offloading-mysql-to-remote-server Javier Vela Diago wrote: > I have a lot of custom checks, written mostly in perl, bash and some > in python. And some take a lo of time. > > Nevermind, I think I found the solution, or at least one part. I > configured to 1 the enable_large_instalallation_tweaks. This options, > 6 months ago, almost crashed my system, so i discarded it. Now, with > bigger problems, is the last thing that I wanted to test, but finally > this afternoon I tested it. > > When I restarted Nagios, the load has started to grow until 6-8, and > the latency problems dissapeared. I was sceptical about the utility of > this options but when the load changes form 2,5 to 6, it means that > the machine is doing a lot of work that before wasn't doing. > > Now the problem is that NDOUtils is causing some latency because of > MYSQL, but well, at least I know what to optimize. Some tips will be > apreciated :) > > Thank you and sorry for your time. > > > De: Daniel Wittenberg > Para: Nagios Users List > Fecha: 11/10/2011 16:02 > Asunto: Re: [Nagios-users] High check latency in a machine with > low load > ------------------------------------------------------------------------ > > > > I think you have the enable_high_latency option enabled J j/k > > Do you have any particular checks that are taking a long time? i.e. > can you watch top and see checks taking a while? > > Dan > > > *From:* Javier Vela Diago [mailto:jvela at s2grupo.es] * > Sent:* Tuesday, October 11, 2011 6:23 AM* > To:* nagios-users at lists.sourceforge.net* > Subject:* [Nagios-users] High check latency in a machine with low load > > Hi, > > I have a Nagios 3.2.3 deployment with 1000+ Hosts and 3000+ services. > This Nagios runs together with NDO and PNP (in bulk mode) in a server > with 4GB of Ram and 4 cpus. > > One day I realized that the check delay in the performance CGI was > very high (300-400 seconds). It was very strange so I took the tunning > guide form nagios > (_http://nagios.sourceforge.net/docs/3_0/tuning.html_) and applied all > the points I could. In particular I adjusted the max_concurrent_checks > to zero (no limit): > > max_concurrent_checks=0 > > The reaper event: > > service_reaper_frequency=5 > max_check_result_reaper_time=15 > > and checked that the host checks where not forced. In addition I > configured 15 seconds of host check cache. > > cached_host_check_horizon=15 > > But the problem remains. And the load of the server is not very high. > Load of 2,5, 2 GB of free memory and an average utilization of disc of > 7%. I disabled NDO and PNP but it was useless. After the first round > of checks, the delay returns, while the load of the server doesn't grow. > > I have searched in google but all the problems area because of the > load in the server, but here this is not the main problem. So my > question is ?what can I do now??There is some variable that shows me > where to look? I'm a bit lost right now and I don't know how to find > the problem. > > ?Or maybe the only way is to configure a master-slave nagios in order > to maximize the server utilization? > > In addition, I have pretty big timeouts (60 seconds) because of the > high latency on the network. All your help is appreciated. Thank you > in advance. > * > nagiostats* > Nagios Stats 3.2.3 > Copyright (c) 2003-2008 Ethan Galstad (_www.nagios.org_) > Last Modified: 10-03-2010 > License: GPL > > CURRENT STATUS DATA > ------------------------------------------------------ > Status File: > /usr/local/argos/aplicaciones/nagios/var/status.dat > Status File Age: 0d 0h 0m 11s > Status File Version: 3.2.3 > > Program Running Time: 0d 20h 56m 7s > Nagios PID: 21834 > Used/High/Total Command Buffers: 0 / 0 / 4096 > > Total Services: 4032 > Services Checked: 4032 > Services Scheduled: 4030 > Services Actively Checked: 4032 > Services Passively Checked: 0 > Total Service State Change: 0.000 / 37.300 / 0.163 % > Active Service Latency: 32.876 / 442.138 / 415.816 sec > Active Service Execution Time: 0.051 / 60.097 / 1.545 sec > Active Service State Change: 0.000 / 37.300 / 0.163 % > Active Services Last 1/5/15/60 min: 237 / 1530 / 4020 / 4020 > Passive Service Latency: 0.000 / 0.000 / 0.000 sec > Passive Service State Change: 0.000 / 0.000 / 0.000 % > Passive Services Last 1/5/15/60 min: 0 / 0 / 0 / 0 > Services Ok/Warn/Unk/Crit: 3766 / 38 / 44 / 184 > Services Flapping: 0 > Services In Downtime: 0 > > Total Hosts: 931 > Hosts Checked: 931 > Hosts Scheduled: 931 > Hosts Actively Checked: 931 > Host Passively Checked: 0 > Total Host State Change: 0.000 / 12.370 / 0.077 % > Active Host Latency: 0.000 / 441.308 / 416.063 sec > Active Host Execution Time: 0.062 / 10.113 / 0.395 sec > Active Host State Change: 0.000 / 12.370 / 0.077 % > Active Hosts Last 1/5/15/60 min: 74 / 423 / 931 / 931 > Passive Host Latency: 0.000 / 0.000 / 0.000 sec > Passive Host State Change: 0.000 / 0.000 / 0.000 % > Passive Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 0 > Hosts Up/Down/Unreach: 897 / 24 / 10 > Hosts Flapping: 0 > Hosts In Downtime: 1 > > Active Host Checks Last 1/5/15 min: 109 / 535 / 1583 > Scheduled: 87 / 433 / 1300 > On-demand: 22 / 102 / 283 > Parallel: 87 / 438 / 1323 > Serial: 0 / 0 / 0 > Cached: 22 / 97 / 260 > Passive Host Checks Last 1/5/15 min: 0 / 0 / 0 > Active Service Checks Last 1/5/15 min: 304 / 1605 / 4924 > Scheduled: 304 / 1605 / 4923 > On-demand: 0 / 0 / 1 > Cached: 0 / 0 / 0 > Passive Service Checks Last 1/5/15 min: 0 / 0 / 0 > > External Commands Last 1/5/15 min: 0 / 0 / 0 > * > nagios -s* > > Nagios Core 3.2.3 > Copyright (c) 2009-2010 Nagios Core Development Team and Community > Contributors > Copyright (c) 1999-2009 Ethan Galstad > Last Modified: 10-03-2010 > License: GPL > > Website: _http://www.nagios.org_ > Warning: aggregate_status_updates directive ignored. All status file > updates are now aggregated. > Warning: downtime_file variable ignored. Downtime entries are now > stored in the status and retention files. > Warning: comment_file variable ignored. Comments are now stored in > the status and retention files. > Timing information on object configuration processing is listed > below. You can use this information to see if precaching your > object configuration would be useful. > > Object Config Source: Config files (uncached) > > OBJECT CONFIG PROCESSING TIMES (* = Potential for precache > savings with -u option) > ---------------------------------- > Read: 0.080036 sec > Resolve: 0.010660 sec * > Recomb Contactgroups: 0.002666 sec * > Recomb Hostgroups: 0.004086 sec * > Dup Services: 0.034632 sec * > Recomb Servicegroups: 0.001277 sec * > Duplicate: 0.010939 sec * > Inherit: 0.005594 sec * > Recomb Contacts: 0.000001 sec * > Sort: 0.000000 sec * > Register: 0.074413 sec > Free: 0.008730 sec > ============ > TOTAL: 0.234920 sec * = 0.071741 sec (30.54%) > estimated savings > > > RETENTION DATA TIMES > ---------------------------------- > Read and Process: 0.495480 sec > ============ > TOTAL: 0.495480 sec > > > Timing information on configuration verification is listed below. > > CONFIG VERIFICATION TIMES (* = Potential for speedup with -x > option) > ---------------------------------- > Object Relationships: 0.060039 sec > Circular Paths: 0.026557 sec * > Misc: 0.005999 sec > ============ > TOTAL: 0.092595 sec * = 0.026557 sec (28.7%) estimated > savings > > > EVENT SCHEDULING TIMES > ------------------------------------- > Get service info: 0.014509 sec > Get host info info: 0.002853 sec > Get service params: 0.000078 sec > Schedule service times: 0.039947 sec > Schedule service events: 0.034656 sec > Get host params: 0.000001 sec > Schedule host times: 0.007519 sec > Schedule host events: 0.029519 sec > ============ > TOTAL: 0.129082 sec > > > Projected scheduling information for host and service checks > is listed below. This information assumes that you are going > to start running Nagios with your current config files. > > HOST SCHEDULING INFORMATION > --------------------------- > Total hosts: 931 > Total scheduled hosts: 931 > Host inter-check delay method: SMART > Average host check interval: 259.01 sec > Host inter-check delay: 0.28 sec > Max host check spread: 30 min > First scheduled check: Tue Oct 11 13:14:08 2011 > Last scheduled check: Tue Oct 11 13:18:26 2011 > > > SERVICE SCHEDULING INFORMATION > ------------------------------- > Total services: 4032 > Total scheduled services: 4030 > Service inter-check delay method: SMART > Average service check interval: 299.55 sec > Inter-check delay: 0.07 sec > Interleave factor method: SMART > Average services per host: 4.33 > Service interleave factor: 5 > Max service check spread: 30 min > First scheduled check: Tue Oct 11 13:15:07 2011 > Last scheduled check: Tue Oct 11 13:20:07 2011 > > > CHECK PROCESSING INFORMATION > ---------------------------- > Check result reaper interval: 5 sec > Max concurrent service checks: Unlimited > > > PERFORMANCE SUGGESTIONS > ----------------------- > I have no suggestions - things look okay. > -- > Javier Vela Diago > S2 GRUPO > Ramiro de Maeztu, 7 bajo. 46022 Valencia > Tel: 963.110.300 Fax: 963.106.086 > e-mail : jvela arroba s2grupo punto es_ > __http://www.s2grupo.es_ > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct_______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > ------------------------------------------------------------------------ > > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- Mike Guthrie Technical Team ___ Nagios Enterprises, LLC Email: mguthrie at nagios.com Web: www.nagios.com ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From jvela at s2grupo.es Tue Oct 11 16:50:51 2011 From: jvela at s2grupo.es (Javier Vela Diago) Date: Tue, 11 Oct 2011 16:50:51 +0200 Subject: High check latency in a machine with low load In-Reply-To: <4E9451EC.7000102@nagios.com> References: <4288A518A157EC4C8873FEE74F778BF0058A13@WPSDGQHH.OPR.STATEFARM.ORG> <4E9451EC.7000102@nagios.com> Message-ID: Thank you for the advise, but due some problems in the past, I already have the mysql database in another machine with 2 cpus and 2GB of ram. Also, because of the problems I suffered, I have a script that every nigth optimizes and repairs the ndoutils database. My goal now is to change the engine from MyISAM to INNODB and apply some tunnig to the database. The engine change is because when problems start, with MyISAM I have to truncate the database because optimize hangs out, but with InnoDB, in the tests I've made, works fine. Javi De: Mike Guthrie Para: Nagios Users List Fecha: 11/10/2011 16:39 Asunto: Re: [Nagios-users] High check latency in a machine with low load If ndoutils starts to create a heavy burden on the system you can also offload ndoutils/mysql to a second machine. We wrote the below document for Nagios XI, but the doc has the info you'd need to make it work for Nagios Core as well. http://library.nagios.com/library/products/nagiosxi/documentation/462-offloading-mysql-to-remote-server Javier Vela Diago wrote: > I have a lot of custom checks, written mostly in perl, bash and some > in python. And some take a lo of time. > > Nevermind, I think I found the solution, or at least one part. I > configured to 1 the enable_large_instalallation_tweaks. This options, > 6 months ago, almost crashed my system, so i discarded it. Now, with > bigger problems, is the last thing that I wanted to test, but finally > this afternoon I tested it. > > When I restarted Nagios, the load has started to grow until 6-8, and > the latency problems dissapeared. I was sceptical about the utility of > this options but when the load changes form 2,5 to 6, it means that > the machine is doing a lot of work that before wasn't doing. > > Now the problem is that NDOUtils is causing some latency because of > MYSQL, but well, at least I know what to optimize. Some tips will be > apreciated :) > > Thank you and sorry for your time. > > > De: Daniel Wittenberg > Para: Nagios Users List > Fecha: 11/10/2011 16:02 > Asunto: Re: [Nagios-users] High check latency in a machine with > low load > ------------------------------------------------------------------------ > > > > I think you have the enable_high_latency option enabled J j/k > > Do you have any particular checks that are taking a long time? i.e. > can you watch top and see checks taking a while? > > Dan > > > *From:* Javier Vela Diago [mailto:jvela at s2grupo.es] * > Sent:* Tuesday, October 11, 2011 6:23 AM* > To:* nagios-users at lists.sourceforge.net* > Subject:* [Nagios-users] High check latency in a machine with low load > > Hi, > > I have a Nagios 3.2.3 deployment with 1000+ Hosts and 3000+ services. > This Nagios runs together with NDO and PNP (in bulk mode) in a server > with 4GB of Ram and 4 cpus. > > One day I realized that the check delay in the performance CGI was > very high (300-400 seconds). It was very strange so I took the tunning > guide form nagios > (_http://nagios.sourceforge.net/docs/3_0/tuning.html_) and applied all > the points I could. In particular I adjusted the max_concurrent_checks > to zero (no limit): > > max_concurrent_checks=0 > > The reaper event: > > service_reaper_frequency=5 > max_check_result_reaper_time=15 > > and checked that the host checks where not forced. In addition I > configured 15 seconds of host check cache. > > cached_host_check_horizon=15 > > But the problem remains. And the load of the server is not very high. > Load of 2,5, 2 GB of free memory and an average utilization of disc of > 7%. I disabled NDO and PNP but it was useless. After the first round > of checks, the delay returns, while the load of the server doesn't grow. > > I have searched in google but all the problems area because of the > load in the server, but here this is not the main problem. So my > question is ?what can I do now??There is some variable that shows me > where to look? I'm a bit lost right now and I don't know how to find > the problem. > > ?Or maybe the only way is to configure a master-slave nagios in order > to maximize the server utilization? > > In addition, I have pretty big timeouts (60 seconds) because of the > high latency on the network. All your help is appreciated. Thank you > in advance. > * > nagiostats* > Nagios Stats 3.2.3 > Copyright (c) 2003-2008 Ethan Galstad (_www.nagios.org_) > Last Modified: 10-03-2010 > License: GPL > > CURRENT STATUS DATA > ------------------------------------------------------ > Status File: > /usr/local/argos/aplicaciones/nagios/var/status.dat > Status File Age: 0d 0h 0m 11s > Status File Version: 3.2.3 > > Program Running Time: 0d 20h 56m 7s > Nagios PID: 21834 > Used/High/Total Command Buffers: 0 / 0 / 4096 > > Total Services: 4032 > Services Checked: 4032 > Services Scheduled: 4030 > Services Actively Checked: 4032 > Services Passively Checked: 0 > Total Service State Change: 0.000 / 37.300 / 0.163 % > Active Service Latency: 32.876 / 442.138 / 415.816 sec > Active Service Execution Time: 0.051 / 60.097 / 1.545 sec > Active Service State Change: 0.000 / 37.300 / 0.163 % > Active Services Last 1/5/15/60 min: 237 / 1530 / 4020 / 4020 > Passive Service Latency: 0.000 / 0.000 / 0.000 sec > Passive Service State Change: 0.000 / 0.000 / 0.000 % > Passive Services Last 1/5/15/60 min: 0 / 0 / 0 / 0 > Services Ok/Warn/Unk/Crit: 3766 / 38 / 44 / 184 > Services Flapping: 0 > Services In Downtime: 0 > > Total Hosts: 931 > Hosts Checked: 931 > Hosts Scheduled: 931 > Hosts Actively Checked: 931 > Host Passively Checked: 0 > Total Host State Change: 0.000 / 12.370 / 0.077 % > Active Host Latency: 0.000 / 441.308 / 416.063 sec > Active Host Execution Time: 0.062 / 10.113 / 0.395 sec > Active Host State Change: 0.000 / 12.370 / 0.077 % > Active Hosts Last 1/5/15/60 min: 74 / 423 / 931 / 931 > Passive Host Latency: 0.000 / 0.000 / 0.000 sec > Passive Host State Change: 0.000 / 0.000 / 0.000 % > Passive Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 0 > Hosts Up/Down/Unreach: 897 / 24 / 10 > Hosts Flapping: 0 > Hosts In Downtime: 1 > > Active Host Checks Last 1/5/15 min: 109 / 535 / 1583 > Scheduled: 87 / 433 / 1300 > On-demand: 22 / 102 / 283 > Parallel: 87 / 438 / 1323 > Serial: 0 / 0 / 0 > Cached: 22 / 97 / 260 > Passive Host Checks Last 1/5/15 min: 0 / 0 / 0 > Active Service Checks Last 1/5/15 min: 304 / 1605 / 4924 > Scheduled: 304 / 1605 / 4923 > On-demand: 0 / 0 / 1 > Cached: 0 / 0 / 0 > Passive Service Checks Last 1/5/15 min: 0 / 0 / 0 > > External Commands Last 1/5/15 min: 0 / 0 / 0 > * > nagios -s* > > Nagios Core 3.2.3 > Copyright (c) 2009-2010 Nagios Core Development Team and Community > Contributors > Copyright (c) 1999-2009 Ethan Galstad > Last Modified: 10-03-2010 > License: GPL > > Website: _http://www.nagios.org_ > Warning: aggregate_status_updates directive ignored. All status file > updates are now aggregated. > Warning: downtime_file variable ignored. Downtime entries are now > stored in the status and retention files. > Warning: comment_file variable ignored. Comments are now stored in > the status and retention files. > Timing information on object configuration processing is listed > below. You can use this information to see if precaching your > object configuration would be useful. > > Object Config Source: Config files (uncached) > > OBJECT CONFIG PROCESSING TIMES (* = Potential for precache > savings with -u option) > ---------------------------------- > Read: 0.080036 sec > Resolve: 0.010660 sec * > Recomb Contactgroups: 0.002666 sec * > Recomb Hostgroups: 0.004086 sec * > Dup Services: 0.034632 sec * > Recomb Servicegroups: 0.001277 sec * > Duplicate: 0.010939 sec * > Inherit: 0.005594 sec * > Recomb Contacts: 0.000001 sec * > Sort: 0.000000 sec * > Register: 0.074413 sec > Free: 0.008730 sec > ============ > TOTAL: 0.234920 sec * = 0.071741 sec (30.54%) > estimated savings > > > RETENTION DATA TIMES > ---------------------------------- > Read and Process: 0.495480 sec > ============ > TOTAL: 0.495480 sec > > > Timing information on configuration verification is listed below. > > CONFIG VERIFICATION TIMES (* = Potential for speedup with -x > option) > ---------------------------------- > Object Relationships: 0.060039 sec > Circular Paths: 0.026557 sec * > Misc: 0.005999 sec > ============ > TOTAL: 0.092595 sec * = 0.026557 sec (28.7%) estimated > savings > > > EVENT SCHEDULING TIMES > ------------------------------------- > Get service info: 0.014509 sec > Get host info info: 0.002853 sec > Get service params: 0.000078 sec > Schedule service times: 0.039947 sec > Schedule service events: 0.034656 sec > Get host params: 0.000001 sec > Schedule host times: 0.007519 sec > Schedule host events: 0.029519 sec > ============ > TOTAL: 0.129082 sec > > > Projected scheduling information for host and service checks > is listed below. This information assumes that you are going > to start running Nagios with your current config files. > > HOST SCHEDULING INFORMATION > --------------------------- > Total hosts: 931 > Total scheduled hosts: 931 > Host inter-check delay method: SMART > Average host check interval: 259.01 sec > Host inter-check delay: 0.28 sec > Max host check spread: 30 min > First scheduled check: Tue Oct 11 13:14:08 2011 > Last scheduled check: Tue Oct 11 13:18:26 2011 > > > SERVICE SCHEDULING INFORMATION > ------------------------------- > Total services: 4032 > Total scheduled services: 4030 > Service inter-check delay method: SMART > Average service check interval: 299.55 sec > Inter-check delay: 0.07 sec > Interleave factor method: SMART > Average services per host: 4.33 > Service interleave factor: 5 > Max service check spread: 30 min > First scheduled check: Tue Oct 11 13:15:07 2011 > Last scheduled check: Tue Oct 11 13:20:07 2011 > > > CHECK PROCESSING INFORMATION > ---------------------------- > Check result reaper interval: 5 sec > Max concurrent service checks: Unlimited > > > PERFORMANCE SUGGESTIONS > ----------------------- > I have no suggestions - things look okay. > -- > Javier Vela Diago > S2 GRUPO > Ramiro de Maeztu, 7 bajo. 46022 Valencia > Tel: 963.110.300 Fax: 963.106.086 > e-mail : jvela arroba s2grupo punto es_ > __http://www.s2grupo.es_ > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct_______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > ------------------------------------------------------------------------ > > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- Mike Guthrie Technical Team ___ Nagios Enterprises, LLC Email: mguthrie at nagios.com Web: www.nagios.com ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mguthrie at nagios.com Tue Oct 11 17:02:22 2011 From: mguthrie at nagios.com (Mike Guthrie) Date: Tue, 11 Oct 2011 10:02:22 -0500 Subject: High check latency in a machine with low load In-Reply-To: References: <4288A518A157EC4C8873FEE74F778BF0058A13@WPSDGQHH.OPR.STATEFARM.ORG> <4E9451EC.7000102@nagios.com> Message-ID: <4E945A7E.9090705@nagios.com> Just to double check, are you locking your tables or stopping mysql when you do your repair runs? You'll actually risk corrupting your DB tables which will tank your CPU if the tables are being written to while a repair run is occurring. As far as the checks go, how many checks per second is your machine running (on average)? Javier Vela Diago wrote: > Thank you for the advise, but due some problems in the past, I already > have the mysql database in another machine with 2 cpus and 2GB of ram. > > Also, because of the problems I suffered, I have a script that every > nigth optimizes and repairs the ndoutils database. My goal now is to > change the engine from MyISAM to INNODB and apply some tunnig to the > database. The engine change is because when problems start, with > MyISAM I have to truncate the database because optimize hangs out, but > with InnoDB, in the tests I've made, works fine. > > Javi > > > > De: Mike Guthrie > Para: Nagios Users List > Fecha: 11/10/2011 16:39 > Asunto: Re: [Nagios-users] High check latency in a machine with > low load > ------------------------------------------------------------------------ > > > > If ndoutils starts to create a heavy burden on the system you can also > offload ndoutils/mysql to a second machine. We wrote the below document > for Nagios XI, but the doc has the info you'd need to make it work for > Nagios Core as well. > > http://library.nagios.com/library/products/nagiosxi/documentation/462-offloading-mysql-to-remote-server > > > > Javier Vela Diago wrote: > > I have a lot of custom checks, written mostly in perl, bash and some > > in python. And some take a lo of time. > > > > Nevermind, I think I found the solution, or at least one part. I > > configured to 1 the enable_large_instalallation_tweaks. This options, > > 6 months ago, almost crashed my system, so i discarded it. Now, with > > bigger problems, is the last thing that I wanted to test, but finally > > this afternoon I tested it. > > > > When I restarted Nagios, the load has started to grow until 6-8, and > > the latency problems dissapeared. I was sceptical about the utility of > > this options but when the load changes form 2,5 to 6, it means that > > the machine is doing a lot of work that before wasn't doing. > > > > Now the problem is that NDOUtils is causing some latency because of > > MYSQL, but well, at least I know what to optimize. Some tips will be > > apreciated :) > > > > Thank you and sorry for your time. > > > > > > De: Daniel Wittenberg > > Para: Nagios Users List > > Fecha: 11/10/2011 16:02 > > Asunto: Re: [Nagios-users] High check latency in a machine with > > low load > > ------------------------------------------------------------------------ > > > > > > > > I think you have the enable_high_latency option enabled J j/k > > > > Do you have any particular checks that are taking a long time? i.e. > > can you watch top and see checks taking a while? > > > > Dan > > > > > > *From:* Javier Vela Diago [mailto:jvela at s2grupo.es] * > > Sent:* Tuesday, October 11, 2011 6:23 AM* > > To:* nagios-users at lists.sourceforge.net* > > Subject:* [Nagios-users] High check latency in a machine with low load > > > > Hi, > > > > I have a Nagios 3.2.3 deployment with 1000+ Hosts and 3000+ services. > > This Nagios runs together with NDO and PNP (in bulk mode) in a server > > with 4GB of Ram and 4 cpus. > > > > One day I realized that the check delay in the performance CGI was > > very high (300-400 seconds). It was very strange so I took the tunning > > guide form nagios > > (_http://nagios.sourceforge.net/docs/3_0/tuning.html_) and applied all > > the points I could. In particular I adjusted the max_concurrent_checks > > to zero (no limit): > > > > max_concurrent_checks=0 > > > > The reaper event: > > > > service_reaper_frequency=5 > > max_check_result_reaper_time=15 > > > > and checked that the host checks where not forced. In addition I > > configured 15 seconds of host check cache. > > > > cached_host_check_horizon=15 > > > > But the problem remains. And the load of the server is not very high. > > Load of 2,5, 2 GB of free memory and an average utilization of disc of > > 7%. I disabled NDO and PNP but it was useless. After the first round > > of checks, the delay returns, while the load of the server doesn't grow. > > > > I have searched in google but all the problems area because of the > > load in the server, but here this is not the main problem. So my > > question is ?what can I do now??There is some variable that shows me > > where to look? I'm a bit lost right now and I don't know how to find > > the problem. > > > > ?Or maybe the only way is to configure a master-slave nagios in order > > to maximize the server utilization? > > > > In addition, I have pretty big timeouts (60 seconds) because of the > > high latency on the network. All your help is appreciated. Thank you > > in advance. > > * > > nagiostats* > > Nagios Stats 3.2.3 > > Copyright (c) 2003-2008 Ethan Galstad (_www.nagios.org_) > > Last Modified: 10-03-2010 > > License: GPL > > > > CURRENT STATUS DATA > > ------------------------------------------------------ > > Status File: > > /usr/local/argos/aplicaciones/nagios/var/status.dat > > Status File Age: 0d 0h 0m 11s > > Status File Version: 3.2.3 > > > > Program Running Time: 0d 20h 56m 7s > > Nagios PID: 21834 > > Used/High/Total Command Buffers: 0 / 0 / 4096 > > > > Total Services: 4032 > > Services Checked: 4032 > > Services Scheduled: 4030 > > Services Actively Checked: 4032 > > Services Passively Checked: 0 > > Total Service State Change: 0.000 / 37.300 / 0.163 % > > Active Service Latency: 32.876 / 442.138 / 415.816 sec > > Active Service Execution Time: 0.051 / 60.097 / 1.545 sec > > Active Service State Change: 0.000 / 37.300 / 0.163 % > > Active Services Last 1/5/15/60 min: 237 / 1530 / 4020 / 4020 > > Passive Service Latency: 0.000 / 0.000 / 0.000 sec > > Passive Service State Change: 0.000 / 0.000 / 0.000 % > > Passive Services Last 1/5/15/60 min: 0 / 0 / 0 / 0 > > Services Ok/Warn/Unk/Crit: 3766 / 38 / 44 / 184 > > Services Flapping: 0 > > Services In Downtime: 0 > > > > Total Hosts: 931 > > Hosts Checked: 931 > > Hosts Scheduled: 931 > > Hosts Actively Checked: 931 > > Host Passively Checked: 0 > > Total Host State Change: 0.000 / 12.370 / 0.077 % > > Active Host Latency: 0.000 / 441.308 / 416.063 sec > > Active Host Execution Time: 0.062 / 10.113 / 0.395 sec > > Active Host State Change: 0.000 / 12.370 / 0.077 % > > Active Hosts Last 1/5/15/60 min: 74 / 423 / 931 / 931 > > Passive Host Latency: 0.000 / 0.000 / 0.000 sec > > Passive Host State Change: 0.000 / 0.000 / 0.000 % > > Passive Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 0 > > Hosts Up/Down/Unreach: 897 / 24 / 10 > > Hosts Flapping: 0 > > Hosts In Downtime: 1 > > > > Active Host Checks Last 1/5/15 min: 109 / 535 / 1583 > > Scheduled: 87 / 433 / 1300 > > On-demand: 22 / 102 / 283 > > Parallel: 87 / 438 / 1323 > > Serial: 0 / 0 / 0 > > Cached: 22 / 97 / 260 > > Passive Host Checks Last 1/5/15 min: 0 / 0 / 0 > > Active Service Checks Last 1/5/15 min: 304 / 1605 / 4924 > > Scheduled: 304 / 1605 / 4923 > > On-demand: 0 / 0 / 1 > > Cached: 0 / 0 / 0 > > Passive Service Checks Last 1/5/15 min: 0 / 0 / 0 > > > > External Commands Last 1/5/15 min: 0 / 0 / 0 > > * > > nagios -s* > > > > Nagios Core 3.2.3 > > Copyright (c) 2009-2010 Nagios Core Development Team and Community > > Contributors > > Copyright (c) 1999-2009 Ethan Galstad > > Last Modified: 10-03-2010 > > License: GPL > > > > Website: _http://www.nagios.org_ > > Warning: aggregate_status_updates directive ignored. All status file > > updates are now aggregated. > > Warning: downtime_file variable ignored. Downtime entries are now > > stored in the status and retention files. > > Warning: comment_file variable ignored. Comments are now stored in > > the status and retention files. > > Timing information on object configuration processing is listed > > below. You can use this information to see if precaching your > > object configuration would be useful. > > > > Object Config Source: Config files (uncached) > > > > OBJECT CONFIG PROCESSING TIMES (* = Potential for precache > > savings with -u option) > > ---------------------------------- > > Read: 0.080036 sec > > Resolve: 0.010660 sec * > > Recomb Contactgroups: 0.002666 sec * > > Recomb Hostgroups: 0.004086 sec * > > Dup Services: 0.034632 sec * > > Recomb Servicegroups: 0.001277 sec * > > Duplicate: 0.010939 sec * > > Inherit: 0.005594 sec * > > Recomb Contacts: 0.000001 sec * > > Sort: 0.000000 sec * > > Register: 0.074413 sec > > Free: 0.008730 sec > > ============ > > TOTAL: 0.234920 sec * = 0.071741 sec (30.54%) > > estimated savings > > > > > > RETENTION DATA TIMES > > ---------------------------------- > > Read and Process: 0.495480 sec > > ============ > > TOTAL: 0.495480 sec > > > > > > Timing information on configuration verification is listed below. > > > > CONFIG VERIFICATION TIMES (* = Potential for speedup with -x > > option) > > ---------------------------------- > > Object Relationships: 0.060039 sec > > Circular Paths: 0.026557 sec * > > Misc: 0.005999 sec > > ============ > > TOTAL: 0.092595 sec * = 0.026557 sec (28.7%) estimated > > savings > > > > > > EVENT SCHEDULING TIMES > > ------------------------------------- > > Get service info: 0.014509 sec > > Get host info info: 0.002853 sec > > Get service params: 0.000078 sec > > Schedule service times: 0.039947 sec > > Schedule service events: 0.034656 sec > > Get host params: 0.000001 sec > > Schedule host times: 0.007519 sec > > Schedule host events: 0.029519 sec > > ============ > > TOTAL: 0.129082 sec > > > > > > Projected scheduling information for host and service checks > > is listed below. This information assumes that you are going > > to start running Nagios with your current config files. > > > > HOST SCHEDULING INFORMATION > > --------------------------- > > Total hosts: 931 > > Total scheduled hosts: 931 > > Host inter-check delay method: SMART > > Average host check interval: 259.01 sec > > Host inter-check delay: 0.28 sec > > Max host check spread: 30 min > > First scheduled check: Tue Oct 11 13:14:08 2011 > > Last scheduled check: Tue Oct 11 13:18:26 2011 > > > > > > SERVICE SCHEDULING INFORMATION > > ------------------------------- > > Total services: 4032 > > Total scheduled services: 4030 > > Service inter-check delay method: SMART > > Average service check interval: 299.55 sec > > Inter-check delay: 0.07 sec > > Interleave factor method: SMART > > Average services per host: 4.33 > > Service interleave factor: 5 > > Max service check spread: 30 min > > First scheduled check: Tue Oct 11 13:15:07 2011 > > Last scheduled check: Tue Oct 11 13:20:07 2011 > > > > > > CHECK PROCESSING INFORMATION > > ---------------------------- > > Check result reaper interval: 5 sec > > Max concurrent service checks: Unlimited > > > > > > PERFORMANCE SUGGESTIONS > > ----------------------- > > I have no suggestions - things look okay. > > -- > > Javier Vela Diago > > S2 GRUPO > > Ramiro de Maeztu, 7 bajo. 46022 Valencia > > Tel: 963.110.300 Fax: 963.106.086 > > e-mail : jvela arroba s2grupo punto es_ > > __http://www.s2grupo.es_ > > > ------------------------------------------------------------------------------ > > All the data continuously generated in your IT infrastructure contains a > > definitive record of customers, application performance, security > > threats, fraudulent activity and more. Splunk takes this data and makes > > sense of it. Business sense. IT sense. Common sense. > > > http://p.sf.net/sfu/splunk-d2d-oct_______________________________________________ > > Nagios-users mailing list > > Nagios-users at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/nagios-users > > ::: Please include Nagios version, plugin version (-v) and OS when > > reporting any issue. > > ::: Messages without supporting info will risk being sent to /dev/null > > ------------------------------------------------------------------------ > > > > > ------------------------------------------------------------------------------ > > All the data continuously generated in your IT infrastructure contains a > > definitive record of customers, application performance, security > > threats, fraudulent activity and more. Splunk takes this data and makes > > sense of it. Business sense. IT sense. Common sense. > > http://p.sf.net/sfu/splunk-d2d-oct > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Nagios-users mailing list > > Nagios-users at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/nagios-users > > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > > ::: Messages without supporting info will risk being sent to /dev/null > > > -- > > > Mike Guthrie > Technical Team > ___ > Nagios Enterprises, LLC > Email: mguthrie at nagios.com > Web: www.nagios.com > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > ------------------------------------------------------------------------ > > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- Mike Guthrie Technical Team ___ Nagios Enterprises, LLC Email: mguthrie at nagios.com Web: www.nagios.com ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From jvela at s2grupo.es Tue Oct 11 17:29:41 2011 From: jvela at s2grupo.es (Javier Vela Diago) Date: Tue, 11 Oct 2011 17:29:41 +0200 Subject: High check latency in a machine with low load In-Reply-To: <4E945A7E.9090705@nagios.com> References: <4288A518A157EC4C8873FEE74F778BF0058A13@WPSDGQHH.OPR.STATEFARM.ORG> <4E9451EC.7000102@nagios.com> <4E945A7E.9090705@nagios.com> Message-ID: No, I don't stop mysql when I launch the optimize and repair. Should I? In order to launch mysqlcheck --repair and mysqlcheck --optimize, the database has to be started, no? I thougth that optimize and repair lock the tables by itself Checks per second? I don't now for sure but 4000 services with 5 min of interval gives us 14 checks/seconds. Moreover i know that i have a lots of times more than 80 concurrents checks at the same time (In the past i got some errors when reached the old limit of max_concurrent_checks=80) Javi -- Javier Vela Diago S2 GRUPO Ramiro de Maeztu, 7 bajo. 46022 Valencia Tel: 963.110.300 Fax: 963.106.086 e-mail : jvela arroba s2grupo punto es http://www.s2grupo.es De: Mike Guthrie Para: Nagios Users List Fecha: 11/10/2011 17:17 Asunto: Re: [Nagios-users] High check latency in a machine with low load Just to double check, are you locking your tables or stopping mysql when you do your repair runs? You'll actually risk corrupting your DB tables which will tank your CPU if the tables are being written to while a repair run is occurring. As far as the checks go, how many checks per second is your machine running (on average)? Javier Vela Diago wrote: > Thank you for the advise, but due some problems in the past, I already > have the mysql database in another machine with 2 cpus and 2GB of ram. > > Also, because of the problems I suffered, I have a script that every > nigth optimizes and repairs the ndoutils database. My goal now is to > change the engine from MyISAM to INNODB and apply some tunnig to the > database. The engine change is because when problems start, with > MyISAM I have to truncate the database because optimize hangs out, but > with InnoDB, in the tests I've made, works fine. > > Javi > > > > De: Mike Guthrie > Para: Nagios Users List > Fecha: 11/10/2011 16:39 > Asunto: Re: [Nagios-users] High check latency in a machine with > low load > ------------------------------------------------------------------------ > > > > If ndoutils starts to create a heavy burden on the system you can also > offload ndoutils/mysql to a second machine. We wrote the below document > for Nagios XI, but the doc has the info you'd need to make it work for > Nagios Core as well. > > http://library.nagios.com/library/products/nagiosxi/documentation/462-offloading-mysql-to-remote-server > > > > Javier Vela Diago wrote: > > I have a lot of custom checks, written mostly in perl, bash and some > > in python. And some take a lo of time. > > > > Nevermind, I think I found the solution, or at least one part. I > > configured to 1 the enable_large_instalallation_tweaks. This options, > > 6 months ago, almost crashed my system, so i discarded it. Now, with > > bigger problems, is the last thing that I wanted to test, but finally > > this afternoon I tested it. > > > > When I restarted Nagios, the load has started to grow until 6-8, and > > the latency problems dissapeared. I was sceptical about the utility of > > this options but when the load changes form 2,5 to 6, it means that > > the machine is doing a lot of work that before wasn't doing. > > > > Now the problem is that NDOUtils is causing some latency because of > > MYSQL, but well, at least I know what to optimize. Some tips will be > > apreciated :) > > > > Thank you and sorry for your time. > > > > > > De: Daniel Wittenberg > > Para: Nagios Users List > > Fecha: 11/10/2011 16:02 > > Asunto: Re: [Nagios-users] High check latency in a machine with > > low load > > ------------------------------------------------------------------------ > > > > > > > > I think you have the enable_high_latency option enabled J j/k > > > > Do you have any particular checks that are taking a long time? i.e. > > can you watch top and see checks taking a while? > > > > Dan > > > > > > *From:* Javier Vela Diago [mailto:jvela at s2grupo.es] * > > Sent:* Tuesday, October 11, 2011 6:23 AM* > > To:* nagios-users at lists.sourceforge.net* > > Subject:* [Nagios-users] High check latency in a machine with low load > > > > Hi, > > > > I have a Nagios 3.2.3 deployment with 1000+ Hosts and 3000+ services. > > This Nagios runs together with NDO and PNP (in bulk mode) in a server > > with 4GB of Ram and 4 cpus. > > > > One day I realized that the check delay in the performance CGI was > > very high (300-400 seconds). It was very strange so I took the tunning > > guide form nagios > > (_http://nagios.sourceforge.net/docs/3_0/tuning.html_) and applied all > > the points I could. In particular I adjusted the max_concurrent_checks > > to zero (no limit): > > > > max_concurrent_checks=0 > > > > The reaper event: > > > > service_reaper_frequency=5 > > max_check_result_reaper_time=15 > > > > and checked that the host checks where not forced. In addition I > > configured 15 seconds of host check cache. > > > > cached_host_check_horizon=15 > > > > But the problem remains. And the load of the server is not very high. > > Load of 2,5, 2 GB of free memory and an average utilization of disc of > > 7%. I disabled NDO and PNP but it was useless. After the first round > > of checks, the delay returns, while the load of the server doesn't grow. > > > > I have searched in google but all the problems area because of the > > load in the server, but here this is not the main problem. So my > > question is ?what can I do now??There is some variable that shows me > > where to look? I'm a bit lost right now and I don't know how to find > > the problem. > > > > ?Or maybe the only way is to configure a master-slave nagios in order > > to maximize the server utilization? > > > > In addition, I have pretty big timeouts (60 seconds) because of the > > high latency on the network. All your help is appreciated. Thank you > > in advance. > > * > > nagiostats* > > Nagios Stats 3.2.3 > > Copyright (c) 2003-2008 Ethan Galstad (_www.nagios.org_) > > Last Modified: 10-03-2010 > > License: GPL > > > > CURRENT STATUS DATA > > ------------------------------------------------------ > > Status File: > > /usr/local/argos/aplicaciones/nagios/var/status.dat > > Status File Age: 0d 0h 0m 11s > > Status File Version: 3.2.3 > > > > Program Running Time: 0d 20h 56m 7s > > Nagios PID: 21834 > > Used/High/Total Command Buffers: 0 / 0 / 4096 > > > > Total Services: 4032 > > Services Checked: 4032 > > Services Scheduled: 4030 > > Services Actively Checked: 4032 > > Services Passively Checked: 0 > > Total Service State Change: 0.000 / 37.300 / 0.163 % > > Active Service Latency: 32.876 / 442.138 / 415.816 sec > > Active Service Execution Time: 0.051 / 60.097 / 1.545 sec > > Active Service State Change: 0.000 / 37.300 / 0.163 % > > Active Services Last 1/5/15/60 min: 237 / 1530 / 4020 / 4020 > > Passive Service Latency: 0.000 / 0.000 / 0.000 sec > > Passive Service State Change: 0.000 / 0.000 / 0.000 % > > Passive Services Last 1/5/15/60 min: 0 / 0 / 0 / 0 > > Services Ok/Warn/Unk/Crit: 3766 / 38 / 44 / 184 > > Services Flapping: 0 > > Services In Downtime: 0 > > > > Total Hosts: 931 > > Hosts Checked: 931 > > Hosts Scheduled: 931 > > Hosts Actively Checked: 931 > > Host Passively Checked: 0 > > Total Host State Change: 0.000 / 12.370 / 0.077 % > > Active Host Latency: 0.000 / 441.308 / 416.063 sec > > Active Host Execution Time: 0.062 / 10.113 / 0.395 sec > > Active Host State Change: 0.000 / 12.370 / 0.077 % > > Active Hosts Last 1/5/15/60 min: 74 / 423 / 931 / 931 > > Passive Host Latency: 0.000 / 0.000 / 0.000 sec > > Passive Host State Change: 0.000 / 0.000 / 0.000 % > > Passive Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 0 > > Hosts Up/Down/Unreach: 897 / 24 / 10 > > Hosts Flapping: 0 > > Hosts In Downtime: 1 > > > > Active Host Checks Last 1/5/15 min: 109 / 535 / 1583 > > Scheduled: 87 / 433 / 1300 > > On-demand: 22 / 102 / 283 > > Parallel: 87 / 438 / 1323 > > Serial: 0 / 0 / 0 > > Cached: 22 / 97 / 260 > > Passive Host Checks Last 1/5/15 min: 0 / 0 / 0 > > Active Service Checks Last 1/5/15 min: 304 / 1605 / 4924 > > Scheduled: 304 / 1605 / 4923 > > On-demand: 0 / 0 / 1 > > Cached: 0 / 0 / 0 > > Passive Service Checks Last 1/5/15 min: 0 / 0 / 0 > > > > External Commands Last 1/5/15 min: 0 / 0 / 0 > > * > > nagios -s* > > > > Nagios Core 3.2.3 > > Copyright (c) 2009-2010 Nagios Core Development Team and Community > > Contributors > > Copyright (c) 1999-2009 Ethan Galstad > > Last Modified: 10-03-2010 > > License: GPL > > > > Website: _http://www.nagios.org_ > > Warning: aggregate_status_updates directive ignored. All status file > > updates are now aggregated. > > Warning: downtime_file variable ignored. Downtime entries are now > > stored in the status and retention files. > > Warning: comment_file variable ignored. Comments are now stored in > > the status and retention files. > > Timing information on object configuration processing is listed > > below. You can use this information to see if precaching your > > object configuration would be useful. > > > > Object Config Source: Config files (uncached) > > > > OBJECT CONFIG PROCESSING TIMES (* = Potential for precache > > savings with -u option) > > ---------------------------------- > > Read: 0.080036 sec > > Resolve: 0.010660 sec * > > Recomb Contactgroups: 0.002666 sec * > > Recomb Hostgroups: 0.004086 sec * > > Dup Services: 0.034632 sec * > > Recomb Servicegroups: 0.001277 sec * > > Duplicate: 0.010939 sec * > > Inherit: 0.005594 sec * > > Recomb Contacts: 0.000001 sec * > > Sort: 0.000000 sec * > > Register: 0.074413 sec > > Free: 0.008730 sec > > ============ > > TOTAL: 0.234920 sec * = 0.071741 sec (30.54%) > > estimated savings > > > > > > RETENTION DATA TIMES > > ---------------------------------- > > Read and Process: 0.495480 sec > > ============ > > TOTAL: 0.495480 sec > > > > > > Timing information on configuration verification is listed below. > > > > CONFIG VERIFICATION TIMES (* = Potential for speedup with -x > > option) > > ---------------------------------- > > Object Relationships: 0.060039 sec > > Circular Paths: 0.026557 sec * > > Misc: 0.005999 sec > > ============ > > TOTAL: 0.092595 sec * = 0.026557 sec (28.7%) estimated > > savings > > > > > > EVENT SCHEDULING TIMES > > ------------------------------------- > > Get service info: 0.014509 sec > > Get host info info: 0.002853 sec > > Get service params: 0.000078 sec > > Schedule service times: 0.039947 sec > > Schedule service events: 0.034656 sec > > Get host params: 0.000001 sec > > Schedule host times: 0.007519 sec > > Schedule host events: 0.029519 sec > > ============ > > TOTAL: 0.129082 sec > > > > > > Projected scheduling information for host and service checks > > is listed below. This information assumes that you are going > > to start running Nagios with your current config files. > > > > HOST SCHEDULING INFORMATION > > --------------------------- > > Total hosts: 931 > > Total scheduled hosts: 931 > > Host inter-check delay method: SMART > > Average host check interval: 259.01 sec > > Host inter-check delay: 0.28 sec > > Max host check spread: 30 min > > First scheduled check: Tue Oct 11 13:14:08 2011 > > Last scheduled check: Tue Oct 11 13:18:26 2011 > > > > > > SERVICE SCHEDULING INFORMATION > > ------------------------------- > > Total services: 4032 > > Total scheduled services: 4030 > > Service inter-check delay method: SMART > > Average service check interval: 299.55 sec > > Inter-check delay: 0.07 sec > > Interleave factor method: SMART > > Average services per host: 4.33 > > Service interleave factor: 5 > > Max service check spread: 30 min > > First scheduled check: Tue Oct 11 13:15:07 2011 > > Last scheduled check: Tue Oct 11 13:20:07 2011 > > > > > > CHECK PROCESSING INFORMATION > > ---------------------------- > > Check result reaper interval: 5 sec > > Max concurrent service checks: Unlimited > > > > > > PERFORMANCE SUGGESTIONS > > ----------------------- > > I have no suggestions - things look okay. > > -- > > Javier Vela Diago > > S2 GRUPO > > Ramiro de Maeztu, 7 bajo. 46022 Valencia > > Tel: 963.110.300 Fax: 963.106.086 > > e-mail : jvela arroba s2grupo punto es_ > > __http://www.s2grupo.es_ > > > ------------------------------------------------------------------------------ > > All the data continuously generated in your IT infrastructure contains a > > definitive record of customers, application performance, security > > threats, fraudulent activity and more. Splunk takes this data and makes > > sense of it. Business sense. IT sense. Common sense. > > > http://p.sf.net/sfu/splunk-d2d-oct_______________________________________________ > > Nagios-users mailing list > > Nagios-users at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/nagios-users > > ::: Please include Nagios version, plugin version (-v) and OS when > > reporting any issue. > > ::: Messages without supporting info will risk being sent to /dev/null > > ------------------------------------------------------------------------ > > > > > ------------------------------------------------------------------------------ > > All the data continuously generated in your IT infrastructure contains a > > definitive record of customers, application performance, security > > threats, fraudulent activity and more. Splunk takes this data and makes > > sense of it. Business sense. IT sense. Common sense. > > http://p.sf.net/sfu/splunk-d2d-oct > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Nagios-users mailing list > > Nagios-users at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/nagios-users > > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > > ::: Messages without supporting info will risk being sent to /dev/null > > > -- > > > Mike Guthrie > Technical Team > ___ > Nagios Enterprises, LLC > Email: mguthrie at nagios.com > Web: www.nagios.com > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > ------------------------------------------------------------------------ > > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- Mike Guthrie Technical Team ___ Nagios Enterprises, LLC Email: mguthrie at nagios.com Web: www.nagios.com ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From xml.devel at gmail.com Wed Oct 12 15:04:27 2011 From: xml.devel at gmail.com (Kumar, Ashish) Date: Wed, 12 Oct 2011 18:34:27 +0530 Subject: Monitoring clustered services Message-ID: Hello fellow Nagios users, I have configured a couple of hosts in Nagios, since they are the nodes of a HA cluster the services are running on the active host only. As obvious Nagios is showing the services down on the passive host. I tried using check_cluster and check_cluster2 but due to the lack of information around the web and mailing list archives I couldn't figure out how it can be configured. Would anyone actually using check_cluster like to lend me a hand? :) We are using Nagios Core 3.2.0 on Centos. Thanks, Ashish Kumar -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From Schimpke.Thomas at bhn-services.com Wed Oct 12 15:41:25 2011 From: Schimpke.Thomas at bhn-services.com (Schimpke, Dr. Thomas - bhn) Date: Wed, 12 Oct 2011 15:41:25 +0200 Subject: Monitoring clustered services In-Reply-To: References: Message-ID: <4E959905.6010400@bhn-services.com> Hello Ashish, is there a "virtual IP adress" connected with the service (like the package IPs in ServiceGuard) ? If there is: simply monitor the service by using the IP associated with the service and monitor the individual hosts, the service might be running on. You may want to monitor and notify the hosts using the check_cluster plugin. check_cluster is useful for active/active clusters. In active/passive clusters you'll get a lot of red services and that is probably not what you want and the reason for your question. Kind regards, Thomas On 10/12/2011 03:04 PM, Kumar, Ashish wrote: > Hello fellow Nagios users, > > I have configured a couple of hosts in Nagios, since they are the > nodes of a HA cluster the services are running on the active host > only. As obvious Nagios is showing the services down on the passive > host. I tried using check_cluster and check_cluster2 but due to the > lack of information around the web and mailing list archives I > couldn't figure out how it can be configured. > > Would anyone actually using check_cluster like to lend me a hand? :) > > We are using Nagios Core 3.2.0 on Centos. > > Thanks, > Ashish Kumar > > ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From cyborg9799 at gmail.com Wed Oct 12 15:42:19 2011 From: cyborg9799 at gmail.com (Mark Thomas) Date: Wed, 12 Oct 2011 09:42:19 -0400 Subject: Monitoring clustered services In-Reply-To: References: Message-ID: Might want to look into see if "negate" works for that plug in which will alert when a service is up. I use that on an snmp plug in that I use to check windows services on two passive nodes that should not have certain application services running. May not work for your cluster plug in but worth looking into On Oct 12, 2011 9:10 AM, "Kumar, Ashish" wrote: > > Hello fellow Nagios users, > > I have configured a couple of hosts in Nagios, since they are the nodes of a HA cluster the services are running on the active host only. As obvious Nagios is showing the services down on the passive host. I tried using check_cluster and check_cluster2 but due to the lack of information around the web and mailing list archives I couldn't figure out how it can be configured. > > Would anyone actually using check_cluster like to lend me a hand? :) > > We are using Nagios Core 3.2.0 on Centos. > > Thanks, > Ashish Kumar > > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From prandal at herefordshire.gov.uk Wed Oct 12 15:32:47 2011 From: prandal at herefordshire.gov.uk (Randal, Phil) Date: Wed, 12 Oct 2011 13:32:47 +0000 Subject: Monitoring clustered services In-Reply-To: References: Message-ID: <7CA580B59C1ABD45B4614ED90D4C7B853B9C1B18@HC-EXMBX02.herefordshire.gov.uk> One solution to that problem is to use Mathias Kettner's check_mk and use its clustered service support: http://mathias-kettner.de/checkmk_clusters.html That's what I do here. Another alternative is to create a third 'host' in Nagios representing the cluster and monitor the services on that, and not on the 'physical' boxes. Cheers, Phil -- Phil Randal | Infrastructure Engineer NHS Herefordshire & Herefordshire Council | Deputy Chief Executive's Office | I.C.T. Services Division Thorn Office Centre, Rotherwas, Hereford, HR2 6JT Tel: 01432 260160 From: Kumar, Ashish [mailto:xml.devel at gmail.com] Sent: 12 October 2011 14:04 To: nagios-users ML Subject: [Nagios-users] Monitoring clustered services Hello fellow Nagios users, I have configured a couple of hosts in Nagios, since they are the nodes of a HA cluster the services are running on the active host only. As obvious Nagios is showing the services down on the passive host. I tried using check_cluster and check_cluster2 but due to the lack of information around the web and mailing list archives I couldn't figure out how it can be configured. Would anyone actually using check_cluster like to lend me a hand? :) We are using Nagios Core 3.2.0 on Centos. Thanks, Ashish Kumar -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mad at b-care.net Wed Oct 12 15:52:13 2011 From: mad at b-care.net (=?ISO-8859-1?Q?Marc-Andr=E9?= Doll) Date: Wed, 12 Oct 2011 15:52:13 +0200 Subject: Monitoring clustered services In-Reply-To: References: Message-ID: <1318427533.1663.4.camel@MADness> Hi, One way to use check_cluster is : first : define a command like define command { command_name check_cluster_service command_line $USER1$/check_cluster -s -d $ARG1$ -c $ARG2$ } then define your check define service { ... check_command check_cluster_service!$SERVICESTATEID:node1:ha_service $,$SERVICESTATEID:node2:ha_service$!1 ... } This will get the result of the tests of the "ha_service" on "node1" and "node2" and return a critical state if there is more than 1 check in a non-OK state. But the best way to check a cluster is to check only the system (disk, CPU, ...) on your nodes and check the cluster as if it is a classic server through th VIP. On Wed, 2011-10-12 at 18:34 +0530, Kumar, Ashish wrote: > Hello fellow Nagios users, > > I have configured a couple of hosts in Nagios, since they are the > nodes of a HA cluster the services are running on the active host > only. As obvious Nagios is showing the services down on the passive > host. I tried using check_cluster and check_cluster2 but due to the > lack of information around the web and mailing list archives I > couldn't figure out how it can be configured. > > Would anyone actually using check_cluster like to lend me a hand? :) > > We are using Nagios Core 3.2.0 on Centos. > > Thanks, > Ashish Kumar > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From michael.friedrich at univie.ac.at Wed Oct 12 17:03:22 2011 From: michael.friedrich at univie.ac.at (Michael Friedrich) Date: Wed, 12 Oct 2011 17:03:22 +0200 Subject: Monitoring clustered services In-Reply-To: References: Message-ID: <4E95AC3A.90703@univie.ac.at> On 2011-10-12 15:04, Kumar, Ashish wrote: > Hello fellow Nagios users, > > I have configured a couple of hosts in Nagios, since they are the > nodes of a HA cluster the services are running on the active host > only. As obvious Nagios is showing the services down on the passive > host. I tried using check_cluster and check_cluster2 but due to the > lack of information around the web and mailing list archives I > couldn't figure out how it can be configured. > > Would anyone actually using check_cluster like to lend me a hand? :) i'd suggest using check_multi instead for checking real clustered services. > > We are using Nagios Core 3.2.0 on Centos. > > Thanks, > Ashish Kumar > > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > > > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- DI (FH) Michael Friedrich Vienna University Computer Center Universitaetsstrasse 7 A-1010 Vienna, Austria email: michael.friedrich at univie.ac.at phone: +43 1 4277 14359 mobile: +43 664 60277 14359 fax: +43 1 4277 14338 web: http://www.univie.ac.at/zid http://www.aco.net Icinga Core& IDOUtils Developer http://www.icinga.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From aravind at linuz.in Wed Oct 12 20:08:00 2011 From: aravind at linuz.in (Aravind M D) Date: Wed, 12 Oct 2011 23:38:00 +0530 Subject: Monitoring clustered services In-Reply-To: <4E95AC3A.90703@univie.ac.at> References: <4E95AC3A.90703@univie.ac.at> Message-ID: <20111012233800.Horde.XIsNa1wlEB5OldeA3rG2JvA@mail.linuz.in> Quoting Michael Friedrich : Hi Ashish, Try to configure cluster resource under a new virtual host in nagios. define host { host_name??????? cluster address???????????? 0.0.0.0 parents????????????? clusterserver1 clusterserver2 } Define clustered service under virtual host. define service{ host_name?????? ? ?? cluster check_command ? servicename } Services will be checked under servers configured as parents if it is available on any one of the node service will be ok. Rgds, Aravind M D ? > On 2011-10-12 15:04, Kumar, Ashish wrote: > Hello fellow Nagios users, >> >> I have configured a couple of hosts in Nagios, since they are >> the nodes of a HA cluster the services are running on the active >> host only.? As obvious Nagios is showing the services down on the >> passive host.? I tried using check_cluster and check_cluster2 but >> due to the lack of information around the web and mailing list >> archives I couldn't figure out how it can be configured. >> >> Would anyone actually using check_cluster like to lend me a hand? :) > > i'd suggest using check_multi instead for checking real clustered services. > >> >> We are using Nagios Core 3.2.0 on Centos. >> >> Thanks, >> Ashish Kumar >> ? >> >> ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct_______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to >> /dev/null > > > -- DI (FH) Michael Friedrich Vienna University Computer Center > Universitaetsstrasse 7 A-1010 Vienna, Austria email: > michael.friedrich at univie.ac.at phone: +43 1 4277 14359 mobile: > +43 664 60277 14359 fax: +43 1 4277 14338 web: > http://www.univie.ac.at/zidhttp://www.aco.net Icinga Core & > IDOUtils Developer http://www.icinga.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From morty+nagios at frakir.org Thu Oct 13 04:46:19 2011 From: morty+nagios at frakir.org (Morty) Date: Wed, 12 Oct 2011 22:46:19 -0400 Subject: check_http and other response codes Message-ID: <20111013024619.GA27623@red-sonja> On some of our apache servers, the normal response code is 401 (authentication required) rather than 200. I'd also like to use nagios to make sure the apache TRACE method stays disabled, with a response code of 405. Problem: check_http returns a warning if the response code is anything but 200. In the first case, both 200 and 401 are acceptable; in the latter case, I want 405 rather than 200. Is there a way to require or allow a different response code? I'm using check_http 1.4.15, as packaged with Debian 5.x. I googled, but didn't find clue. :( Thanks! - Morty ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From frnkblk at iname.com Thu Oct 13 06:16:17 2011 From: frnkblk at iname.com (Frank Bulk) Date: Wed, 12 Oct 2011 23:16:17 -0500 Subject: check_http and other response codes In-Reply-To: <20111013024619.GA27623@red-sonja> References: <20111013024619.GA27623@red-sonja> Message-ID: <00d701cc895e$da8f7e70$8fae7b50$@iname.com> Isn't there some regex matching? Frank -----Original Message----- From: Morty [mailto:morty+nagios at frakir.org] Sent: Wednesday, October 12, 2011 9:46 PM To: nagios-users at lists.sourceforge.net Subject: [Nagios-users] check_http and other response codes On some of our apache servers, the normal response code is 401 (authentication required) rather than 200. I'd also like to use nagios to make sure the apache TRACE method stays disabled, with a response code of 405. Problem: check_http returns a warning if the response code is anything but 200. In the first case, both 200 and 401 are acceptable; in the latter case, I want 405 rather than 200. Is there a way to require or allow a different response code? I'm using check_http 1.4.15, as packaged with Debian 5.x. I googled, but didn't find clue. :( Thanks! - Morty ---------------------------------------------------------------------------- -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From yu.watanabe at jp.fujitsu.com Thu Oct 13 07:09:33 2011 From: yu.watanabe at jp.fujitsu.com (Yu Watanabe) Date: Thu, 13 Oct 2011 14:09:33 +0900 Subject: About new release for Nagios Message-ID: <201110130509.AA04397@S2007337.jp.fujitsu.com> Hello all. I was curious there will be any new releases coming out in Nagios. I remember there was memory leak in 3.3.1. Are there any plans for any new releases? Thanks, Yu ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From morty+nagios at frakir.org Thu Oct 13 09:18:08 2011 From: morty+nagios at frakir.org (Morty) Date: Thu, 13 Oct 2011 03:18:08 -0400 Subject: check_http and other response codes In-Reply-To: <00d701cc895e$da8f7e70$8fae7b50$@iname.com> References: <20111013024619.GA27623@red-sonja> <00d701cc895e$da8f7e70$8fae7b50$@iname.com> Message-ID: <20111013071808.GB27623@red-sonja> On Wed, Oct 12, 2011 at 11:16:17PM -0500, Frank Bulk wrote: > Isn't there some regex matching? There is. But it didn't help me in either case. check_http apparently does an implicit test to make sure it gets a valid response code such as 200. And the regex checking is in content, not headers or response code. So check_http -H $host -S -r 401 still returns a warning with a server that requires auth, and check_http -H $host -S -j TRACE -r 405 still returns a warning on a server with TRACE disabled. While reading a different thread on this mailing list, I found Mark Thomas's mention of "negate". That actually did workaround my HTTP TRACE problem -- TRACE will cause check_http to return a warning when it's disabled and ok when it's enabled, so the following command definition will test for HTTP TRACE: define command{ command_name check_http_trace command_line $USER1$/negate -sw OK -o CRITICAL -c OK -- $USER1$/check_http -j TRACE -f sticky -H $HOSTADDRESS$ -p $ARG1$ $ARG2$ } But IMHO, that's something of a hack. And it doesn't deal with the 401 issue. - Morty ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From pitchfork at ederdrom.de Thu Oct 13 09:39:43 2011 From: pitchfork at ederdrom.de (=?iso-8859-1?Q?J=F6rg_Linge?=) Date: Thu, 13 Oct 2011 09:39:43 +0200 Subject: check_http and other response codes In-Reply-To: <20111013071808.GB27623@red-sonja> References: <20111013024619.GA27623@red-sonja> <00d701cc895e$da8f7e70$8fae7b50$@iname.com> <20111013071808.GB27623@red-sonja> Message-ID: <23E0C87F-FD86-4081-807C-D046362242B4@ederdrom.de> Am 13.10.2011 um 09:18 schrieb Morty: > On Wed, Oct 12, 2011 at 11:16:17PM -0500, Frank Bulk wrote: >> Isn't there some regex matching? > > There is. But it didn't help me in either case. check_http > apparently does an implicit test to make sure it gets a valid response > code such as 200. And the regex checking is in content, not headers > or response code. So check_http -H $host -S -r 401 still returns a > warning with a server that requires auth, and check_http -H $host -S > -j TRACE -r 405 still returns a warning on a server with TRACE > disabled. > > While reading a different thread on this mailing list, I found Mark > Thomas's mention of "negate". That actually did workaround my HTTP > TRACE problem -- TRACE will cause check_http to return a warning when > it's disabled and ok when it's enabled, so the following command > definition will test for HTTP TRACE: > > define command{ > command_name check_http_trace > command_line $USER1$/negate -sw OK -o CRITICAL -c OK -- $USER1$/check_http -j TRACE -f sticky -H $HOSTADDRESS$ -p $ARG1$ $ARG2$ > } > > But IMHO, that's something of a hack. And it doesn't deal with the 401 issue. http://nagiosplugins.org/man/check_http Option -e -e, --expect=STRING Comma-delimited list of strings, at least one of them is expected in the first (status) line of the server response (default: HTTP/1.) If specified skips all other status line logic (ex: 3xx, 4xx, 5xx processing) ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From ae at op5.se Thu Oct 13 12:39:46 2011 From: ae at op5.se (Andreas Ericsson) Date: Thu, 13 Oct 2011 12:39:46 +0200 Subject: About new release for Nagios In-Reply-To: <201110130509.AA04397@S2007337.jp.fujitsu.com> References: <201110130509.AA04397@S2007337.jp.fujitsu.com> Message-ID: <4E96BFF2.3040009@op5.se> On 10/13/2011 07:09 AM, Yu Watanabe wrote: > Hello all. > > I was curious there will be any new releases coming out in Nagios. > I remember there was memory leak in 3.3.1. > > Are there any plans for any new releases? > No, there will never be a new release of Nagios ever again. We've all decided to take up knitting and unicorn-breeding instead. On a more serious note; Ofcourse there will be a new release of Nagios. The memory leaks are not very serious and have not yet been merged to the Nagios core, so making a new release right now would be stupid. I still need more time to fully investigate the pros and cons of the patch sent in to handle the memory leak in the notification, for instance. I believe the proposed fix either doesn't fix the leak completely or fixes it in a bad way that would cause other problems, so I need to run it through valgrind a couple of times to first of all see the leak for myself and secondly make sure nothing bad happens when the patch is applied and there are multiple notifications going out, of which some are sent to escalated contacts. In the meantime, you can restart your Nagios daemon once a year to avoid any realworld problems from any potential leaks (although running latest svn trunk would fix most of them too, so you could probably get away with restarting only ever leapyear or something). -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From Deborah.Martin at kognitio.com Thu Oct 13 17:23:23 2011 From: Deborah.Martin at kognitio.com (Deborah Martin) Date: Thu, 13 Oct 2011 08:23:23 -0700 Subject: Check_tcp - port monitoring best practice In-Reply-To: <201110130509.AA04397@S2007337.jp.fujitsu.com> References: <201110130509.AA04397@S2007337.jp.fujitsu.com> Message-ID: Hi, I just wondered what would be the best practice for monitoring ports. At the moment we specify 10,20,25 seconds for ok, warning and critical alerts. Is that too little - we are seeing a few critical alerts generated and wondered if increasing the thresholds would be better. What does a "socket timeout" generally mean as some of the ports we're monitoring are on systems which are idle at the moment. Could these be caused by network glitches between the nagios box and the target ports - the systems we're monitoring aren't local on the nagios network but in different parts of the world. I wonder if a local nagios instance would be better to try and reduce "socket timeouts" as otherwise people oncall are going to get called out quite a lot.... (?) Any input / ideas would be really appreciated. Regards, Deborah Complimentary Events and Webinars on In-Memory, Massively Parallel Processing and 'In the Cloud' - for more information click here This e-mail and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the intended recipient, please delete this e-mail immediately. Any unauthorised distribution or copying is strictly prohibited. Whilst Kognitio endeavours to prevent the transmission of viruses via e-mail, we cannot guarantee that any e-mail or attachment is free from computer viruses and you are strongly advised to undertake your own anti-virus precautions. Kognitio grants no warranties regarding performance, use or quality of any e-mail or attachment and undertakes no liability for loss or damage, howsoever caused ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From yu.watanabe at jp.fujitsu.com Fri Oct 14 02:01:26 2011 From: yu.watanabe at jp.fujitsu.com (Yu Watanabe) Date: Fri, 14 Oct 2011 09:01:26 +0900 Subject: About new release for Nagios In-Reply-To: <4E96BFF2.3040009@op5.se> References: <4E96BFF2.3040009@op5.se> Message-ID: <201110140001.AA04402@S2007337.jp.fujitsu.com> Hello Andreas. Thank you for the reply. I understood the situation. So, is v 3.2.3 more stable version for now? Thanks, Yu Andreas Ericsson ????????: >On 10/13/2011 07:09 AM, Yu Watanabe wrote: >> Hello all. >> >> I was curious there will be any new releases coming out in Nagios. >> I remember there was memory leak in 3.3.1. >> >> Are there any plans for any new releases? >> > >No, there will never be a new release of Nagios ever again. We've all >decided to take up knitting and unicorn-breeding instead. > >On a more serious note; Ofcourse there will be a new release of Nagios. >The memory leaks are not very serious and have not yet been merged to >the Nagios core, so making a new release right now would be stupid. I >still need more time to fully investigate the pros and cons of the patch >sent in to handle the memory leak in the notification, for instance. I >believe the proposed fix either doesn't fix the leak completely or fixes >it in a bad way that would cause other problems, so I need to run it >through valgrind a couple of times to first of all see the leak for >myself and secondly make sure nothing bad happens when the patch is >applied and there are multiple notifications going out, of which some >are sent to escalated contacts. > >In the meantime, you can restart your Nagios daemon once a year to avoid >any realworld problems from any potential leaks (although running latest >svn trunk would fix most of them too, so you could probably get away with >restarting only ever leapyear or something). > >-- >Andreas Ericsson andreas.ericsson at op5.se >OP5 AB www.op5.se >Tel: +46 8-230225 Fax: +46 8-230231 > >Considering the successes of the wars on alcohol, poverty, drugs and >terror, I think we should give some serious thought to declaring war >on peace. ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mailinglist at theflux.net Fri Oct 14 02:19:16 2011 From: mailinglist at theflux.net (Al) Date: Thu, 13 Oct 2011 20:19:16 -0400 Subject: Suggestion on SNMP disk space checker Message-ID: <138F24FA-568E-4DC3-9637-2585C0DFD563@theflux.net> I'm open for any suggestions/urlz/code for disk space checking via SNMP. I'm trying not to do it with NRPE. Thanks in advance! ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From nagiosusers at edcint.co.nz Fri Oct 14 03:08:09 2011 From: nagiosusers at edcint.co.nz (Matthew Jurgens) Date: Fri, 14 Oct 2011 12:08:09 +1100 Subject: Suggestion on SNMP disk space checker In-Reply-To: <138F24FA-568E-4DC3-9637-2585C0DFD563@theflux.net> References: <138F24FA-568E-4DC3-9637-2585C0DFD563@theflux.net> Message-ID: <4E978B79.9050800@edcint.co.nz> If you are talking about disk space on Windows then take a look at www.edcint.co.nz/checkwmiplus This will check Windows disk space (and many other things) without having to install a client/proxy and without having to configure anything related to SNMP as it use WMI. On 14/10/2011 11:19 AM, Al wrote: > I'm open for any suggestions/urlz/code for disk space checking via SNMP. I'm trying not to do it with NRPE. Thanks in advance! > ------------------------------------------------------------------------------ > > -- > Smartmon System Monitoring > www.smartmon.com.au -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mailinglist at theflux.net Fri Oct 14 03:43:04 2011 From: mailinglist at theflux.net (Al) Date: Thu, 13 Oct 2011 21:43:04 -0400 Subject: Suggestion on SNMP disk space checker In-Reply-To: <4E978B79.9050800@edcint.co.nz> References: <138F24FA-568E-4DC3-9637-2585C0DFD563@theflux.net> <4E978B79.9050800@edcint.co.nz> Message-ID: Its to monitor linux fedora/centos/debian systems... mostly, thanks for the suggestion! On Oct 13, 2011, at 9:08 PM, Matthew Jurgens wrote: > If you are talking about disk space on Windows then take a look at > > www.edcint.co.nz/checkwmiplus > > This will check Windows disk space (and many other things) without having to install a client/proxy and without having to configure anything related to SNMP as it use WMI. > > On 14/10/2011 11:19 AM, Al wrote: >> >> I'm open for any suggestions/urlz/code for disk space checking via SNMP. I'm trying not to do it with NRPE. Thanks in advance! >> ------------------------------------------------------------------------------ >> >> -- >> Smartmon System Monitoring >> www.smartmon.com.au > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct_______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From pitchfork at ederdrom.de Fri Oct 14 06:26:06 2011 From: pitchfork at ederdrom.de (Joerg Linge) Date: Fri, 14 Oct 2011 06:26:06 +0200 Subject: Suggestion on SNMP disk space checker In-Reply-To: <138F24FA-568E-4DC3-9637-2585C0DFD563@theflux.net> References: <138F24FA-568E-4DC3-9637-2585C0DFD563@theflux.net> Message-ID: Am 14.10.2011 um 02:19 schrieb Al: > I'm open for any suggestions/urlz/code for disk space checking via SNMP. I'm trying not to do it with NRPE. Thanks in advance! http://www.google.com/search?q=nagios+check+snmp+disk http://nagios.manubulon.com/snmp_storage.html ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From xml.devel at gmail.com Fri Oct 14 10:07:58 2011 From: xml.devel at gmail.com (Kumar, Ashish) Date: Fri, 14 Oct 2011 13:37:58 +0530 Subject: Monitoring clustered services In-Reply-To: <20111012233800.Horde.XIsNa1wlEB5OldeA3rG2JvA@mail.linuz.in> References: <4E95AC3A.90703@univie.ac.at> <20111012233800.Horde.XIsNa1wlEB5OldeA3rG2JvA@mail.linuz.in> Message-ID: Thank you all for taking time to read my e-mail, your patience and valuable suggestions. I will try to evaluate them all the post my findings. Thank you ever so much once again. Ashish Kumar -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From wfournier at comscore.com Fri Oct 14 10:09:15 2011 From: wfournier at comscore.com (Fournier, Wim) Date: Fri, 14 Oct 2011 04:09:15 -0400 Subject: Service/host escalation doesn't work for service/host groups Message-ID: Hi all, I a have a config of 250 hosts and 1800 services. I have setup a hostgroup and servicegroup that contain all (using host/servicegroups statement in each host/service definition). For these 2 groups I have configured an escalation for 2+ alerts, but it doesn?t seem to work... I get no notifs after my first. The reason for doing this is that our management wants us to send messages out to the 'other on call' person if the main doesn't respond in 30 minutes. Some config: define hostgroup{ hostgroup_name all alias All hosts } define hostescalation { hostgroup_name all contact_groups SNM-oncall,TAA-oncall first_notification 2 last_notification 0 notification_interval 30 } Does anyone have a clue why this doesn't work? -- Wim Fournier ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From ae at op5.se Fri Oct 14 16:09:27 2011 From: ae at op5.se (Andreas Ericsson) Date: Fri, 14 Oct 2011 16:09:27 +0200 Subject: About new release for Nagios In-Reply-To: <201110140001.AA04402@S2007337.jp.fujitsu.com> References: <4E96BFF2.3040009@op5.se> <201110140001.AA04402@S2007337.jp.fujitsu.com> Message-ID: <4E984297.1020507@op5.se> On 10/14/2011 02:01 AM, Yu Watanabe wrote: > Hello Andreas. > > Thank you for the reply. > > I understood the situation. So, is v3.2.3 more stable version > for now? > No. 3.2.3 has the same leaks but more other bugs. I'm still not entirely convinced that one of the reported leaks is actually a leak though as I can't see it in valgrind myself. The other leaks are primarily onetimers, and the downtime and comment removal patches only matter if you're using the new custom commands from altinity (or is it opsera?) that delete downtime and comments on remote hosts when using nsca as a distribution mechanism, and noone in their right mind should be doing that nowadays anyway. -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From morty+nagios at frakir.org Fri Oct 14 18:19:07 2011 From: morty+nagios at frakir.org (Morty) Date: Fri, 14 Oct 2011 12:19:07 -0400 Subject: check_http and other response codes In-Reply-To: <23E0C87F-FD86-4081-807C-D046362242B4@ederdrom.de> References: <20111013024619.GA27623@red-sonja> <00d701cc895e$da8f7e70$8fae7b50$@iname.com> <20111013071808.GB27623@red-sonja> <23E0C87F-FD86-4081-807C-D046362242B4@ederdrom.de> Message-ID: <20111014161907.GA3757@red-sonja> On Thu, Oct 13, 2011 at 09:39:43AM +0200, J?rg Linge wrote: > http://nagiosplugins.org/man/check_http > > Option -e > > -e, --expect=STRING > Comma-delimited list of strings, at least one of them is expected in > the first (status) line of the server response (default: HTTP/1.) > If specified skips all other status line logic (ex: 3xx, 4xx, 5xx processing) Oh, thanks! I saw that option, but misunderstood it. Now I've got it working. For TRACE, I think the "negate" implementation is actually better. CUPS IPP returns no response code at all to a TRACE request, which is good from my perspective. - Morty ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From bparish at cognex.com Fri Oct 14 21:03:14 2011 From: bparish at cognex.com (Parish, Brent) Date: Fri, 14 Oct 2011 15:03:14 -0400 Subject: Suggestion on SNMP disk space checker In-Reply-To: <138F24FA-568E-4DC3-9637-2585C0DFD563@theflux.net> References: <138F24FA-568E-4DC3-9637-2585C0DFD563@theflux.net> Message-ID: <6265B2EB12D194469B958F2E703D81830D6A0074CC@viper.pc.cognex.com> I guess this is the part where I should put in a plug for my own? :) http://exchange.nagios.org/directory/Plugins/System-Metrics/File-System/check_disk_snmp/details -----Original Message----- From: Al [mailto:mailinglist at theflux.net] Sent: Thursday, October 13, 2011 8:19 PM To: Nagios User List Subject: [Nagios-users] Suggestion on SNMP disk space checker I'm open for any suggestions/urlz/code for disk space checking via SNMP. I'm trying not to do it with NRPE. Thanks in advance! ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mailinglist at theflux.net Fri Oct 14 21:49:04 2011 From: mailinglist at theflux.net (Al) Date: Fri, 14 Oct 2011 15:49:04 -0400 Subject: Suggestion on SNMP disk space checker In-Reply-To: <6265B2EB12D194469B958F2E703D81830D6A0074CC@viper.pc.cognex.com> References: <138F24FA-568E-4DC3-9637-2585C0DFD563@theflux.net> <6265B2EB12D194469B958F2E703D81830D6A0074CC@viper.pc.cognex.com> Message-ID: <03F6DD73-037B-4B76-A35A-1CA4464148AC@theflux.net> Thanks for all the suggestions! On Oct 14, 2011, at 3:03 PM, Parish, Brent wrote: > I guess this is the part where I should put in a plug for my own? :) > > http://exchange.nagios.org/directory/Plugins/System-Metrics/File-System/check_disk_snmp/details > > > > -----Original Message----- > From: Al [mailto:mailinglist at theflux.net] > Sent: Thursday, October 13, 2011 8:19 PM > To: Nagios User List > Subject: [Nagios-users] Suggestion on SNMP disk space checker > > I'm open for any suggestions/urlz/code for disk space checking via SNMP. I'm trying not to do it with NRPE. Thanks in advance! > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From maillists0 at gmail.com Sun Oct 16 22:23:48 2011 From: maillists0 at gmail.com (maillists0 at gmail.com) Date: Sun, 16 Oct 2011 16:23:48 -0400 Subject: Problems with scheduling Message-ID: Just built 3.3.1 on CentOS 5.6. Configured a single host for testing with nrpe, everything looked good. Stopped nrpe on the client and got the expected emails. The problem is that when I enabled nrpe on the client, the server continued to send me alerts and the web ui says that the service was recently checked, but there is nothing in the log. The checks aren't in fact happening but the server seems to think they have and keeps reporting the service as down. I replicated this behavior with the default checks against localhost on the server. I've restarted nagios several times and confirmed that there aren't any processes hanging around. I'm using flat files for configuration, no database. I immediately thought it was my configuration, but I don't immediately see anything wrong and also don't understand what I could have done to cause Nagios to think that it has run checks that didn't actually occur. What might have I done wrong? I've used Nagios in the past and don't remember ever having this problem. Any help at all is appreciated. Here's the running config, from status.dat: servicestatus { host_name=mytestmachine service_description=Users modified_attributes=0 check_command=check_nrpe!!check_users check_period=24x7 notification_period=24x7 check_interval=1.000000 retry_interval=1.000000 event_handler= has_been_checked=1 should_be_scheduled=1 check_execution_time=0.026 check_latency=0.213 check_type=0 current_state=2 last_hard_state=2 last_event_id=0 current_event_id=28 current_problem_id=9 last_problem_id=0 current_attempt=10 max_attempts=10 state_type=1 last_state_change=1318789732 last_hard_state_change=1318790272 last_time_ok=1318789672 last_time_warning=0 last_time_unknown=0 last_time_critical=1318795508 plugin_output=Connection refused by host long_plugin_output= performance_data= last_check=1318795508 next_check=1318795568 check_options=0 current_notification_number=27 current_notification_id=29 last_notification=1318795212 next_notification=1318795392 no_more_notifications=0 notifications_enabled=1 active_checks_enabled=1 passive_checks_enabled=1 event_handler_enabled=1 problem_has_been_acknowledged=1 acknowledgement_type=2 flap_detection_enabled=1 failure_prediction_enabled=1 process_performance_data=1 obsess_over_service=1 last_update=1318795534 is_flapping=0 percent_state_change=0.00 scheduled_downtime_depth=0 } ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From f.hugh at comcast.net Sun Oct 16 23:01:17 2011 From: f.hugh at comcast.net (PRP) Date: Sun, 16 Oct 2011 16:01:17 -0500 Subject: Certificate problems with check_ldap In-Reply-To: <1317624296.1588.4.camel@MADness> References: <481090375.233622.1317407948324.JavaMail.root@sz0051a.emeryville.ca.mail.comcast.net> <1317624296.1588.4.camel@MADness> Message-ID: <000301cc8c46$c010fd40$4032f7c0$@comcast.net> Thanks for the info. You lead me in the right direction. Once I figured out the format of the default ca-bundle.crt, I added my corporation's intermediate and root certs. I then added that file and path to the openldap config file as you mention, and I was in business. -prp -----Original Message----- From: Marc-Andr? Doll [mailto:mad at b-care.net] Sent: Monday, October 03, 2011 1:45 AM To: nagios-users at lists.sourceforge.net Subject: Re: [Nagios-users] Certificate problems with check_ldap Hi, I had this problem once. You have to get your root CA and copy it to your default CA certificates directory on your Nagios server (on RedHat it is /etc/openldap/cacerts) or copy it where ever you want and add the line "TLS_CACERT /path/to/my/root/CA.pem" to your openldap configuration file. It solved my problem. Marc-Andr? On Fri, 2011-09-30 at 18:39 +0000, f.hugh at comcast.net wrote: > I have been able to get check_ldap to work fine over the clear on port > 389. When I try to use ssl 636 it fails. It can't verify the cert > since it is our own CA and not a comercial CA that signed the cert. > > This is the error I get: > > ldap_bind: Can't contact LDAP server (-1) > additional info: error:14090086:SSL > routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed Could > not bind to the LDAP server > > I am certain that it is the trust of the cert that is the problem. I > have googled this for half the day looking for the method to insert > our Root CA as trusted, but have had no luck. Anyone been able to > accomplish this? Think of it as a self signed cert installad on our > AD domain controllers. > > -paul > > ---------------------------------------------------------------------- > -------- All of the data generated in your IT infrastructure is > seriously valuable. > Why? It contains a definitive record of application performance, > security threats, fraudulent activity, and more. Splunk takes this > data and makes sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2dcopy2 > _______________________________________________ Nagios-users mailing > list Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please > include Nagios version, plugin version (-v) and OS when reporting any > issue. ::: Messages without supporting info will risk being sent to > /dev/null ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From yu.watanabe at jp.fujitsu.com Mon Oct 17 02:57:48 2011 From: yu.watanabe at jp.fujitsu.com (Yu Watanabe) Date: Mon, 17 Oct 2011 09:57:48 +0900 Subject: About new release for Nagios In-Reply-To: <4E984297.1020507@op5.se> References: <4E984297.1020507@op5.se> Message-ID: <201110170057.AA04403@S2007337.jp.fujitsu.com> Thank you for the reply. I see. Then I will think of using v 3.3.1. Which one is better to use? The Latest snapshot or the Latest stable release? It would be a great help if you could give us your opinion. Thanks, Yu Watanabe Andreas Ericsson ????????: >On 10/14/2011 02:01 AM, Yu Watanabe wrote: >> Hello Andreas. >> >> Thank you for the reply. >> >> I understood the situation. So, is v3.2.3 more stable version >> for now? >> > >No. 3.2.3 has the same leaks but more other bugs. I'm still not entirely >convinced that one of the reported leaks is actually a leak though as I >can't see it in valgrind myself. The other leaks are primarily onetimers, >and the downtime and comment removal patches only matter if you're using >the new custom commands from altinity (or is it opsera?) that delete >downtime and comments on remote hosts when using nsca as a distribution >mechanism, and noone in their right mind should be doing that nowadays >anyway. > >-- >Andreas Ericsson andreas.ericsson at op5.se >OP5 AB www.op5.se >Tel: +46 8-230225 Fax: +46 8-230231 > >Considering the successes of the wars on alcohol, poverty, drugs and >terror, I think we should give some serious thought to declaring war >on peace. > ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From michael.friedrich at univie.ac.at Mon Oct 17 09:31:37 2011 From: michael.friedrich at univie.ac.at (Michael Friedrich) Date: Mon, 17 Oct 2011 09:31:37 +0200 Subject: About new release for Nagios In-Reply-To: <201110170057.AA04403@S2007337.jp.fujitsu.com> References: <4E984297.1020507@op5.se> <201110170057.AA04403@S2007337.jp.fujitsu.com> Message-ID: <4E9BD9D9.4050804@univie.ac.at> Yu Watanabe wrote: > Thank you for the reply. > > I see. > > Then I will think of using v 3.3.1. wait for 3.4.x > Which one is better to use? The Latest snapshot or > the Latest stable release? It would be a great help if you > could give us your opinion. 3.2.3 is considered stable, 3.3.x is a developer release tree and contains various things to be fixed or already fixed in svn. still, empty perfdata is not re-enabled and breaks various graphing addons. anyhow, that's up the nagios core devs to decide what to fix and when to release. the developer guidelines on wiki.nagios.org are lost, but iirc it's mentioned over there which versions indicate which release tree. > Thanks, > Yu Watanabe > > Andreas Ericsson ????????: >> On 10/14/2011 02:01 AM, Yu Watanabe wrote: >>> Hello Andreas. >>> >>> Thank you for the reply. >>> >>> I understood the situation. So, is v3.2.3 more stable version >>> for now? >>> >> No. 3.2.3 has the same leaks but more other bugs. I'm still not entirely >> convinced that one of the reported leaks is actually a leak though as I >> can't see it in valgrind myself. The other leaks are primarily onetimers, >> and the downtime and comment removal patches only matter if you're using >> the new custom commands from altinity (or is it opsera?) that delete >> downtime and comments on remote hosts when using nsca as a distribution >> mechanism, and noone in their right mind should be doing that nowadays >> anyway. >> >> -- >> Andreas Ericsson andreas.ericsson at op5.se >> OP5 AB www.op5.se >> Tel: +46 8-230225 Fax: +46 8-230231 >> >> Considering the successes of the wars on alcohol, poverty, drugs and >> terror, I think we should give some serious thought to declaring war >> on peace. >> > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- DI (FH) Michael Friedrich Vienna University Computer Center Universitaetsstrasse 7 A-1010 Vienna, Austria email: michael.friedrich at univie.ac.at phone: +43 1 4277 14359 mobile: +43 664 60277 14359 fax: +43 1 4277 14338 web: http://www.univie.ac.at/zid http://www.aco.net Icinga Core & IDOUtils Developer http://www.icinga.org ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From yu.watanabe at jp.fujitsu.com Mon Oct 17 09:43:09 2011 From: yu.watanabe at jp.fujitsu.com (Yu Watanabe) Date: Mon, 17 Oct 2011 16:43:09 +0900 Subject: About new release for Nagios In-Reply-To: <4E9BD9D9.4050804@univie.ac.at> References: <4E9BD9D9.4050804@univie.ac.at> Message-ID: <201110170743.AA04407@S2007337.jp.fujitsu.com> Michael, Thank you for the reply. Your information would be very useful. Thanks, Yu Michael Friedrich ????????: >Yu Watanabe wrote: >> Thank you for the reply. >> >> I see. >> >> Then I will think of using v 3.3.1. > >wait for 3.4.x >> Which one is better to use? The Latest snapshot or >> the Latest stable release? It would be a great help if you >> could give us your opinion. > >3.2.3 is considered stable, 3.3.x is a developer release tree and >contains various things to be fixed or already fixed in svn. still, >empty perfdata is not re-enabled and breaks various graphing addons. >anyhow, that's up the nagios core devs to decide what to fix and when to >release. > >the developer guidelines on wiki.nagios.org are lost, but iirc it's >mentioned over there which versions indicate which release tree. > > >> Thanks, >> Yu Watanabe >> >> Andreas Ericsson ????????: >>> On 10/14/2011 02:01 AM, Yu Watanabe wrote: >>>> Hello Andreas. >>>> >>>> Thank you for the reply. >>>> >>>> I understood the situation. So, is v3.2.3 more stable version >>>> for now? >>>> >>> No. 3.2.3 has the same leaks but more other bugs. I'm still not entirely >>> convinced that one of the reported leaks is actually a leak though as I >>> can't see it in valgrind myself. The other leaks are primarily onetimers, >>> and the downtime and comment removal patches only matter if you're using >>> the new custom commands from altinity (or is it opsera?) that delete >>> downtime and comments on remote hosts when using nsca as a distribution >>> mechanism, and noone in their right mind should be doing that nowadays >>> anyway. >>> >>> -- >>> Andreas Ericsson andreas.ericsson at op5.se >>> OP5 AB www.op5.se >>> Tel: +46 8-230225 Fax: +46 8-230231 >>> >>> Considering the successes of the wars on alcohol, poverty, drugs and >>> terror, I think we should give some serious thought to declaring war >>> on peace. >>> >> >> ------------------------------------------------------------------------------ >> All the data continuously generated in your IT infrastructure contains a >> definitive record of customers, application performance, security >> threats, fraudulent activity and more. Splunk takes this data and makes >> sense of it. Business sense. IT sense. Common sense. >> http://p.sf.net/sfu/splunk-d2d-oct >> _______________________________________________ >> Nagios-users mailing list >> Nagios-users at lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/nagios-users >> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. >> ::: Messages without supporting info will risk being sent to /dev/null > > >-- >DI (FH) Michael Friedrich > >Vienna University Computer Center >Universitaetsstrasse 7 A-1010 Vienna, Austria > >email: michael.friedrich at univie.ac.at >phone: +43 1 4277 14359 >mobile: +43 664 60277 14359 >fax: +43 1 4277 14338 >web: http://www.univie.ac.at/zid > http://www.aco.net > >Icinga Core & IDOUtils Developer >http://www.icinga.org > > >------------------------------------------------------------------------------ >All the data continuously generated in your IT infrastructure contains a >definitive record of customers, application performance, security >threats, fraudulent activity and more. Splunk takes this data and makes >sense of it. Business sense. IT sense. Common sense. >http://p.sf.net/sfu/splunk-d2d-oct >_______________________________________________ >Nagios-users mailing list >Nagios-users at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/nagios-users >::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. >::: Messages without supporting info will risk being sent to /dev/null > ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From wfournier at comscore.com Mon Oct 17 09:53:59 2011 From: wfournier at comscore.com (Fournier, Wim) Date: Mon, 17 Oct 2011 03:53:59 -0400 Subject: About new release for Nagios In-Reply-To: <4E9BD9D9.4050804@univie.ac.at> References: <4E9BD9D9.4050804@univie.ac.at> Message-ID: On 10/17/11 9:31 AM, "Michael Friedrich" wrote: >3.2.3 is considered stable, 3.3.x is a developer release tree and Are you sure? 3.3.1 is marked as the latest stable on http://nagios.org/download/core/thanks/?registered=1 and 3.2.3 as the previous stable. -- Wim Fournier ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From michael.friedrich at univie.ac.at Mon Oct 17 11:10:15 2011 From: michael.friedrich at univie.ac.at (Michael Friedrich) Date: Mon, 17 Oct 2011 11:10:15 +0200 Subject: About new release for Nagios In-Reply-To: References: Message-ID: <4E9BF0F7.2070007@univie.ac.at> Fournier, Wim wrote: > On 10/17/11 9:31 AM, "Michael Friedrich" > wrote: > > >> 3.2.3 is considered stable, 3.3.x is a developer release tree and > Are you sure? 3.3.1 is marked as the latest stable on > http://nagios.org/download/core/thanks/?registered=1 and 3.2.3 as the > previous stable. oh. i wasn't aware of that change, thanks for the pointer. well if they say so. i've encountered and fixed various bugs on my 3.3.1 github tree, so i don't consider it stable as it should be. anyhow, as stated before, that's nagios devs' decision not mine ;-) > > -- Wim Fournier > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- DI (FH) Michael Friedrich Vienna University Computer Center Universitaetsstrasse 7 A-1010 Vienna, Austria email: michael.friedrich at univie.ac.at phone: +43 1 4277 14359 mobile: +43 664 60277 14359 fax: +43 1 4277 14338 web: http://www.univie.ac.at/zid http://www.aco.net Icinga Core& IDOUtils Developer http://www.icinga.org ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From ae at op5.se Mon Oct 17 14:00:20 2011 From: ae at op5.se (Andreas Ericsson) Date: Mon, 17 Oct 2011 14:00:20 +0200 Subject: About new release for Nagios In-Reply-To: <4E9BF0F7.2070007@univie.ac.at> References: <4E9BF0F7.2070007@univie.ac.at> Message-ID: <4E9C18D4.6060107@op5.se> On 10/17/2011 11:10 AM, Michael Friedrich wrote: > Fournier, Wim wrote: >> On 10/17/11 9:31 AM, "Michael Friedrich" >> wrote: >> >> >>> 3.2.3 is considered stable, 3.3.x is a developer release tree and >> Are you sure? 3.3.1 is marked as the latest stable on >> http://nagios.org/download/core/thanks/?registered=1 and 3.2.3 as the >> previous stable. > > oh. i wasn't aware of that change, thanks for the pointer. well if they > say so. i've encountered and fixed various bugs on my 3.3.1 github tree, > so i don't consider it stable as it should be. > anyhow, as stated before, that's nagios devs' decision not mine ;-) > All non-trivial programs have bugs. How many have you fixed in Icinga that were shipped in "stable" releases? The ones reported for Nagios have all been fairly "safe" in that they're small or one-time leaks, exist in code not normally exercised (recently added features without ui support), changes in behaviour that might as well have been misdocumented in the first place or only triggered by certain combinations of eventbroker modules. And yes, 3.3.1 is the latest stable. I'm not aware of any bugs in it, apart from the potential one that Dorian sent me a patch for a few weeks ago that I still haven't had time to review and test properly, so it's a bug in potentia, but not actually verified. -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From michael.friedrich at univie.ac.at Mon Oct 17 14:19:50 2011 From: michael.friedrich at univie.ac.at (Michael Friedrich) Date: Mon, 17 Oct 2011 14:19:50 +0200 Subject: About new release for Nagios In-Reply-To: <4E9C18D4.6060107@op5.se> References: <4E9BF0F7.2070007@univie.ac.at> <4E9C18D4.6060107@op5.se> Message-ID: <4E9C1D66.5010201@univie.ac.at> Andreas Ericsson wrote: > On 10/17/2011 11:10 AM, Michael Friedrich wrote: >> Fournier, Wim wrote: >>> On 10/17/11 9:31 AM, "Michael Friedrich" >>> wrote: >>> >>> >>>> 3.2.3 is considered stable, 3.3.x is a developer release tree and >>> Are you sure? 3.3.1 is marked as the latest stable on >>> http://nagios.org/download/core/thanks/?registered=1 and 3.2.3 as the >>> previous stable. >> oh. i wasn't aware of that change, thanks for the pointer. well if they >> say so. i've encountered and fixed various bugs on my 3.3.1 github tree, >> so i don't consider it stable as it should be. >> anyhow, as stated before, that's nagios devs' decision not mine ;-) >> > All non-trivial programs have bugs. How many have you fixed in Icinga > that were shipped in "stable" releases? The ones reported for Nagios have > all been fairly "safe" in that they're small or one-time leaks, exist in > code not normally exercised (recently added features without ui support), > changes in behaviour that might as well have been misdocumented in the > first place or only triggered by certain combinations of eventbroker > modules. i've added my input on the empty perfdata behaviorial change on the nagios-devel lists and i do think that this is a bug because it actually breaks compatibility. even if not intended, if the behaviour stayed thre for a long time, i don't see the reason to "fix" that this way. but solely, that's just my opinion even if i break the abi myself from time to time. just a hint to reduce support questions. > > And yes, 3.3.1 is the latest stable. I'm not aware of any bugs in it, > apart from the potential one that Dorian sent me a patch for a few weeks > ago that I still haven't had time to review and test properly, so it's a > bug in potentia, but not actually verified. i've looked over them and i am still waiting for further input on what exactly leaks. i'm with you, free'ing the macros themselves rather than cleaning up the whole "mess". -- DI (FH) Michael Friedrich Vienna University Computer Center Universitaetsstrasse 7 A-1010 Vienna, Austria email: michael.friedrich at univie.ac.at phone: +43 1 4277 14359 mobile: +43 664 60277 14359 fax: +43 1 4277 14338 web: http://www.univie.ac.at/zid http://www.aco.net Icinga Core& IDOUtils Developer http://www.icinga.org ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From yu.watanabe at jp.fujitsu.com Tue Oct 18 04:19:53 2011 From: yu.watanabe at jp.fujitsu.com (Yu Watanabe) Date: Tue, 18 Oct 2011 11:19:53 +0900 Subject: Service check latency rises when service notification event occurs frequently Message-ID: <201110180219.AA04411@S2007337.jp.fujitsu.com> Hi all! We are doing some performance test with nagios 3.3.1 in following environement. Server 1 RHEL 5.5 64 bit 1CPU Xeon E3-1220 3.10 GHz Memory 8GB Disk 450GB (Raid 1) Server 2 RHEL 5.5 64 bit 2CPU Xeon E5630 2.53 GHz Memory 8GB Disk 300GB (Raid 5) 3011 hosts 6173 services (3011 ping check) Also putting some loads on the background, 1. 347 syslog msg per sec 2. 1 passvie service check per sec for notification event to two contact group 3. 30 ms of network traffic latency 4. cacti polling I have realized that Server 1 has service check latency for average 80 second but server 2 has average below 10 second. Does notification process naturally effect the service check scheduling? Thanks, Yu ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From ae at op5.se Tue Oct 18 11:57:53 2011 From: ae at op5.se (Andreas Ericsson) Date: Tue, 18 Oct 2011 11:57:53 +0200 Subject: Service check latency rises when service notification event occurs frequently In-Reply-To: <201110180219.AA04411@S2007337.jp.fujitsu.com> References: <201110180219.AA04411@S2007337.jp.fujitsu.com> Message-ID: <4E9D4DA1.4080603@op5.se> On 10/18/2011 04:19 AM, Yu Watanabe wrote: > Hi all! > > We are doing some performance test with nagios 3.3.1 in following > environement. > > Server 1 > > RHEL 5.5 64 bit > 1CPU Xeon E3-1220 3.10 GHz > Memory 8GB > Disk 450GB (Raid 1) > > Server 2 > > RHEL 5.5 64 bit > 2CPU Xeon E5630 2.53 GHz > Memory 8GB > Disk 300GB (Raid 5) > > 3011 hosts > 6173 services (3011 ping check) > > Also putting some loads on the background, > > 1. 347 syslog msg per sec > 2. 1 passvie service check per sec for notification event to two contact group > 3. 30 ms of network traffic latency > 4. cacti polling > > I have realized that Server 1 has service check latency for average 80 second but > server 2 has average below 10 second. > Server 2 has Raid 5 (superior to Raid 1) and an extra CPU. I'm not very surprised that it performs better than server 1. What happens if you put spool directories and objects.cache and status.sav on ramdisk? > Does notification process naturally effect the service check scheduling? > Not by much, no. -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From Sven.Nierlein at consol.de Tue Oct 18 12:57:54 2011 From: Sven.Nierlein at consol.de (Sven Nierlein) Date: Tue, 18 Oct 2011 12:57:54 +0200 Subject: Service check latency rises when service notification event occurs frequently In-Reply-To: <4E9D4DA1.4080603@op5.se> References: <201110180219.AA04411@S2007337.jp.fujitsu.com> <4E9D4DA1.4080603@op5.se> Message-ID: <4E9D5BB2.8000908@consol.de> On 18.10.2011 11:57, Andreas Ericsson wrote: >> Does notification process naturally effect the service check scheduling? >> > > Not by much, no. Not sure about that. Notifications are sent out during reaping the results and therefor block the main loop for as long as the notification takes. Usually notification commands are very fast and hopefully will not be necessary all the time, but depending on your notification command, it could have an impact on your scheduling. Sven ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From yu.watanabe at jp.fujitsu.com Tue Oct 18 13:23:24 2011 From: yu.watanabe at jp.fujitsu.com (Yu Watanabe) Date: Tue, 18 Oct 2011 20:23:24 +0900 Subject: Service check latency rises when service notification event occurs frequently In-Reply-To: <4E9D4DA1.4080603@op5.se> References: <4E9D4DA1.4080603@op5.se> Message-ID: <201110181123.AA04413@S2007337.jp.fujitsu.com> Andreas Ericsson ????????: >On 10/18/2011 04:19 AM, Yu Watanabe wrote: >> Hi all! >> >> We are doing some performance test with nagios 3.3.1 in following >> environement. >> >> Server 1 >> >> RHEL 5.5 64 bit >> 1CPU Xeon E3-1220 3.10 GHz >> Memory 8GB >> Disk 450GB (Raid 1) >> >> Server 2 >> >> RHEL 5.5 64 bit >> 2CPU Xeon E5630 2.53 GHz >> Memory 8GB >> Disk 300GB (Raid 5) >> >> 3011 hosts >> 6173 services (3011 ping check) >> >> Also putting some loads on the background, >> >> 1. 347 syslog msg per sec >> 2. 1 passvie service check per sec for notification event to two contact group >> 3. 30 ms of network traffic latency >> 4. cacti polling >> >> I have realized that Server 1 has service check latency for average 80 second but >> server 2 has average below 10 second. >> > >Server 2 has Raid 5 (superior to Raid 1) and an extra CPU. I'm not very >surprised that it performs better than server 1. What happens if you >put spool directories and objects.cache and status.sav on ramdisk? We didnt have time to do your suggestions , sorry about that... However, when I take off the notification event, the average service latency goes down below 1 sec in both servers. This was very strange. > >> Does notification process naturally effect the service check scheduling? >> > >Not by much, no. > >-- >Andreas Ericsson andreas.ericsson at op5.se >OP5 AB www.op5.se >Tel: +46 8-230225 Fax: +46 8-230231 > >Considering the successes of the wars on alcohol, poverty, drugs and >terror, I think we should give some serious thought to declaring war >on peace. > ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From yu.watanabe at jp.fujitsu.com Tue Oct 18 13:55:54 2011 From: yu.watanabe at jp.fujitsu.com (Yu Watanabe) Date: Tue, 18 Oct 2011 20:55:54 +0900 Subject: Service check latency rises when service notification event occurs frequently In-Reply-To: <4E9D5BB2.8000908@consol.de> References: <4E9D5BB2.8000908@consol.de> Message-ID: <201110181155.AA04414@S2007337.jp.fujitsu.com> Sven Nierlein ????????: >On 18.10.2011 11:57, Andreas Ericsson wrote: >>> Does notification process naturally effect the service check scheduling? >>> >> >> Not by much, no. > > >Not sure about that. Notifications are sent out during reaping the results and therefor block the >main loop for as long as the notification takes. Usually notification commands are very fast and >hopefully will not be necessary all the time, but depending on your notification command, it >could have an impact on your scheduling. Thank you for the reply. I have used the standard os mail command '/bin/mail' Do you think the command had affected the scheduling process? Thanks, Yu > > Sven > >------------------------------------------------------------------------------ >All the data continuously generated in your IT infrastructure contains a >definitive record of customers, application performance, security >threats, fraudulent activity and more. Splunk takes this data and makes >sense of it. Business sense. IT sense. Common sense. >http://p.sf.net/sfu/splunk-d2d-oct >_______________________________________________ >Nagios-users mailing list >Nagios-users at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/nagios-users >::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. >::: Messages without supporting info will risk being sent to /dev/null > ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From Sven.Nierlein at consol.de Tue Oct 18 14:33:27 2011 From: Sven.Nierlein at consol.de (Sven Nierlein) Date: Tue, 18 Oct 2011 14:33:27 +0200 Subject: Service check latency rises when service notification event occurs frequently In-Reply-To: <201110181155.AA04414@S2007337.jp.fujitsu.com> References: <4E9D5BB2.8000908@consol.de> <201110181155.AA04414@S2007337.jp.fujitsu.com> Message-ID: <4E9D7217.9080402@consol.de> On 18.10.2011 13:55, Yu Watanabe wrote: > Thank you for the reply. I have used the standard os mail command '/bin/mail' > Do you think the command had affected the scheduling process? How should i know? When i do a simple test, i can send up to 3 mails per seconds with local mail delivery. %> time echo "test" | mail sven real 0m0.329s user 0m0.010s sys 0m0.000s So depending on your mail configuration and the amount of notifications, you simply could calculate how long your core is blocked. Sven ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From yu.watanabe at jp.fujitsu.com Wed Oct 19 02:38:22 2011 From: yu.watanabe at jp.fujitsu.com (Yu Watanabe) Date: Wed, 19 Oct 2011 09:38:22 +0900 Subject: Service check latency rises when service notification event occurs frequently In-Reply-To: <4E9D7217.9080402@consol.de> References: <4E9D7217.9080402@consol.de> Message-ID: <201110190038.AA04415@S2007337.jp.fujitsu.com> Sven Nierlein ????????: >On 18.10.2011 13:55, Yu Watanabe wrote: >> Thank you for the reply. I have used the standard os mail command '/bin/mail' >> Do you think the command had affected the scheduling process? > >How should i know? When i do a simple test, i can send up to 3 mails per seconds with local mail delivery. > >%> time echo "test" | mail sven > >real 0m0.329s >user 0m0.010s >sys 0m0.000s > >So depending on your mail configuration and the amount of notifications, you simply could calculate how long your core is blocked. I see. Thank you for the advice! Thanks, Yu > > Sven > >------------------------------------------------------------------------------ >All the data continuously generated in your IT infrastructure contains a >definitive record of customers, application performance, security >threats, fraudulent activity and more. Splunk takes this data and makes >sense of it. Business sense. IT sense. Common sense. >http://p.sf.net/sfu/splunk-d2d-oct >_______________________________________________ >Nagios-users mailing list >Nagios-users at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/nagios-users >::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. >::: Messages without supporting info will risk being sent to /dev/null > ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mysqlstudent at gmail.com Wed Oct 19 07:38:10 2011 From: mysqlstudent at gmail.com (Alex) Date: Wed, 19 Oct 2011 01:38:10 -0400 Subject: Monitoring clamd.amavisd Message-ID: Hi, I need to monitor the clamd.amavisd binary, but having difficulty with the check_procs command. When using the following check_procs, it isn't able to identify any running processes: # /usr/lib64/nagios/plugins/check_procs -w 1: -c 1: -C clamd.amavisd -u amavis PROCS CRITICAL: 0 processes with command name 'clamd.amavisd', UID = 496 (amavis) However, the process is there: # ps ax|grep clam 1066 ? Ssl 1:13 clamd.amavisd -c /etc/clamd.d/amavisd.conf --pid /var/run/clamd.amavisd/clamd.pid If I change the check_procs to just look for "clamd", it matches, but it also matches clamdscan, which also runs periodically, and I don't want it to do that. Do you have any suggestions for what the problem may be? Is it because of the dot between clamd.amavisd? Thanks for any ideas. Best regards, Alex ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From pitchfork at ederdrom.de Wed Oct 19 08:08:35 2011 From: pitchfork at ederdrom.de (Joerg Linge) Date: Wed, 19 Oct 2011 08:08:35 +0200 Subject: Monitoring clamd.amavisd In-Reply-To: References: Message-ID: <7A9E1416-FB8A-4D02-9672-EFF28BEDD1ED@ederdrom.de> Am 19.10.2011 um 07:38 schrieb Alex: > Hi, Hi Alex > I need to monitor the clamd.amavisd binary, but having difficulty with > the check_procs command. When using the following check_procs, it > isn't able to identify any running processes: > > # /usr/lib64/nagios/plugins/check_procs -w 1: -c 1: -C clamd.amavisd -u amavis > PROCS CRITICAL: 0 processes with command name 'clamd.amavisd', UID = > 496 (amavis) append -vv to your check_procs call to get more infos about the internals. > However, the process is there: > > # ps ax|grep clam > 1066 ? Ssl 1:13 clamd.amavisd -c /etc/clamd.d/amavisd.conf > --pid /var/run/clamd.amavisd/clamd.pid > > If I change the check_procs to just look for "clamd", it matches, but > it also matches clamdscan, which also runs periodically, and I don't > want it to do that. > > Do you have any suggestions for what the problem may be? Is it because > of the dot between clamd.amavisd? Example: OMD[gearman]:~$ ps -ef | grep amavis gearman 848 32594 0 08:03 pts/1 00:00:00 grep amavis amavis 1571 1 0 Sep09 ? 00:00:39 amavisd (master) amavis 2408 1571 0 Oct18 ? 00:00:09 amavisd (ch15-avail) amavis 31143 1571 0 Oct18 ? 00:00:10 amavisd (ch16-avail) Ahh a amavisd process! OMD[gearman]:~$ ./lib/nagios/plugins/check_procs -C amavisd PROCS OK: 0 processes with command name 'amavisd' WTF? No amavisd Process? So lets be more verbose ... OMD[gearman]:~$ ./lib/nagios/plugins/check_procs -C amavisd -vvv |head -n1 CMD: /bin/ps axwo 'stat uid pid ppid vsz rss pcpu comm args' Ahh check_procs use other ps option! OMD[gearman]:~$ /bin/ps axwo 'stat uid pid ppid vsz rss pcpu comm args' | grep amavis Ss 106 1571 1 207404 10208 0.0 amavisd-new amavisd (master) S+ 999 1938 32594 9928 844 0.0 grep grep amavis S 106 2408 1571 213380 61968 0.0 amavisd-new amavisd (ch15-avail) S 106 31143 1571 212216 60476 0.0 amavisd-new amavisd (ch16-avail) So the process name is amavisd-new and not just amavis! Lets call check_procs again OMD[gearman]:~$ ./lib/nagios/plugins/check_procs -C amavisd-new PROCS OK: 3 processes with command name 'amavisd-new' Much better! HTH Joerg Joerg ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From benj at dotnul.com Wed Oct 19 14:08:06 2011 From: benj at dotnul.com (KRAFT Benjamin) Date: Wed, 19 Oct 2011 14:08:06 +0200 Subject: Problem installing Nagios 3.3.1 with --enable-embedded-per in CentOS 6 64 bits In-Reply-To: <4E703A14.7060705@free.fr> References: <4E6FB0C1.1010503@free.fr> <3D480E2907FD164191FE65820A669FFF02E43F94@POM-LA-MBX01.pomwonderful.com> <4E703A14.7060705@free.fr> Message-ID: Hi, I had a similar problem on a centos6/64bits platform, but the ./configure showed me the following : Can't locate ExtUtils/Embed.pm in @INC (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .). BEGIN failed--compilation aborted. I then just installed perl-ExtUtils-Embed and ruled out the problem. hope it helps. Regards, Benjamin On Wed, Sep 14, 2011 at 7:22 AM, xesos wrote: > Hello, > > Yes perl-devel is installed : > > - yum install httpd php gcc make gd gd-devel perl perl-devel wget mailx > > Thanks, > > Regards. > > Le 13/09/2011 23:53, Werner, Robert a ?crit?: > > One quick guess: do you have the perl-devel rpm installed? > > -- > Robert G. Werner > Oracle Apps Systems Administrator > rwerner at pomwonderful.com > 559.521.5089 > > > -----Original Message----- > From: xesos [mailto:xesos at free.fr] > Sent: Tuesday, September 13, 2011 12:37 PM > To: nagios-users at lists.sourceforge.net > Subject: [Nagios-users] Problem installing Nagios 3.3.1 with > --enable-embedded-per in CentOS 6 64 bits > > Hello, > > I can not compile Nagios with --enable-embedded-perl in CentOS 6 64 > bits. The system is installed by default (minimal installation). > Here, it's all my procedure and the errors : > > - yum install httpd php gcc make gd gd-devel perl perl-devel wget mailx > > - groupadd nagios > - useradd -g nagios -d /usr/local/nagios -M -s /bin/bash nagios > > - groupadd nagcmd > - usermod -a -G nagcmd nagios > - usermod -a -G nagcmd apache > > - cd /usr/local/src/ > - wget > http://downloads.sourceforge.net/project/nagios/nagios-3.x/nagios-3.3.1/nagios-3.3.1.tar.gz?r=http%3A%2F%2Fwww.nagios.org%2Fdownload%2Fcore%2Fthanks%2F&ts=1315941098&use_mirror=freefr > - tar -xzf nagios-3.3.1.tar.gz > - cd nagios > - ./configure --prefix=/usr/local/nagios --sysconfdir=/etc/nagios > --localstatedir=/var/nagios --with-nagios-user=nagios > --with-nagios-grp=nagios --with-command-user=nagios > --with-command-group=nagcmd --with-gd-lib=/usr/local/lib > --with-gd-inc=/usr/local/include --with-cgiurl=/nagios/cgi-bin > --with-htmurl=/nagios --with-mail=/bin/mail > --with-httpd-conf=/etc/httpd/conf.d --enable-nanosleep > --enable-embedded-perl --with-perlcache > - make all > > > cd ./base && make > > make[1]: entrant dans le r??pertoire ?? /usr/local/src/nagios/base ?? > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -c -o > broker.o broker.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -c -o > nebmods.o nebmods.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -c -o > ../common/shared.o ../common/shared.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -c -o > checks.o checks.c > > Dans le fichier inclus ? partir de checks.c:41: > > ../include/epn_nagios.h:11:20: erreur: EXTERN.h : Aucun fichier ou > dossier de ce type > > ../include/epn_nagios.h:12:18: erreur: perl.h : Aucun fichier ou > dossier de ce type > > In file included from checks.c:41: > > ../include/epn_nagios.h:31: erreur: expected ?=?, ?,?, ?;?, ?asm? or > ?__attribute__? before ?void? > > checks.c: In function ?run_async_service_check?: > > checks.c:348: erreur: ?SV? undeclared (first use in this function) > > checks.c:348: erreur: (Each undeclared identifier is reported only once > > checks.c:348: erreur: for each function it appears in.) > > checks.c:348: erreur: ?plugin_hndlr_cr? undeclared (first use in this > function) > > checks.c:354: erreur: ?dSP? undeclared (first use in this function) > > checks.c:551: erreur: ?ENTER? undeclared (first use in this function) > > checks.c:552: erreur: ?SAVETMPS? undeclared (first use in this function) > > checks.c:553: attention : implicit declaration of function ?PUSHMARK? > > checks.c:553: erreur: ?SP? undeclared (first use in this function) > > checks.c:554: attention : implicit declaration of function ?XPUSHs? > > checks.c:554: attention : implicit declaration of function ?sv_2mortal? > > checks.c:554: attention : implicit declaration of function ?newSVpv? > > checks.c:558: erreur: ?PUTBACK? undeclared (first use in this function) > > checks.c:562: attention : implicit declaration of function ?call_pv? > > checks.c:562: erreur: ?G_SCALAR? undeclared (first use in this function) > > checks.c:562: erreur: ?G_EVAL? undeclared (first use in this function) > > checks.c:564: erreur: ?SPAGAIN? undeclared (first use in this function) > > checks.c:566: attention : implicit declaration of function ?SvTRUE? > > checks.c:566: erreur: ?ERRSV? undeclared (first use in this function) > > checks.c:575: erreur: ?POPs? undeclared (first use in this function) > > checks.c:578: attention : implicit declaration of function ?SvPVX? > > checks.c:623: attention : implicit declaration of function ?newSVsv? > > checks.c:628: erreur: ?FREETMPS? undeclared (first use in this function) > > checks.c:629: erreur: ?LEAVE? undeclared (first use in this function) > > checks.c:707: erreur: ?G_ARRAY? undeclared (first use in this function) > > checks.c:711: erreur: ?POPpx? undeclared (first use in this function) > > checks.c:712: erreur: ?POPi? undeclared (first use in this function) > > make[1]: *** [checks.o] Erreur 1 > > make[1]: quittant le r??pertoire ?? /usr/local/src/nagios/base ?? > > make: *** [all] Erreur 2 > > - yum install mlocate > - updatedb > - locate EXTERN.h > > > /usr/lib64/perl5/CORE/EXTERN.h > > - locate perl.h > > > /usr/lib64/perl5/CORE/perl.h > > - export CPPFLAGS="-I/usr/lib64/perl5/CORE/" > - make all > > > cd ./base && make > > make[1]: entrant dans le r??pertoire ?? /usr/local/src/nagios/base ?? > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE > -I/usr/lib64/perl5/CORE/ -c -o broker.o broker.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE > -I/usr/lib64/perl5/CORE/ -c -o nebmods.o nebmods.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -c -o > ../common/shared.o ../common/shared.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE > -I/usr/lib64/perl5/CORE/ -c -o checks.o checks.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE > -I/usr/lib64/perl5/CORE/ -c -o config.o config.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE > -I/usr/lib64/perl5/CORE/ -c -o commands.o commands.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE > -I/usr/lib64/perl5/CORE/ -c -o events.o events.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE > -I/usr/lib64/perl5/CORE/ -c -o flapping.o flapping.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE > -I/usr/lib64/perl5/CORE/ -c -o logging.o logging.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -c -o > macros-base.o ../common/macros.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE > -I/usr/lib64/perl5/CORE/ -c -o netutils.o netutils.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE > -I/usr/lib64/perl5/CORE/ -c -o notifications.o notifications.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE > -I/usr/lib64/perl5/CORE/ -c -o sehandlers.o sehandlers.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -c -o > skiplist.o ../common/skiplist.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE > -I/usr/lib64/perl5/CORE/ -c -o utils.o utils.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -c -o > retention-base.o sretention.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -c -o > xretention-base.o ../xdata/xrddefault.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -c -o > comments-base.o ../common/comments.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -c -o > xcomments-base.o ../xdata/xcddefault.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -c -o > objects-base.o ../common/objects.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -c -o > xobjects-base.o ../xdata/xodtemplate.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -c -o > statusdata-base.o ../common/statusdata.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -c -o > xstatusdata-base.o ../xdata/xsddefault.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -c -o > perfdata-base.o perfdata.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -c -o > xperfdata-base.o ../xdata/xpddefault.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -c -o > downtime-base.o ../common/downtime.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -c -o > xdowntime-base.o ../xdata/xdddefault.c > > gcc -Wall -g -O2 -I/usr/local/include -DHAVE_CONFIG_H -DNSCORE -o > nagios nagios.c broker.o nebmods.o ../common/shared.o checks.o config.o > commands.o events.o flapping.o logging.o macros-base.o netutils.o > notifications.o sehandlers.o skiplist.o utils.o retention-base.o > xretention-base.o comments-base.o xcomments-base.o objects-base.o > xobjects-base.o statusdata-base.o xstatusdata-base.o perfdata-base.o > xperfdata-base.o downtime-base.o xdowntime-base.o perlxsi.o > -Wl,-export-dynamic -L/usr/local/lib -lm -lpthread -ldl > > gcc: perlxsi.o : Aucun fichier ou dossier de ce type > > make[1]: *** [nagios] Erreur 1 > > make[1]: quittant le r??pertoire ?? /usr/local/src/nagios/base ?? > > make: *** [all] Erreur 2 > > An idea of the problem ? > > Thanks, > > Regards. > > > ------------------------------------------------------------------------------ > BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA > Learn about the latest advances in developing for the > BlackBerry® mobile platform with sessions, labs & more. > See new tools and technologies. Register for BlackBerry® DevCon today! > http://p.sf.net/sfu/rim-devcon-copy1 > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting > any issue. > ::: Messages without supporting info will risk being sent to /dev/null > ------------------------------------------------------------------------------ > BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA > Learn about the latest advances in developing for the > BlackBerry® mobile platform with sessions, labs & more. > See new tools and technologies. Register for BlackBerry® DevCon today! > http://p.sf.net/sfu/rim-devcon-copy1 > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting > any issue. > ::: Messages without supporting info will risk being sent to /dev/null > > ------------------------------------------------------------------------------ > BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA > Learn about the latest advances in developing for the > BlackBerry® mobile platform with sessions, labs & more. > See new tools and technologies. Register for BlackBerry® DevCon today! > http://p.sf.net/sfu/rim-devcon-copy1 > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting > any issue. > ::: Messages without supporting info will risk being sent to /dev/null > ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From noc at fohnet.co.uk Wed Oct 19 18:15:22 2011 From: noc at fohnet.co.uk (Richard Clark) Date: Wed, 19 Oct 2011 17:15:22 +0100 Subject: Monitoring clamd.amavisd In-Reply-To: References: Message-ID: <20111019161522.GA18166@fohnet.co.uk> On Wed, Oct 19, 2011 at 01:38:10AM -0400, Alex wrote: > Hi, > > I need to monitor the clamd.amavisd binary, but having difficulty with > the check_procs command. When using the following check_procs, it > isn't able to identify any running processes: > > # /usr/lib64/nagios/plugins/check_procs -w 1: -c 1: -C clamd.amavisd -u amavis > PROCS CRITICAL: 0 processes with command name 'clamd.amavisd', UID = > 496 (amavis) > > However, the process is there: > > # ps ax|grep clam > 1066 ? Ssl 1:13 clamd.amavisd -c /etc/clamd.d/amavisd.conf > --pid /var/run/clamd.amavisd/clamd.pid > > If I change the check_procs to just look for "clamd", it matches, but > it also matches clamdscan, which also runs periodically, and I don't > want it to do that. > > Do you have any suggestions for what the problem may be? Is it because > of the dot between clamd.amavisd? > > Thanks for any ideas. > Best regards, > Alex Try single-quoting the process name, see if that makes a difference. Failing that - check_procs can be a bit anal sometimes and doesn't behave exactly the same as how 'ps ax' sees processes - see here for more explanation: http://bangbangsoundslikemachinery.blogspot.com/2011/09/nagios-plugin-checkprocs-incorrectly.html Cheers, -- Richard Clark richard at fohnet.co.uk -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: Digital signature URL: -------------- next part -------------- ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From Eliot.Picken at wenaas.co.uk Wed Oct 19 19:04:58 2011 From: Eliot.Picken at wenaas.co.uk (Eliot.Picken at wenaas.co.uk) Date: Wed, 19 Oct 2011 18:04:58 +0100 Subject: AUTO: Eliot Picken is out of the office (returning 03/11/2011) Message-ID: I am out of the office until 03/11/2011. I am currently out of the office an annual leave, and your email has not been forwarded. For emergency issues, please contact Alex Lawrie on +44 (0) 1224 894 000 otherwise I will respond to your email upon my return Note: This is an automated response to your message "Re: [Nagios-users] Monitoring clamd.amavisd" sent on 10/19/2011 5:15:22 PM. This is the only notification you will receive while this person is away. ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From nagios at walkwith.me.uk Wed Oct 19 19:24:33 2011 From: nagios at walkwith.me.uk (Alan Munday) Date: Wed, 19 Oct 2011 18:24:33 +0100 Subject: Monitoring clamd.amavisd In-Reply-To: References: Message-ID: <4E9F07D1.7070209@walkwith.me.uk> Alex wrote the following on 19/10/11 06:38: > Hi, > > I need to monitor the clamd.amavisd binary, but having difficulty with > the check_procs command. When using the following check_procs, it > isn't able to identify any running processes: > > # /usr/lib64/nagios/plugins/check_procs -w 1: -c 1: -C clamd.amavisd -u amavis > PROCS CRITICAL: 0 processes with command name 'clamd.amavisd', UID = > 496 (amavis) > > However, the process is there: > > # ps ax|grep clam > 1066 ? Ssl 1:13 clamd.amavisd -c /etc/clamd.d/amavisd.conf > --pid /var/run/clamd.amavisd/clamd.pid > Alex I'm checking remote clamd.amavisd processes using check_clamd & nrpe via the following: On the remote servers in /etc/nagios/nrpe.cfg: command[check_clamd]=/usr/lib/nagios/plugins/check_clamd -H /var/run/clamd/clamd.sock Then from the nagios server defining a service to call this: check_command check_nrpe!check_clamd HTH Alan ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Ciosco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mysqlstudent at gmail.com Wed Oct 19 20:43:48 2011 From: mysqlstudent at gmail.com (Alex) Date: Wed, 19 Oct 2011 14:43:48 -0400 Subject: Monitoring clamd.amavisd In-Reply-To: <4E9F07D1.7070209@walkwith.me.uk> References: <4E9F07D1.7070209@walkwith.me.uk> Message-ID: Hi, >> However, the process is there: >> >> # ps ax|grep clam >> ? 1066 ? ? ? ? ?Ssl ? ?1:13 clamd.amavisd -c /etc/clamd.d/amavisd.conf >> --pid /var/run/clamd.amavisd/clamd.pid >> > > Alex > > I'm checking remote clamd.amavisd processes using check_clamd & nrpe via > the following: > > On the remote servers in /etc/nagios/nrpe.cfg: > > command[check_clamd]=/usr/lib/nagios/plugins/check_clamd -H > /var/run/clamd/clamd.sock > > Then from the nagios server defining a service to call this: > > check_command check_nrpe!check_clamd Perfect, thanks. I should have realized there would be a check_clamd plugin (from nagios-plugins-tcp-1.4.15-4.fc15.x86_64 in my case). Thanks everyone for the other tips as well. They will be helpful in the future. Best, Alex ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Ciosco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mysqlstudent at gmail.com Wed Oct 19 22:13:53 2011 From: mysqlstudent at gmail.com (Alex) Date: Wed, 19 Oct 2011 16:13:53 -0400 Subject: Monitoring clamd.amavisd In-Reply-To: References: <4E9F07D1.7070209@walkwith.me.uk> Message-ID: Hi, >> I'm checking remote clamd.amavisd processes using check_clamd & nrpe via >> the following: >> >> On the remote servers in /etc/nagios/nrpe.cfg: >> >> command[check_clamd]=/usr/lib/nagios/plugins/check_clamd -H >> /var/run/clamd/clamd.sock >> >> Then from the nagios server defining a service to call this: >> >> check_command check_nrpe!check_clamd > > Perfect, thanks. I should have realized there would be a check_clamd > plugin (from nagios-plugins-tcp-1.4.15-4.fc15.x86_64 in my case). I should add that in my specific case, I used the following: command[check_procs_clamd]=/usr/lib64/nagios/plugins/check_clamd -H /var/spool/amavisd/clamd.sock This failed with permission denied because /var/spool/amavisd was inaccessible to user nagios. I've changed the permissions to 711 so user nagios can access the directory which is owned by user amavis. Is this correct, or is it preferrable to add user nagios to the amavis group, or somehow run check_clamd as user amavis? Thanks again, Alex ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Ciosco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From xml.devel at gmail.com Thu Oct 20 07:55:57 2011 From: xml.devel at gmail.com (Kumar, Ashish) Date: Thu, 20 Oct 2011 11:25:57 +0530 Subject: Suggestion on SNMP disk space checker In-Reply-To: References: <138F24FA-568E-4DC3-9637-2585C0DFD563@theflux.net> Message-ID: On 14 October 2011 09:56, Joerg Linge wrote: > > http://nagios.manubulon.com/snmp_storage.html > > +1 to that. We use this plugin to monitor disk space on Linux and Windows servers, never had any troubles. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Ciosco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From AHKAPLAN at PARTNERS.ORG Thu Oct 20 18:07:22 2011 From: AHKAPLAN at PARTNERS.ORG (Kaplan, Andrew H.) Date: Thu, 20 Oct 2011 12:07:22 -0400 Subject: Internal Server Error when trying to access Nagios 3 on an Ubuntu 10.04 LTS system Message-ID: Hi there -- I went through the motions of installing the Nagios 3.2.0-4ubuntu2.2 package on an Ubuntu 10.04 LTS 32-bit system. The packages shown below were also installed with the core software: nagcon 0.0.30-0ubuntu1 K-O console application interfacing to Nagios nagios-images 0.5 K-O Collection of images and icons for the nagios system nagios-nrpe-plugin 2.12-4ubuntu1 K-O Nagios Remote Plugin Executor Plugin nagios-nrpe-server 2.12-4ubuntu1 K-O Nagios Remote Plugin Executor Server nagios-plugins 1.4.14-1ubuntu1 K-O Plugins for the nagios network monitoring and management system nagios-plugins-basic 1.4.14-1ubuntu1 K-O Plugins for the nagios network monitoring and management system nagios-plugins-extra 1.4.14-1ubuntu1 K-O Plugins for the nagios network monitoring and manegement system. nagios-plugins-standard 1.4.14-1ubuntu1 K-O Plugins for the nagios network monitoring and management system nagios-snmp-plugins 1.1.1-6 K-O SNMP Plugins for nagios nagios-statd-client 3.12-1 K-O Nagios client for monitoring remote system information nagios-statd-server 3.12-1 K-O Nagios server for monitoring remote system information nagios3-cgi 3.2.0-4ubuntu2.2 K-O cgi files for nagios3 nagios3-common 3.2.0-4ubuntu2.2 K-O support files for nagios3 nagios3-core 3.2.0-4ubuntu2.2 K-O A host/service/network monitoring and management system core files nagios3-dbg 3.2.0-4ubuntu2.2 K-O debugging symbols and debug stuff for nagios3 nagios3-doc 3.2.0-4ubuntu2.2 K-O documentation for nagios3 nagiosgrapher 1.7.1-1 K-O Charting add-on for Nagios nagvis 1.3.1-3 K-O Visualization addon for Nagios ndoutils-common 1.4b7-11build1 K-O This provides the NDOUtils for Nagios with MySQL support nsca 2.7.2 K-O Nagios service monitor agent The server is already configured as a LAMP server with the Apache 2.2.14, and MySQL 5.1.41 applications with the PHP 5.3.2 library. When I try to access the web interface of the server, using the URL /nagios3, I am confronted with an Internal Server Error message. The login user account is the nagios account. What step(s) can I take to be able to access the web interface of the Nagios application? Thanks. The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Ciosco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From AHKAPLAN at PARTNERS.ORG Thu Oct 20 19:14:28 2011 From: AHKAPLAN at PARTNERS.ORG (Kaplan, Andrew H.) Date: Thu, 20 Oct 2011 13:14:28 -0400 Subject: Internal Server Error when trying to access Nagios 3on an Ubuntu 10.04 LTS system In-Reply-To: References: Message-ID: Hi there -- Sorry about the false alarm. The issue was the absence of an htpasswd.users file. Once that was put into place, access to the web interface was successful. ________________________________ From: Kaplan, Andrew H. Sent: Thursday, October 20, 2011 12:07 PM To: Nagios Users List Subject: [Nagios-users] Internal Server Error when trying to access Nagios 3on an Ubuntu 10.04 LTS system Hi there -- I went through the motions of installing the Nagios 3.2.0-4ubuntu2.2 package on an Ubuntu 10.04 LTS 32-bit system. The packages shown below were also installed with the core software: nagcon 0.0.30-0ubuntu1 K-O console application interfacing to Nagios nagios-images 0.5 K-O Collection of images and icons for the nagios system nagios-nrpe-plugin 2.12-4ubuntu1 K-O Nagios Remote Plugin Executor Plugin nagios-nrpe-server 2.12-4ubuntu1 K-O Nagios Remote Plugin Executor Server nagios-plugins 1.4.14-1ubuntu1 K-O Plugins for the nagios network monitoring and management system nagios-plugins-basic 1.4.14-1ubuntu1 K-O Plugins for the nagios network monitoring and management system nagios-plugins-extra 1.4.14-1ubuntu1 K-O Plugins for the nagios network monitoring and manegement system. nagios-plugins-standard 1.4.14-1ubuntu1 K-O Plugins for the nagios network monitoring and management system nagios-snmp-plugins 1.1.1-6 K-O SNMP Plugins for nagios nagios-statd-client 3.12-1 K-O Nagios client for monitoring remote system information nagios-statd-server 3.12-1 K-O Nagios server for monitoring remote system information nagios3-cgi 3.2.0-4ubuntu2.2 K-O cgi files for nagios3 nagios3-common 3.2.0-4ubuntu2.2 K-O support files for nagios3 nagios3-core 3.2.0-4ubuntu2.2 K-O A host/service/network monitoring and management system core files nagios3-dbg 3.2.0-4ubuntu2.2 K-O debugging symbols and debug stuff for nagios3 nagios3-doc 3.2.0-4ubuntu2.2 K-O documentation for nagios3 nagiosgrapher 1.7.1-1 K-O Charting add-on for Nagios nagvis 1.3.1-3 K-O Visualization addon for Nagios ndoutils-common 1.4b7-11build1 K-O This provides the NDOUtils for Nagios with MySQL support nsca 2.7.2 K-O Nagios service monitor agent The server is already configured as a LAMP server with the Apache 2.2.14, and MySQL 5.1.41 applications with the PHP 5.3.2 library. When I try to access the web interface of the server, using the URL /nagios3, I am confronted with an Internal Server Error message. The login user account is the nagios account. What step(s) can I take to be able to access the web interface of the Nagios application? Thanks. The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From AHKAPLAN at PARTNERS.ORG Thu Oct 20 19:25:47 2011 From: AHKAPLAN at PARTNERS.ORG (Kaplan, Andrew H.) Date: Thu, 20 Oct 2011 13:25:47 -0400 Subject: Problem accessing the status map page Message-ID: Hi there -- This e-mail is sort of a sequel to a previous e-mail that I posted about a similar issue. The Nagios server' is the 3.2.0 release on an Ubuntu LTS 10.04 32-bit system. I am able to access the web interface of the server, but one page of the interface, Status Map, returns an Internal Server Error message. As part of the troubleshooting process, I ran the statusmap.cgi file as root from a terminal prompt. The output of the command is shown below: ./statusmap.cgi: symbol lookup error: ./statusmap.cgi: undefined symbol: gdImageCreateFromJpeg I checked for the presence of the gd libraries on the server, and the packages that are installed are the following: libgd-gd2-perl 2.39-2build1 K-O Perl module wrapper for libgd - gd2 variant libgd2-xpm 2.0.36~rc1~dfsg-3.1ubuntu1 K-O GD Graphics Library version 2 libgd2-xpm-dev 2.0.36~rc1~dfsg-3.1ubuntu1 K-O GD Graphics Library version 2 (development version) libgdbm3 1.8.3-9 K-O GNU dbm database routines (runtime version) php5-gd 5.3.2-1ubuntu4.10 P-T GD module for php5 I did a similar check for the jpeg libraries on the server, and the packages shown below have been installed on the system: libjpeg62 6b-15ubuntu1 K-O The Independent JPEG Group's JPEG runtime library libjpeg62-dbg 6b-15ubuntu1 K-O Development files for the IJG JPEG library libjpeg62-dev 6b-15ubuntu1 K-O Development files for the IJG JPEG library Are there any packages and/or libraries that I need to install on the server to get past this issue? Thanks. The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From AdcockJ at leoncountyfl.gov Fri Oct 21 01:23:05 2011 From: AdcockJ at leoncountyfl.gov (Jon Adcock) Date: Thu, 20 Oct 2011 19:23:05 -0400 Subject: Nagios Logout? Message-ID: <4EA07519.D962.0075.0@leoncountyfl.gov> Once I am logged into a particular Nagios account (Apache2 account) and viewing the Nagios webpage, is there anyway to logout/log back into a different account, short of closing the browser and re-opening it? Jon Adcock Network Systems Administrator Leon County MIS 301 S. Monroe St. Tallahassee, FL 32301 Office: (850) 606-5518 adcockj at leoncountyfl.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpg Size: 2261 bytes Desc: not available URL: -------------- next part -------------- ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From anthony-nagios at hogan.id.au Fri Oct 21 02:04:14 2011 From: anthony-nagios at hogan.id.au (Anthony Hogan) Date: Fri, 21 Oct 2011 11:04:14 +1100 Subject: Nagios Logout? In-Reply-To: <4EA07519.D962.0075.0@leoncountyfl.gov> References: <4EA07519.D962.0075.0@leoncountyfl.gov> Message-ID: Clearing your browser's login sessions (I believe some browsers allow this). Alternatively, have a script within the nagios tree send an HTTP 401 code with the same auth message/description, even though you're sending the correct password. You'll get prompted for a password again.. just click cancel.. then I think browser should have forgotten. When using HTTP basic authentication, you're not actually "logging in" as such, your browser is just automatically sending the username and password to the 401 prompts within the folder/tree you've authenticated on. HTTP is a stateless protocol (which is why cookies etc. were bolted onto the top of it). On Fri, Oct 21, 2011 at 10:23, Jon Adcock wrote: > Once I am logged into a particular Nagios account (Apache2 account) and > viewing the Nagios webpage, is there anyway to logout/log back into a > different account, short of closing the browser and re-opening it? > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From ratty at they.org Fri Oct 21 05:06:05 2011 From: ratty at they.org (frank) Date: Thu, 20 Oct 2011 20:06:05 -0700 (PDT) Subject: Nagios Logout? In-Reply-To: References: <4EA07519.D962.0075.0@leoncountyfl.gov> Message-ID: You could also try changing your URL manually to: http://newusername at hostname/nagios/ A new http-auth prompt should pop up. -f On Fri, 21 Oct 2011, Anthony Hogan wrote: > Date: Fri, 21 Oct 2011 11:04:14 +1100 > From: Anthony Hogan > Reply-To: Nagios Users List > To: Nagios Users List > Subject: Re: [Nagios-users] Nagios Logout? > > Clearing your browser's login sessions (I believe some browsers allow this). > > Alternatively, have a script within the nagios tree send an HTTP 401 code with the same auth > message/description, even though you're sending the correct password. You'll get prompted for a > password again.. just click cancel.. then I think browser should have forgotten. > > When using HTTP basic authentication, you're not actually "logging in" as such, your browser is > just automatically sending the username and password to the 401 prompts within the folder/tree > you've authenticated on. > > HTTP is a stateless protocol (which is why cookies etc. were bolted onto the top of it). > > On Fri, Oct 21, 2011 at 10:23, Jon Adcock wrote: > ? Once I am logged into a particular Nagios account (Apache2 account)?and viewing > the Nagios webpage, is there anyway to logout/log back into a different account, > short of closing the browser and re-opening it? > > > -------------- next part -------------- ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From m.borsani at it.net Fri Oct 21 11:03:52 2011 From: m.borsani at it.net (Marco Borsani) Date: Fri, 21 Oct 2011 11:03:52 +0200 Subject: R: Nagios Logout? In-Reply-To: References: <4EA07519.D962.0075.0@leoncountyfl.gov> Message-ID: <007c01cc8fd0$61099810$231cc830$@it.net> I tried this, but does not go in the right way -----Messaggio originale----- Da: frank [mailto:ratty at they.org] Inviato: venerd? 21 ottobre 2011 05:06 A: Nagios Users List Oggetto: Re: [Nagios-users] Nagios Logout? You could also try changing your URL manually to: http://newusername at hostname/nagios/ A new http-auth prompt should pop up. -f On Fri, 21 Oct 2011, Anthony Hogan wrote: > Date: Fri, 21 Oct 2011 11:04:14 +1100 > From: Anthony Hogan > Reply-To: Nagios Users List > To: Nagios Users List > Subject: Re: [Nagios-users] Nagios Logout? > > Clearing your browser's login sessions (I believe some browsers allow this). > > Alternatively, have a script within the nagios tree send an HTTP 401 > code with the same auth message/description, even though you're > sending the correct password. You'll get prompted for a password again.. just click cancel.. then I think browser should have forgotten. > > When using HTTP basic authentication, you're not actually "logging in" > as such, your browser is just automatically sending the username and > password to the 401 prompts within the folder/tree you've authenticated on. > > HTTP is a stateless protocol (which is why cookies etc. were bolted onto the top of it). > > On Fri, Oct 21, 2011 at 10:23, Jon Adcock wrote: > ? Once I am logged into a particular Nagios account (Apache2 account)?and viewing > the Nagios webpage, is there anyway to logout/log back into a different account, > short of closing the browser and re-opening it? > > > ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From BChan at Shawcor.com Fri Oct 21 13:57:14 2011 From: BChan at Shawcor.com (Brian Chan) Date: Fri, 21 Oct 2011 07:57:14 -0400 Subject: AUTO: Chan, Brian is away from the office on Business Travel (Returning 10/25//2011) (returning 25/10/2011) Message-ID: I am out of the office until 25/10/2011. I will respond to your message when I return. If this is a request for support, click here to open an ITRequest--:> mailto:itrequest at shawcor.com Alternatively, all questions can be directed to the Help Desk at 416-744-5557 Brian Chan Note: This is an automated response to your message "Nagios-users Digest, Vol 65, Issue 10" sent on 10/21/2011 7:54:54. This is the only notification you will receive while this person is away. ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From james.osbourn at citrix.com Fri Oct 21 14:55:18 2011 From: james.osbourn at citrix.com (James Osbourn) Date: Fri, 21 Oct 2011 13:55:18 +0100 Subject: postgresql monitoring Message-ID: <09051C7A8945F944AB7AC4E86BEB1ED5B61AD07F8F@LONPMAILBOX01.citrite.net> I have to start monitoring a postgresql server and looking for some pointers on the which plugin to use and check command syntax. Basically want to check that the database is up and responding, but any other recommended checks welcome. Thanks James ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From michael.friedrich at univie.ac.at Fri Oct 21 15:01:40 2011 From: michael.friedrich at univie.ac.at (Michael Friedrich) Date: Fri, 21 Oct 2011 15:01:40 +0200 Subject: postgresql monitoring In-Reply-To: <09051C7A8945F944AB7AC4E86BEB1ED5B61AD07F8F@LONPMAILBOX01.citrite.net> References: <09051C7A8945F944AB7AC4E86BEB1ED5B61AD07F8F@LONPMAILBOX01.citrite.net> Message-ID: <4EA16D34.3060201@univie.ac.at> On 21.10.2011 14:55, James Osbourn wrote: > I have to start monitoring a postgresql server and looking for some pointers on the which plugin to use and check command syntax. http://bucardo.org/wiki/Check_postgres > Basically want to check that the database is up and responding, but any other recommended checks welcome. > > Thanks > > James > > ------------------------------------------------------------------------------ > The demand for IT networking professionals continues to grow, and the > demand for specialized networking skills is growing even more rapidly. > Take a complimentary Learning at Cisco Self-Assessment and learn > about Cisco certifications, training, and career opportunities. > http://p.sf.net/sfu/cisco-dev2dev > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- DI (FH) Michael Friedrich Vienna University Computer Center Universitaetsstrasse 7 A-1010 Vienna, Austria email: michael.friedrich at univie.ac.at phone: +43 1 4277 14359 mobile: +43 664 60277 14359 fax: +43 1 4277 14338 web: http://www.univie.ac.at/zid http://www.aco.net Icinga Core& IDOUtils Developer http://www.icinga.org ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From AdcockJ at leoncountyfl.gov Fri Oct 21 15:25:31 2011 From: AdcockJ at leoncountyfl.gov (Jon Adcock) Date: Fri, 21 Oct 2011 09:25:31 -0400 Subject: R: Nagios Logout? In-Reply-To: <007c01cc8fd0$61099810$231cc830$@it.net> References: <4EA07519.D962.0075.0@leoncountyfl.gov> <007c01cc8fd0$61099810$231cc830$@it.net> Message-ID: <4EA13A8B.D962.0075.0@leoncountyfl.gov> Marco, I tried this and it does work for me (Firefox 7.0.1 & IE 9). I also tried a suggestion that I found online where you create a new directory /share/logout, copy the .htaccess file to that directory, add a blank htpassword.users file, a dummy index.html file, and I modified the nagios.conf file (Apache2 config) to look at these files. Both worked for me. Jon Adcock Network Systems Administrator Leon County MIS 301 S. Monroe St. Tallahassee, FL 32301 Office: (850) 606-5518 adcockj at leoncountyfl.gov >>> On 10/21/2011 at 5:03 AM, "Marco Borsani" wrote: I tried this, but does not go in the right way -----Messaggio originale----- Da: frank [mailto:ratty at they.org] Inviato: venerd? 21 ottobre 2011 05:06 A: Nagios Users List Oggetto: Re: [Nagios-users] Nagios Logout? You could also try changing your URL manually to: http://newusername at hostname/nagios/ A new http-auth prompt should pop up. -f On Fri, 21 Oct 2011, Anthony Hogan wrote: > Date: Fri, 21 Oct 2011 11:04:14 +1100 > From: Anthony Hogan > Reply-To: Nagios Users List > To: Nagios Users List > Subject: Re: [Nagios-users] Nagios Logout? > > Clearing your browser's login sessions (I believe some browsers allow this). > > Alternatively, have a script within the nagios tree send an HTTP 401 > code with the same auth message/description, even though you're > sending the correct password. You'll get prompted for a password again.. just click cancel.. then I think browser should have forgotten. > > When using HTTP basic authentication, you're not actually "logging in" > as such, your browser is just automatically sending the username and > password to the 401 prompts within the folder/tree you've authenticated on. > > HTTP is a stateless protocol (which is why cookies etc. were bolted onto the top of it). > > On Fri, Oct 21, 2011 at 10:23, Jon Adcock wrote: > Once I am logged into a particular Nagios account (Apache2 account) and viewing > the Nagios webpage, is there anyway to logout/log back into a different account, > short of closing the browser and re-opening it? > > > ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpg Size: 2261 bytes Desc: not available URL: -------------- next part -------------- ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From stuart.browne at ausregistry.com.au Mon Oct 24 01:49:46 2011 From: stuart.browne at ausregistry.com.au (Stuart Browne) Date: Mon, 24 Oct 2011 10:49:46 +1100 Subject: Average Check latency and execution time growth - 3.2.3 Message-ID: <8CEF048B9EC83748B1517DC64EA130FB67DB590B06@off-win2003-01.ausregistrygroup.local> > -----Original Message----- > From: Max Schubert [mailto:maxs at webwizarddesign.com] > Sent: Sunday, 9 October 2011 2:19 AM > Subject: Re: [Nagios-users] Average Check latency and execution time > growth - 3.2.3 Sorry for the delay in response, went on break for a few weeks. > What minor RHEL rev are you running? We had one poller that was > running RHEL 5.3 that had constantly increasing latency - a Compaw / > AMD based host. None of the optimizations / configuration changes we > made to the other pollers we ran at the time seemed to help this one - > we updated the poller in-box from 5.3 to 5.4 and voila - issue gone. Fully up-to-date EL5.7. > As Joerge mentioned, probably was a memory leak / bug in a library the > parent Nagios poller process was using, we never did determine which > one and we haven't hit that same issue since then with any 5.4 or 5.5 > pollers. Embedded perl is still in use on this box (too many perl-written plugins to change it without serious thought). > Even with stable software we end up bouncing our pollers every 2-3 > days - 1) because we have an active customer base who make config > changes often and 2) because we take the metrics from the checks and > put them in a time series data warehouse that is sensitive to interval > skew...any poller that hits 10 seconds latency has to be bounced. > > We are at 12 pollers or so right now and we will be up to almost 20 by > next year at this time. Sounds fun ;) > Max > > On 10/2/11, Stuart Browne wrote: > > Hi, > > > > I know this topic has been covered many times, but I've tried those > tweaks > > and I have the remaining issue. > > > > After a few days, the latency on checks explodes. It goes along quite > > happily with small values, then after (about) 3 days, the values rise > quite > > sharply. I've recently been graphing performance statistics > (nagiostats, > > mrtg) and as you can see by the two attachments (day, week), it's rather > > surprising. > > > > We restart Nagios every few days (for other reasons) so thankfully the > issue > > never gets completely out of control, but as you can see, it gets a bit > > crazy. > > > > I can't think of any combination of settings that would cause such > growth > > after such a long period of time. Does anybody have any knowledge as to > why > > it would suddenly increase after running for days without issue? > > > > Basic Nagios system stats: > > 2 x dual-core Xeon 5160 (3Ghz) > > 6GB Memory > > 4 x SAS, RAID1 (hardware, BBU, LVM over RAID1) > > RHEL5, fully patched > > Load average between 0.5 and 3.2 > > > > 'nagios -s /etc/nagios/nagios.cfg' output (trimmed): > > > > HOST SCHEDULING INFORMATION > > --------------------------- > > Total hosts: 252 > > Total scheduled hosts: 252 > > Host inter-check delay method: SMART > > Average host check interval: 300.00 sec > > Host inter-check delay: 1.19 sec > > Max host check spread: 30 min > > First scheduled check: Mon Oct 3 14:31:17 2011 > > Last scheduled check: Mon Oct 3 14:36:15 2011 > > > > > > SERVICE SCHEDULING INFORMATION > > ------------------------------- > > Total services: 1575 > > Total scheduled services: 1386 > > Service inter-check delay method: SMART > > Average service check interval: 878.40 sec > > Inter-check delay: 0.63 sec > > Interleave factor method: SMART > > Average services per host: 6.25 > > Service interleave factor: 6 > > Max service check spread: 30 min > > First scheduled check: Mon Oct 3 14:33:43 2011 > > Last scheduled check: Mon Oct 3 14:48:21 2011 > > > > CHECK PROCESSING INFORMATION > > ---------------------------- > > Check result reaper interval: 5 sec > > Max concurrent service checks: Unlimited > > > > > > PERFORMANCE SUGGESTIONS > > ----------------------- > > I have no suggestions - things look okay. ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From msh.computing at gmail.com Mon Oct 24 02:00:56 2011 From: msh.computing at gmail.com (Steve Kieu) Date: Mon, 24 Oct 2011 11:00:56 +1100 Subject: Strange problem with perfomance data Message-ID: Hello everyone, I have strange problem with nagios not writing performance data for some services checks but it does for some others and checking it does not return any clue at all. The purpose is to parse this for nagios_grapher - and to debug I just echo it to a file and see that symptom. Basically all options look correct - service a and b all use generic service and option process_perf_data is set to 1, and nagios.cfg it is enabled. Service b nagios does write serivce performance data to a file that I set, Service a does not Run the command check for service a it print a line like this su nagios -c '/usr/local/nagios/libexec/check_isdn_channels 10.200.200.33 40 50' OK: There are 25 channels up out of 60 so nothing special, and similarto the service b service check command, we do not use pipe | after the main one at all and only parse the %SERVICEOUTPUT$ any ideas about what to track to find out why this happened would be highly appreciated. Nagios core version 3.3.1 running on rhel 5.6 (build from source) Many thanks in advance -- Steve Kieu -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From jim at jimavery.me.uk Mon Oct 24 13:34:08 2011 From: jim at jimavery.me.uk (Jim Avery) Date: Mon, 24 Oct 2011 12:34:08 +0100 Subject: Strange problem with perfomance data In-Reply-To: References: Message-ID: On 24 October 2011 01:00, Steve Kieu wrote: > Hello everyone, > > I have strange problem with nagios not writing performance data for some > services checks but it does for some others and checking it does not return > any clue at all. The purpose is to parse this for nagios_grapher - and to > debug I just echo it to a file and see that symptom. > > Basically all options look correct - service a and b all use generic service > and option process_perf_data is set to 1, and nagios.cfg it is enabled. > > Service b nagios does write serivce performance data to a file that I set, > Service a does not > > Run the command check for service a it print a line like this > > su nagios -c '/usr/local/nagios/libexec/check_isdn_channels 10.200.200.33 40 > 50' > OK: There are 25 channels up out of 60 > > so nothing special, and similarto the service b service check command, we do > not use pipe |? after the main one at all and only parse the %SERVICEOUTPUT$ What does service b output? > any ideas about what to track to find out why this happened would be highly > appreciated. My wild guess is that Nagiosgrapher doesn't know which of the two numbers "25" or "60" to parse (how could it?) so it doesn't. I strongly recommend you use plugins which follow the developer guidelines, producing the performance data after the "|". Even if the plugin doesn't produce any performance data, I would either re-write it so it does or would write a wrapper script to give Nagios correctly formatted performance data. Cheers, Jim ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From AHKAPLAN at PARTNERS.ORG Mon Oct 24 19:27:14 2011 From: AHKAPLAN at PARTNERS.ORG (Kaplan, Andrew H.) Date: Mon, 24 Oct 2011 13:27:14 -0400 Subject: check_mssql_health connection problem Message-ID: Hi there -- I have completed the installation of the check_mssql_health plugin onto a Nagios 3.2.3 64-bit server running on the CentOS 6.0 distribution. When I try to connect to a remote server running MSSQL Server 2008 R2, the following command syntaxes are used: ./check_mssql_health --server= --username= --password= --mode=connection-time ./check_mssql_health --hostname= --username= --password= --mode=connection-time The error message that I am receiving is the following: CRITICAL - cannot connect to . DBI connect(';server=',',...) failed: OpenClient message: LAYER = (0) ORIGIN = (0) SEVERITY = (78) NUMBER = (41) Server , database Message String: Server is unavailable or does not exist. at ./check_mssql_health line 2175 I have confirmed the hostname of the SQL server, as well as username and password to access the database. The syntax for the username was either or \username during the testing. The port of the server is the default port which, if I am not mistaken, is port 1433. I have tried several sql modes with the same error message appearing each time. Can someone lend a hand here? Thanks. The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mail at catsnest.co.uk Tue Oct 25 17:00:15 2011 From: mail at catsnest.co.uk (mail at catsnest.co.uk) Date: Tue, 25 Oct 2011 16:00:15 +0100 Subject: check_mssql_health connection problem In-Reply-To: References: Message-ID: On Mon, Oct 24, 2011 at 6:27 PM, Kaplan, Andrew H. wrote: > Hi there -- > > I have completed the installation of the check_mssql_health plugin onto a > Nagios 3.2.3 64-bit server running on the CentOS 6.0 distribution. > > When I try to connect to a remote server running MSSQL Server 2008 R2, the > following command syntaxes are used: > > ./check_mssql_health --server= > --username= --password= --mode=connection-time > > ./check_mssql_health --hostname= > --username= --password= --mode=connection-time > > The error message that I am receiving is the following: > > CRITICAL - cannot connect to . DBI > connect(';server=',',...) failed: > > OpenClient message: LAYER = (0) ORIGIN = (0) SEVERITY = (78) NUMBER = (41) > Server , database > Message String: Server is unavailable or does not exist. > ?at ./check_mssql_health line 2175 > > I have confirmed the hostname of the SQL server, as well as username and > password to access the database. The syntax for the username > > was either or \username during the testing. The port > of the server is the default port which, if I am not mistaken, > > is port 1433. > > I have tried several sql modes with the same error message appearing each > time. Can someone lend a hand here? Sorry if you have done this already... I would suggest some basics first, eg from the nagios server can you do things such as: # dig #to check the nagios server can do the dns lookup # ping # to check that routing works # telnet 1433 # to check that "firewall" / "routing" work. If that is all ok, maybe try quoting the username / pw -- Ritchie > > Thanks. > > > > > The information in this e-mail is intended only for the person to whom it is > addressed. If you believe this e-mail was sent to you in error and the > e-mail > contains patient information, please contact the Partners Compliance > HelpLine at > http://www.partners.org/complianceline . If the e-mail was sent to you in > error > but does not contain patient information, please contact the sender and > properly > dispose of the e-mail. > ------------------------------------------------------------------------------ > The demand for IT networking professionals continues to grow, and the > demand for specialized networking skills is growing even more rapidly. > Take a complimentary Learning at Cisco Self-Assessment and learn > about Cisco certifications, training, and career opportunities. > http://p.sf.net/sfu/cisco-dev2dev > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting > any issue. > ::: Messages without supporting info will risk being sent to /dev/null > ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From webdev.gk at gmail.com Wed Oct 26 10:28:19 2011 From: webdev.gk at gmail.com (Gian Karlo) Date: Wed, 26 Oct 2011 16:28:19 +0800 Subject: Question about Nagios Features Message-ID: Hi everyone. I would like to evaluate OpenNMS but I would like to ask if if the features that I am looking is supported out-of-the-box. Thank you. 1. Topology Based Navigation - Be able to navigate devices/elements as based on the topology map created 2. Automatic Device Discovery - Be able to discover devices for monitoring (requires enabling of SNMP) 3. Detailed Historical Reports - Be able to view historical reports on devices and elements 4. Fault Management and Network Availability Tools - Be able to display alerts, provide audible alarms, suppress alarms, create alarms as based on complex status and devices/element availability 5. Network Performance Monitor - Be able to display information about network devices/element health, utilization and status 6. Syslog Viewer - Be able to accept and display system logs from devices/elements 7. SNMP Compatibility - Be able to verify compatibility with different SNMP versions 8. CPU, Memory & Disk Space Monitoring - Be able to display device/element information such as CPU, Memory and Disk Space Utilization 9. Advance Reporting Engine - Be able to create custom reports as based on required information 10. Advance Alerting - Be able to create advance alerts as based on complex combination of inputs and send these alerts via email or thru audible alarms 11. Incident Alerting - Be able to send an alert based on different incidents experienced by the device/element 12. Network Map Making - Be able to create a map of the devices/elements monitored 13. Custom Property Editor - Be able to create a custom property to interface, node or volume 14. Custom HTML Resources - Be able to add resources to your HTML such as link or text 15. Custom MIB Support - Be able to view other statistics as based on customized MIB pollers 16. Role-based Access Control - Be able to create different users with different access to NPM resources 17. Customizable and Flexible Web Console - Be able to customize web console as based on customer requirement 18. Customizable Web Views - Be able to customize web views as based on customer requirement 19. Enterprise Scalability -Be able to provide scalability options to sustain network growth 20. Open Architecture - Be able to integrate with Open Architecture database SQL 21. Unified Monitoring Console - Be able to unify application performance information with other NPM information on the console 22. Advance Server Monitoring - Be able to monitor Windows, Unix and Linux servers key performance statistics 23. Services and Port Monitoring - Be able to monitor applications running on servers as based on services and ports used 24. User Experience Monitoring - Be able to provide user experience information on web-based applications 25. Application Templates - Be able to provide application monitoring information as based on included application templates 26. Out-of-the-box Reporting - Be able to display reports using out of the box reporting templates 27. Advance Application Alerting - Be able to create alerts on complex combinations of events 28. Top-to-bottom Analysis - Be able to provide analysis on traffic flow as based on NetFlow, SFlow and/or JFlow 29. Netflow, SFlow, JFlow Support - Be able to support devices/elements running IP Flows and display statistics as based on information 30. Netflow Reporting - Be able to provide Flow reports 31. Application Traffic Information - Be able to provide statistics as based on application traffic information 32. Demographics - Be able to provide information on top users and top applications 33. VoIP QoS Measurement - Be able to provide statistics as based on VoIP QoS measured values 34. Alerts, Grahps and Reporting - Be able to provide alerts, graphs and reports as based on statistics gathered 35. VoIP Infrastructure Monitoring - Be able to monitor VoIP network infrastructure and provide information on network health for support of VoIP service -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From giles at coochey.net Wed Oct 26 12:09:35 2011 From: giles at coochey.net (Giles Coochey) Date: Wed, 26 Oct 2011 12:09:35 +0200 Subject: Question about Nagios Features In-Reply-To: References: Message-ID: <9434fef3dd4509d92f0594c689e40cf2.squirrel@www.coochey.net> On Wed, October 26, 2011 10:28, Gian Karlo wrote: > Hi everyone. I would like to evaluate OpenNMS but I would like to ask if > if > the features that I am looking is supported out-of-the-box. Thank you. > Wrong list... try the OpenNMS list. ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From loki77 at gmail.com Wed Oct 26 17:30:06 2011 From: loki77 at gmail.com (Michael Barrett) Date: Wed, 26 Oct 2011 08:30:06 -0700 Subject: escalations question Message-ID: I was wondering - if a contact is only set to receive critical alerts, and via escalations the service is only set to contact that contact with a first_notification set as 3, what could cause that contact to get notified at the first notification? If the service has been in a warning state for a while (more than 3 notifications, but none of them going to the critical only receiving contact since they aren't configured to get warnings) do those notifications count towards the first_notification count? I thought we had a pretty cool setup going where our secondary pager would only be notified if the service went critical and only after it's third critical notification - but this morning both the primary pager & secondary pager were notified at the same time for a disk space issue that had been in a warning state for a few hours and then went to critical. Is there anyway to get that sort of setup working btw? I can pull together the relevant config entries if that would help. Thanks for your help! -- Michael Barrett loki77 at gmail.com ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From work at paul.dubuc.org Wed Oct 26 19:27:24 2011 From: work at paul.dubuc.org (Paul M. Dubuc) Date: Wed, 26 Oct 2011 13:27:24 -0400 Subject: escalations question In-Reply-To: References: Message-ID: <4EA842FC.20201@paul.dubuc.org> Michael Barrett wrote: > I was wondering - if a contact is only set to receive critical alerts, and > via escalations the service is only set to contact that contact with a > first_notification set as 3, what could cause that contact to get notified > at the first notification? > > If the service has been in a warning state for a while (more than 3 > notifications, but none of them going to the critical only receiving > contact since they aren't configured to get warnings) do those > notifications count towards the first_notification count? Yes. Each time Nagios generates a notification for any state, the notification count is incremented. After the recovery (OK state) notification is sent, the count is reset to 0. > > I thought we had a pretty cool setup going where our secondary pager would > only be notified if the service went critical and only after it's third > critical notification - but this morning both the primary pager& secondary > pager were notified at the same time for a disk space issue that had been > in a warning state for a few hours and then went to critical. All the warning notifications incremented the count, so the count was greater than 3 when the the service went critical. There is no way to specify the 3rd CRITICAL notification with escalations. Notification counts do not take the state into account. > > Is there anyway to get that sort of setup working btw? You might re-think why you want to do this. If there has been a problem at the warning level for 2 or more notification intervals without it being acknowledged (which stops notifications) or fixed, maybe your secondary contact should be notified anyway when the critical threshold is exceeded. If you really want it to work the way you describe then the best solution I can think of is to have 2 separate services with different contacts. One that issues only warnings and the other only critical problems. But then you've doubled the number of checks you are doing for the same problem. Paul Dubuc ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mike-nagios at 5dninja.net Thu Oct 27 01:55:29 2011 From: mike-nagios at 5dninja.net (Mike Lindsey) Date: Wed, 26 Oct 2011 16:55:29 -0700 Subject: escalations question In-Reply-To: <4EA842FC.20201@paul.dubuc.org> References: <4EA842FC.20201@paul.dubuc.org> Message-ID: <4EA89DF1.7040904@5dninja.net> On 10/26/11 10:27 AM, Paul M. Dubuc wrote: > Michael Barrett wrote: >> Is there anyway to get that sort of setup working btw? > You might re-think why you want to do this. If there has been a problem at > the warning level for 2 or more notification intervals without it being > acknowledged (which stops notifications) or fixed, maybe your secondary > contact should be notified anyway when the critical threshold is exceeded. When you have multiple levels of management in your escalation trees, this particular kind of behavior is to be avoided at all cost. :) > If you really want it to work the way you describe then the best solution I > can think of is to have 2 separate services with different contacts. One that > issues only warnings and the other only critical problems. But then you've > doubled the number of checks you are doing for the same problem. > There's a split-tier notification patch that seems to handle this pretty well. Standard escalation configuration stanzas work the same, but a few new ones are added that allow discrete escalations based on notification number AND type. Barring that you'll have to handle it some monkey-patched after-thought, in your notification scripts. You can search the forums (or google) for the patch. If that fails, I can probably find it later. I believe it was compatible with Nagios 3.2. -- Mike Lindsey ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From webdev.gk at gmail.com Thu Oct 27 04:39:24 2011 From: webdev.gk at gmail.com (Gian Karlo) Date: Thu, 27 Oct 2011 10:39:24 +0800 Subject: Question about Nagios Features In-Reply-To: <9434fef3dd4509d92f0594c689e40cf2.squirrel@www.coochey.net> References: <9434fef3dd4509d92f0594c689e40cf2.squirrel@www.coochey.net> Message-ID: Sorry it should be Nagios not OpenNMS. Anyway I would like to ask it again if it is possible in Nagios. Thanks a lot. 1. Topology Based Navigation - Be able to navigate devices/elements as based on the topology map created 2. Automatic Device Discovery - Be able to discover devices for monitoring (requires enabling of SNMP) 3. Detailed Historical Reports - Be able to view historical reports on devices and elements 4. Fault Management and Network Availability Tools - Be able to display alerts, provide audible alarms, suppress alarms, create alarms as based on complex status and devices/element availability 5. Network Performance Monitor - Be able to display information about network devices/element health, utilization and status 6. Syslog Viewer - Be able to accept and display system logs from devices/elements 7. SNMP Compatibility - Be able to verify compatibility with different SNMP versions 8. CPU, Memory & Disk Space Monitoring - Be able to display device/element information such as CPU, Memory and Disk Space Utilization 9. Advance Reporting Engine - Be able to create custom reports as based on required information 10. Advance Alerting - Be able to create advance alerts as based on complex combination of inputs and send these alerts via email or thru audible alarms 11. Incident Alerting - Be able to send an alert based on different incidents experienced by the device/element 12. Network Map Making - Be able to create a map of the devices/elements monitored 13. Custom Property Editor - Be able to create a custom property to interface, node or volume 14. Custom HTML Resources - Be able to add resources to your HTML such as link or text 15. Custom MIB Support - Be able to view other statistics as based on customized MIB pollers 16. Role-based Access Control - Be able to create different users with different access to NPM resources 17. Customizable and Flexible Web Console - Be able to customize web console as based on customer requirement 18. Customizable Web Views - Be able to customize web views as based on customer requirement 19. Enterprise Scalability -Be able to provide scalability options to sustain network growth 20. Open Architecture - Be able to integrate with Open Architecture database SQL 21. Unified Monitoring Console - Be able to unify application performance information with other NPM information on the console 22. Advance Server Monitoring - Be able to monitor Windows, Unix and Linux servers key performance statistics 23. Services and Port Monitoring - Be able to monitor applications running on servers as based on services and ports used 24. User Experience Monitoring - Be able to provide user experience information on web-based applications 25. Application Templates - Be able to provide application monitoring information as based on included application templates 26. Out-of-the-box Reporting - Be able to display reports using out of the box reporting templates 27. Advance Application Alerting - Be able to create alerts on complex combinations of events 28. Top-to-bottom Analysis - Be able to provide analysis on traffic flow as based on NetFlow, SFlow and/or JFlow 29. Netflow, SFlow, JFlow Support - Be able to support devices/elements running IP Flows and display statistics as based on information 30. Netflow Reporting - Be able to provide Flow reports 31. Application Traffic Information - Be able to provide statistics as based on application traffic information 32. Demographics - Be able to provide information on top users and top applications 33. VoIP QoS Measurement - Be able to provide statistics as based on VoIP QoS measured values 34. Alerts, Grahps and Reporting - Be able to provide alerts, graphs and reports as based on statistics gathered 35. VoIP Infrastructure Monitoring - Be able to monitor VoIP network infrastructure and provide information on network health for support of VoIP service On Wed, Oct 26, 2011 at 6:09 PM, Giles Coochey wrote: > On Wed, October 26, 2011 10:28, Gian Karlo wrote: > > Hi everyone. I would like to evaluate OpenNMS but I would like to ask if > > if > > the features that I am looking is supported out-of-the-box. Thank you. > > > Wrong list... try the OpenNMS list. > > > > ------------------------------------------------------------------------------ > The demand for IT networking professionals continues to grow, and the > demand for specialized networking skills is growing even more rapidly. > Take a complimentary Learning at Cisco Self-Assessment and learn > about Cisco certifications, training, and career opportunities. > http://p.sf.net/sfu/cisco-dev2dev > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From msh.computing at gmail.com Thu Oct 27 08:16:41 2011 From: msh.computing at gmail.com (Steve Kieu) Date: Thu, 27 Oct 2011 17:16:41 +1100 Subject: Strange problem with perfomance data In-Reply-To: References: Message-ID: Yep I found it before reading this message that if I put | in the output of the plugin it works. Thanks anyway cheers On Mon, Oct 24, 2011 at 10:34 PM, Jim Avery wrote: > On 24 October 2011 01:00, Steve Kieu wrote: > > Hello everyone, > > > > I have strange problem with nagios not writing performance data for some > > services checks but it does for some others and checking it does not > return > > any clue at all. The purpose is to parse this for nagios_grapher - and to > > debug I just echo it to a file and see that symptom. > > > > Basically all options look correct - service a and b all use generic > service > > and option process_perf_data is set to 1, and nagios.cfg it is enabled. > > > > Service b nagios does write serivce performance data to a file that I > set, > > Service a does not > > > > Run the command check for service a it print a line like this > > > > su nagios -c '/usr/local/nagios/libexec/check_isdn_channels 10.200.200.33 > 40 > > 50' > > OK: There are 25 channels up out of 60 > > > > so nothing special, and similarto the service b service check command, we > do > > not use pipe | after the main one at all and only parse the > %SERVICEOUTPUT$ > > What does service b output? > > > any ideas about what to track to find out why this happened would be > highly > > appreciated. > > My wild guess is that Nagiosgrapher doesn't know which of the two > numbers "25" or "60" to parse (how could it?) so it doesn't. > > I strongly recommend you use plugins which follow the developer > guidelines, producing the performance data after the "|". Even if the > plugin doesn't produce any performance data, I would either re-write > it so it does or would write a wrapper script to give Nagios correctly > formatted performance data. > > Cheers, > > Jim > > > ------------------------------------------------------------------------------ > The demand for IT networking professionals continues to grow, and the > demand for specialized networking skills is growing even more rapidly. > Take a complimentary Learning at Cisco Self-Assessment and learn > about Cisco certifications, training, and career opportunities. > http://p.sf.net/sfu/cisco-dev2dev > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > -- Steve Kieu -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mariog at absi.be Thu Oct 27 12:15:25 2011 From: mariog at absi.be (Mario Garcia Ortiz) Date: Thu, 27 Oct 2011 12:15:25 +0200 Subject: nagios 2.9 doesn't send emails anymore. In-Reply-To: References: Message-ID: Hello we have this strange issue on a nagios server running version 2.9. all of a sudden, we stopped receiving notifications the web interface shows that the notifications are sent to all the contacts but nothing is actually sent. there's nothing on the syslog of the server; if we send manually a mail via the command line that is sent but nothing that is sent by nagios process itself. what could be the problem here. thank you -- Mario Garcia Ortiz ABSI SA 224 Bd de l'Humanit? 1190 Brussels ?Belgium www.absi.be Tel 00 322 333 40 00 ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mike-nagios at 5dninja.net Fri Oct 28 05:30:19 2011 From: mike-nagios at 5dninja.net (Mike Lindsey) Date: Thu, 27 Oct 2011 20:30:19 -0700 Subject: Question about Nagios Features In-Reply-To: References: <9434fef3dd4509d92f0594c689e40cf2.squirrel@www.coochey.net> Message-ID: <4EAA21CB.1090608@5dninja.net> Short answer: Yes. Medium answer: The free version can provide all of this, but it takes some work to get what you want. The paid version will give you a lot more of this right out of the door. Long answer: Go to nagios.org, read the docs, play with the demo, download the software. Install it somewhere, play with it for a day. Then go hit Nagios Exchange and download some addons that fix whatever you think doesn't work quite the way you want. It sounds like you're going to be monitoring a complex environment. That means that your monitoring environment is going to be complex as well - at my work we monitor over a hundred discrete host profiles on a vast (though still incredibly small in comparison to some other installations) with some shared monitoring, but a lot of unique custom monitoring that performs deep application health checks on custom apps. You get what you put out. In a large environment you can put out half a million dollars a year in licensing and support, and potentially millions in professional services dollars, getting something that you might not understand - tying you and your business to an external org through a hefty financial leash. Alternately, you can download some free software (or pay an entirely reasonable sum for a better version and professional support) and put some effort into learning the tool, resulting in a monitoring environment that does everything you want it to, that you understand in and out, and hopefully the adoration of management -and- your operations teams. On 10/26/11 7:39 PM, Gian Karlo wrote: > Sorry it should be Nagios not OpenNMS. Anyway I would like to ask it > again if it is possible in Nagios. Thanks a lot. > > 1. Topology Based Navigation - Be able to navigate devices/elements as > based on the topology map created > 2. Automatic Device Discovery - Be able to discover devices for > monitoring (requires enabling of SNMP) > 3. Detailed Historical Reports - Be able to view historical reports > on devices and elements > 4. Fault Management and Network Availability Tools - Be able to > display alerts, provide audible alarms, suppress alarms, create alarms > as based on complex status and devices/element availability > 5. Network Performance Monitor - Be able to display information about > network devices/element health, utilization and status > 6. Syslog Viewer - Be able to accept and display system logs from > devices/elements > 7. SNMP Compatibility - Be able to verify compatibility with different > SNMP versions > 8. CPU, Memory & Disk Space Monitoring - Be able to display > device/element information such as CPU, Memory and Disk Space Utilization > 9. Advance Reporting Engine - Be able to create custom reports as > based on required information > 10. Advance Alerting - Be able to create advance alerts as based on > complex combination of inputs and send these alerts via email or thru > audible alarms > 11. Incident Alerting - Be able to send an alert based on different > incidents experienced by the device/element > 12. Network Map Making - Be able to create a map of the > devices/elements monitored > 13. Custom Property Editor - Be able to create a custom property to > interface, node or volume > 14. Custom HTML Resources - Be able to add resources to your HTML such > as link or text > 15. Custom MIB Support - Be able to view other statistics as based > on customized MIB pollers > 16. Role-based Access Control - Be able to create different users with > different access to NPM resources > 17. Customizable and Flexible Web Console - Be able to customize web > console as based on customer requirement > 18. Customizable Web Views - Be able to customize web views as based > on customer requirement > 19. Enterprise Scalability -Be able to provide scalability options to > sustain network growth > 20. Open Architecture - Be able to integrate with Open Architecture > database SQL > 21. Unified Monitoring Console - Be able to unify application > performance information with other NPM information on the console > 22. Advance Server Monitoring - Be able to monitor Windows, Unix and > Linux servers key performance statistics > 23. Services and Port Monitoring - Be able to monitor applications > running on servers as based on services and ports used > 24. User Experience Monitoring - Be able to provide user experience > information on web-based applications > 25. Application Templates - Be able to provide application monitoring > information as based on included application templates > 26. Out-of-the-box Reporting - Be able to display reports using out of > the box reporting templates > 27. Advance Application Alerting - Be able to create alerts on complex > combinations of events > 28. Top-to-bottom Analysis - Be able to provide analysis on traffic > flow as based on NetFlow, SFlow and/or JFlow > 29. Netflow, SFlow, JFlow Support - Be able to support > devices/elements running IP Flows and display statistics as based on > information > 30. Netflow Reporting - Be able to provide Flow reports > 31. Application Traffic Information - Be able to provide statistics as > based on application traffic information > 32. Demographics - Be able to provide information on top users and top > applications > 33. VoIP QoS Measurement - Be able to provide statistics as based on > VoIP QoS measured values > 34. Alerts, Grahps and Reporting - Be able to provide alerts, graphs > and reports as based on statistics gathered > 35. VoIP Infrastructure Monitoring - Be able to monitor VoIP network > infrastructure and provide information on network health for support > of VoIP service > -- Mike Lindsey ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mike-nagios at 5dninja.net Fri Oct 28 05:40:09 2011 From: mike-nagios at 5dninja.net (Mike Lindsey) Date: Thu, 27 Oct 2011 20:40:09 -0700 Subject: nagios 2.9 doesn't send emails anymore. In-Reply-To: References: Message-ID: <4EAA2419.4050500@5dninja.net> On 10/27/11 3:15 AM, Mario Garcia Ortiz wrote: > Hello > we have this strange issue on a nagios server running version 2.9. > > all of a sudden, we stopped receiving notifications > the web interface shows that the notifications are sent to all the > contacts but nothing is actually sent. there's nothing on the syslog > of the server; if we send manually a mail via the command line that is > sent but nothing that is sent by nagios process itself. > > what could be the problem here. > thank you > Many things could be the problem here, only some of them would be "Nagios." If you were running 3, I'd say turn on debug logging. Since you're not, this gets hard (or you could just upgrade to 3, see if the problem disappears, and if it doesn't, turn on that debug log.) What's your notification command config look like? What happens if you add: """ >/tmp/notif_log.out 2>&1 """ to the end of it? That should trap the command stdout and stderr, and save it in that file. -- Mike Lindsey ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From michael.friedrich at univie.ac.at Fri Oct 28 11:19:27 2011 From: michael.friedrich at univie.ac.at (Michael Friedrich) Date: Fri, 28 Oct 2011 11:19:27 +0200 Subject: Question about Nagios Features In-Reply-To: <4EAA21CB.1090608@5dninja.net> References: <9434fef3dd4509d92f0594c689e40cf2.squirrel@www.coochey.net> <4EAA21CB.1090608@5dninja.net> Message-ID: <4EAA739F.7080709@univie.ac.at> On 28.10.2011 05:30, Mike Lindsey wrote: > Short answer: Yes. > > Medium answer: The free version can provide all of this, but it takes > some work to get what you want. The paid version will give you a lot > more of this right out of the door. > > Long answer: Go to nagios.org, read the docs, understand the docs! > play with the demo, > download the software. Install it somewhere, play with it for a day. > Then go hit Nagios Exchange and download some addons that fix whatever > you think doesn't work quite the way you want. don't do that. come up with a design and an idea of your environment including various use cases for various problems, some drawings and target answers on the real questions. the docs don't reflect all available addons which could possibly resolve the main questions, but users's experience will tell - if you ask straight ahead. not with that list which sounds rather "catch them all" ... > It sounds like you're > going to be monitoring a complex environment. That means that your > monitoring environment is going to be complex as well - at my work we > monitor over a hundred discrete host profiles on a vast (though still > incredibly small in comparison to some other installations) with some > shared monitoring, but a lot of unique custom monitoring that performs > deep application health checks on custom apps. > > You get what you put out. In a large environment you can put out half a > million dollars a year in licensing and support, and potentially > millions in professional services dollars, getting something that you > might not understand - tying you and your business to an external org > through a hefty financial leash. Alternately, you can download some > free software (or pay an entirely reasonable sum for a better version > and professional support) and put some effort into learning the tool, > resulting in a monitoring environment that does everything you want it > to, that you understand in and out, and hopefully the adoration of > management -and- your operations teams. > > On 10/26/11 7:39 PM, Gian Karlo wrote: >> Sorry it should be Nagios not OpenNMS. Anyway I would like to ask it >> again if it is possible in Nagios. Thanks a lot. >> >> 1. Topology Based Navigation - Be able to navigate devices/elements as >> based on the topology map created >> 2. Automatic Device Discovery - Be able to discover devices for >> monitoring (requires enabling of SNMP) >> 3. Detailed Historical Reports - Be able to view historical reports >> on devices and elements >> 4. Fault Management and Network Availability Tools - Be able to >> display alerts, provide audible alarms, suppress alarms, create alarms >> as based on complex status and devices/element availability >> 5. Network Performance Monitor - Be able to display information about >> network devices/element health, utilization and status >> 6. Syslog Viewer - Be able to accept and display system logs from >> devices/elements >> 7. SNMP Compatibility - Be able to verify compatibility with different >> SNMP versions >> 8. CPU, Memory& Disk Space Monitoring - Be able to display >> device/element information such as CPU, Memory and Disk Space Utilization >> 9. Advance Reporting Engine - Be able to create custom reports as >> based on required information >> 10. Advance Alerting - Be able to create advance alerts as based on >> complex combination of inputs and send these alerts via email or thru >> audible alarms >> 11. Incident Alerting - Be able to send an alert based on different >> incidents experienced by the device/element >> 12. Network Map Making - Be able to create a map of the >> devices/elements monitored >> 13. Custom Property Editor - Be able to create a custom property to >> interface, node or volume >> 14. Custom HTML Resources - Be able to add resources to your HTML such >> as link or text >> 15. Custom MIB Support - Be able to view other statistics as based >> on customized MIB pollers >> 16. Role-based Access Control - Be able to create different users with >> different access to NPM resources >> 17. Customizable and Flexible Web Console - Be able to customize web >> console as based on customer requirement >> 18. Customizable Web Views - Be able to customize web views as based >> on customer requirement >> 19. Enterprise Scalability -Be able to provide scalability options to >> sustain network growth >> 20. Open Architecture - Be able to integrate with Open Architecture >> database SQL >> 21. Unified Monitoring Console - Be able to unify application >> performance information with other NPM information on the console >> 22. Advance Server Monitoring - Be able to monitor Windows, Unix and >> Linux servers key performance statistics >> 23. Services and Port Monitoring - Be able to monitor applications >> running on servers as based on services and ports used >> 24. User Experience Monitoring - Be able to provide user experience >> information on web-based applications >> 25. Application Templates - Be able to provide application monitoring >> information as based on included application templates >> 26. Out-of-the-box Reporting - Be able to display reports using out of >> the box reporting templates >> 27. Advance Application Alerting - Be able to create alerts on complex >> combinations of events >> 28. Top-to-bottom Analysis - Be able to provide analysis on traffic >> flow as based on NetFlow, SFlow and/or JFlow >> 29. Netflow, SFlow, JFlow Support - Be able to support >> devices/elements running IP Flows and display statistics as based on >> information >> 30. Netflow Reporting - Be able to provide Flow reports >> 31. Application Traffic Information - Be able to provide statistics as >> based on application traffic information >> 32. Demographics - Be able to provide information on top users and top >> applications >> 33. VoIP QoS Measurement - Be able to provide statistics as based on >> VoIP QoS measured values >> 34. Alerts, Grahps and Reporting - Be able to provide alerts, graphs >> and reports as based on statistics gathered >> 35. VoIP Infrastructure Monitoring - Be able to monitor VoIP network >> infrastructure and provide information on network health for support >> of VoIP service >> > -- DI (FH) Michael Friedrich Vienna University Computer Center Universitaetsstrasse 7 A-1010 Vienna, Austria email: michael.friedrich at univie.ac.at phone: +43 1 4277 14359 mobile: +43 664 60277 14359 fax: +43 1 4277 14338 web: http://www.univie.ac.at/zid http://www.aco.net Lead Icinga Core& IDOUtils Developer http://www.icinga.org ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From werner at aloah-from-hell.de Fri Oct 28 11:37:16 2011 From: werner at aloah-from-hell.de (werner at aloah-from-hell.de) Date: Fri, 28 Oct 2011 11:37:16 +0200 Subject: nagios 2.9 doesn't send emails anymore. In-Reply-To: <4EAA2419.4050500@5dninja.net> References: <4EAA2419.4050500@5dninja.net> Message-ID: <4EAA77CC.9030203@aloah-from-hell.de> Hi, >> what could be the problem here. first, I would check your local MTA and the corresponding logs for hints :) Regards, Werner ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From ae at op5.se Fri Oct 28 14:53:24 2011 From: ae at op5.se (Andreas Ericsson) Date: Fri, 28 Oct 2011 14:53:24 +0200 Subject: Question about Nagios Features In-Reply-To: References: <9434fef3dd4509d92f0594c689e40cf2.squirrel@www.coochey.net> Message-ID: <4EAAA5C4.6000006@op5.se> On 10/27/2011 04:39 AM, Gian Karlo wrote: > Sorry it should be Nagios not OpenNMS. Anyway I would like to ask it again > if it is possible in Nagios. Thanks a lot. > A lot of questions here. Most of them seem to point to you wanting to start a business based on packaging and selling Nagios to others. If that's the case, you really should be doing your own homework, so go RTFM's (there are about 12 of them that you'll find very useful and about 50 others that will be merel "useful" and about 4000 others that will be completely worthless) and try it out for yourself. If you're really (I mean, really?) interested in getting everything on the list below, I'd suggest you contact one of the many companies that already supply pre-packaged nagios solutions with all the boss bling and buzzwords you mentioned on your list of requirements. I know at least $paycheck_delivering_firm provides all of the list that I could bare to read before my brain went all markety on me and I compulsively started googling for "search engine optimization", "brand recognition" and other crap that people with too much powerpoint time on their hands tend to invest a significant amount of time into. I'd rant further, but I must conserve my powers of evil sarcasm for tonights party where my friends are sure to ridicule me for spraining my back in the gym. Should be a good one. Cheerios, and good luck > 1. Topology Based Navigation - Be able to navigate devices/elements as based > on the topology map created > 2. Automatic Device Discovery - Be able to discover devices for > monitoring (requires enabling of SNMP) > 3. Detailed Historical Reports - Be able to view historical reports on > devices and elements > 4. Fault Management and Network Availability Tools - Be able to display > alerts, provide audible alarms, suppress alarms, create alarms as based on > complex status and devices/element availability > 5. Network Performance Monitor - Be able to display information about > network devices/element health, utilization and status > 6. Syslog Viewer - Be able to accept and display system logs from > devices/elements > 7. SNMP Compatibility - Be able to verify compatibility with different SNMP > versions > 8. CPU, Memory& Disk Space Monitoring - Be able to display device/element > information such as CPU, Memory and Disk Space Utilization > 9. Advance Reporting Engine - Be able to create custom reports as based on > required information > 10. Advance Alerting - Be able to create advance alerts as based on complex > combination of inputs and send these alerts via email or thru audible alarms > 11. Incident Alerting - Be able to send an alert based on different > incidents experienced by the device/element > 12. Network Map Making - Be able to create a map of the devices/elements > monitored > 13. Custom Property Editor - Be able to create a custom property to > interface, node or volume > 14. Custom HTML Resources - Be able to add resources to your HTML such as > link or text > 15. Custom MIB Support - Be able to view other statistics as based on > customized MIB pollers > 16. Role-based Access Control - Be able to create different users with > different access to NPM resources > 17. Customizable and Flexible Web Console - Be able to customize web console > as based on customer requirement > 18. Customizable Web Views - Be able to customize web views as based on > customer requirement > 19. Enterprise Scalability -Be able to provide scalability options to > sustain network growth > 20. Open Architecture - Be able to integrate with Open Architecture database > SQL > 21. Unified Monitoring Console - Be able to unify application performance > information with other NPM information on the console > 22. Advance Server Monitoring - Be able to monitor Windows, Unix and Linux > servers key performance statistics > 23. Services and Port Monitoring - Be able to monitor applications running > on servers as based on services and ports used > 24. User Experience Monitoring - Be able to provide user experience > information on web-based applications > 25. Application Templates - Be able to provide application monitoring > information as based on included application templates > 26. Out-of-the-box Reporting - Be able to display reports using out of the > box reporting templates > 27. Advance Application Alerting - Be able to create alerts on complex > combinations of events > 28. Top-to-bottom Analysis - Be able to provide analysis on traffic flow as > based on NetFlow, SFlow and/or JFlow > 29. Netflow, SFlow, JFlow Support - Be able to support devices/elements > running IP Flows and display statistics as based on information > 30. Netflow Reporting - Be able to provide Flow reports > 31. Application Traffic Information - Be able to provide statistics as based > on application traffic information > 32. Demographics - Be able to provide information on top users and top > applications > 33. VoIP QoS Measurement - Be able to provide statistics as based on VoIP > QoS measured values > 34. Alerts, Grahps and Reporting - Be able to provide alerts, graphs and > reports as based on statistics gathered > 35. VoIP Infrastructure Monitoring - Be able to monitor VoIP network > infrastructure and provide information on network health for support of VoIP > service -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From Rick.Garland at quantum.com Fri Oct 28 16:39:07 2011 From: Rick.Garland at quantum.com (Rick Garland) Date: Fri, 28 Oct 2011 08:39:07 -0600 Subject: nagios reporting Message-ID: <8324365F3DDCFE4EA4CBB01F275CDD650786D299@DENMSGV1.QUANTUM.COM> Hi all: Running Nagios 3.3.1 on RH5.5 server. In an earlier version I discovered that the Histogram reports would not always run correctly. Specifically, running the report for a service. The error message that appears is "It appears as though you are not authorized to view information for the specified service ..." This issue still exists in Nagios 3.3.1 Some time ago I found a blurp about the service_description having a space in the text. Example CPU LOAD vs CPU_LOAD. Anybody else experience the issue? Is there a patch or something available? How did you fix? Many thanks ---------------------------------------------------------------------- The information contained in this transmission may be confidential. Any disclosure, copying, or further distribution of confidential information is not permitted unless such privilege is explicitly granted in writing by Quantum. Quantum reserves the right to have electronic communications, including email and attachments, sent across its networks filtered through anti virus and spam software programs and retain such messages in order to comply with applicable data security and retention requirements. Quantum is not responsible for the proper and complete transmission of the substance of this communication or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From Rick.Garland at quantum.com Fri Oct 28 18:46:09 2011 From: Rick.Garland at quantum.com (Rick Garland) Date: Fri, 28 Oct 2011 10:46:09 -0600 Subject: high Service Check Latency Message-ID: <8324365F3DDCFE4EA4CBB01F275CDD650786D308@DENMSGV1.QUANTUM.COM> Hi all: Dell PE2950, 16GB ram, plenty of disk space, etc Just upgraded to Nagios 3.3.1 from Nagios 3.2.3 MySQL 5.0.77 NDO2DB 1.4b9 RRDTool 1.4.5 NRPE 2.8.1 Been using nagios for a while (nagios 2.x) and I have been upgrading, the latest upgrade from 3.2.3. Every other upgrade has gone off without problems except this one to Nagios 3.3.1. The Service Check Latency has jumped from being about 1 sec to 200+ seconds. I have searched for tuning tips and have made the following changes in nagios.cfg but with little effect. max_concurrent_checks=100 check_result_reaper_frequency=15 max_check_result_reaper_time=25 The output below is the result of nagiosstats. Nagios Stats 3.3.1 Copyright (c) 2003-2008 Ethan Galstad (http://www.nagios.org) Last Modified: 07-25-2011 License: GPL CURRENT STATUS DATA ------------------------------------------------------ Status File: /usr/local/nagios/var/status.dat Status File Age: 0d 0h 0m 11s Status File Version: 3.3.1 Program Running Time: 0d 16h 27m 5s Nagios PID: 6224 Used/High/Total Command Buffers: 0 / 3 / 4096 Total Services: 2023 Services Checked: 2023 Services Scheduled: 2020 Services Actively Checked: 2023 Services Passively Checked: 0 Total Service State Change: 0.000 / 9.870 / 0.020 % Active Service Latency: 0.008 / 324.337 / 242.021 sec Active Service Execution Time: 0.011 / 52.108 / 0.874 sec Active Service State Change: 0.000 / 9.870 / 0.020 % Active Services Last 1/5/15/60 min: 128 / 902 / 1850 / 1935 Passive Service Latency: 0.000 / 0.000 / 0.000 sec Passive Service State Change: 0.000 / 0.000 / 0.000 % Passive Services Last 1/5/15/60 min: 0 / 0 / 0 / 0 Services Ok/Warn/Unk/Crit: 2021 / 1 / 0 / 1 Services Flapping: 0 Services In Downtime: 0 Total Hosts: 152 Hosts Checked: 152 Hosts Scheduled: 28 Hosts Actively Checked: 152 Host Passively Checked: 0 Total Host State Change: 0.000 / 0.000 / 0.000 % Active Host Latency: 0.000 / 471.193 / 284.723 sec Active Host Execution Time: 0.007 / 0.162 / 0.044 sec Active Host State Change: 0.000 / 0.000 / 0.000 % Active Hosts Last 1/5/15/60 min: 4 / 16 / 29 / 29 Passive Host Latency: 0.000 / 0.000 / 0.000 sec Passive Host State Change: 0.000 / 0.000 / 0.000 % Passive Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 0 Hosts Up/Down/Unreach: 152 / 0 / 0 Hosts Flapping: 0 Hosts In Downtime: 0 Active Host Checks Last 1/5/15 min: 5 / 17 / 49 Scheduled: 5 / 16 / 44 On-demand: 0 / 1 / 5 Parallel: 5 / 16 / 46 Serial: 0 / 0 / 0 Cached: 0 / 1 / 3 Passive Host Checks Last 1/5/15 min: 0 / 0 / 0 Active Service Checks Last 1/5/15 min: 179 / 939 / 2898 Scheduled: 179 / 939 / 2898 On-demand: 0 / 0 / 0 Cached: 0 / 0 / 0 Passive Service Checks Last 1/5/15 min: 0 / 0 / 0 External Commands Last 1/5/15 min: 0 / 0 / 0 Something I did find, don't know if it's related or not - yet. Before the upgrade the CPU sys value would stay in the 3% range. Since the upgrade the CPU sys is running in the 20% range. I also see the run queue jump up by a factor of 5x at times. I have been unable to find any reasons why or solutions. Anybody else? Thanks Rick Garland | Sr UNIX Systems Administrator | Quantum, Corp | Office: 720-249-5984 | cell: 720-210-4671 ---------------------------------------------------------------------- The information contained in this transmission may be confidential. Any disclosure, copying, or further distribution of confidential information is not permitted unless such privilege is explicitly granted in writing by Quantum. Quantum reserves the right to have electronic communications, including email and attachments, sent across its networks filtered through anti virus and spam software programs and retain such messages in order to comply with applicable data security and retention requirements. Quantum is not responsible for the proper and complete transmission of the substance of this communication or for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From discoverashwin at yahoo.co.uk Fri Oct 28 19:55:15 2011 From: discoverashwin at yahoo.co.uk (Ashwin) Date: Fri, 28 Oct 2011 18:55:15 +0100 (BST) Subject: Nagios Web Interface does not notify when nagios service is killed! Message-ID: <1319824515.60852.YahooMailNeo@web28516.mail.ukl.yahoo.com> Hi, I am working as UNIX administrator in Data Centre having more than few hundred servers(RHEL 5.5/6). I have installed Nagios (core) 3.3.1 and Nagios Plugins 1.4.15 and NRPE (Nagios Remote Plugin Executor) 2.12 to monitor the all servers in data centre mentioned above. When nagios process is killed manually at the nagios monitoring server, the nagios web interface does not automatically notify about the nagios process been killed (unless the service is explicitly stopped). This gives impression to support team looking at the web interface that everything is working fine when nagios process itself is killed/dead. I am new to nagios/RHEL, can you please give me some pointers to solve this issue. Thanks a ton, LA -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From jim at jimavery.me.uk Sat Oct 29 00:25:36 2011 From: jim at jimavery.me.uk (Jim Avery) Date: Fri, 28 Oct 2011 23:25:36 +0100 Subject: Nagios Web Interface does not notify when nagios service is killed! In-Reply-To: <1319824515.60852.YahooMailNeo@web28516.mail.ukl.yahoo.com> References: <1319824515.60852.YahooMailNeo@web28516.mail.ukl.yahoo.com> Message-ID: On 28 October 2011 18:55, Ashwin wrote: > Hi, > > I am working as UNIX administrator in Data Centre having more than few > hundred servers(RHEL 5.5/6). I have installed Nagios (core) 3.3.1 and Nagios > Plugins 1.4.15 and NRPE (Nagios Remote Plugin Executor) 2.12 to monitor the > all servers in data centre mentioned above. > > When nagios process is killed manually at the nagios monitoring server, the > nagios web interface does not automatically notify about the nagios process > been killed (unless the service is explicitly stopped). > > This gives impression to support team looking at the web interface that > everything is working fine when nagios process itself is killed/dead. I am > new to nagios/RHEL, can you please give me some pointers to solve this > issue. In the (more years than I care to remember) that I've been running a Nagios system, I've not had a problem with the nagios daemon being killed manually. So long as you stop/start it using the init script the front end will make it pretty obvious the daemon is stopped. If your admins are fool enough to kill stuff off willy-nilly you need to lock them out of your Nagios system, and any other system that's important to you for that matter. If you really want to know the daemon is killed, then simply knock up a script, ideally running on another server, which queries your Nagios server using snmp or ssh or whatever to see if the nagios daemon is running and emails or SMSs you if it isn't. The check_snmp_process plugin could be useful for that task. http://nagios.manubulon.com/snmp_process.html hth, Jim ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning at Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From rick.garland at quantum.com Sun Oct 30 02:43:47 2011 From: rick.garland at quantum.com (Rick) Date: Sun, 30 Oct 2011 01:43:47 +0000 (UTC) Subject: nagios 2.9 doesn't send emails anymore. References: Message-ID: Mario Garcia Ortiz absi.be> writes: > > Hello > we have this strange issue on a nagios server running version 2.9. > > all of a sudden, we stopped receiving notifications > the web interface shows that the notifications are sent to all the > contacts but nothing is actually sent. there's nothing on the syslog > of the server; if we send manually a mail via the command line that is > sent but nothing that is sent by nagios process itself. > > what could be the problem here. > thank you Check the size of the perfdata logfile. When I hit a limit of 2TB nagios stopped sending mail as alerts and notifications. I included the perfdata.log file in regular rotation and it never happened again. ------------------------------------------------------------------------------ Get your Android app more play: Bring it to the BlackBerry PlayBook in minutes. BlackBerry App World™ now supports Android™ Apps for the BlackBerry® PlayBook™. Discover just how easy and simple it is! http://p.sf.net/sfu/android-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From yu.watanabe at jp.fujitsu.com Mon Oct 31 03:24:01 2011 From: yu.watanabe at jp.fujitsu.com (Yu Watanabe) Date: Mon, 31 Oct 2011 11:24:01 +0900 Subject: Question about the calculation of service check latency Message-ID: <201110310224.AA04445@S2007337.jp.fujitsu.com> Hi all! I would like to ask question about the calculation of service check latency. In the code (events.c), the service check latency is calculated as, latency = (double)((double)(tv.tv_sec - event->run_time) + (double)(tv.tv_usec / 1000) / 1000.0); I was wondering, what is the below caculation is meant for? (double)(tv.tv_usec / 1000) / 1000.0); Thanks, Yu ------------------------------------------------------------------------------ Get your Android app more play: Bring it to the BlackBerry PlayBook in minutes. BlackBerry App World™ now supports Android™ Apps for the BlackBerry® PlayBook™. Discover just how easy and simple it is! http://p.sf.net/sfu/android-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From yu.watanabe at jp.fujitsu.com Mon Oct 31 03:25:31 2011 From: yu.watanabe at jp.fujitsu.com (Yu Watanabe) Date: Mon, 31 Oct 2011 11:25:31 +0900 Subject: Question about the calculation of service check latency In-Reply-To: <201110310224.AA04445@S2007337.jp.fujitsu.com> References: <201110310224.AA04445@S2007337.jp.fujitsu.com> Message-ID: <201110310225.AA04446@S2007337.jp.fujitsu.com> Forgot to mention about the version. The version is nagios v 3.3.1. Sorry. Thanks, Yu Yu Watanabe ????????: >Hi all! > >I would like to ask question about the calculation of service check latency. > >In the code (events.c), the service check latency is calculated as, > >latency = (double)((double)(tv.tv_sec - event->run_time) + (double)(tv.tv_usec / 1000) / 1000.0); > >I was wondering, what is the below caculation is meant for? > >(double)(tv.tv_usec / 1000) / 1000.0); > >Thanks, >Yu ------------------------------------------------------------------------------ Get your Android app more play: Bring it to the BlackBerry PlayBook in minutes. BlackBerry App World™ now supports Android™ Apps for the BlackBerry® PlayBook™. Discover just how easy and simple it is! http://p.sf.net/sfu/android-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From cosmin.neagu at omnilogic.ro Mon Oct 31 09:22:01 2011 From: cosmin.neagu at omnilogic.ro (Cosmin Neagu) Date: Mon, 31 Oct 2011 10:22:01 +0200 Subject: Unable to use check_bgp Message-ID: <4EAE5AA9.2070300@omnilogic.ro> Hello, I'm trying to monitor bgp session, and i came upon the following script check_bgp. When i run it manually, it checks the bgp session OK: /nagios at mon2:/usr/local/nagios/libexec$ ./check_bgp.pl -H X.X.X.X -C xxx -p Y.Y.Y.Y OK - Y.Y.Y.Y (AS12345) state is established(6). Established for 69d9h34m43s. Last error "Hold Timer Expired". / But, when i define the command, and add a service for it, it does not work. In nagios, i get CRITICAL status and /null/ Status information. Using nagios 3.2.3 /define command { command_name check_bgp command_line $USER1$/check_bgp.pl -H $HOSTADDRESS$ $ARG1$ }/ /define service { use generic-service host_name EDGE service_description eBGP to AS12345 check_command check_bgp!-C abcd -p Y.Y.Y.Y }/ So, what am i doing wrong? I have about 300 diferent services, and all checks works fine, this is the first time when a script works when invoked manually but not when invoked by nagios. Please help. -- Cosmin Neagu -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ Get your Android app more play: Bring it to the BlackBerry PlayBook in minutes. BlackBerry App World™ now supports Android™ Apps for the BlackBerry® PlayBook™. Discover just how easy and simple it is! http://p.sf.net/sfu/android-dev2dev -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From cosmin.neagu at omnilogic.ro Mon Oct 31 09:33:29 2011 From: cosmin.neagu at omnilogic.ro (Cosmin Neagu) Date: Mon, 31 Oct 2011 10:33:29 +0200 Subject: Unable to use check_bgp and check_catalyst_mem In-Reply-To: <4EAE5AA9.2070300@omnilogic.ro> References: <4EAE5AA9.2070300@omnilogic.ro> Message-ID: <4EAE5D59.9090107@omnilogic.ro> Actually, i have another one in the same situation, when executed manualy, it is working: /nagios at mon2:/usr/local/nagios/libexec$ ./check_catalyst_mem.pl -s 172.31.0.100 -C xxx -w 30 -c 20 OK: I/O: valid, Used: 25189592B Free: 41919272B (62%)! Processor: valid, Used: 114067808B Free: 801243696B (87%)!|WARING <= 30%, CRITICAL <= 20%/ But, when i define a service for it, it does not do the checking corectly: Status Critical and null status information: /define command { command_name check_catalyst_mem command_line $USER1$/check_catalyst_mem.pl -s $HOSTADDRESS$ $ARG1$ } define service { use generic-service host_name VSS6509 service_description Memory check_command check_catalyst_mem!-C xxx -w 30 -c 20 }/ What's wrong with this definitions and why nagios does not check corect? Cosmin Neagu On 10/31/2011 10:22 AM, Cosmin Neagu wrote: > Hello, > I'm trying to monitor bgp session, and i came upon the following > script check_bgp. > When i run it manually, it checks the bgp session OK: > > /nagios at mon2:/usr/local/nagios/libexec$ ./check_bgp.pl -H X.X.X.X -C > xxx -p Y.Y.Y.Y > OK - Y.Y.Y.Y (AS12345) state is established(6). Established for > 69d9h34m43s. Last error "Hold Timer Expired". > / > > But, when i define the command, and add a service for it, it does not > work. In nagios, i get CRITICAL status and /null/ Status information. > Using nagios 3.2.3 > > > > > /define command { > command_name check_bgp > command_line $USER1$/check_bgp.pl -H $HOSTADDRESS$ $ARG1$ > }/ > /define service { > use generic-service > host_name EDGE > service_description eBGP to AS12345 > check_command check_bgp!-C abcd -p Y.Y.Y.Y > }/ > > So, what am i doing wrong? I have about 300 diferent services, and all > checks works fine, this is the first time when a script works when > invoked manually but not when invoked by nagios. Please help. > > > > -- > Cosmin Neagu > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ Get your Android app more play: Bring it to the BlackBerry PlayBook in minutes. BlackBerry App World™ now supports Android™ Apps for the BlackBerry® PlayBook™. Discover just how easy and simple it is! http://p.sf.net/sfu/android-dev2dev -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From Schimpke.Thomas at bhn-services.com Mon Oct 31 10:42:34 2011 From: Schimpke.Thomas at bhn-services.com (Schimpke, Dr. Thomas - bhn) Date: Mon, 31 Oct 2011 10:42:34 +0100 Subject: Question about the calculation of service check latency In-Reply-To: <201110310224.AA04445@S2007337.jp.fujitsu.com> References: <201110310224.AA04445@S2007337.jp.fujitsu.com> Message-ID: <4EAE6D8A.9090103@bhn-services.com> Hi Yu, conversion from Microseconds to seconds...and then convert the result to double. Thomas On 10/31/2011 03:24 AM, Yu Watanabe wrote: > Hi all! > > I would like to ask question about the calculation of service check latency. > > In the code (events.c), the service check latency is calculated as, > > latency = (double)((double)(tv.tv_sec - event->run_time) + (double)(tv.tv_usec / 1000) / 1000.0); > > I was wondering, what is the below caculation is meant for? > > (double)(tv.tv_usec / 1000) / 1000.0); > > Thanks, > Yu > > > ------------------------------------------------------------------------------ > Get your Android app more play: Bring it to the BlackBerry PlayBook > in minutes. BlackBerry App World™ now supports Android™ Apps > for the BlackBerry® PlayBook™. Discover just how easy and simple > it is! http://p.sf.net/sfu/android-dev2dev > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null ------------------------------------------------------------------------------ Get your Android app more play: Bring it to the BlackBerry PlayBook in minutes. BlackBerry App World™ now supports Android™ Apps for the BlackBerry® PlayBook™. Discover just how easy and simple it is! http://p.sf.net/sfu/android-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From yu.watanabe at jp.fujitsu.com Mon Oct 31 11:08:11 2011 From: yu.watanabe at jp.fujitsu.com (Yu Watanabe) Date: Mon, 31 Oct 2011 19:08:11 +0900 Subject: Question about the calculation of service check latency In-Reply-To: <4EAE6D8A.9090103@bhn-services.com> References: <4EAE6D8A.9090103@bhn-services.com> Message-ID: <201110311008.AA04449@S2007337.jp.fujitsu.com> Hi Thomas. Thank you for the reply. I understand. Thanks, Yu Schimpke, Dr. Thomas - bhn ????????: >Hi Yu, > >conversion from Microseconds to seconds...and then convert the result to >double. > >Thomas > > > >On 10/31/2011 03:24 AM, Yu Watanabe wrote: >> Hi all! >> >> I would like to ask question about the calculation of service check latency. >> >> In the code (events.c), the service check latency is calculated as, >> >> latency = (double)((double)(tv.tv_sec - event->run_time) + (double)(tv.tv_usec / 1000) / 1000.0); >> >> I was wondering, what is the below caculation is meant for? >> >> (double)(tv.tv_usec / 1000) / 1000.0); >> >> Thanks, >> Yu >> >> >> ------------------------------------------------------------------------------ >> Get your Android app more play: Bring it to the BlackBerry PlayBook >> in minutes. BlackBerry App World™ now supports Android™ Apps >> for the BlackBerry® PlayBook™. Discover just how easy and simple >> it is! http://p.sf.net/sfu/android-dev2dev >> _______________________________________________ >> Nagios-users mailing list >> Nagios-users at lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/nagios-users >> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. >> ::: Messages without supporting info will risk being sent to /dev/null > > >------------------------------------------------------------------------------ >Get your Android app more play: Bring it to the BlackBerry PlayBook >in minutes. BlackBerry App World™ now supports Android™ Apps >for the BlackBerry® PlayBook™. Discover just how easy and simple >it is! http://p.sf.net/sfu/android-dev2dev >_______________________________________________ >Nagios-users mailing list >Nagios-users at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/nagios-users >::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. >::: Messages without supporting info will risk being sent to /dev/null > ------------------------------------------------------------------------------ Get your Android app more play: Bring it to the BlackBerry PlayBook in minutes. BlackBerry App World™ now supports Android™ Apps for the BlackBerry® PlayBook™. Discover just how easy and simple it is! http://p.sf.net/sfu/android-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From mad at b-care.net Mon Oct 31 12:27:56 2011 From: mad at b-care.net (MAD) Date: Mon, 31 Oct 2011 12:27:56 +0100 Subject: Next check jumps from 5min to 24h Message-ID: <4EAE863C.1010709@b-care.net> Hi list, Something strange happened last night. At about 23:00 the 30th of October, all checks (hosts and services) were scheduled to run the 31st of October at 23:00 instead of the usual 5min later. I had to rescheduled all checks manually this morning to restart my monitoring. I'm running Nagios 3.2.0 Have someone heard about a similar issue? Thanks in advance, Marc-Andr? ------------------------------------------------------------------------------ Get your Android app more play: Bring it to the BlackBerry PlayBook in minutes. BlackBerry App World™ now supports Android™ Apps for the BlackBerry® PlayBook™. Discover just how easy and simple it is! http://p.sf.net/sfu/android-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From Deborah.Martin at kognitio.com Mon Oct 31 12:34:45 2011 From: Deborah.Martin at kognitio.com (Deborah Martin) Date: Mon, 31 Oct 2011 04:34:45 -0700 Subject: Next check jumps from 5min to 24h In-Reply-To: <4EAE863C.1010709@b-care.net> References: <4EAE863C.1010709@b-care.net> Message-ID: Folks, I had exactly the same problem with the same version on SLES10 SP2, when I came in this morning and realised that the last_check time was wrong for most checks. I had to jump start Nagios by disabling all active checks and reenabling them. This is weird problem but wondered if it was anything to do with the clocks changing over the weekend in the UK ? Perhaps it isn't if people in a non-UK timezone had the same problem ? Regards, Deborah -----Original Message----- From: MAD [mailto:mad at b-care.net] Sent: 31 October 2011 11:28 To: Nagios-Users Subject: [Nagios-users] Next check jumps from 5min to 24h Hi list, Something strange happened last night. At about 23:00 the 30th of October, all checks (hosts and services) were scheduled to run the 31st of October at 23:00 instead of the usual 5min later. I had to rescheduled all checks manually this morning to restart my monitoring. I'm running Nagios 3.2.0 Have someone heard about a similar issue? Thanks in advance, Marc-Andr? ------------------------------------------------------------------------------ Get your Android app more play: Bring it to the BlackBerry PlayBook in minutes. BlackBerry App World™ now supports Android™ Apps for the BlackBerry® PlayBook™. Discover just how easy and simple it is! http://p.sf.net/sfu/android-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null Complimentary Events and Webinars on In-Memory, Massively Parallel Processing and 'In the Cloud' - for more information click here This e-mail and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the intended recipient, please delete this e-mail immediately. Any unauthorised distribution or copying is strictly prohibited. Whilst Kognitio endeavours to prevent the transmission of viruses via e-mail, we cannot guarantee that any e-mail or attachment is free from computer viruses and you are strongly advised to undertake your own anti-virus precautions. Kognitio grants no warranties regarding performance, use or quality of any e-mail or attachment and undertakes no liability for loss or damage, howsoever caused ------------------------------------------------------------------------------ Get your Android app more play: Bring it to the BlackBerry PlayBook in minutes. BlackBerry App World™ now supports Android™ Apps for the BlackBerry® PlayBook™. Discover just how easy and simple it is! http://p.sf.net/sfu/android-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From jim at jimavery.me.uk Mon Oct 31 13:09:45 2011 From: jim at jimavery.me.uk (Jim Avery) Date: Mon, 31 Oct 2011 12:09:45 +0000 Subject: Next check jumps from 5min to 24h In-Reply-To: References: <4EAE863C.1010709@b-care.net> Message-ID: I had no such problem here in the UK as far as I am aware. Using Nagios Core 3.3.1 on Ubuntu 8.04.1 and /etc/timezone contains "Europe/London". There was a fix in 3.2.2 "Fix for choosing next valid time on day of DST change when clocks go one hour backwards". See the changelog at:- http://www.nagios.org/projects/nagioscore/history/core-3x Cheers, Jim ------------------------------------------------------------------------------ Get your Android app more play: Bring it to the BlackBerry PlayBook in minutes. BlackBerry App World™ now supports Android™ Apps for the BlackBerry® PlayBook™. Discover just how easy and simple it is! http://p.sf.net/sfu/android-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From david.ribeiro at altitudeinfra.fr Mon Oct 31 12:47:06 2011 From: david.ribeiro at altitudeinfra.fr (David Ribeiro) Date: Mon, 31 Oct 2011 12:47:06 +0100 (CET) Subject: Next check jumps from 5min to 24h In-Reply-To: References: Message-ID: <213762035.420314.1320061626878.JavaMail.root@zm-mbs01.altitudetelecom.fr> Hy all, I had the same problem on my supervision with a Nagios 3.2.0 from Ubuntu Repository. I'm in France, my supervision stop Sunday at 23h00. Regards, David ----- Mail original ----- De: "Deborah Martin" ?: "Nagios-Users" Envoy?: Lundi 31 Octobre 2011 12:34:45 Objet: Re: [Nagios-users] Next check jumps from 5min to 24h Folks, I had exactly the same problem with the same version on SLES10 SP2, when I came in this morning and realised that the last_check time was wrong for most checks. I had to jump start Nagios by disabling all active checks and reenabling them. This is weird problem but wondered if it was anything to do with the clocks changing over the weekend in the UK ? Perhaps it isn't if people in a non-UK timezone had the same problem ? Regards, Deborah -----Original Message----- From: MAD [mailto:mad at b-care.net] Sent: 31 October 2011 11:28 To: Nagios-Users Subject: [Nagios-users] Next check jumps from 5min to 24h Hi list, Something strange happened last night. At about 23:00 the 30th of October, all checks (hosts and services) were scheduled to run the 31st of October at 23:00 instead of the usual 5min later. I had to rescheduled all checks manually this morning to restart my monitoring. I'm running Nagios 3.2.0 Have someone heard about a similar issue? Thanks in advance, Marc-Andr? ------------------------------------------------------------------------------ Get your Android app more play: Bring it to the BlackBerry PlayBook in minutes. BlackBerry App World™ now supports Android™ Apps for the BlackBerry® PlayBook™. Discover just how easy and simple it is! http://p.sf.net/sfu/android-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null Complimentary Events and Webinars on In-Memory, Massively Parallel Processing and 'In the Cloud' - for more information click here This e-mail and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the intended recipient, please delete this e-mail immediately. Any unauthorised distribution or copying is strictly prohibited. Whilst Kognitio endeavours to prevent the transmission of viruses via e-mail, we cannot guarantee that any e-mail or attachment is free from computer viruses and you are strongly advised to undertake your own anti-virus precautions. Kognitio grants no warranties regarding performance, use or quality of any e-mail or attachment and undertakes no liability for loss or damage, howsoever caused ------------------------------------------------------------------------------ Get your Android app more play: Bring it to the BlackBerry PlayBook in minutes. BlackBerry App World™ now supports Android™ Apps for the BlackBerry® PlayBook™. Discover just how easy and simple it is! http://p.sf.net/sfu/android-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null Ce message et toutes les pieces jointes (ci-apres le "message" ) sont confidentiels et etablis a l'intention exclusive de ses destinataires. Toute utilisation de ce message non conforme a sa destination, toute diffusion ou toute publication, totale ou partielle, est interdite, sauf autorisation expresse. Si vous recevez ce message par erreur, merci de le detruire sans en conserver de copie et d'en avertir immediatement l'expediteur. Internet ne permettant pas de garantir l'integrite de ce message, AltitudeInfrastructure decline toute responsabilite au titre de ce message s'il a ete modifie, altere, deforme ou falsifie. This message and any attachments (the "message") are confidential and intended solely for the addressees. Any use not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you receive this message in error, please delete it without storing any evidence and immediately notify the sender. Internet can not guarantee the integrity of this message, neither AltitudeInfrastructure shall be liable for the message if modified, altered, changed or falsified. ------------------------------------------------------------------------------ Get your Android app more play: Bring it to the BlackBerry PlayBook in minutes. BlackBerry App World™ now supports Android™ Apps for the BlackBerry® PlayBook™. Discover just how easy and simple it is! http://p.sf.net/sfu/android-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From aravind at linuz.in Mon Oct 31 13:53:00 2011 From: aravind at linuz.in (Aravind M D) Date: Mon, 31 Oct 2011 18:23:00 +0530 Subject: Next check jumps from 5min to 24h In-Reply-To: <213762035.420314.1320061626878.JavaMail.root@zm-mbs01.altitudetelecom.fr> References: <213762035.420314.1320061626878.JavaMail.root@zm-mbs01.altitudetelecom.fr> Message-ID: <20111031182300.Horde._4gaf1wlEB5OrposD1cW3CA@mail.linuz.in> Quoting David Ribeiro : Hi All, I had faced the same problem today morning below link had helped me to resolve the issue. http://blog.centreon.com/post/2009/10/26/Nagios-disfunctional-due-to-time-change Rgds, Aravind M D >> Hy all, > > > > I had the same problem on my supervision with a Nagios 3.2.0 from > > Ubuntu Repository. > > I'm in France, my supervision stop Sunday at 23h00. > > > > Regards, > > > > David > > > > ----- Mail original ----- > > De: "Deborah Martin" > > ?: "Nagios-Users" > > Envoy?: Lundi 31 Octobre 2011 12:34:45 > > Objet: Re: [Nagios-users] Next check jumps from 5min to 24h > > > > Folks, > > > > I had exactly the same problem with the same version on SLES10 SP2,? > > when I came in this morning and realised that the last_check time was > > wrong for most checks. > > > > I had to jump start Nagios by disabling all active checks and > > reenabling them. > > > > This is weird problem but wondered if it was anything to do with the > > clocks changing over the weekend in the UK ? > > Perhaps it isn't if people in a non-UK timezone had the same problem ? > > > > Regards, > > Deborah > > -----Original Message----- > > From: MAD [mailto:mad at b-care.net] > > Sent: 31 October 2011 11:28 > > To: Nagios-Users > > Subject: [Nagios-users] Next check jumps from 5min to 24h > > > > Hi list, > > > > Something strange happened last night. At about 23:00 the 30th of > > October, all checks (hosts and services) were scheduled to run the > > 31st of October at 23:00 instead of the usual 5min later. I had to > > rescheduled all checks manually this morning to restart my monitoring. > > > > I'm running Nagios? 3.2.0 > > > > Have someone heard about a similar issue? > > > > Thanks in advance, > > > > Marc-Andr? > > > > > ------------------------------------------------------------------------------ > > Get your Android app more play: Bring it to the BlackBerry PlayBook > > in minutes. BlackBerry App World™ now supports Android™ > > Apps for the BlackBerry® PlayBook™. Discover just how easy > > and simple it is! http://p.sf.net/sfu/android-dev2dev > > _______________________________________________ > > Nagios-users mailing list > > Nagios-users at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/nagios-users > > ::: Please include Nagios version, plugin version (-v) and OS when > > reporting any issue. > > ::: Messages without supporting info will risk being sent to /dev/null > > > > > > Complimentary Events and Webinars on In-Memory, Massively Parallel > > Processing and 'In the Cloud' - for more information click here > > > > This e-mail and any files transmitted with it are confidential and > > intended solely for the use of the individual or entity to whom they > > are addressed. If you are not the intended recipient, please delete > > this e-mail immediately. Any unauthorised distribution or copying is > > strictly prohibited. Whilst Kognitio endeavours to prevent the > > transmission of viruses via e-mail, we cannot guarantee that any > > e-mail or attachment is free from computer viruses and you are > > strongly advised to undertake your own anti-virus precautions. > > Kognitio grants no warranties regarding performance, use or quality > > of any e-mail or attachment and undertakes no liability for loss or > > damage, howsoever caused > > > > > > > > > ------------------------------------------------------------------------------ > > Get your Android app more play: Bring it to the BlackBerry PlayBook > > in minutes. BlackBerry App World™ now supports Android™ Apps > > for the BlackBerry® PlayBook™. Discover just how easy and simple > > it is! http://p.sf.net/sfu/android-dev2dev > > _______________________________________________ > > Nagios-users mailing list > > Nagios-users at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/nagios-users > > ::: Please include Nagios version, plugin version (-v) and OS when > > reporting any issue. > > ::: Messages without supporting info will risk being sent to /dev/null > > > > > > > > > > Ce message et toutes les pieces jointes (ci-apres le "message" ) sont > > confidentiels et etablis a l'intention exclusive de ses > > destinataires. Toute utilisation de ce message non conforme a sa > > destination, toute diffusion ou toute publication, totale ou > > partielle, est interdite, sauf autorisation expresse. Si vous recevez > > ce message par erreur, merci de le detruire sans en conserver de > > copie et d'en avertir immediatement l'expediteur. Internet ne > > permettant pas de garantir l'integrite de ce message, > > AltitudeInfrastructure decline toute responsabilite au titre de ce > > message s'il a ete modifie, altere, deforme ou falsifie. > > > > This message and any attachments (the "message") are confidential and > > intended solely for the addressees. Any use not in accord with its > > purpose, any dissemination or disclosure, either whole or partial, is > > prohibited except formal approval. If you receive this message in > > error, please delete it without storing any evidence and immediately > > notify the sender. Internet can not guarantee the integrity of this > > message, neither AltitudeInfrastructure shall be liable for the > > message if modified, altered, changed or falsified. > > > > > > > ------------------------------------------------------------------------------ > > Get your Android app more play: Bring it to the BlackBerry PlayBook > > in minutes. BlackBerry App World™ now supports Android™ Apps > > for the BlackBerry® PlayBook™. Discover just how easy and simple > > it is! http://p.sf.net/sfu/android-dev2dev > > _______________________________________________ > > Nagios-users mailing list > > Nagios-users at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/nagios-users > > ::: Please include Nagios version, plugin version (-v) and OS when > > reporting any issue. > > ::: Messages without supporting info will risk being sent to /dev/null -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------------ Get your Android app more play: Bring it to the BlackBerry PlayBook in minutes. BlackBerry App World™ now supports Android™ Apps for the BlackBerry® PlayBook™. Discover just how easy and simple it is! http://p.sf.net/sfu/android-dev2dev -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From cbeattie at geninfo.com Mon Oct 31 16:15:19 2011 From: cbeattie at geninfo.com (Chris Beattie) Date: Mon, 31 Oct 2011 11:15:19 -0400 Subject: Nagios Web Interface does not notify when nagios service is killed! In-Reply-To: References: <1319824515.60852.YahooMailNeo@web28516.mail.ukl.yahoo.com> Message-ID: <4EAEBB87.5000804@geninfo.com> On 10/28/2011 6:25 PM, Jim Avery wrote: > > running and emails or SMSs you if it isn't. The check_snmp_process > plugin could be useful for that task. > http://nagios.manubulon.com/snmp_process.html I'm echoing everything Jim said. I cannot remember if Nagios has ever crashed on me. I do restart (rather than just reload the config files) it on a fairly regular basis, though, as I get asked to make changes. Do you know how the process is being terminated? Any clues in the system logs? I use a system very much like Jim describes to provide failover redundancy, but I use the stock check_nagios plugin running over ssh. The check_nagios plugin also monitors the age of the status file to ensure Nagios has written fresh data. Nothing in this message is intended to make or accept an offer or to form a contract, except that an attachment that is an image of a contract bearing the signature of an officer of our company may be or become a contract. This message (including any attachments) is intended only for the use of the individual or entity to whom it is addressed. It may contain information that is non-public, proprietary, privileged, confidential, and exempt from disclosure under applicable law or may constitute as attorney work product. If you are not the intended recipient, we hereby notify you that any use, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this message in error, please notify us immediately by telephone and delete this message immediately. Thank you. ------------------------------------------------------------------------------ Get your Android app more play: Bring it to the BlackBerry PlayBook in minutes. BlackBerry App World™ now supports Android™ Apps for the BlackBerry® PlayBook™. Discover just how easy and simple it is! http://p.sf.net/sfu/android-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null