From nagios at nagios.org Tue Jan 1 20:28:41 2008 From: nagios at nagios.org (Ethan Galstad) Date: Tue, 01 Jan 2008 13:28:41 -0600 Subject: error log messages In-Reply-To: <60F082C0B436CF48A11CD57CD194B7C4094869C3@EXCHANGE.westernwats.com> References: <60F082C0B436CF48A11CD57CD194B7C4094869C3@EXCHANGE.westernwats.com> Message-ID: <477A9469.2010905@nagios.org> Patrick Felt wrote: > I?ve got a couple of error log messages that I?m trying to figure out. > #nagios suggested that I submit them here as they might be bugs. The > first is a large number of orphaned processors. According to Google, > this is caused by high box utilization, however the nagios box is quite > idle so I?m not sure what is going on > > The second is a large number of ?Warning: Check result queue contained > results for service 'Memory Utilization' on host hostname', but the > service could not be found! Perhaps you forgot to define the service in > your config files?? the preflight shows no errors so I?m unsure of what > is happening here either. > > Are these bugs or have I configured something wrong? > > pat > Are you running 3.0rc1? The second item sounds like a bug that was fixed in the rc1 release. Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From pfelt at westernwats.com Wed Jan 2 18:08:22 2008 From: pfelt at westernwats.com (Patrick Felt) Date: Wed, 2 Jan 2008 10:08:22 -0700 Subject: error log messages In-Reply-To: <477A9469.2010905@nagios.org> References: <60F082C0B436CF48A11CD57CD194B7C4094869C3@EXCHANGE.westernwats.com> <477A9469.2010905@nagios.org> Message-ID: <60F082C0B436CF48A11CD57CD194B7C40948709A@EXCHANGE.westernwats.com> I am on rc1 yes. The really weird thing here is that I can't reproduce it now. Perhaps it has to do with load. I started up nagios with fully 350 services to monitor that It had to query right off the bat. It's been up for a while no and is pretty stable atm. I'm scanning the logs and don't see either message and I doubt that it was because of some change that I made. I'll keep an eye on it Pat -----Original Message----- From: nagios-devel-bounces at lists.sourceforge.net [mailto:nagios-devel-bounces at lists.sourceforge.net] On Behalf Of Ethan Galstad Sent: Tuesday, January 01, 2008 12:29 PM To: Nagios Developers List Subject: Re: [Nagios-devel] error log messages Patrick Felt wrote: > I've got a couple of error log messages that I'm trying to figure out. > #nagios suggested that I submit them here as they might be bugs. The > first is a large number of orphaned processors. According to Google, > this is caused by high box utilization, however the nagios box is quite > idle so I'm not sure what is going on > > The second is a large number of "Warning: Check result queue contained > results for service 'Memory Utilization' on host hostname', but the > service could not be found! Perhaps you forgot to define the service in > your config files?" the preflight shows no errors so I'm unsure of what > is happening here either. > > Are these bugs or have I configured something wrong? > > pat > Are you running 3.0rc1? The second item sounds like a bug that was fixed in the rc1 release. Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------ - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From frederic.schaer at cea.fr Thu Jan 3 11:25:27 2008 From: frederic.schaer at cea.fr (SCHAER Frederic) Date: Thu, 3 Jan 2008 11:25:27 +0100 Subject: [Nagios-devel] nrpe RPM spec file bug(s) In-Reply-To: References: Message-ID: Hi lists, I just checked in the CVS tree, and nagios/nrpe spec files still contains the non-system groupadds ... It's not that I'm willing to see this changes at all costs in the official spec files since I have patched ones for now, but I guess I'm not the only one this kind of update will help... am I ? Regards ________________________________ From: nagios-devel-bounces at lists.sourceforge.net [mailto:nagios-devel-bounces at lists.sourceforge.net] On Behalf Of SCHAER Frederic Sent: Friday, December 21, 2007 4:26 PM To: Nagios Developers List Subject: Re: [Nagios-devel] nrpe RPM spec file bug(s) Hmmm... by the way, the nagios.spec file has the same defect : it's using "groupadd" whereas it's using "useradd -r" ... if this could also be corrected... Regards ________________________________ From: nagios-devel-bounces at lists.sourceforge.net [mailto:nagios-devel-bounces at lists.sourceforge.net] On Behalf Of SCHAER Frederic Sent: Friday, December 21, 2007 2:06 PM To: nagios-devel at lists.sourceforge.net Subject: [Nagios-devel] nrpe RPM spec file bug(s) Hi, In the nagios spec file, the nagios user is added with command < useradd -r > which creates a system account + system group. In the nrpe spec file, there is first a standard groupadd on the Nagios group, and then a useradd with the -r option for the user. It looks to me that the first groupadd is then useless, isn't it ? By the way : when the groupadd is run, there's a grep on the hardcoded "nagios" value, while just after the %{nsgrp} is used to add the group, which is also a bug I think... Nevertheless, if the groupadd has to be kept, would it be possible to use it also with the "-r" option ? This way, a system group would be created. This would avoid conflicts with standard users/groups, and this would also make things more consistent. I can easily patch the spec file, but I'm using nrpe on a few hundred machines, and I guess other people managing clusters may face the same problem of GID conflicts that I just did, which could probably be easily be avoided.... so here I'm sharing the problem and potential solution. Thanks for patching && regards && merry Christmas ! :] Frederic -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From tobias at scherbaum.info Thu Jan 3 17:01:44 2008 From: tobias at scherbaum.info (Tobias Scherbaum) Date: Thu, 03 Jan 2008 17:01:44 +0100 Subject: Nagios 3.0rc1 segfaulting Message-ID: <1199376104.2599.4.camel@homer.ob.libexec.de> Hi, I'm experiencing a segfault w/ Nagios 3.0rc1 ?in process_check_result_queue(), it dies just a few seconds after starting up the Nagios daemon. Backtrace attached. tia, Tobias (gdb) r Starting program: /usr/sbin/nagios3 /etc/nagios3/nagios.cfg Failed to read a valid object file image from memory. [Thread debugging using libthread_db enabled] [New Thread 1077336608 (LWP 3529)] Nagios 3.0rc1 Copyright (c) 1999-2007 Ethan Galstad (http://www.nagios.org) Last Modified: 12-17-2007 License: GPL Nagios 3.0rc1 starting... (PID=3529) Local time is Thu Jan 03 16:01:09 CET 2008 [New Thread 1081805744 (LWP 3533)] Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 1077336608 (LWP 3529)] 0x0808450c in process_check_result_queue (dirname=0x80d4078 "/var/spool/nagios3/checkresults") at utils.c:2244 2244 utils.c: No such file or directory. in utils.c (gdb) bt #0 0x0808450c in process_check_result_queue (dirname=0x80d4078 "/var/spool/nagios3/checkresults") at utils.c:2244 #1 0x08062749 in reap_check_results () at checks.c:149 #2 0x08070227 in handle_timed_event (event=0x80f1c20) at events.c:1235 #3 0x080708a8 in event_execution_loop () at events.c:941 #4 0x08058a29 in main (argc=Cannot access memory at address 0xf0 ) at nagios.c:795 (gdb) kill Kill the program being debugged? (y or n) y (gdb) q ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From matthew-ln at itconsult.co.uk Thu Jan 3 18:32:18 2008 From: matthew-ln at itconsult.co.uk (Matthew Richardson) Date: Thu, 03 Jan 2008 17:32:18 +0000 Subject: Sub-optimal host & service contact_group inheritance Message-ID: <8b6qn39v7c5kvqo3pkqrhaq5c37o2f68ol@m27.itconsult.net> I was trying to configure some contact-group inheritance today and tried the following:- |define host{ | name je | contact_groups aaa,bbb | register 0 |} |define host{ | use template-host-generic,je | host_name hhh | alias hhh | address hhh |} |define service{ | use template-service-whatever | host_name hhh | contact_groups +ccc | service_description sss | check_command check_whatever |} The two "template-" templates contain no contacts or contact groups. Intuitively I had expected notifications for service sss to go to all members of contact groups aaa, bbb & ccc. What actually occurs is that the notifications go only to the members of ccc. Might it be possible to adjust the contact inheritance do notify all three contact groups? Best wishes, Matthew ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From andurin at process-zero.de Thu Jan 3 21:02:26 2008 From: andurin at process-zero.de (=?UTF-8?B?SGVuZHJpayBCw6Rja2Vy?=) Date: Thu, 03 Jan 2008 21:02:26 +0100 Subject: Nagios 3.0rc1 segfaulting In-Reply-To: <1199376104.2599.4.camel@homer.ob.libexec.de> References: <1199376104.2599.4.camel@homer.ob.libexec.de> Message-ID: <477D3F52.8060502@process-zero.de> Hi Tobias, Tobias Scherbaum schrieb: > Program received signal SIGSEGV, Segmentation fault. > [Switching to Thread 1077336608 (LWP 3529)] > 0x0808450c in process_check_result_queue (dirname=0x80d4078 > "/var/spool/nagios3/checkresults") at utils.c:2244 > 2244 utils.c: No such file or directory. > in utils.c > (gdb) bt > #0 0x0808450c in process_check_result_queue (dirname=0x80d4078 > "/var/spool/nagios3/checkresults") at utils.c:2244 can you verify if the path "/var/spool/nagios3/checkresults" is existing on your system? Further information like OS, OS Level, source install or distro package would be nice, too. Regards Hendrik -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 2185 bytes Desc: S/MIME Cryptographic Signature URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From tobias at scherbaum.info Thu Jan 3 22:59:46 2008 From: tobias at scherbaum.info (Tobias Scherbaum) Date: Thu, 03 Jan 2008 22:59:46 +0100 Subject: Nagios 3.0rc1 segfaulting In-Reply-To: <477D3F52.8060502@process-zero.de> References: <1199376104.2599.4.camel@homer.ob.libexec.de> <477D3F52.8060502@process-zero.de> Message-ID: <1199397586.2596.9.camel@homer.ob.libexec.de> Heya Hendrik, Hendrik B?cker wrote: > > Program received signal SIGSEGV, Segmentation fault. > > [Switching to Thread 1077336608 (LWP 3529)] > > 0x0808450c in process_check_result_queue (dirname=0x80d4078 > > "/var/spool/nagios3/checkresults") at utils.c:2244 > > 2244 utils.c: No such file or directory. > > in utils.c > > (gdb) bt > > #0 0x0808450c in process_check_result_queue (dirname=0x80d4078 > > "/var/spool/nagios3/checkresults") at utils.c:2244 > > > can you verify if the path "/var/spool/nagios3/checkresults" is existing > on your system? Sure it exists ... plus Nagios is able to create a handful of temporary queue files, which makes this even more weird. Not even the debug.log has any useful information which file Nagios fails to access ?exactly ... (snip) [1199372484.065535] [016.2] [pid=3545] Moving temp check result file '/var/spool/nagios3/checkresults/checkNmJmfN' to queue file '/var/spool/nagios3/checkresul ts/ccfuOCQ'... (snip) [1199372487.135649] [001.0] [pid=3529] reap_check_results() start [1199372487.135691] [016.0] [pid=3529] Starting to reap check results. ... and then it segfaults > Further information like OS, OS Level, source install or distro package > would be nice, too. Debian Etch using packages from [1], nothing special ... Tobias [1] ?http://people.teamix.de/~svelt/debian/etch/nagios3/3.0-rc-1/ ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From tobias at scherbaum.info Thu Jan 3 23:23:12 2008 From: tobias at scherbaum.info (Tobias Scherbaum) Date: Thu, 03 Jan 2008 23:23:12 +0100 Subject: Nagios 3.0rc1 segfaulting In-Reply-To: <1199397586.2596.9.camel@homer.ob.libexec.de> References: <1199376104.2599.4.camel@homer.ob.libexec.de> <477D3F52.8060502@process-zero.de> <1199397586.2596.9.camel@homer.ob.libexec.de> Message-ID: <1199398992.2596.12.camel@homer.ob.libexec.de> Tobias Scherbaum wrote: > > can you verify if the path "/var/spool/nagios3/checkresults" is existing > > on your system? > > Sure it exists ... plus Nagios is able to create a handful of temporary > queue files, which makes this even more weird. Not even the debug.log > has any useful information which file Nagios fails to > access ?exactly ... > > (snip) > [1199372484.065535] [016.2] [pid=3545] Moving temp check result file > '/var/spool/nagios3/checkresults/checkNmJmfN' to queue file > '/var/spool/nagios3/checkresul > ts/ccfuOCQ'... > (snip) > [1199372487.135649] [001.0] [pid=3529] reap_check_results() start > [1199372487.135691] [016.0] [pid=3529] Starting to reap check results. > > ... and then it segfaults Funny, found exactly the same problem in a thread in the German Nagios forum [1] ... not that it has a solution/fix though. Tobias [1] http://www.nagios-portal.de/wbb/index.php?page=Thread&threadID=7302&pageNo=1 ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From andurin at process-zero.de Fri Jan 4 07:22:40 2008 From: andurin at process-zero.de (=?UTF-8?B?SGVuZHJpayBCw6Rja2Vy?=) Date: Fri, 04 Jan 2008 07:22:40 +0100 Subject: Nagios 3.0rc1 segfaulting In-Reply-To: <1199397586.2596.9.camel@homer.ob.libexec.de> References: <1199376104.2599.4.camel@homer.ob.libexec.de> <477D3F52.8060502@process-zero.de> <1199397586.2596.9.camel@homer.ob.libexec.de> Message-ID: <477DD0B0.4010306@process-zero.de> Tobias Scherbaum schrieb: > > Sure it exists ... plus Nagios is able to create a handful of temporary > queue files, which makes this even more weird. Not even the debug.log > has any useful information which file Nagios fails to > access ?exactly ... > Can you please: 1. Stop the nagios daemon 2. move all existing files from the checkresults dir to a temp dir (please don't delete them yet) 3. tell us if you are using a 32/64 bit architecture? If there is anything strange, IMHO it is hidden in the distro specific stuff. I am running RC1 fine under: SLES10 32bit Ubuntu 6.07 TLS 64bit (i think it was called dapper) Ubuntu 7.10 32bit (but only with minimal config) > > [1] ?http://people.teamix.de/~svelt/debian/etch/nagios3/3.0-rc-1/ > I know this package maintainer personaly, but I don't think he is getting things worse in the code. If nothing else solves this problem, I guess you should be able to install nagios from the sources and get rid of a package. Regards, Hendrik -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 2185 bytes Desc: S/MIME Cryptographic Signature URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From michael_luebben at web.de Fri Jan 4 14:36:30 2008 From: michael_luebben at web.de (=?iso-8859-15?Q?Michael_L=FCbben?=) Date: Fri, 04 Jan 2008 14:36:30 +0100 Subject: Problem with NSCA large and multiline output Message-ID: <1413373335@web.de> Hi @all, i use distributed monitoring! Now i will update to the Version 3! But the nsca send not the complet plugin output! Now i have change the MAX_PLUGINOUTPUT_LENGTH from 512 to 2512 and new compiled. The nsca send now more output, but not enough. Has someone an idea which maximal value i can use? Another problem is that the nsca can only send the first line, not more! I am not a c developer and have no idea how to fix this. Bye Michael _________________________________________________________________________ In 5 Schritten zur eigenen Homepage. Jetzt Domain sichern und gestalten! Nur 3,99 EUR/Monat! http://www.maildomain.web.de/?mc=021114 ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From pitchfork at ederdrom.de Fri Jan 4 21:54:39 2008 From: pitchfork at ederdrom.de (Joerg Linge) Date: Fri, 4 Jan 2008 21:54:39 +0100 Subject: Nagios 3.0rc1 extinfo.cgi Segmentation fault Message-ID: <200801042154.39453.pitchfork@ederdrom.de> Hi List, calling extinfi.cgi with non existing Host or Service values via QUERY_STRING results in a segmentation fault Now some debug infos ... 0 - [nagios at kassandra /usr/local/src/nagios-3.0rc1/cgi]$ export REMOTE_USER=linge 0 - [nagios at kassandra /usr/local/src/nagios-3.0rc1/cgi]$ export REQUEST_METHOD=GET 0 - [nagios at kassandra /usr/local/src/nagios-3.0rc1/cgi]$ export QUERY_STRING="type=2&host=wrong_host&service=wrong_service" 1 - [nagios at kassandra /usr/local/src/nagios-3.0rc1/cgi]$ ./extinfo.cgi Cache-Control: no-store Pragma: no-cache Refresh: 90 Last-Modified: Fri, 04 Jan 2008 20:46:55 GMT Expires: Thu, 01 Jan 1970 00:00:00 GMT Content-type: text/html [... stripped content ...] Segmentation fault 139 - [nagios at kassandra /usr/local/src/nagios-3.0rc1/cgi]$ The gdb returns the following Infos: [.... stripped content ..]
Program received signal SIGSEGV, Segmentation fault. 0x08050005 in main () at extinfo.c:452 452 if(temp_service->action_url!=NULL && strcmp(temp_service->action_url,"")){ (gdb) So now its your turn ;-) Kind regards J?rg ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From pitchfork at ederdrom.de Sun Jan 6 20:08:55 2008 From: pitchfork at ederdrom.de (Joerg Linge) Date: Sun, 6 Jan 2008 20:08:55 +0100 Subject: [patch] summary.cgi Nagios 3.0rc1 broken Message-ID: <200801062008.55968.pitchfork@ederdrom.de> Hi Ethan, summary.cgi does not recognice the host query string. So here is a small patch against summary.c Revision 1.25 ( 3.0rc1 ) Kind regards, J?rg diff -u summary.c.orig summary.c --- summary.c.orig 2008-01-06 19:34:00.000000000 +0100 +++ summary.c 2008-01-06 19:56:17.000000000 +0100 @@ -1170,7 +1170,7 @@ break; } - if((target_host_name=(char *)strdup(target_host_name))==NULL) + if((target_host_name=(char *)strdup(variables[x]))==NULL) target_host_name=""; strip_html_brackets(target_host_name); ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From ojan at expertise-online.net Mon Jan 7 14:53:51 2008 From: ojan at expertise-online.net (Olivier JAN) Date: Mon, 07 Jan 2008 14:53:51 +0100 Subject: .cgi problem Message-ID: <20080107145351.u2xs5by57o4wwwog@intra.expertise-online.net> Hi list, A strange problem occuring on one of my nagios 3.Orc1 on ubuntu 6.0.6 since this morning. I get some premature end of script randomly in the web interface Premature end of script headers: cmd.cgi with sometimes glibc error. *** glibc detected *** malloc(): memory corruption: 0x080b4668 *** or *** glibc detected *** free(): invalid next size (fast): 0x080b6ad8 *** Don't get any errors on the second server which has the same configuration. Same nagios binaries and configuration. Configuration hasn't been touched before those mistakes appearing. I doubled check the permissions on the file and everything seems ok. And they are the same on the two nagios servers. Tried to fetch the url with firefox and lynx with same result. The error was on extinfo.cfi this morning and cmd.cgi this afternoon. Tried the free_child_process_memory=0 in nagios config. No more glibc errors after passing this parameter to 1 and 0 back which is definitively strange. Parameter is now set the same it was with glibc errors. Anything i miss, a bug ? Any ideas welcome guys... thanks -- Olivier JAN ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From ae at op5.se Mon Jan 7 14:59:10 2008 From: ae at op5.se (Andreas Ericsson) Date: Mon, 07 Jan 2008 14:59:10 +0100 Subject: nrpe RPM spec file bug(s) In-Reply-To: References: Message-ID: <4782302E.5020104@op5.se> SCHAER Frederic wrote: > Hmmm... by the way, the nagios.spec file has the same defect : it's > using "groupadd" whereas it's using "useradd -r" ... if this could also > be corrected... > >From the useradd man-page on Fedora Core 7: -r This flag is used to create a system account. That is, a user with a UID lower than the value of UID_MIN defined in /etc/login.defs and whose password does not expire. Note that useradd will not create a home directory for such an user, regardless of the default setting in /etc/login.defs. You have to specify -m option if you want a home directory for a system account to be created. This is an option added by Red Hat In short, -r is used to make useradd give the account a UID less than 500 (or 1000), so as to be easily distinguishable from ordinary user-accounts. Also note that if you're using this option on anything else than a Red Hat compatible system (such as SuSE), you need to write your own spec-file. -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From thomas at zango.com Mon Jan 7 20:54:44 2008 From: thomas at zango.com (Thomas Guyot-Sionnest) Date: Mon, 07 Jan 2008 14:54:44 -0500 Subject: {enable, disable}_notifications and file name expansion! Message-ID: <47828384.2050608@zango.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, I just noticed I recently started to have problems using the enable/disable notification commands when run from cron. After some investigation it turns out the culprit is... file name expansion! In those scripts the command line is built as this: cmdline="[$datetime] COMMAND_NAME;$datetime" and then the following command is run `$echocmd $cmdline >> $CommandFile` this result in the following being run (given current timestamp): `/bin/echo [1199733486] COMMAND_NAME;1199733486 >> $CommandFile` If you happen to have any of the digits in $datetime as a file name in your current folder, it will be expanded by bash. In my case I had a file named "1" in /root, and since it's being run by the root crontab it turned out to this: `/bin/echo 1 COMMAND_NAME;1199733486 >> $CommandFile` which obviously won't work. There's a few ways to fix this problem: 1. Quoting the $cmdline in the echocmd arguments: - --- enable_notifications 2008-01-07 10:48:17.000000000 -0800 +++ enable_notifications 2008-01-07 11:42:01.000000000 -0800 @@ -23,7 +23,7 @@ cmdline="[$datetime] ENABLE_NOTIFICATIONS;$datetime" # append the command to the end of the command file - -`$echocmd $cmdline >> $CommandFile` +`$echocmd "$cmdline" >> $CommandFile` 2. Backquoting the hooks in $cmdline: - --- enable_notifications 2008-01-07 10:48:17.000000000 -0800 +++ enable_notifications 2008-01-07 11:44:42.000000000 -0800 @@ -20,7 +20,7 @@ datetime=`date +%s` # create the command line to add to the command file - -cmdline="[$datetime] ENABLE_NOTIFICATIONS;$datetime" +cmdline="\[$datetime\] ENABLE_NOTIFICATIONS;$datetime" # append the command to the end of the command file `$echocmd $cmdline >> $CommandFile` 3. Uning printf: - --- enable_notifications 2008-01-07 10:48:17.000000000 -0800 +++ enable_notifications 2008-01-07 11:49:35.000000000 -0800 @@ -12,18 +12,15 @@ # the check_external_commands option in the main # configuration file. - -echocmd="/bin/echo" +printfcmd="/bin/printf" CommandFile="/usr/local/nagios/var/rw/nagios.cmd" # get the current date/time in seconds since UNIX epoch datetime=`date +%s` - -# create the command line to add to the command file - -cmdline="[$datetime] ENABLE_NOTIFICATIONS;$datetime" - - # append the command to the end of the command file - -`$echocmd $cmdline >> $CommandFile` +`$printfcmd "[%i] ENABLE_NOTIFICATIONS;%i\n" $datetime $datetime >> $CommandFile` This should be fixed on both disable_notifications and enable_notifications files in contrib/eventhandlers/ Thanks - -- Thomas -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHgoOE6dZ+Kt5BchYRAivoAKD/QX4dEM5XQAPhctGzlEyaCdGf4wCgmOj1 ss+SuRzMEcl+D4klP8l9odg= =CF5j -----END PGP SIGNATURE----- ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From Alexander.Leidinger at ext.publications.europa.eu Tue Jan 8 10:42:36 2008 From: Alexander.Leidinger at ext.publications.europa.eu (Alexander Leidinger) Date: Tue, 08 Jan 2008 10:42:36 +0100 Subject: nagios 3.0rc1 compile problems on Solaris 10 (+workaround/fix) Message-ID: <1199785356.18169.64.camel@phoenix> Hi, I encountered some compile problems on Solaris 10 with the Sun Studio compiler/tools: Error 1: ---snip--- ld: fatal: file ../common/snprintf.o: open failed: No such file or directory ld: fatal: File processing errors. No output written to nagios ---snip--- My workaround: ---snip--- cd common ln -s ../base/snprintf.o . cd .. ---snip--- Error 2: ---snip--- "helloworld.c", line 39: warning: syntax error: empty declaration "helloworld.c", line 76: warning: argument #8 is incompatible with prototype: prototype: pointer to void : "../include/nagios.h", line 491 argument : pointer to function(pointer to char) returning void ld: fatal: file helloworld.o: unknown file type ld: fatal: File processing errors. No output written to helloworld.o ---snip--- My workaround (and probably the right fix): ---snip--- vi include/nagios.h goto line 491 change int schedule_new_event(int,int,time_t,int,unsigned long,void *,int,void *,void *,int); /* schedules a new timed event */ to int schedule_new_event(int,int,time_t,int,unsigned long,void *,int,void (*)(char *),void *,int); /* schedules a new timed event */ vi include/nebmodules.h search #define NEB_API_VERSION(x) int __neb_api_version = x; change to (remove the semicolon at the end) #define NEB_API_VERSION(x) int __neb_api_version = x perl -pi -e 's:helloworld.o:helloworld:g' module/Makefile* ---snip--- Bye, Alexander. ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From gmueller at netways.de Tue Jan 8 12:52:22 2008 From: gmueller at netways.de (Gerd Mueller) Date: Tue, 08 Jan 2008 12:52:22 +0100 Subject: Problem with NSCA large and multiline output In-Reply-To: <1413373335@web.de> References: <1413373335@web.de> Message-ID: <1199793142.12870.14.camel@netl-gm-01.int.netways.de> Hi Michael, believing the changelog of Nagios max plugin output length is now 8kb. And Right now NSCA and NRPE aren't able to handle more than 1 line. :-( Cheers, Gerd Am Freitag, den 04.01.2008, 14:36 +0100 schrieb Michael L?bben: > Hi @all, > > i use distributed monitoring! Now i will update to the Version 3! But the nsca send not the complet plugin output! Now i have change the MAX_PLUGINOUTPUT_LENGTH from 512 to 2512 and new compiled. The nsca send now more output, but not enough. Has someone an idea which maximal value i can use? Another problem is that the nsca can only send the first line, not more! I am not a c developer and have no idea how to fix this. > > Bye > Michael > _________________________________________________________________________ > In 5 Schritten zur eigenen Homepage. Jetzt Domain sichern und gestalten! > Nur 3,99 EUR/Monat! http://www.maildomain.web.de/?mc=021114 > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Nagios-devel mailing list > Nagios-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-devel -- Gerd Mueller Senior Consultant NETWAYS GmbH | Deutschherrnstr. 47a | D-90429 N?rnberg Tel: +49 911 92885-0 | Fax: +49 911 92885-33 GF: Julian Hein | AG N?rnberg HRB18461 http://www.netways.de | gmueller at netways.de ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From herbert at linuxhacker.at Tue Jan 8 15:42:33 2008 From: herbert at linuxhacker.at (Herbert Straub) Date: Tue, 08 Jan 2008 15:42:33 +0100 Subject: NDOUtils configure error on CentOS x86_64 Message-ID: <47838BD9.1090304@linuxhacker.at> The specification of the configure options --with-mysql-lib and --with-pgsql-lib does not work. The mysql library path cannot found. I think, my patch fix this problem. Second, the default specification of /usr/lib/mysql does not work on x86_64. I think, my patch fix this also. --- configure.in.orig 2008-01-08 15:14:33.000000000 +0100 +++ configure.in 2008-01-08 15:15:01.000000000 +0100 @@ -177,7 +177,7 @@ ]) if test "$withval" = "" ; then dnl If no library path specified, add default (RedHat) path for good measure - DBLDFLAGS="$LDFLAGS -L/usr/lib/mysql" + DBLDFLAGS="$LDFLAGS -L/usr/lib/mysql -L/usr/lib64/mysql" fi AC_ARG_WITH(mysql-inc,--with-mysql-inc=DIR sets location of the MySQL client include files,[ DBCFLAGS="${DBCFLAGS} -I${withval}" @@ -185,7 +185,7 @@ dnl Optional PostgreSQL library and include paths AC_ARG_WITH(pgsql-lib,--with-pgsql-lib=DIR sets location of the PostgreSQL client library,[ - DBLDFLAGS="-L${withval}" + DBLDFLAGS=" ${DBLDFLAGS} -L${withval}" LD_RUN_PATH="${withval}${LD_RUN_PATH:+:}${LD_RUN_PATH}" ]) AC_ARG_WITH(pgsql-inc,--with-pgsql-inc=DIR sets location of the PostgreSQL client include files,[ Download the patch from: http://src.linuxhacker.at/patches/ndoutils-1.4b7-mysql.patch Regards Herbert Straub ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From michael_luebben at web.de Tue Jan 8 16:39:38 2008 From: michael_luebben at web.de (=?iso-8859-15?Q?Michael_L=FCbben?=) Date: Tue, 08 Jan 2008 16:39:38 +0100 Subject: Problem with NSCA large and multiline output Message-ID: <1419589756@web.de> Hi Gerd, nrpe 2.10 and higher supports multiline output ;-) bye Michael -----Urspr?ngliche Nachricht----- Von: Nagios Developers List Gesendet: 08.01.08 12:58:50 An: Nagios Developers List Betreff: Re: [Nagios-devel] Problem with NSCA large and multiline output Hi Michael, believing the changelog of Nagios max plugin output length is now 8kb. And Right now NSCA and NRPE aren't able to handle more than 1 line. :-( Cheers, Gerd Am Freitag, den 04.01.2008, 14:36 +0100 schrieb Michael L?bben: > Hi @all, > > i use distributed monitoring! Now i will update to the Version 3! But the nsca send not the complet plugin output! Now i have change the MAX_PLUGINOUTPUT_LENGTH from 512 to 2512 and new compiled. The nsca send now more output, but not enough. Has someone an idea which maximal value i can use? Another problem is that the nsca can only send the first line, not more! I am not a c developer and have no idea how to fix this. > > Bye > Michael > _________________________________________________________________________ > In 5 Schritten zur eigenen Homepage. Jetzt Domain sichern und gestalten! > Nur 3,99 EUR/Monat! http://www.maildomain.web.de/?mc=021114 > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Nagios-devel mailing list > Nagios-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-devel -- Gerd Mueller Senior Consultant NETWAYS GmbH | Deutschherrnstr. 47a | D-90429 N?rnberg Tel: +49 911 92885-0 | Fax: +49 911 92885-33 GF: Julian Hein | AG N?rnberg HRB18461 http://www.netways.de | gmueller at netways.de ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel _______________________________________________________________________ Jetzt neu! Sch?tzen Sie Ihren PC mit McAfee und WEB.DE. 30 Tage kostenlos testen. http://www.pc-sicherheit.web.de/startseite/?mc=022220 ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From ton.voon at altinity.com Tue Jan 8 17:04:50 2008 From: ton.voon at altinity.com (Ton Voon) Date: Tue, 8 Jan 2008 16:04:50 +0000 Subject: NDOUtils configure error on CentOS x86_64 In-Reply-To: <47838BD9.1090304@linuxhacker.at> References: <47838BD9.1090304@linuxhacker.at> Message-ID: On 8 Jan 2008, at 14:42, Herbert Straub wrote: > The specification of the configure options --with-mysql-lib and > --with-pgsql-lib does not work. The mysql library path cannot found. Does this patch work for you? It uses mysql_config to work out the appropriate flags. http://altinity.blogs.com/dotorg/2007/04/better_mysqlcli.html Ton http://www.altinity.com UK: +44 (0)870 787 9243 US: +1 866 879 9184 Fax: +44 (0)845 280 1725 Skype: tonvoon ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From thomas at zango.com Tue Jan 8 17:32:51 2008 From: thomas at zango.com (Thomas Guyot-Sionnest) Date: Tue, 08 Jan 2008 11:32:51 -0500 Subject: Difference in CPU time with and without ePN Message-ID: <4783A5B3.8070504@zango.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 If anyone's interested here's what ePN versus pure Perl plugins can looks like on the server's CPU load. This is on a dual proc (multi-threaded == 4 proc) server with hardware SCSI raid (battery-backed write-back) and 2GiB of RAM. The server is almost doing 1000 active check/minutes spread on nearly 100 hosts (Nagios 2.7). In the attached graph you can see the server load as I changed the settings of one Perl plugin that runs 28 times every minute. There's two drops: 1. At the very beginning I removed the "/usr/bin/perl " in the checkcommand definition. The ePN compilation was failing so the file was likely still being recompiled at every run. The drop in user time seems to be from the plug-in not running (but still being compiled). 2. After the week-end, the 2nd drop shows the drop when I fixed the plug-in compilation. At that time the plugin was running again but not being recompiled at each run, which caused a raise in user time and a nice drop in system time. This graph clearly shows that ePN is a must for any big system that rely heavily on Perl plugins. The difference you see here is caused by only 28 check out of nearly 1000. The CPU data is gathered every minutes trough SNMP counters and properly divided according to the number of CPUs. - -- Thomas -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHg6Wz6dZ+Kt5BchYRAudkAJ0RCpgqnASiwUoU/moCWuLWnAn+FACfYdQA eV+dnGKKiG0zYlDPZxtluwk= =5yZ/ -----END PGP SIGNATURE----- -------------- next part -------------- A non-text attachment was scrubbed... Name: graph_image.php.png Type: image/png Size: 30780 bytes Desc: not available URL: -------------- next part -------------- ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From herbert at linuxhacker.at Tue Jan 8 18:11:24 2008 From: herbert at linuxhacker.at (Herbert Straub) Date: Tue, 08 Jan 2008 18:11:24 +0100 Subject: NDOUtils configure error on CentOS x86_64 In-Reply-To: References: <47838BD9.1090304@linuxhacker.at> Message-ID: <4783AEBC.3070402@linuxhacker.at> Ton Voon wrote: > Does this patch work for you? It uses mysql_config to work out the > appropriate flags. > > http://altinity.blogs.com/dotorg/2007/04/better_mysqlcli.html > Ton, the np_mysqlclient.m4 and the patch of Makefile.in and configure.in works very good. I create a complete patch for NDOutils-1.4b7, which contains all changes: http://src.linuxhacker.at/patches/ndoutils-1.4b7-better_mysql_detection.patch and could create a NDOutils with this spec: http://src.linuxhacker.at/rpmspec/ndoutils.spec Herbert ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From ton.voon at altinity.com Tue Jan 8 18:35:25 2008 From: ton.voon at altinity.com (Ton Voon) Date: Tue, 8 Jan 2008 17:35:25 +0000 Subject: NSCA error with --single and aggregate writes enabled Message-ID: <21E3E5FF-1C15-44C1-8BFA-01C286F3DC49@altinity.com> Hi Ethan, We've found a problem with NSCA when aggregate writes are enabled and NSCA starts writing to the command file before it gets created. Details here: http://altinity.blogs.com/dotorg/2008/01/nscas-aggregate.html Ton http://www.altinity.com UK: +44 (0)870 787 9243 US: +1 866 879 9184 Fax: +44 (0)845 280 1725 Skype: tonvoon ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From thomas at zango.com Tue Jan 8 19:48:19 2008 From: thomas at zango.com (Thomas Guyot-Sionnest) Date: Tue, 08 Jan 2008 13:48:19 -0500 Subject: NSCA error with --single and aggregate writes enabled In-Reply-To: <21E3E5FF-1C15-44C1-8BFA-01C286F3DC49@altinity.com> References: <21E3E5FF-1C15-44C1-8BFA-01C286F3DC49@altinity.com> Message-ID: <4783C573.5000701@zango.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Ton Voon wrote: > Hi Ethan, > > We've found a problem with NSCA when aggregate writes are enabled and > NSCA starts writing to the command file before it gets created. > Details here: http://altinity.blogs.com/dotorg/2008/01/nscas-aggregate.html Alternatively, Nagios could leave the pipe here and have nsca open the file in non-blocking read-write - it will then be able to fill up the pipe and Nagios will get the queued commands when it starts up. On write failures (pipe full) the dump file could be used. This would be the preferred method to avoid data loss. That's pretty much the same as host/service_perfdata_file_mode=p in Nagios 3 with a daemon like OCP_Daemon to get the data, but the other way round. - -- Thomas -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHg8Vz6dZ+Kt5BchYRAgyfAJ0Ya6YR+DXZydCAFnDf80KrlMBO7wCg/IgR WqVovkd5ekbogibtcz//g0E= =MQca -----END PGP SIGNATURE----- ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From gmueller at netways.de Wed Jan 9 10:04:29 2008 From: gmueller at netways.de (Gerd Mueller) Date: Wed, 09 Jan 2008 10:04:29 +0100 Subject: Problem with NSCA large and multiline output In-Reply-To: <1419589756@web.de> References: <1419589756@web.de> Message-ID: <1199869469.13583.5.camel@netl-gm-01.int.netways.de> Ups ;-) Of course your right. If possible I am using ssh instead of nsca and nrpe. So I did not have this limit. Cheers Gerd Am Dienstag, den 08.01.2008, 16:39 +0100 schrieb Michael L?bben: > Hi Gerd, > nrpe 2.10 and higher supports multiline output ;-) > > bye > Michael > > -----Urspr?ngliche Nachricht----- > Von: Nagios Developers List > Gesendet: 08.01.08 12:58:50 > An: Nagios Developers List > Betreff: Re: [Nagios-devel] Problem with NSCA large and multiline output > > > Hi Michael, > > believing the changelog of Nagios max plugin output length is now 8kb. > And Right now NSCA and NRPE aren't able to handle more than 1 > line. :-( > > Cheers, > > Gerd > Am Freitag, den 04.01.2008, 14:36 +0100 schrieb Michael L?bben: > > Hi @all, > > > > i use distributed monitoring! Now i will update to the Version 3! But the nsca send not the complet plugin output! Now i have change the MAX_PLUGINOUTPUT_LENGTH from 512 to 2512 and new compiled. The nsca send now more output, but not enough. Has someone an idea which maximal value i can use? Another problem is that the nsca can only send the first line, not more! I am not a c developer and have no idea how to fix this. > > > > Bye > > Michael > > _________________________________________________________________________ > > In 5 Schritten zur eigenen Homepage. Jetzt Domain sichern und gestalten! > > Nur 3,99 EUR/Monat! http://www.maildomain.web.de/?mc=021114 > > > > > > ------------------------------------------------------------------------- > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2005. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Nagios-devel mailing list > > Nagios-devel at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/nagios-devel -- Gerd Mueller Senior Consultant NETWAYS GmbH | Deutschherrnstr. 47a | D-90429 N?rnberg Tel: +49 911 92885-0 | Fax: +49 911 92885-33 GF: Julian Hein | AG N?rnberg HRB18461 http://www.netways.de | gmueller at netways.de ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From ae at op5.se Wed Jan 9 09:35:28 2008 From: ae at op5.se (Andreas Ericsson) Date: Wed, 09 Jan 2008 09:35:28 +0100 Subject: Difference in CPU time with and without ePN In-Reply-To: <4783A5B3.8070504@zango.com> References: <4783A5B3.8070504@zango.com> Message-ID: <47848750.2040200@op5.se> Thomas Guyot-Sionnest wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > If anyone's interested here's what ePN versus pure Perl plugins can > looks like on the server's CPU load. This is on a dual proc > (multi-threaded == 4 proc) server with hardware SCSI raid > (battery-backed write-back) and 2GiB of RAM. > > The server is almost doing 1000 active check/minutes spread on nearly > 100 hosts (Nagios 2.7). In the attached graph you can see the server > load as I changed the settings of one Perl plugin that runs 28 times > every minute. There's two drops: > > 1. At the very beginning I removed the "/usr/bin/perl " in the > checkcommand definition. The ePN compilation was failing so the file was > likely still being recompiled at every run. The drop in user time seems > to be from the plug-in not running (but still being compiled). > > 2. After the week-end, the 2nd drop shows the drop when I fixed the > plug-in compilation. At that time the plugin was running again but not > being recompiled at each run, which caused a raise in user time and a > nice drop in system time. > > This graph clearly shows that ePN is a must for any big system that rely > heavily on Perl plugins. The difference you see here is caused by only > 28 check out of nearly 1000. > Have you found a way around the memory leakage? Otherwise, I still believe it's more hassle than it's worth, and effort would be better spent to cut the number of fork()'s in half by having Nagios multiplex its checks. Nice figures though. -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From kyoxu at hotmail.com Wed Jan 9 10:54:12 2008 From: kyoxu at hotmail.com (Kelvin Xu) Date: Wed, 9 Jan 2008 17:54:12 +0800 Subject: =?utf-8?q?nagios_core_dump_and_restart_when_check?= =?utf-8?b?X25ycGXigI8=?= Message-ID: Hi all, I have just installed Nagios 3.0rc1 into a Solaris 10 machine. Everything is working fine except when i tried to do a check_nrpe on a remote host or localhost. I check my /var/adm/messages. Below is a section of the output: Jan 4 10:16:39 pnsgsit1gw1 nagios[263]: [ID 702911 user.info] Caught SIGTERM, shutting down...Jan 4 10:16:39 pnsgsit1gw1 nagios[263]: [ID 702911 user.info] Successfully shutdown... (PID=263)Jan 4 10:16:39 pnsgsit1gw1 nagios[290]: [ID 702911 user.info] Nagios 3.0rc1 starting... (PID=290)Jan 4 10:16:39 pnsgsit1gw1 nagios[290]: [ID 702911 user.info] Local time is Fri Jan 04 10:16:39 SGT 2008Jan 4 10:16:39 pnsgsit1gw1 nagios[290]: [ID 702911 user.info] LOG VERSION: 2.0Jan 4 10:16:39 pnsgsit1gw1 nagios[291]: [ID 702911 user.info] Finished daemonizing... (New PID=291)Jan 4 10:17:53 pnsgsit1gw1 genunix: [ID 603404 kern.notice] NOTICE: core_log: nagios[302] setid process, core not dumped: /var/core/core.nagios.302.pnsgsit1gw1.210033.65541.1199413073Jan 4 10:17:53 pnsgsit1gw1 nagios[291]: [ID 702911 user.info] Caught SIGTERM, shutting down...Jan 4 10:17:53 pnsgsit1gw1 nagios[291]: [ID 702911 user.info] Successfully shutdown... (PID=291)Jan 4 10:17:53 pnsgsit1gw1 nagios[305]: [ID 702911 user.info] Nagios 3.0rc1 starting... (PID=305)Jan 4 10:17:53 pnsgsit1gw1 nagios[305]: [ID 702911 user.info] Local time is Fri Jan 04 10:17:53 SGT 2008Jan 4 10:17:53 pnsgsit1gw1 nagios[305]: [ID 702911 user.info] LOG VERSION: 2.0Jan 4 10:17:53 pnsgsit1gw1 nagios[306]: [ID 702911 user.info] Finished daemonizing... (New PID=306) This will repeat every few minutes and will not occur when i remove the nrpe service monitoring from the configuration. I tried to do a /usr/local/nagios/libexec/check_nrpe -H pnsgsit1gw2 -c check_load, The output seems fine except there is some addition characters appended to end. OK - load average: 0.00, 0.00, 0.00|load1=0.000;15.000;30.000;0; load5=0.000;10.000;25.000;0; load15=0.000;5.000;20.000;0;???p?: Below is the debug log that i extracted. It seems that the nagios just core dump when a check_nrpe request is sent out and a new process is created: 1199869965.255643] [064.1] [pid=720] Making callbacks (type 13)... [1199869965.255659] [016.0] [pid=720] Checking service 'NRPE' on host 'pnsgsit1web2a'... [1199869965.255752] [001.0] [pid=720] get_raw_command_line() [1199869965.255774] [2320.2] [pid=720] Raw Command Input: $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ [1199869965.255792] [001.0] [pid=720] process_macros() [1199869965.255808] [2048.1] [pid=720] **** BEGIN MACRO PROCESSING *********** [1199869965.255822] [2048.1] [pid=720] Processing: 'check_load' [1199869965.255836] [2048.2] [pid=720] Processing part: 'check_load' [1199869965.255851] [2048.2] [pid=720] Not currently in macro. Running output (10): 'check_load' [1199869965.255866] [2048.1] [pid=720] Done. Final output: 'check_load' [1199869965.255879] [2048.1] [pid=720] **** END MACRO PROCESSING ************* [1199869965.255892] [2320.2] [pid=720] Expanded Command Output: $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ [1199869965.255905] [001.0] [pid=720] process_macros() [1199869965.255919] [2048.1] [pid=720] **** BEGIN MACRO PROCESSING *********** [1199869965.255931] [2048.1] [pid=720] Processing: '$USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$' [1199869965.255945] [2048.2] [pid=720] Processing part: '' [1199869965.255958] [2048.2] [pid=720] Not currently in macro. Running output (0): '' [1199869965.255971] [2048.2] [pid=720] Processing part: 'USER1' [1199869965.256010] [2048.2] [pid=720] Uncleaned macro. Running output (25): '/usr/local/nagios/libexec' [1199869965.256025] [2048.2] [pid=720] Just finished macro. Running output (25): '/usr/local/nagios/libexec' [1199869965.256039] [2048.2] [pid=720] Processing part: '/check_nrpe -H ' [1199869965.256054] [2048.2] [pid=720] Not currently in macro. Running output (40): '/usr/local/nagios/libexec/check_nrpe -H ' [1199869965.256068] [2048.2] [pid=720] Processing part: 'HOSTADDRESS' [1199869965.256088] [2048.2] [pid=720] Uncleaned macro. Running output (52): '/usr/local/nagios/libexec/check_nrpe -H 10.106.65.18' [1199869965.256103] [2048.2] [pid=720] Just finished macro. Running output (52): '/usr/local/nagios/libexec/check_nrpe -H 10.106.65.18' [1199869965.256118] [2048.2] [pid=720] Processing part: ' -c ' [1199869965.256132] [2048.2] [pid=720] Not currently in macro. Running output (56): '/usr/local/nagios/libexec/check_nrpe -H 10.106.65.18 -c ' [1199869965.256218] [2048.2] [pid=720] Processing part: 'ARG1' [1199869965.256245] [2048.2] [pid=720] Uncleaned macro. Running output (66): '/usr/local/nagios/libexec/check_nrpe -H 10.106.65.18 -c check_load' [1199869965.256260] [2048.2] [pid=720] Just finished macro. Running output (66): '/usr/local/nagios/libexec/check_nrpe -H 10.106.65.18 -c check_load' [1199869965.256274] [2048.2] [pid=720] Processing part: '' [1199869965.256288] [2048.2] [pid=720] Not currently in macro. Running output (66): '/usr/local/nagios/libexec/check_nrpe -H 10.106.65.18 -c check_load' [1199869965.256302] [2048.1] [pid=720] Done. Final output: '/usr/local/nagios/libexec/check_nrpe -H 10.106.65.18 -c check_load' [1199869965.256316] [2048.1] [pid=720] **** END MACRO PROCESSING ************* [1199869965.256595] [016.1] [pid=720] Check result output will be written to '/usr/local/nagios/var/spool/checkresults/checkCmaaAb' (fd=9) [1199869965.256737] [064.1] [pid=720] Making callbacks (type 13)... [1199869965.257854] [016.2] [pid=720] Service check is executing in child process (pid=758) [1199869965.260733] [001.0] [pid=758] process_macros() [1199869965.260821] [001.0] [pid=758] process_macros() [1199869965.260852] [001.0] [pid=758] process_macros() [1199869965.260879] [001.0] [pid=758] process_macros() [1199869965.260907] [001.0] [pid=758] process_macros() [1199869965.260934] [001.0] [pid=758] process_macros() [1199869965.267584] [001.0] [pid=720] handle_timed_event() end [1199869965.267647] [008.1] [pid=720] ** Event Check Loop [1199869965.267718] [008.1] [pid=720] Next High Priority Event Time: Wed Jan 9 17:12:52 2008 [1199869965.267742] [008.1] [pid=720] Next Low Priority Event Time: Wed Jan 9 17:14:32 2008 [1199869965.256737] [064.1] [pid=720] Making callbacks (type 13)... [1199869965.257854] [016.2] [pid=720] Service check is executing in child process (pid=758) [1199869965.260733] [001.0] [pid=758] process_macros() [1199869965.260821] [001.0] [pid=758] process_macros() [1199869965.260852] [001.0] [pid=758] process_macros() [1199869965.260879] [001.0] [pid=758] process_macros() [1199869965.260907] [001.0] [pid=758] process_macros() [1199869965.260934] [001.0] [pid=758] process_macros() [1199869965.267584] [001.0] [pid=720] handle_timed_event() end [1199869965.267647] [008.1] [pid=720] ** Event Check Loop [1199869965.267718] [008.1] [pid=720] Next High Priority Event Time: Wed Jan 9 17:12:52 2008 [1199869965.267742] [008.1] [pid=720] Next Low Priority Event Time: Wed Jan 9 17:14:32 2008 [1199869965.267758] [008.1] [pid=720] Current/Max Service Checks: 1/0 [1199869965.267773] [008.2] [pid=720] No events to execute at the moment. Idling for a bit... [1199869965.267788] [001.0] [pid=720] check_for_external_commands() [1199869965.267806] [064.1] [pid=720] Making callbacks (type 8)... [1199869965.302735] [001.0] [pid=720] event_execution_loop() end [1199869965.303213] [064.1] [pid=720] Making callbacks (type 9)... [1199869965.303244] [064.1] [pid=720] Making callbacks (type 7)... [1199869965.303260] [064.1] [pid=720] Making callbacks (type 7)... [1199869965.303276] [064.1] [pid=720] Making callbacks (type 26)... [1199869965.303291] [001.0] [pid=720] xrddefault_save_state_information() [1199869965.303480] [004.2] [pid=720] Writing retention data to temp file '/usr/local/nagios/var/nagios.tmpDmaaAb' [1199869965.325858] [064.1] [pid=720] Making callbacks (type 26)... [1199869965.350393] [064.1] [pid=720] Making callbacks (type 9)... [1199869965.404567] [001.0] [pid=762] drop_privileges() start [1199869965.404797] [004.0] [pid=762] Original UID/GID: 0/0 [1199869965.453908] [004.0] [pid=762] New UID/GID: 210033/65541 [1199869965.454562] [064.1] [pid=762] Making callbacks (type 9)... [1199869965.454874] [064.1] [pid=762] Making callbacks (type 9)... [1199869965.455046] [064.1] [pid=762] Making callbacks (type 9)... [1199869965.455064] [064.1] [pid=762] Making callbacks (type 7)... [1199869965.462889] [064.1] [pid=762] Making callbacks (type 7)... [1199869965.465180] [064.1] [pid=763] Making callbacks (type 7)... [1199869965.465827] [064.1] [pid=763] Making callbacks (type 9)... [1199869965.482936] [064.1] [pid=763] Making callbacks (type 26)... [1199869965.482993] [001.0] [pid=763] xrddefault_read_state_information() start [1199869965.483484] [064.1] [pid=763] Making callbacks (type 19)... Anyone has any ideas of what could be the problem? Has anyone succeeded in using nagios 3.0rc1 on Solaris 10? Thanks Regards,Kelvin Xu _________________________________________________________________ Get your free suite of Windows Live services today! http://www.get.live.com/wl/all -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From dermoth at aei.ca Wed Jan 9 13:48:32 2008 From: dermoth at aei.ca (Thomas Guyot-Sionnest) Date: Wed, 09 Jan 2008 07:48:32 -0500 Subject: Difference in CPU time with and without ePN In-Reply-To: <47848750.2040200@op5.se> References: <4783A5B3.8070504@zango.com> <47848750.2040200@op5.se> Message-ID: <4784C2A0.8020000@aei.ca> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 09/01/08 03:35 AM, Andreas Ericsson wrote: > Thomas Guyot-Sionnest wrote: >> >> This graph clearly shows that ePN is a must for any big system that rely >> heavily on Perl plugins. The difference you see here is caused by only >> 28 check out of nearly 1000. >> > > Have you found a way around the memory leakage? Otherwise, I still believe > it's more hassle than it's worth, and effort would be better spent to cut > the number of fork()'s in half by having Nagios multiplex its checks. I never noticed any memory problem with the ePN and my Nagios often ran for many consecutive months without being stopped (doing SIGHUPs from time to time to update the config trough) Could you direct me to some documents of communication archives that point out the problem? Thomas -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHhMKf6dZ+Kt5BchYRAgA9AJ9WR4dL2C5yeE4PSwBAe804btMMgACeM72V QplmZPq6I8/yJUEmLD/e4FI= =Vo8x -----END PGP SIGNATURE----- ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From ae at op5.se Wed Jan 9 14:36:12 2008 From: ae at op5.se (Andreas Ericsson) Date: Wed, 09 Jan 2008 14:36:12 +0100 Subject: Difference in CPU time with and without ePN In-Reply-To: <4784C2A0.8020000@aei.ca> References: <4783A5B3.8070504@zango.com> <47848750.2040200@op5.se> <4784C2A0.8020000@aei.ca> Message-ID: <4784CDCC.7050101@op5.se> Thomas Guyot-Sionnest wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 09/01/08 03:35 AM, Andreas Ericsson wrote: >> Thomas Guyot-Sionnest wrote: >>> This graph clearly shows that ePN is a must for any big system that rely >>> heavily on Perl plugins. The difference you see here is caused by only >>> 28 check out of nearly 1000. >>> >> Have you found a way around the memory leakage? Otherwise, I still believe >> it's more hassle than it's worth, and effort would be better spent to cut >> the number of fork()'s in half by having Nagios multiplex its checks. > > I never noticed any memory problem with the ePN and my Nagios often ran > for many consecutive months without being stopped (doing SIGHUPs from > time to time to update the config trough) > > Could you direct me to some documents of communication archives that > point out the problem? > http://www.google.se/search?q=%2Bnagios+%2B%22embedded+perl%22+%2B%22memory+leak%22 Embedded perl leaks memory. Alot. If you have a setup where it doesn't, you're pretty much unique. Look for "memory leak" or "embedded perl" in the nagios-devel and nagios-users archives, apart from the link above. Which versions of Nagios and Perl are you using? What system/hw is this on? ld, glibc and gcc versions might also be interesting, as well as which options you used when compiling Nagios. If the plugins are custom ones, that could also be worth having a look at. In so far as I know though, Stanley Hopcroft has been trying well over a year to consign the leaks into oblivion, with some but far from complete success, and the result varies heavily depending on a lot of different things, all of which aren't 100% clear to anyone. -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From thomas at zango.com Wed Jan 9 17:55:46 2008 From: thomas at zango.com (Thomas Guyot-Sionnest) Date: Wed, 09 Jan 2008 11:55:46 -0500 Subject: Difference in CPU time with and without ePN In-Reply-To: <4784CDCC.7050101@op5.se> References: <4783A5B3.8070504@zango.com> <47848750.2040200@op5.se> <4784C2A0.8020000@aei.ca> <4784CDCC.7050101@op5.se> Message-ID: <4784FC92.2070103@zango.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Andreas Ericsson wrote: > Thomas Guyot-Sionnest wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> On 09/01/08 03:35 AM, Andreas Ericsson wrote: >>> Have you found a way around the memory leakage? Otherwise, I still believe >>> it's more hassle than it's worth, and effort would be better spent to cut >>> the number of fork()'s in half by having Nagios multiplex its checks. >> I never noticed any memory problem with the ePN and my Nagios often ran >> for many consecutive months without being stopped (doing SIGHUPs from >> time to time to update the config trough) >> >> Could you direct me to some documents of communication archives that >> point out the problem? >> > > http://www.google.se/search?q=%2Bnagios+%2B%22embedded+perl%22+%2B%22memory+leak%22 > > Embedded perl leaks memory. Alot. If you have a setup where it doesn't, > you're pretty much unique. Look for "memory leak" or "embedded perl" in > the nagios-devel and nagios-users archives, apart from the link above. > > Which versions of Nagios and Perl are you using? What system/hw is this > on? ld, glibc and gcc versions might also be interesting, as well as > which options you used when compiling Nagios. > > If the plugins are custom ones, that could also be worth having a look > at. In so far as I know though, Stanley Hopcroft has been trying well > over a year to consign the leaks into oblivion, with some but far from > complete success, and the result varies heavily depending on a lot of > different things, all of which aren't 100% clear to anyone. > Well, yesterday I had to kill and restart Nagios to make changes to Perl modules apply (HUP wouldn't do) and it's true that Nagios now use much less memory, so indeed there seems to be leaks. However with 2GiB of RAM it would take months (maybe more than a year) to fill up the server memory. I don't graph real memory usage yet (used memory minus cache/buffers) so I can't really tell how fast it increase, but it's for sure not a big deal with 2GiB. HUPs have been said to cause memory leaks as well and 1. In average I probably HUP Nagios 2-3 times a week. 2. Automated scripts probably do the same on config pulls from AD (The HUP only happens if the config changes, but that happens quite often) On the solutions side, you can. 1. Kill and restart Nagios instead of HUP'ing it (Why don't nagios execve itself on HUPs BTW?) 2. Monitor the Nagios process memory usage, with optionally an event handler for automated restarts 3. Schedule automated restarts every day/week/month or so 4. Add more memory For me the benefits is definitely worth the downsides. - -- Thomas -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHhPyS6dZ+Kt5BchYRAoFoAJ9nAE/Iv5Ffxyet2+nxWMyYZowc6ACfXGGZ ++MhkdBwWwVST7fniyUPqyg= =sv3N -----END PGP SIGNATURE----- ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From thomas at zango.com Wed Jan 9 18:07:42 2008 From: thomas at zango.com (Thomas Guyot-Sionnest) Date: Wed, 09 Jan 2008 12:07:42 -0500 Subject: Difference in CPU time with and without ePN In-Reply-To: <4784FC92.2070103@zango.com> References: <4783A5B3.8070504@zango.com> <47848750.2040200@op5.se> <4784C2A0.8020000@aei.ca> <4784CDCC.7050101@op5.se> <4784FC92.2070103@zango.com> Message-ID: <4784FF5E.10907@zango.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Thomas Guyot-Sionnest wrote: > Andreas Ericsson wrote: >> Thomas Guyot-Sionnest wrote: >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> On 09/01/08 03:35 AM, Andreas Ericsson wrote: >>>> Have you found a way around the memory leakage? Otherwise, I still believe >>>> it's more hassle than it's worth, and effort would be better spent to cut >>>> the number of fork()'s in half by having Nagios multiplex its checks. >>> I never noticed any memory problem with the ePN and my Nagios often ran >>> for many consecutive months without being stopped (doing SIGHUPs from >>> time to time to update the config trough) >>> >>> Could you direct me to some documents of communication archives that >>> point out the problem? >>> >> http://www.google.se/search?q=%2Bnagios+%2B%22embedded+perl%22+%2B%22memory+leak%22 > >> Embedded perl leaks memory. Alot. If you have a setup where it doesn't, >> you're pretty much unique. Look for "memory leak" or "embedded perl" in >> the nagios-devel and nagios-users archives, apart from the link above. > >> Which versions of Nagios and Perl are you using? What system/hw is this >> on? ld, glibc and gcc versions might also be interesting, as well as >> which options you used when compiling Nagios. > >> If the plugins are custom ones, that could also be worth having a look >> at. In so far as I know though, Stanley Hopcroft has been trying well >> over a year to consign the leaks into oblivion, with some but far from >> complete success, and the result varies heavily depending on a lot of >> different things, all of which aren't 100% clear to anyone. > > > Well, yesterday I had to kill and restart Nagios to make changes to Perl > modules apply (HUP wouldn't do) and it's true that Nagios now use much > less memory, so indeed there seems to be leaks. However with 2GiB of RAM > it would take months (maybe more than a year) to fill up the server > memory. I don't graph real memory usage yet (used memory minus > cache/buffers) so I can't really tell how fast it increase, but it's for > sure not a big deal with 2GiB. > > HUPs have been said to cause memory leaks as well and > 1. In average I probably HUP Nagios 2-3 times a week. > 2. Automated scripts probably do the same on config pulls from AD (The > HUP only happens if the config changes, but that happens quite often) > > On the solutions side, you can. > > 1. Kill and restart Nagios instead of HUP'ing it (Why don't nagios > execve itself on HUPs BTW?) > 2. Monitor the Nagios process memory usage, with optionally an event > handler for automated restarts > 3. Schedule automated restarts every day/week/month or so > 4. Add more memory > > For me the benefits is definitely worth the downsides. Oh and I forgot to say... On the software side: Slackware 11, Nagios 2.7, Perl v5.8.8. GNU ld, gcc and glibc from Slackware 11 (Just like Perl btw). Dual Xeon (+ hyperthreading) == 4 logical CPUs. Relevant Nagios configure options: '--enable-embedded-perl' '--with-perlcache' '--with-nanosleep' For the plugins, I use most of those I posted in NagiosExchange (user "dermoth"). They're all carefully written for ePN. - -- Thomas -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHhP9e6dZ+Kt5BchYRAr+9AJ4zIK+2lOWq20tQekJGQ22oA8D/hwCg3dAf 6Syq4NQlqChc4MFMg2+o6kE= =Rer0 -----END PGP SIGNATURE----- ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From holger at CIS.FU-Berlin.DE Wed Jan 9 20:52:10 2008 From: holger at CIS.FU-Berlin.DE (Holger Weiss) Date: Wed, 9 Jan 2008 20:52:10 +0100 Subject: [PATCH] send_nsca segfault on timeout Message-ID: <20080109195209.GN28313306@CIS.FU-Berlin.DE> If send_nsca runs into the timeout while it's in encrypt_init(), it can segfault because encrypt_cleanup() (which is called from the signal handler) calls mcrypt_generic_end() although mcrypt_generic_init() wasn't done yet. This happens every now and then for us, for some reason send_nsca sometimes timeouts while it's in mcrypt_generic_init(). The attached patch checks whether mcrypt_generic_init() was called before calling mcrypt_generic_end(). Holger -------------- next part -------------- Index: configure.in =================================================================== RCS file: /cvsroot/nagios/nsca/configure.in,v retrieving revision 1.22 diff -u -r1.22 configure.in --- configure.in 23 Nov 2007 17:32:14 -0000 1.22 +++ configure.in 9 Jan 2008 19:42:21 -0000 @@ -32,6 +32,7 @@ dnl Checks for typedefs, structures, and compiler characteristics. AC_C_CONST +AC_C_VOLATILE AC_STRUCT_TM AC_TYPE_MODE_T AC_TYPE_PID_T @@ -92,6 +93,16 @@ AC_SUBST(LIBWRAPLIBS) AC_CHECK_FUNCS(strdup strstr strtoul) +dnl Define sig_atomic_t to int if it's not available. +AC_CHECK_TYPE([sig_atomic_t],[],[ + AC_DEFINE([sig_atomic_t],[int], + [Define to 'int' if does not define.]) + ],[ + #if HAVE_SIGNAL_H + #include + #endif + ]) + dnl socklen_t check - from curl AC_CHECK_TYPE([socklen_t], ,[ AC_MSG_CHECKING([for socklen_t equivalent]) Index: src/utils.c =================================================================== RCS file: /cvsroot/nagios/nsca/src/utils.c,v retrieving revision 1.6 diff -u -r1.6 utils.c --- src/utils.c 2 Feb 2006 18:45:06 -0000 1.6 +++ src/utils.c 9 Jan 2008 19:42:21 -0000 @@ -36,6 +36,9 @@ /*#define DEBUG*/ static unsigned long crc32_table[256]; +#ifdef HAVE_LIBMCRYPT +static volatile sig_atomic_t mcrypt_initialized=FALSE; +#endif @@ -235,7 +238,7 @@ /* initialize encryption buffers */ mcrypt_generic_init(CI->td,CI->key,CI->keysize,CI->IV); - + mcrypt_initialized=TRUE; #endif return OK; @@ -253,7 +256,8 @@ #ifdef HAVE_LIBMCRYPT /* mcrypt cleanup */ if(encryption_method!=ENCRYPT_NONE && encryption_method!=ENCRYPT_XOR){ - mcrypt_generic_end(CI->td); + if(mcrypt_initialized==TRUE) + mcrypt_generic_end(CI->td); free(CI->key); CI->key=NULL; free(CI->IV); -------------- next part -------------- ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From michael_luebben at web.de Wed Jan 9 20:55:27 2008 From: michael_luebben at web.de (=?iso-8859-15?Q?Michael_L=FCbben?=) Date: Wed, 09 Jan 2008 20:55:27 +0100 Subject: Problem with NSCA large and multiline output Message-ID: <1421523726@web.de> >>If possible I am using ssh instead of nsca and nrpe. So I did not have >>this limit. Thats ok! But when you use distributed monitoring, then is ssh or nrpe no solution ;-)! I hope Ethan can fix this problem! Bye Michael -----Urspr?ngliche Nachricht----- Von: Nagios Developers List Gesendet: 09.01.08 10:05:02 An: Nagios Developers List Betreff: Re: [Nagios-devel] Problem with NSCA large and multiline output Ups ;-) Of course your right. If possible I am using ssh instead of nsca and nrpe. So I did not have this limit. Cheers Gerd Am Dienstag, den 08.01.2008, 16:39 +0100 schrieb Michael L?bben: > Hi Gerd, > nrpe 2.10 and higher supports multiline output ;-) > > bye > Michael > > -----Urspr?ngliche Nachricht----- > Von: Nagios Developers List > Gesendet: 08.01.08 12:58:50 > An: Nagios Developers List > Betreff: Re: [Nagios-devel] Problem with NSCA large and multiline output > > > Hi Michael, > > believing the changelog of Nagios max plugin output length is now 8kb. > And Right now NSCA and NRPE aren't able to handle more than 1 > line. :-( > > Cheers, > > Gerd > Am Freitag, den 04.01.2008, 14:36 +0100 schrieb Michael L?bben: > > Hi @all, > > > > i use distributed monitoring! Now i will update to the Version 3! But the nsca send not the complet plugin output! Now i have change the MAX_PLUGINOUTPUT_LENGTH from 512 to 2512 and new compiled. The nsca send now more output, but not enough. Has someone an idea which maximal value i can use? Another problem is that the nsca can only send the first line, not more! I am not a c developer and have no idea how to fix this. > > > > Bye > > Michael > > _________________________________________________________________________ > > In 5 Schritten zur eigenen Homepage. Jetzt Domain sichern und gestalten! > > Nur 3,99 EUR/Monat! http://www.maildomain.web.de/?mc=021114 > > > > > > ------------------------------------------------------------------------- > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2005. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Nagios-devel mailing list > > Nagios-devel at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/nagios-devel -- Gerd Mueller Senior Consultant NETWAYS GmbH | Deutschherrnstr. 47a | D-90429 N?rnberg Tel: +49 911 92885-0 | Fax: +49 911 92885-33 GF: Julian Hein | AG N?rnberg HRB18461 http://www.netways.de | gmueller at netways.de ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel _______________________________________________________________________ Jetzt neu! Sch?tzen Sie Ihren PC mit McAfee und WEB.DE. 30 Tage kostenlos testen. http://www.pc-sicherheit.web.de/startseite/?mc=022220 ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From Mark.Limburg at police.sa.gov.au Thu Jan 10 03:15:23 2008 From: Mark.Limburg at police.sa.gov.au (Limburg, Mark (SAPOL)) Date: Thu, 10 Jan 2008 12:45:23 +1030 Subject: XHTML in Nagios Message-ID: Greetings, Been a while since I haunted these lists (back in v2 beta days), but I'm back and with a little project under my belt that may interest a few of you. I want to make the Nagios CGIs to produce XHTML1. I started off with grand ideas of PHP integration or perhaps a template engine, but I've returned to a more solid ground for now. Whilst I know my PHP, the integration of PHP into the existing CGI framework was beyond my immediate skills ... and the C template engine I was going to use had a memory leak (nothing is going to touch the stability of *MY* nagios install). So, my plan is to now alter the printf statements directly. This will be an acceptable middle ground for me, as it does nothing to alter the stability of the application, whilst providing a much cleaner/stylable set of output. For now, I'm just getting the output of my Nagios install (not that large) and rewriting the output into XHTML1 that makes some sense. I then alter the source accordingly. When this little project has finished phase one, it should then produce XHTML1 across the board. However, I doubt this will be where I finish, as what *I* see as well formed XHTML may not be the best way to do it. Phase two will be tweaking the XHTML so a wide (all?) variety of CSS magic can be applied. Which is where this email comes in .. As I'm starting the rewrite based on existing HTML output (because I can't code in C, making it slow and difficult to fully understand the output directly from the source), I'm asking for the community to send me their various rendered HTML files. In accordance with privacy concerns, I would more than understand if these HTML files are altered a bit - and that's fine - but I need well populated HTML so I can ensure the XHTML that I write has a better chance to be done "right". Hell, if you want to send me your version of a XHTML1 page, feel free :) I'm quite excited about this .. So far, I've altered the Tactical page and it's looking pretty good. Advice, warnings, code, ideas, anything .. Feel free to share :) Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From ae at op5.se Thu Jan 10 10:15:16 2008 From: ae at op5.se (Andreas Ericsson) Date: Thu, 10 Jan 2008 10:15:16 +0100 Subject: Difference in CPU time with and without ePN In-Reply-To: <4784FC92.2070103@zango.com> References: <4783A5B3.8070504@zango.com> <47848750.2040200@op5.se> <4784C2A0.8020000@aei.ca> <4784CDCC.7050101@op5.se> <4784FC92.2070103@zango.com> Message-ID: <4785E224.2060803@op5.se> Thomas Guyot-Sionnest wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Andreas Ericsson wrote: >> Thomas Guyot-Sionnest wrote: >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> On 09/01/08 03:35 AM, Andreas Ericsson wrote: >>>> Have you found a way around the memory leakage? Otherwise, I still believe >>>> it's more hassle than it's worth, and effort would be better spent to cut >>>> the number of fork()'s in half by having Nagios multiplex its checks. >>> I never noticed any memory problem with the ePN and my Nagios often ran >>> for many consecutive months without being stopped (doing SIGHUPs from >>> time to time to update the config trough) >>> >>> Could you direct me to some documents of communication archives that >>> point out the problem? >>> >> http://www.google.se/search?q=%2Bnagios+%2B%22embedded+perl%22+%2B%22memory+leak%22 >> >> Embedded perl leaks memory. Alot. If you have a setup where it doesn't, >> you're pretty much unique. Look for "memory leak" or "embedded perl" in >> the nagios-devel and nagios-users archives, apart from the link above. >> >> Which versions of Nagios and Perl are you using? What system/hw is this >> on? ld, glibc and gcc versions might also be interesting, as well as >> which options you used when compiling Nagios. >> >> If the plugins are custom ones, that could also be worth having a look >> at. In so far as I know though, Stanley Hopcroft has been trying well >> over a year to consign the leaks into oblivion, with some but far from >> complete success, and the result varies heavily depending on a lot of >> different things, all of which aren't 100% clear to anyone. >> > > Well, yesterday I had to kill and restart Nagios to make changes to Perl > modules apply (HUP wouldn't do) and it's true that Nagios now use much > less memory, so indeed there seems to be leaks. However with 2GiB of RAM > it would take months (maybe more than a year) to fill up the server > memory. I don't graph real memory usage yet (used memory minus > cache/buffers) so I can't really tell how fast it increase, but it's for > sure not a big deal with 2GiB. > That depends on the system. Some have reported leaks of 20MiB/minute. 2 gigs won't last very long at that rate. > HUPs have been said to cause memory leaks as well and > 1. In average I probably HUP Nagios 2-3 times a week. > 2. Automated scripts probably do the same on config pulls from AD (The > HUP only happens if the config changes, but that happens quite often) > > On the solutions side, you can. > > 1. Kill and restart Nagios instead of HUP'ing it (Why don't nagios > execve itself on HUPs BTW?) I have no idea. It would definitely make it easier to write modules for it, as each new load is 100% certain to provide a clean slate. > 2. Monitor the Nagios process memory usage, with optionally an event > handler for automated restarts That feels decidedly icky though. It would be better to use the malloc info from glibc to internally decide if we're leaking or not, and then simply execve() self when the memory usage grows too large. > 3. Schedule automated restarts every day/week/month or so > 4. Add more memory > Adding more memory only puts off having to do one of the other things though. It's not a real solution. > For me the benefits is definitely worth the downsides. > You're not alone. Just be aware that some systems see enormous leaks when embedded perl is used. I was curious to know what your specs were, as you would definitely have noticed if you were suffering from the huge kind of leaks. -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From frederic.schaer at cea.fr Thu Jan 10 10:52:22 2008 From: frederic.schaer at cea.fr (SCHAER Frederic) Date: Thu, 10 Jan 2008 10:52:22 +0100 Subject: nrpe RPM spec file bug(s) In-Reply-To: <4782302E.5020104@op5.se> References: <4782302E.5020104@op5.se> Message-ID: Hi, I have to manage servers with hundreds of local unix users created, and all of these servers must have consistent UIDs/GIDs, just because of local constraints (NFS for instance). I also need to monitor these (computing) servers, but I cannot afford having nrpe / Nagios / nagiosplugins create a nagios user whose UID will collide with one of those local user IDs. I don't mind creating my own specfile now that I know what went wrong last time I installed plugins - actually I did create one as this UID thing was no option for me -, and I was sharing this experience. You could say I can create user accounts before RPMs are installed, but actually it's more complicated for me than having the RPM scripts do it ;) So, you seem to agree with me the missing "-r" option is a bug at least for redhat systems spec file, don't you ? I think this option is nice, I'm wondering if other systems might use the same, which would be great... >-----Original Message----- >From: nagios-devel-bounces at lists.sourceforge.net [mailto:nagios-devel- >bounces at lists.sourceforge.net] On Behalf Of Andreas Ericsson >Sent: Monday, January 07, 2008 2:59 PM >To: Nagios Developers List >Subject: Re: [Nagios-devel] nrpe RPM spec file bug(s) > >SCHAER Frederic wrote: >> Hmmm... by the way, the nagios.spec file has the same defect : it's >> using "groupadd" whereas it's using "useradd -r" ... if this could also >> be corrected... >> > >>From the useradd man-page on Fedora Core 7: > > -r This flag is used to create a system account. That is, a user with a > UID lower than the value of UID_MIN defined in /etc/login.defs and > whose password does not expire. Note that useradd will not create a > home directory for such an user, regardless of the default setting > in /etc/login.defs. You have to specify -m option if you want a home > directory for a system account to be created. This is an option > added by Red Hat > > >In short, -r is used to make useradd give the account a UID less than >500 (or 1000), so as to be easily distinguishable from ordinary >user-accounts. > >Also note that if you're using this option on anything else than a Red >Hat compatible system (such as SuSE), you need to write your own spec-file. > >-- >Andreas Ericsson andreas.ericsson at op5.se >OP5 AB www.op5.se >Tel: +46 8-230225 Fax: +46 8-230231 > > >----------------------------------------------------------------------- -- >Check out the new SourceForge.net Marketplace. >It's the best place to buy or sell services for >just about anything Open Source. >http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/market place >_______________________________________________ >Nagios-devel mailing list >Nagios-devel at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/nagios-devel ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From dermoth at aei.ca Thu Jan 10 11:41:42 2008 From: dermoth at aei.ca (Thomas Guyot-Sionnest) Date: Thu, 10 Jan 2008 05:41:42 -0500 Subject: Difference in CPU time with and without ePN In-Reply-To: <4785E224.2060803@op5.se> References: <4783A5B3.8070504@zango.com> <47848750.2040200@op5.se> <4784C2A0.8020000@aei.ca> <4784CDCC.7050101@op5.se> <4784FC92.2070103@zango.com> <4785E224.2060803@op5.se> Message-ID: <4785F666.5020704@aei.ca> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/01/08 04:15 AM, Andreas Ericsson wrote: > Thomas Guyot-Sionnest wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Well, yesterday I had to kill and restart Nagios to make changes to Perl >> modules apply (HUP wouldn't do) and it's true that Nagios now use much >> less memory, so indeed there seems to be leaks. However with 2GiB of RAM >> it would take months (maybe more than a year) to fill up the server >> memory. I don't graph real memory usage yet (used memory minus >> cache/buffers) so I can't really tell how fast it increase, but it's for >> sure not a big deal with 2GiB. >> > > That depends on the system. Some have reported leaks of 20MiB/minute. > 2 gigs won't last very long at that rate. For now what I can see is that Nagios loose track of roughly 2 MiB per HUP. That alone can probably explain the memory usage I had over the last months. If I look at the virtual size and vss directly from proc (to avoid rounding) they seems very steady, with a big jump on each HUP: cat /proc//stat|cut -d' ' -f23-24 45871104 6566 [Wait a few minutes] cat /proc//stat|cut -d' ' -f23-24 45871104 6566 [killall -HUP nagios] cat /proc//stat|cut -d' ' -f23-24 46919680 6700 [wait a few minutes again] cat /proc//stat|cut -d' ' -f23-24 46919680 6700 >> HUPs have been said to cause memory leaks as well and >> 1. In average I probably HUP Nagios 2-3 times a week. >> 2. Automated scripts probably do the same on config pulls from AD (The >> HUP only happens if the config changes, but that happens quite often) >> >> On the solutions side, you can. >> >> 1. Kill and restart Nagios instead of HUP'ing it (Why don't nagios >> execve itself on HUPs BTW?) > > I have no idea. It would definitely make it easier to write modules > for it, as each new load is 100% certain to provide a clean slate. Ethan, are you reading this? >> 2. Monitor the Nagios process memory usage, with optionally an event >> handler for automated restarts > > That feels decidedly icky though. It would be better to use the malloc > info from glibc to internally decide if we're leaking or not, and then > simply execve() self when the memory usage grows too large. The event handler would definitely need to do some sanity checking but I think it's a safe way. Maybe this could be made as a module as well... >> 3. Schedule automated restarts every day/week/month or so >> 4. Add more memory >> > > Adding more memory only puts off having to do one of the other > things though. It's not a real solution. That's true for a 20MB/minute kind of leak, but in my case (2 MiB/HUP leaks) I would definitely have noticed it earlier if I didn't have as much memory... >> For me the benefits is definitely worth the downsides. >> > > You're not alone. Just be aware that some systems see enormous > leaks when embedded perl is used. I was curious to know what > your specs were, as you would definitely have noticed if you > were suffering from the huge kind of leaks. Was this on Linux? Older Perl release? Was it happening on all Perl plugins or only on some of them? Thomas -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHhfZm6dZ+Kt5BchYRAgStAKDAkdVEv8O54SzSz0V9wvhNsSghqwCg0I1A rrJIUAWyRHmVndX1iLxaM9U= =U2MM -----END PGP SIGNATURE----- ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From ae at op5.se Thu Jan 10 13:02:08 2008 From: ae at op5.se (Andreas Ericsson) Date: Thu, 10 Jan 2008 13:02:08 +0100 Subject: nrpe RPM spec file bug(s) In-Reply-To: References: <4782302E.5020104@op5.se> Message-ID: <47860940.90500@op5.se> SCHAER Frederic wrote: > > I don't mind creating my own specfile now that I know what went wrong > last time I installed plugins - actually I did create one as this UID > thing was no option for me -, and I was sharing this experience. You > could say I can create user accounts before RPMs are installed, but > actually it's more complicated for me than having the RPM scripts do it > ;) > I believe that's the case for most large organisations that have a wide array of RPM-based systems. > So, you seem to agree with me the missing "-r" option is a bug at least > for redhat systems spec file, don't you ? Not really, no. The spec-file should be portable enough to work just about everywhere. If you need a specific UID to be set to the same for every computer you want to install the RPM on, you really need to make sure that it ends up that way on your own. -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From ae at op5.se Thu Jan 10 13:07:17 2008 From: ae at op5.se (Andreas Ericsson) Date: Thu, 10 Jan 2008 13:07:17 +0100 Subject: Difference in CPU time with and without ePN In-Reply-To: <4785F666.5020704@aei.ca> References: <4783A5B3.8070504@zango.com> <47848750.2040200@op5.se> <4784C2A0.8020000@aei.ca> <4784CDCC.7050101@op5.se> <4784FC92.2070103@zango.com> <4785E224.2060803@op5.se> <4785F666.5020704@aei.ca> Message-ID: <47860A75.6090001@op5.se> Thomas Guyot-Sionnest wrote: > >>> For me the benefits is definitely worth the downsides. >>> >> You're not alone. Just be aware that some systems see enormous >> leaks when embedded perl is used. I was curious to know what >> your specs were, as you would definitely have noticed if you >> were suffering from the huge kind of leaks. > > Was this on Linux? Older Perl release? Was it happening on all Perl > plugins or only on some of them? > It was on a plethora of different systems, with an equal large amount of different versions of just about everything. In general, I think FreeBSD (and the other BSD's, I have no doubt) get hit the hardest, but all systems I've seen so far suffer at least some from leaks. The google-link I provided you with has 273 hits. I haven't checked them all, but I *have* read most of the reports on nagios-users@, as well as those on this list. HW/SW configuration seems to play a role in how much ePN leaks, not whether it leaks at all or not. Perhaps it's the plugins you use. Could you try adding some few plugins that are clumsily written and don't do much of anything, perhaps? It would be best if they randomize their return value, with roughly 33% probability of returning OK, so as to maximize the checking frequency and also determine if return value has any impact on the leak ratio. -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From frederic.schaer at cea.fr Fri Jan 11 15:06:56 2008 From: frederic.schaer at cea.fr (SCHAER Frederic) Date: Fri, 11 Jan 2008 15:06:56 +0100 Subject: nrpe RPM spec file bug(s) In-Reply-To: <47860940.90500@op5.se> References: <4782302E.5020104@op5.se> <47860940.90500@op5.se> Message-ID: Hi, I have to admit I tend to agree with you about portability... but then the specfiles are still buggy, since they already contain the -r option in some places ;) > >> So, you seem to agree with me the missing "-r" option is a bug at least >> for redhat systems spec file, don't you ? > >Not really, no. The spec-file should be portable enough to work just >about everywhere. If you need a specific UID to be set to the same for >every computer you want to install the RPM on, you really need to make >sure that it ends up that way on your own. > ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From thomas at zango.com Fri Jan 11 17:00:02 2008 From: thomas at zango.com (Thomas Guyot-Sionnest) Date: Fri, 11 Jan 2008 11:00:02 -0500 Subject: Difference in CPU time with and without ePN In-Reply-To: <4785E224.2060803@op5.se> References: <4783A5B3.8070504@zango.com> <47848750.2040200@op5.se> <4784C2A0.8020000@aei.ca> <4784CDCC.7050101@op5.se> <4784FC92.2070103@zango.com> <4785E224.2060803@op5.se> Message-ID: <47879282.4030802@zango.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Andreas Ericsson wrote: > Thomas Guyot-Sionnest wrote: >> 1. Kill and restart Nagios instead of HUP'ing it (Why don't nagios >> execve itself on HUPs BTW?) > > I have no idea. It would definitely make it easier to write modules > for it, as each new load is 100% certain to provide a clean slate. Eh, I just flashed back on this. I know why; it's simply because Nagios opens the resource file (usually readable only by root) before dropping privileges. The only way it can restart by itself without loosing access to that file is what it does right now. - -- Thomas -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHh5KC6dZ+Kt5BchYRAie5AJsElQEu0Tn8ijAdqU0c2+F5+Dz+MwCeMKw9 PEAianCtetkXo/lCB34Df0o= =ImRM -----END PGP SIGNATURE----- ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From td at Dunkel.de Fri Jan 11 17:31:06 2008 From: td at Dunkel.de (td at Dunkel.de) Date: Fri, 11 Jan 2008 17:31:06 +0100 Subject: Bug in Custom Object Variables (Nagios 3.0RC1) In-Reply-To: <47860A75.6090001@op5.se> References: <4783A5B3.8070504@zango.com> <47848750.2040200@op5.se> <4784C2A0.8020000@aei.ca> <4784CDCC.7050101@op5.se> <4784FC92.2070103@zango.com> <4785E224.2060803@op5.se><4785F666.5020704@aei.ca> <47860A75.6090001@op5.se> Message-ID: <897F59F90972B94BBD80EACDFD0ABB4402563D93@internal.dunkel.de> Hi List, maybe there is a bug in the custom object variables. I've defined three usermacros. (in service, host and contact) But in the Environment Variable I only can see the usermacro definition from the host. The other two are missing. Best regards, Thomas Dohl ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From thomas at zango.com Fri Jan 11 18:57:53 2008 From: thomas at zango.com (Thomas Guyot-Sionnest) Date: Fri, 11 Jan 2008 12:57:53 -0500 Subject: Nagios-plugins links on Nagios CVS page (www.nagios.org) Message-ID: <4787AE21.60201@zango.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 The links and CVS info for Nagios-plugins are wrong on the following page: http://www.nagios.org/development/cvs.php > Browse The CVS Tree This is now a Subversion Tree. URL: http://nagiosplug.svn.sourceforge.net/viewvc/nagiosplug/ > Daily CVS Snapshots Although the link is still good, this should be called a Subversion snapshot. > Anonymous CVS Access Subversion access: svn co https://nagiosplug.svn.sourceforge.net/svnroot/nagiosplug nagiosplug > Automatic Notification Of CVS Commits Still the same mailing list, but Subversion commits now. For more details: http://sourceforge.net/svn/?group_id=29880 - -- Thomas -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHh64h6dZ+Kt5BchYRAobFAJ99wJ3Aab2nf2klKV6ehMZVPXyPgQCg2yEJ Kx6uGhNdDkLBTOZP5MCK0Fg= =oR0j -----END PGP SIGNATURE----- ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From thomas at zango.com Fri Jan 11 22:58:11 2008 From: thomas at zango.com (Thomas Guyot-Sionnest) Date: Fri, 11 Jan 2008 16:58:11 -0500 Subject: Nagios v3 minor fixes Message-ID: <4787E673.1070201@zango.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, Today I decided to get my hands dirty and try my biggest Nagios server's config running on Nagios 3. So far the process went pretty smoothly and I'm quite impressed about the performance (All tweaks enabled), but I noticed the following things: 1. In the main Nagios config file: # STATUS FILE UPDATE INTERVAL # Combined with the aggregate_status_updates option, # this option determines the frequency (in seconds) that # Nagios will periodically dump program, host, and # service status data. - -> aggregate_status_updates doesn't exit anymore. This comment should be changed. 2. Nagios Makefiles: I expected Nagios wouldn't overwrite the config files if they were present, apparently it overwrites them (I unfortunately learned that the hard way, thankfully only cgi.cfg and nagios.cfg got overwritten since all the rest have custom names from my v2 config). Any chances that it writes the configs as "name.cfg-sample" or something like that then the file already exists? I know it creates backups, but due to an unrelated issue I re-installed it twice (so the backups got overwritten. So far Nagios 3 looks awesome :) - -- Thomas -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHh+Zz6dZ+Kt5BchYRAq7PAJ4pnRqQx1hm2bQA8EJC4/8FYv77dwCg/suR rB4WEAB60vtQzakX5c4cV/I= =ATIW -----END PGP SIGNATURE----- ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From Donald.Byington at synopsys.com Sat Jan 12 00:07:09 2008 From: Donald.Byington at synopsys.com (Don Byington) Date: Fri, 11 Jan 2008 15:07:09 -0800 Subject: access object data inside of nebmodule_init() Message-ID: I'm working on a NEB module and would like to access host and service objects during the module init. This is an attempt to build a list of all hosts and associated services from the start, rather than waiting for the EB to send object information as things happen. To start with I'm using the filesystem status interface example in David Josephsen's book. I want to create all of the host directories and service files first, then update them with status when it becomes available. I've tried to use; extern host *host_list; extern service *service_list; extern char *config_file; read_all_object_data(config_file); Which will allow me to use; for (temp_host=host_list ; temp_host!=NULL ; temp_host=temp_host->next) { for (temp_service=service_list ; temp_service!=NULL ; temp_service=temp_service->next) { } } However once I do this it apparently messes up the data already read in by Nagios on start up so I get errors like; [1199996860] Done. [1199996860] Event broker module '/opt/nagios/bin/statechange_neb.o' initialized successfully. [1199996860] Error: Timeperiod '24x7' has already been defined [1199996860] Error: Could not register timeperiod (config file '/opt/nagios/etc/timeperiods.cfg' , starting on line 7) [1199996860] Bailing out due to one or more errors encountered in the configuration files. Run Nagios from the command line with the -v option to verify your config before restarting. (PID=32 539) [1199996860] Unloading STATECHANGE Module... [1199996860] Event broker module '/opt/nagios/bin/statechange_neb.o' deinitialized successfully. How can I get to the host_list and service_list structs without messing with the variables? Thanks, Don ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From dermoth at aei.ca Sat Jan 12 06:05:56 2008 From: dermoth at aei.ca (Thomas Guyot-Sionnest) Date: Sat, 12 Jan 2008 00:05:56 -0500 Subject: Difference in CPU time with and without ePN In-Reply-To: <47879282.4030802@zango.com> References: <4783A5B3.8070504@zango.com> <47848750.2040200@op5.se> <4784C2A0.8020000@aei.ca> <4784CDCC.7050101@op5.se> <4784FC92.2070103@zango.com> <4785E224.2060803@op5.se> <47879282.4030802@zango.com> Message-ID: <47884AB4.7070702@aei.ca> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/01/08 11:00 AM, Thomas Guyot-Sionnest wrote: > Andreas Ericsson wrote: >> Thomas Guyot-Sionnest wrote: >>> 1. Kill and restart Nagios instead of HUP'ing it (Why don't nagios >>> execve itself on HUPs BTW?) >> I have no idea. It would definitely make it easier to write modules >> for it, as each new load is 100% certain to provide a clean slate. > > Eh, I just flashed back on this. I know why; it's simply because Nagios > opens the resource file (usually readable only by root) before dropping > privileges. The only way it can restart by itself without loosing access > to that file is what it does right now. Oops I though It was that but apparently it's not the case (at least on Nagos 3); if the file is only readable by root Nagios fails on HUP's. Sorry for the spam (actually for not verifying my claims...) :( Thomas -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHiEq06dZ+Kt5BchYRAvVoAKDhFrNVEaPMheJYWrxuni61S2panACfV4w9 Lue30C6gfbhqViO+JEIFjWc= =VWYM -----END PGP SIGNATURE----- ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From lars at linux-schulserver.de Sun Jan 13 17:15:02 2008 From: lars at linux-schulserver.de (Lars Vogdt) Date: Sun, 13 Jan 2008 17:15:02 +0100 Subject: nrpe RPM spec file bug(s) In-Reply-To: References: Message-ID: Hi Just to add my 2 cents to this threat: On Fri, 11 Jan 2008 15:06:56 +0100, "SCHAER Frederic" wrote: > I have to admit I tend to agree with you about portability... but then > the specfiles are still buggy, since they already contain the -r option > in some places ;) > >>Not really, no. The spec-file should be portable enough to work just >>about everywhere. What about using rpm-macros? I know SUSE has the macro %suse_version which is set to the current SUSE version like 1020 (openSUSE 10.2). With that, you can just make something like this in your specfile: %if 0%{?suse_version} > 1020 do_this %else do_that %endif If you don't need the "> 1020" just leave it out. If the macro is not present, it will expand to "0" and so RPM jumps in the %else tree (if you have one). I'm doing this already with some RPMs in the openSUSE buildservice [1]. I don't know if other distributions like Fedora or Mandriva have these macros defined on their real plattforms (a look into /usr/lib/rpm/*macros* should help), but with this you can do thinks like: %if 0%{?mandriva_version} ... %endif %if 0%{?fedora_version} ... %endif This makes writing specfiles not really easier, but hopefully in some future time we always just use one specfile for all... ;-) Other projects have a directory "distribution" which contains specfiles for each version and distribution sorted in subdirectories in it. Regards, Lars [1]: http://en.opensuse.org/Build_Service/cross_distribution_package_how_to ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From dermoth at aei.ca Mon Jan 14 07:20:31 2008 From: dermoth at aei.ca (Thomas Guyot-Sionnest) Date: Mon, 14 Jan 2008 01:20:31 -0500 Subject: Patch to properly poll() on the command pipe Message-ID: <478AFF2F.2080804@aei.ca> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Ethan, hi list, There's one thing that trickle me with nagios since I came across that code... In nagios v2, base/utils.c, there's the service_result_worker_thread function that uses poll calls to read from an ipc pipe, and there's the command_file_worker_thread (still present in Nagios v3) that seems to be a cut and paste of the previous function with polling replaced by a sleep timer because poll didn't work. The reason poll didn't work is simply because to poll a named pipe, you must open it RDWR, so this patch takes back the code from nagios v2 service_result_worker_thread to use polling on the command pipe. I did it just for fun, but I'm wondering if you'd be interested in using is for Nagios v3 (this is against CVS HEAD btw). It worked well on my Ubuntu box with the following tests: a. Running Nagios with 1500 passive checks accross 100 hosts getting their result from NSCA every minute at the exact same time (1) b. Same as [a] while trying to keep the pipe full by cat'ing a file containing millions of check results for the same host/service (2) (1) A perl script (attached too) forking 100 childs syncronized with a semaphore. every 60 seconds each child opens a pipe to send_nsca and prints 15 service results. nsca was configured not to aggregate writes. (2) Running the following just before the semaphore fires off the childs (results.txt is big enough so that the command finishes after nagios is done processing the 1500 passive results): $ `cat results.txt >>var/rw/nagios.cmd` I guess this patch would help keeping the command pipe empty on huge distributed monitoring systems. As I said it was for fun, do whatever you want with it ;) Thomas -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHiv8u6dZ+Kt5BchYRAkCGAKCA1muzYCQo31cA1ZI/RPQQyQBidwCgv6K6 2Ft31txzgmay0kCnF/EQ1so= =d5/z -----END PGP SIGNATURE----- -------------- next part -------------- A non-text attachment was scrubbed... Name: use_poll_on_cmd_pipe.patch Type: text/x-diff Size: 2796 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: nsca_test.pl Type: application/x-perl Size: 2254 bytes Desc: not available URL: -------------- next part -------------- ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From hs4233 at mail.mn-solutions.de Mon Jan 14 13:01:42 2008 From: hs4233 at mail.mn-solutions.de (Holger Schurig) Date: Mon, 14 Jan 2008 13:01:42 +0100 Subject: is there a problem with Nagios Devel + Kernel 2.4? Message-ID: <200801141301.42931.hs4233@mail.mn-solutions.de> I'm running Nagios CVS HEAD on 2.4.24-1-386 (a Debian kernel). For now, I have only enabled one service (ping) to the localhost. Unfortunately, this check stays forever in the "pending" state. So I turned on debugging and did some "ps" to dig around. What I found so far is: - files get's created in /usr/src/nagios/dist/var/spool/checkresults. So far I have 6 files there. - the normal nagios log says [1200311267] Warning: The check of service 'PING' on host 'lin01' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service... [1200311507] Warning: The check of host 'lin01' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the host... - Something doesn't work with signals. The longer I run nagios, the more nagios processes hanging in rt_sigsuspend exist. Right now I have those: # ps -eo pid,comm,wchan|grep nagios 21307 nagios nanosleep 21308 nagios poll 21309 nagios select 21310 nagios rt_sigsuspend 21378 nagios rt_sigsuspend 21411 nagios rt_sigsuspend 21505 nagios rt_sigsuspend 21574 nagios rt_sigsuspend 21611 nagios rt_sigsuspend 21649 nagios rt_sigsuspend Right now, upgrading this server to Kernel 2.6.xxx isn't an option (because of the other services running on this box, which I don't want to interrupt). Any ideas? ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From td at Dunkel.de Mon Jan 14 15:38:07 2008 From: td at Dunkel.de (td at Dunkel.de) Date: Mon, 14 Jan 2008 15:38:07 +0100 Subject: Bug in Custom Object Variables (Nagios 3.0RC1) (patch) Message-ID: <897F59F90972B94BBD80EACDFD0ABB4402563DF3@internal.dunkel.de> Hi, I think I've found the bug. With this patch it works fine. Please check, if all is ok with this patch. :-) Best regards, Thomas Dohl -start--------------------------------------------------------------------------------------------------- --- common.old/macros.c 2007-12-14 18:44:31.000000000 +0100 +++ common/macros.c 2008-01-14 15:13:17.000000000 +0100 @@ -3169,12 +3169,12 @@ set_macro_environment_var(temp_customvariablesmember->variable_name,clean_macro_chars(temp_customvariablesmember->variable_value,STRIP_ILLEGAL_MACRO_CHARS|ESCAPE_MACRO_CHARS),set); } - /***** CUSTOM HOST VARIABLES *****/ + /***** CUSTOM SERVICE VARIABLES *****/ /* generate variables and save them for later */ if((temp_service=macro_service_ptr) && set==TRUE){ for(temp_customvariablesmember=temp_service->custom_variables;temp_customvariablesmember!=NULL;temp_customvariablesmember=temp_customvariablesmember->next){ asprintf(&customvarname,"_SERVICE%s",temp_customvariablesmember->variable_name); - add_custom_variable_to_object(¯o_custom_host_vars,customvarname,temp_customvariablesmember->variable_value); + add_custom_variable_to_object(¯o_custom_service_vars,customvarname,temp_customvariablesmember->variable_value); my_free(customvarname); } } @@ -3187,7 +3187,7 @@ if((temp_contact=macro_contact_ptr) && set==TRUE){ for(temp_customvariablesmember=temp_contact->custom_variables;temp_customvariablesmember!=NULL;temp_customvariablesmember=temp_customvariablesmember->next){ asprintf(&customvarname,"_CONTACT%s",temp_customvariablesmember->variable_name); - add_custom_variable_to_object(¯o_custom_host_vars,customvarname,temp_customvariablesmember->variable_value); + add_custom_variable_to_object(¯o_custom_contact_vars,customvarname,temp_customvariablesmember->variable_value); my_free(customvarname); } } -END--------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From step at tdc.dk Mon Jan 14 17:24:49 2008 From: step at tdc.dk (Steffen Poulsen) Date: Mon, 14 Jan 2008 17:24:49 +0100 Subject: Notification slow on Solaris - 27+ seconds spent in "set_all_macro_environment_vars(TRUE); "? References: Message-ID: Re: Notification slow on Solaris - 27+ seconds spent in "set_all_macro_environment_vars(TRUE);"? Following up on previous debug, in rc1 we have inserted a few more debug statements in base/utils.c: > log_debug_info(DEBUGL_FUNCTIONS,0,"setting env variables ...\n"); /* set environment variables */ set_all_macro_environment_vars(TRUE); > log_debug_info(DEBUGL_FUNCTIONS,0,"closing command file ...\n"); And we got this: [1200325554.988586] [001.0] [pid=11086] setting env variables ... [1200325555.111295] [001.0] [pid=11086] process_macros() [1200325555.111602] [001.0] [pid=11086] process_macros() [1200325555.111661] [2048.1] [pid=11086] **** BEGIN MACRO PROCESSING *********** [1200325555.111702] [2048.1] [pid=11086] Processing: 'xxx...' [1200325555.111753] [2048.1] [pid=11086] Done. Final output: 'xxx...' [1200325555.111800] [2048.1] [pid=11086] **** END MACRO PROCESSING ************* [1200325555.112067] [001.0] [pid=11086] process_macros() [1200325555.112192] [001.0] [pid=11086] process_macros() [1200325555.112301] [001.0] [pid=11086] process_macros() [1200325555.112410] [001.0] [pid=11086] process_macros() [1200325555.119389] [001.0] [pid=11086] process_macros() [1200325555.119520] [001.0] [pid=11086] process_macros() [1200325555.119629] [001.0] [pid=11086] process_macros() [1200325555.119737] [001.0] [pid=11086] process_macros() [1200325555.119847] [001.0] [pid=11086] process_macros() [1200325555.119956] [001.0] [pid=11086] process_macros() [1200325582.539891] [001.0] [pid=11086] closing command file ... What can we do to lower this pretty large amount of time spend on setting the macro evn vars? We're already running with use_large_installation_tweaks=1, but perhaps there are other helpful options touching the logic going on in set_all_macro_environment_vars? We're at 3.0rc1, Solaris/SPARC, T1000. Best regards, Steffen Poulsen -----Original Message----- From: nagios-users-bounces at lists.sourceforge.net on behalf of Steffen Poulsen Sent: Mon 14-01-2008 10:57 To: Nagios Users Mailinglist Subject: Re: [Nagios-users] Notification slow on Solaris? / Debug: Executiontime=27.553 sec? > There is no problems involved in running the plugins / > notifications at the command line, they execute in no time. > > The delay we can see is from nagios stating in the nagios.log > that is going to send the alert and to the bash script / > check is actually executed. > > So the delay probably has to do with the way nagios spawns > serialized commands (as we see it?). Exerpt from debug log with (-1 / all): [1200304088.239603] [001.0] [pid=4899] my_system() [1200304088.239642] [256.1] [pid=4899] Running command '/usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: RECOVERY\n\nService: check_latency\nHost: xxx.xxx.dk\nAddress: 87.48.144.101\nState: OK\n\nDate/Time: Mon Jan 14 10:48:08 MET 2008\n\nAdditional Info:\n\nOK - Service Latency: 0.23 sec" | /usr/bin/mailx -s "** RECOVERY Service Alert: nagsrv000.tele.dk/check_latency is OK **" xxx at xxx.dk' ... [1200304088.239793] [064.1] [pid=4899] Making callbacks (type 10)... [1200304088.320955] [001.0] [pid=5960] process_macros() [1200304088.321398] [001.0] [pid=5960] process_macros() [1200304088.321449] [2048.1] [pid=5960] **** BEGIN MACRO PROCESSING *********** [1200304088.321486] [2048.1] [pid=5960] Processing: 'xxxx...' [1200304088.321683] [2048.1] [pid=5960] Done. Final output: 'xxxx...' [1200304088.321732] [2048.1] [pid=5960] **** END MACRO PROCESSING ************* [1200304088.321824] [001.0] [pid=5960] process_macros() [1200304088.321918] [001.0] [pid=5960] process_macros() [1200304088.322010] [001.0] [pid=5960] process_macros() [1200304088.322101] [001.0] [pid=5960] process_macros() [1200304088.329595] [001.0] [pid=5960] process_macros() [1200304088.329713] [001.0] [pid=5960] process_macros() [1200304088.329821] [001.0] [pid=5960] process_macros() [1200304088.329927] [001.0] [pid=5960] process_macros() [1200304088.330035] [001.0] [pid=5960] process_macros() [1200304088.330143] [001.0] [pid=5960] process_macros() [1200304115.792892] [256.1] [pid=4899] Execution time=27.553 sec The script appears to be called at the very late end of the "execution time" stated, the script itself is executed in (much) less than 1 sec at the command line. Still: Solaris/SPARC, T1000, no noticable load (~1). // Steffen Poulsen ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From thomas at zango.com Mon Jan 14 22:00:11 2008 From: thomas at zango.com (Thomas Guyot-Sionnest) Date: Mon, 14 Jan 2008 16:00:11 -0500 Subject: Notification slow on Solaris - 27+ seconds spent in "set_all_macro_environment_vars(TRUE); "? In-Reply-To: References: Message-ID: <478BCD5B.3030808@zango.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Steffen Poulsen wrote: > Re: Notification slow on Solaris - 27+ seconds spent in "set_all_macro_environment_vars(TRUE);"? > > Following up on previous debug, in rc1 we have inserted a few more debug statements in base/utils.c: > >> log_debug_info(DEBUGL_FUNCTIONS,0,"setting env variables ...\n"); > > /* set environment variables */ > set_all_macro_environment_vars(TRUE); > >> log_debug_info(DEBUGL_FUNCTIONS,0,"closing command file ...\n"); > > And we got this: > > [1200325554.988586] [001.0] [pid=11086] setting env variables ... > [1200325555.111295] [001.0] [pid=11086] process_macros() > [1200325555.111602] [001.0] [pid=11086] process_macros() > [1200325555.111661] [2048.1] [pid=11086] **** BEGIN MACRO PROCESSING *********** > [1200325555.111702] [2048.1] [pid=11086] Processing: 'xxx...' > [1200325555.111753] [2048.1] [pid=11086] Done. Final output: 'xxx...' > [1200325555.111800] [2048.1] [pid=11086] **** END MACRO PROCESSING ************* > [1200325555.112067] [001.0] [pid=11086] process_macros() > [1200325555.112192] [001.0] [pid=11086] process_macros() > [1200325555.112301] [001.0] [pid=11086] process_macros() > [1200325555.112410] [001.0] [pid=11086] process_macros() > [1200325555.119389] [001.0] [pid=11086] process_macros() > [1200325555.119520] [001.0] [pid=11086] process_macros() > [1200325555.119629] [001.0] [pid=11086] process_macros() > [1200325555.119737] [001.0] [pid=11086] process_macros() > [1200325555.119847] [001.0] [pid=11086] process_macros() > [1200325555.119956] [001.0] [pid=11086] process_macros() > [1200325582.539891] [001.0] [pid=11086] closing command file ... > > What can we do to lower this pretty large amount of time spend on setting the macro evn vars? If you can avoid macros in environment variables (i.e. putting everything on the command line in your commands) try using: enable_environment_macros=0 in nagios.cfg This should be much faster. - -- Thomas -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHi81b6dZ+Kt5BchYRAkr3AJ95j1KOHbEf5wuXpn0U9hbqzET13wCdF80B W5HQPN/CMChTi4X79tJidWo= =TxN/ -----END PGP SIGNATURE----- ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From forums at emat.be Mon Jan 14 23:57:30 2008 From: forums at emat.be (js) Date: Mon, 14 Jan 2008 23:57:30 +0100 Subject: Service & Host dependencies: feature request? Message-ID: <478BE8DA.6070103@emat.be> An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From step at tdc.dk Tue Jan 15 10:39:18 2008 From: step at tdc.dk (Steffen Poulsen) Date: Tue, 15 Jan 2008 10:39:18 +0100 Subject: Notification slow on Solaris - 27+ seconds spent in "set_all_macro_environment_vars(TRUE); "? In-Reply-To: <478BCD5B.3030808@zango.com> References: <478BCD5B.3030808@zango.com> Message-ID: > > What can we do to lower this pretty large amount of time > spend on setting the macro evn vars? > > If you can avoid macros in environment variables (i.e. > putting everything on the command line in your commands) try using: > enable_environment_macros=0 > in nagios.cfg > > This should be much faster. This made all difference in the world - we're at split seconds for notification now. Setup size is around ~8k services / ~1k hosts. - Thanks for helping us out :-) // Steffen ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From igor-v-k at yandex.ru Tue Jan 15 10:47:55 2008 From: igor-v-k at yandex.ru (Igor Khristophorov) Date: Tue, 15 Jan 2008 12:47:55 +0300 Subject: Nagios Tactical Overview - need to alarm only on hard states Message-ID: <20080115094755.GC3936@libra.onm.mtt.ru> Hi, I need to have audible signaling and show red critical errors and warnings only on hard states in Tactical CGI. I am willing to make a patch for this. Could you please recommend me on how to achieve this (where and how to do checks for this)? I would like to have soft states displayed too, though with a lighter color and special notice (as separate line with text "in soft critical"), and to not produce audio alarms in the web interface. Suggestions are welcome. Regards, Igor Khristophorov ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace From nagios at nagios.org Tue Jan 15 19:10:19 2008 From: nagios at nagios.org (Ethan Galstad) Date: Tue, 15 Jan 2008 12:10:19 -0600 Subject: Bug in Custom Object Variables (Nagios 3.0RC1) (patch) In-Reply-To: <897F59F90972B94BBD80EACDFD0ABB4402563DF3@internal.dunkel.de> References: <897F59F90972B94BBD80EACDFD0ABB4402563DF3@internal.dunkel.de> Message-ID: <478CF70B.7030205@nagios.org> Good work Thomas! This will be in CVS shortly. td at Dunkel.de wrote: > Hi, > > I think I've found the bug. > With this patch it works fine. > > Please check, if all is ok with this patch. :-) > > Best regards, > Thomas Dohl > > -start--------------------------------------------------------------------------------------------------- > --- common.old/macros.c 2007-12-14 18:44:31.000000000 +0100 > +++ common/macros.c 2008-01-14 15:13:17.000000000 +0100 > @@ -3169,12 +3169,12 @@ > set_macro_environment_var(temp_customvariablesmember->variable_name,clean_macro_chars(temp_customvariablesmember->variable_value,STRIP_ILLEGAL_MACRO_CHARS|ESCAPE_MACRO_CHARS),set); > } > > - /***** CUSTOM HOST VARIABLES *****/ > + /***** CUSTOM SERVICE VARIABLES *****/ > /* generate variables and save them for later */ > if((temp_service=macro_service_ptr) && set==TRUE){ > for(temp_customvariablesmember=temp_service->custom_variables;temp_customvariablesmember!=NULL;temp_customvariablesmember=temp_customvariablesmember->next){ > asprintf(&customvarname,"_SERVICE%s",temp_customvariablesmember->variable_name); > - add_custom_variable_to_object(¯o_custom_host_vars,customvarname,temp_customvariablesmember->variable_value); > + add_custom_variable_to_object(¯o_custom_service_vars,customvarname,temp_customvariablesmember->variable_value); > my_free(customvarname); > } > } > @@ -3187,7 +3187,7 @@ > if((temp_contact=macro_contact_ptr) && set==TRUE){ > for(temp_customvariablesmember=temp_contact->custom_variables;temp_customvariablesmember!=NULL;temp_customvariablesmember=temp_customvariablesmember->next){ > asprintf(&customvarname,"_CONTACT%s",temp_customvariablesmember->variable_name); > - add_custom_variable_to_object(¯o_custom_host_vars,customvarname,temp_customvariablesmember->variable_value); > + add_custom_variable_to_object(¯o_custom_contact_vars,customvarname,temp_customvariablesmember->variable_value); > my_free(customvarname); > } > } > -END--------------------------------------------------------------------------------------------------- > > > ------------------------------------------------------------------------- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace > _______________________________________________ > Nagios-devel mailing list > Nagios-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-devel -- Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From thomas at zango.com Tue Jan 15 20:27:46 2008 From: thomas at zango.com (Thomas Guyot-Sionnest) Date: Tue, 15 Jan 2008 14:27:46 -0500 Subject: Nagios v3 minor fixes In-Reply-To: <4787E673.1070201@zango.com> References: <4787E673.1070201@zango.com> Message-ID: <478D0932.1080904@zango.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Thomas Guyot-Sionnest wrote: > Hi, > > Today I decided to get my hands dirty and try my biggest Nagios server's > config running on Nagios 3. So far the process went pretty smoothly and > I'm quite impressed about the performance (All tweaks enabled), but I > noticed the following things: > > [...] > Adding some more: In http://nagios.sourceforge.net/docs/3_0/distributed.html, "Freshness Checking" section, you should use check_dummy rather than some custon check script. Not only it will be much faster, it's also easier as you can set the arguments directly in Nagios: define command{ command_name service-is-stale command_line /usr/local/nagios/libexec/check_dummy 3 "UNKNOWN: Service results are stale" } Although I personally do "check_dummy $ARG1$ "$ARG2$"" and set the arguments from the check command; that way I have control over what should be returned (WARNING,CRITICAL or UNKNOWN). V2 can be modified as well... - ----- If I create a precache object file (nagios -pv) and compare it to the object.cache (same config, created with "nagios -d"), it match but the date in the commented header. However if I activate it (nagios -xud) the objects.cache gets all reversed and objects are shown in reverse alphabetical order in the cgi. I won't have much use for that yet, but I tried it that's my results (i guess that's not expected)... - -- Thomas -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHjQky6dZ+Kt5BchYRAp09AKCgqCclvXOA5fzQm1F/S4kFjzZW0gCgqxSz 6UvsZv7dojDXXqEcL0aoIPw= =8APq -----END PGP SIGNATURE----- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From nagios at nagios.org Tue Jan 15 21:49:50 2008 From: nagios at nagios.org (Ethan Galstad) Date: Tue, 15 Jan 2008 14:49:50 -0600 Subject: Patch to properly poll() on the command pipe In-Reply-To: <478AFF2F.2080804@aei.ca> References: <478AFF2F.2080804@aei.ca> Message-ID: <478D1C6E.1090506@nagios.org> Thanks Thomas - This patch will be in CVS shortly. Thomas Guyot-Sionnest wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi Ethan, hi list, > > There's one thing that trickle me with nagios since I came across that > code... In nagios v2, base/utils.c, there's the > service_result_worker_thread function that uses poll calls to read from > an ipc pipe, and there's the command_file_worker_thread (still present > in Nagios v3) that seems to be a cut and paste of the previous function > with polling replaced by a sleep timer because poll didn't work. > > The reason poll didn't work is simply because to poll a named pipe, you > must open it RDWR, so this patch takes back the code from nagios v2 > service_result_worker_thread to use polling on the command pipe. > > I did it just for fun, but I'm wondering if you'd be interested in using > is for Nagios v3 (this is against CVS HEAD btw). It worked well on my > Ubuntu box with the following tests: > > a. Running Nagios with 1500 passive checks accross 100 hosts getting > their result from NSCA every minute at the exact same time (1) > > b. Same as [a] while trying to keep the pipe full by cat'ing a file > containing millions of check results for the same host/service (2) > > (1) A perl script (attached too) forking 100 childs syncronized with a > semaphore. every 60 seconds each child opens a pipe to send_nsca and > prints 15 service results. nsca was configured not to aggregate writes. > > (2) Running the following just before the semaphore fires off the childs > (results.txt is big enough so that the command finishes after nagios is > done processing the 1500 passive results): > $ `cat results.txt >>var/rw/nagios.cmd` > > I guess this patch would help keeping the command pipe empty on huge > distributed monitoring systems. As I said it was for fun, do whatever > you want with it ;) > > > Thomas Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From nagios at nagios.org Tue Jan 15 22:18:09 2008 From: nagios at nagios.org (Ethan Galstad) Date: Tue, 15 Jan 2008 15:18:09 -0600 Subject: [PATCH] send_nsca segfault on timeout In-Reply-To: <20080109195209.GN28313306@CIS.FU-Berlin.DE> References: <20080109195209.GN28313306@CIS.FU-Berlin.DE> Message-ID: <478D2311.707@nagios.org> Thanks for the patch Holger - this will be in CVS shortly. Holger Weiss wrote: > If send_nsca runs into the timeout while it's in encrypt_init(), it can > segfault because encrypt_cleanup() (which is called from the signal > handler) calls mcrypt_generic_end() although mcrypt_generic_init() > wasn't done yet. This happens every now and then for us, for some > reason send_nsca sometimes timeouts while it's in mcrypt_generic_init(). > The attached patch checks whether mcrypt_generic_init() was called > before calling mcrypt_generic_end(). > > Holger > > Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From nagios at nagios.org Tue Jan 15 23:24:47 2008 From: nagios at nagios.org (Ethan Galstad) Date: Tue, 15 Jan 2008 16:24:47 -0600 Subject: NSCA error with --single and aggregate writes enabled In-Reply-To: <4783C573.5000701@zango.com> References: <21E3E5FF-1C15-44C1-8BFA-01C286F3DC49@altinity.com> <4783C573.5000701@zango.com> Message-ID: <478D32AF.4040008@nagios.org> Thanks - patch will be in CVS soon. Thomas Guyot-Sionnest wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Ton Voon wrote: >> Hi Ethan, >> >> We've found a problem with NSCA when aggregate writes are enabled and >> NSCA starts writing to the command file before it gets created. >> Details here: http://altinity.blogs.com/dotorg/2008/01/nscas-aggregate.html > > Alternatively, Nagios could leave the pipe here and have nsca open the > file in non-blocking read-write - it will then be able to fill up the > pipe and Nagios will get the queued commands when it starts up. On write > failures (pipe full) the dump file could be used. This would be the > preferred method to avoid data loss. > > That's pretty much the same as host/service_perfdata_file_mode=p in > Nagios 3 with a daemon like OCP_Daemon to get the data, but the other > way round. > > - -- > Thomas Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From nagios at nagios.org Tue Jan 15 23:39:16 2008 From: nagios at nagios.org (Ethan Galstad) Date: Tue, 15 Jan 2008 16:39:16 -0600 Subject: Nagios 3.0rc1 segfaulting In-Reply-To: <1199376104.2599.4.camel@homer.ob.libexec.de> References: <1199376104.2599.4.camel@homer.ob.libexec.de> Message-ID: <478D3614.6060808@nagios.org> Hmmm... I took a look at the code and didn't see any obvious reason why it would be segfaulting. I'm doubtful that readdir() is the real cause of the segfault, as the backtrace suggests. Tobias Scherbaum wrote: > Hi, > > I'm experiencing a segfault w/ Nagios 3.0rc1 ?in > process_check_result_queue(), it dies just a few seconds after starting > up the Nagios daemon. Backtrace attached. > > tia, > Tobias > > > (gdb) r > Starting program: /usr/sbin/nagios3 /etc/nagios3/nagios.cfg > Failed to read a valid object file image from memory. > [Thread debugging using libthread_db enabled] > [New Thread 1077336608 (LWP 3529)] > > Nagios 3.0rc1 > Copyright (c) 1999-2007 Ethan Galstad (http://www.nagios.org) > Last Modified: 12-17-2007 > License: GPL > > Nagios 3.0rc1 starting... (PID=3529) > Local time is Thu Jan 03 16:01:09 CET 2008 > [New Thread 1081805744 (LWP 3533)] > > Program received signal SIGSEGV, Segmentation fault. > [Switching to Thread 1077336608 (LWP 3529)] > 0x0808450c in process_check_result_queue (dirname=0x80d4078 > "/var/spool/nagios3/checkresults") at utils.c:2244 > 2244 utils.c: No such file or directory. > in utils.c > (gdb) bt > #0 0x0808450c in process_check_result_queue (dirname=0x80d4078 > "/var/spool/nagios3/checkresults") at utils.c:2244 > #1 0x08062749 in reap_check_results () at checks.c:149 > #2 0x08070227 in handle_timed_event (event=0x80f1c20) at events.c:1235 > #3 0x080708a8 in event_execution_loop () at events.c:941 > #4 0x08058a29 in main (argc=Cannot access memory at address 0xf0 > ) at nagios.c:795 > (gdb) kill > Kill the program being debugged? (y or n) y > (gdb) q > Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From webknowledge at gmail.com Wed Jan 16 00:10:13 2008 From: webknowledge at gmail.com (Marcel) Date: Tue, 15 Jan 2008 21:10:13 -0200 Subject: Nagios Tactical Overview - need to alarm only on hard states In-Reply-To: <20080115094755.GC3936@libra.onm.mtt.ru> References: <20080115094755.GC3936@libra.onm.mtt.ru> Message-ID: <2dfcbd1b0801151510l18ac5873o412ad75af4854bdc@mail.gmail.com> We solve an issue like yours crafting a php that connects to NDO and reports back only the hard states, without sound, though. On Jan 15, 2008 7:47 AM, Igor Khristophorov wrote: > Hi, > > I need to have audible signaling and show red critical errors and warnings > only > on hard states in Tactical CGI. > > I am willing to make a patch for this. Could you please recommend me on > how to > achieve this (where and how to do checks for this)? > > I would like to have soft states displayed too, though with a lighter > color and > special notice (as separate line with text "in soft critical"), and to not > produce audio alarms in the web interface. > > Suggestions are welcome. > > Regards, > Igor Khristophorov > > > ------------------------------------------------------------------------- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > > http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace > _______________________________________________ > Nagios-devel mailing list > Nagios-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From mark.eisenblaetter at gmail.com Wed Jan 16 09:00:37 2008 From: mark.eisenblaetter at gmail.com (Mark Eisenblaetter) Date: Wed, 16 Jan 2008 09:00:37 +0100 Subject: Template befor Definition In-Reply-To: <4762A66D.2040107@op5.se> References: <288245630712140621u93d6113q643b51e4194cde39@mail.gmail.com> <4762A66D.2040107@op5.se> Message-ID: <288245630801160000j3f40be4q7fd1e748ebba782d@mail.gmail.com> Hi, sorry for the latye replay, Then I will have to much templates, because i will need for every kombination of args a new template. For example by check_http I will need a template vor every website/service i want to check or for every disk with diffrend warning and Critival threshosld. Thats not so practicable. Mark On Dec 14, 2007 4:51 PM, Andreas Ericsson wrote: > Mark Eisenblaetter wrote: > > Hi, > > > > > > for my distributed monitoring it would be great if i where possible to > > say: > > Use the Template and not the option set in the definition. > > > > My problem is that when i have an active check on an distributed server > it > > is passiv on the master. > > I control this beavior with the template system (diffrend templates on > > maser/slave) > > > > But i if use check_freshness on the master it will use the check_command > and > > f.e pings the Host which could get me the wrong results > > if for example the ip address on both location pingable but differend > server > > i will get the wron result. > > > > so it would be great if i can say in the template use the check_command > > (check_dummy) defined in the template and not that defined in the Host. > > > > That's what not setting anything in the object itself is for. If you don't > have a check_command in the host object, it will use the one from the > template. > > -- > Andreas Ericsson andreas.ericsson at op5.se > OP5 AB www.op5.se > Tel: +46 8-230225 Fax: +46 8-230231 > > ------------------------------------------------------------------------- > SF.Net email is sponsored by: > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services > for just about anything Open Source. > > http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace > _______________________________________________ > Nagios-devel mailing list > Nagios-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-devel > -- Mark Eisenbl?tter Geissendoerfer & Leschinsky GmbH www.gl-sytemhaus.de -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From ae at op5.se Wed Jan 16 10:07:23 2008 From: ae at op5.se (Andreas Ericsson) Date: Wed, 16 Jan 2008 10:07:23 +0100 Subject: Template befor Definition In-Reply-To: <288245630801160000j3f40be4q7fd1e748ebba782d@mail.gmail.com> References: <288245630712140621u93d6113q643b51e4194cde39@mail.gmail.com> <4762A66D.2040107@op5.se> <288245630801160000j3f40be4q7fd1e748ebba782d@mail.gmail.com> Message-ID: <478DC94B.6040808@op5.se> Please don't top-post. It makes it hard to follow the discussion, especially when you're top-posting to an answer that wasn't top- posted. Anyways... Mark Eisenblaetter wrote: > Hi, > sorry for the latye replay, > > Then I will have to much templates, because i will need for every > kombination of args a new template. > > For example by check_http I will need a template vor every website/service i > want to check or for every disk with diffrend warning and Critival > threshosld. > > Thats not so practicable. > Originally you wrote: >>> so it would be great if i can say in the template use the check_command >>> (check_dummy) defined in the template and not that defined in the Host. >>> To which I replied: >> That's what not setting anything in the object itself is for. If you don't >> have a check_command in the host object, it will use the one from the >> template. >> In other words, if you want a particular object to inherit the value from the template, simply don't set that value in the object. There was no other question-like statement in your original mail. -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From mark.eisenblaetter at gmail.com Wed Jan 16 10:38:41 2008 From: mark.eisenblaetter at gmail.com (Mark Eisenblaetter) Date: Wed, 16 Jan 2008 10:38:41 +0100 Subject: Template befor Definition In-Reply-To: <478DC94B.6040808@op5.se> References: <288245630712140621u93d6113q643b51e4194cde39@mail.gmail.com> <4762A66D.2040107@op5.se> <288245630801160000j3f40be4q7fd1e748ebba782d@mail.gmail.com> <478DC94B.6040808@op5.se> Message-ID: <288245630801160138y437578ddp23f16f6720efe6d1@mail.gmail.com> Hi, On Jan 16, 2008 10:07 AM, Andreas Ericsson wrote: > Please don't top-post. It makes it hard to follow the discussion, > especially when you're top-posting to an answer that wasn't top- > posted. Anyways... > > Mark Eisenblaetter wrote: > > Hi, > > sorry for the latye replay, > > > > Then I will have to much templates, because i will need for every > > kombination of args a new template. > > > > For example by check_http I will need a template vor every > website/service i > > want to check or for every disk with diffrend warning and Critival > > threshosld. > > > > Thats not so practicable. > > > > Originally you wrote: > >>> so it would be great if i can say in the template use the > check_command > >>> (check_dummy) defined in the template and not that defined in the > Host. > >>> > > To which I replied: > >> That's what not setting anything in the object itself is for. If you > don't > >> have a check_command in the host object, it will use the one from the > >> template. > >> > > In other words, if you want a particular object to inherit the value from > the template, simply don't set that value in the object. There was no > other question-like statement in your original mail. > Ok, then my first mail was not so clear I hoped. I was thinking of that as a new feature, to minimize the templatework. -- Mark Eisenbl?tter Geissendoerfer & Leschinsky GmbH www.gl-sytemhaus.de -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From s.a.johansen at usit.uio.no Wed Jan 16 11:26:41 2008 From: s.a.johansen at usit.uio.no (=?ISO-8859-1?Q?St=E5le_Asker=F8d_Johansen?=) Date: Wed, 16 Jan 2008 11:26:41 +0100 Subject: Nagios Tactical Overview - need to alarm only on hard states In-Reply-To: <20080115094755.GC3936@libra.onm.mtt.ru> References: <20080115094755.GC3936@libra.onm.mtt.ru> Message-ID: <478DDBE1.2020908@usit.uio.no> Igor Khristophorov wrote: > Hi, > > I need to have audible signaling and show red critical errors and warnings only > on hard states in Tactical CGI. > > I am willing to make a patch for this. Could you please recommend me on how to > achieve this (where and how to do checks for this)? > > I would like to have soft states displayed too, though with a lighter color and > special notice (as separate line with text "in soft critical"), and to not > produce audio alarms in the web interface. > We also would very much like to have hard states only in the webgui. It seems it was possible in earlier versions, but not any longer. To have more flexibility like you mention would be very useful for us. -- St?le Johansen, University of Oslo. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From tobias at scherbaum.info Wed Jan 16 18:09:25 2008 From: tobias at scherbaum.info (Tobias Scherbaum) Date: Wed, 16 Jan 2008 18:09:25 +0100 Subject: Nagios 3.0rc1 segfaulting In-Reply-To: <478D3614.6060808@nagios.org> References: <1199376104.2599.4.camel@homer.ob.libexec.de> <478D3614.6060808@nagios.org> Message-ID: <1200503365.2591.19.camel@homer.ob.libexec.de> Ethan Galstad wrote: > Hmmm... I took a look at the code and didn't see any obvious reason why > it would be segfaulting. I'm doubtful that readdir() is the real cause > of the segfault, as the backtrace suggests. ditto. For now I switched back to Nagios 2, i'll try to setup a similar testbox and reproduce this problem within the next few weeks. Thanks for your feedback! Tobias ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From step at tdc.dk Wed Jan 16 21:22:57 2008 From: step at tdc.dk (Steffen Poulsen) Date: Wed, 16 Jan 2008 21:22:57 +0100 Subject: Nagios v3 minor fixes - fast startup / addressx / flapping / timeperiod templates In-Reply-To: <478D0932.1080904@zango.com> References: <478D0932.1080904@zango.com> Message-ID: > If I create a precache object file (nagios -pv) and compare > it to the object.cache (same config, created with "nagios > -d"), it match but the date in the commented header. However > if I activate it (nagios -xud) the objects.cache gets all > reversed and objects are shown in reverse alphabetical order > in the cgi. I won't have much use for that yet, but I tried > it that's my results (i guess that's not expected)... We started running with fast startup recently as well and noticed this behaviour with reverse sorting also (rc1). Glad to hear it is not only us seeing this :-) Another thing we have noticed with fast startup is that the nagios service stop processing entirely when the first service starts flapping (we have had to disable flapping detection for now). And a third thing while I'm remembering: Setting up notifications (also recently) we had to use the "standard" pager variable for cell phone number, nagios SIGSEGV'ed when using a custom contact variable (addressx) for the purpose. Lastly, we entered into some strange template land by following the advice in http://nagios.sourceforge.net/docs/3_0/oncallrotation.html Which suggests that one timeperiod can _use_ another (bob-on-call definition). I don't believe this is true (any more?), this section of the documentation should be eliminated. Solaris/SPARC, 3.0rc1. Best regards, Steffen Poulsen ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From william at leibzon.org Thu Jan 17 07:44:24 2008 From: william at leibzon.org (William Leibzon) Date: Wed, 16 Jan 2008 22:44:24 -0800 Subject: Difference in CPU time with and without ePN In-Reply-To: <47884AB4.7070702@aei.ca> References: <4783A5B3.8070504@zango.com> <47848750.2040200@op5.se> <4784C2A0.8020000@aei.ca> <4784CDCC.7050101@op5.se> <4784FC92.2070103@zango.com> <4785E224.2060803@op5.se> <47879282.4030802@zango.com> <47884AB4.7070702@aei.ca> Message-ID: [Note: this is not a reply to particular message but thread in general] I've used nagios with embedded perl for quite a long time with fairly heavy number of perl checks (in fact most of the plugins) and I've not seen any serious memory leaks. I've also tested CPU usage both with and without ePN and with ePN its 50%-75% less. Now to be fair nagios installations I've set it up on do not run forever and usually set to restart nagios server once week. But if there were serious memory leaks I'd have noticed as I have specialized memory plugin specifically looking for such issues (nagios dnx code from June had big issue there BTW although new version I just downloaded seems to have lot of it fixed but I can still see it leaking). As far as Perl I suspect the issues are specific to perl modules & plugins you may be using rather then being ePN issues in general (note that I'm not using Nagios::Plugin at all for example). If that is so, then try forcing nagios to reload/recache the plugin by modifying it (I think just adding extra line with '#' at the end of file should be enough) In general it might actually be a good idea for nagios to have a compiled setting on maximum amount of time perl plugin code would be cached and then whenever nagios checks if plugin code has changed it can also check when it was last cached and if its too long recache it even if the code is still the same; this should really apply to both plugins and perl modules which does present some extra challenges. At the same tiem my understanding of embedded perl architecture used by nagios is still limited (I've even tried once to do something similar myself based on nagios code but could not understand some of what I saw [I plan on trying it again when I have more time which is always an issue...] but ePN is not that simple and may well have leak somewhere that is not consistently showing for everyone). On 1/11/08, Thomas Guyot-Sionnest wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 11/01/08 11:00 AM, Thomas Guyot-Sionnest wrote: > > Andreas Ericsson wrote: > >> Thomas Guyot-Sionnest wrote: > >>> 1. Kill and restart Nagios instead of HUP'ing it (Why don't nagios > >>> execve itself on HUPs BTW?) > >> I have no idea. It would definitely make it easier to write modules > >> for it, as each new load is 100% certain to provide a clean slate. > > > > Eh, I just flashed back on this. I know why; it's simply because Nagios > > opens the resource file (usually readable only by root) before dropping > > privileges. The only way it can restart by itself without loosing access > > to that file is what it does right now. > > Oops I though It was that but apparently it's not the case (at least on > Nagos 3); if the file is only readable by root Nagios fails on HUP's. > > Sorry for the spam (actually for not verifying my claims...) :( > > Thomas > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.6 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iD8DBQFHiEq06dZ+Kt5BchYRAvVoAKDhFrNVEaPMheJYWrxuni61S2panACfV4w9 > Lue30C6gfbhqViO+JEIFjWc= > =VWYM > -----END PGP SIGNATURE----- > > ------------------------------------------------------------------------- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > > http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace > _______________________________________________ > Nagios-devel mailing list > Nagios-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From llow at telesphere.com Thu Jan 17 21:34:28 2008 From: llow at telesphere.com (Larry Low) Date: Thu, 17 Jan 2008 13:34:28 -0700 Subject: Feature Request - Blocking Outage Macros Message-ID: <060c01c85948$5b7598d0$1260ca70$@com> It would be nice during notification if the number of hosts (and maybe services) that were affected by "blocking" would be populated. I'll look through the code but wanted to throw this out there. ---- Larry Low 4150 N Drinkwater Blvd., 5th Floor Scottsdale, AZ 85251 Office: 480.385.7045 E-mail: llow at telesphere.com' Telesphere Networks, Inc ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From manuel.hervo at gmail.com Fri Jan 18 03:32:55 2008 From: manuel.hervo at gmail.com (Manuel HERVO) Date: Fri, 18 Jan 2008 03:32:55 +0100 Subject: I found a bug in Nagios, I think Message-ID: <58fb07530801171832ue41ca67g9614c5c136833b32@mail.gmail.com> Hello, I began working in the supervision and I think Nagios very well done. However, for 3.0rc1 version, I found a problem with the plug "check_mysql_query." And the problem is not plug, but Nagios. As usual, I am still a semi-colon (;) at the end of my request and Nagios does not seem to want to accept it within the parameters of my command. Define command ( Command_name check_query Comand_line $USER1$/check_mysql_query-q $ARG1$ -d $ARG2$ -H $HOSTADDRESS$ -u $ARG3$ -p $ARG4$ ) (Define service Servers use service Host_name web-pre-production Service_description Test Query Mysql Check_command check_mysql_query! "SELECT id ForumMessages FROM WHERE id = 1;"! ! ! ) In file / nagios / var / objects.cache: .... Check_command check_mysql_query! "SELECT id ForumMessages FROM WHERE id = 1 .... The semi-colon(;) is blocking the wake of arguments passed to plugin. I hope you have help. Azema. P.S.: Sorry for my english, but I speak French. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From mgoupil at gmail.com Fri Jan 18 03:42:20 2008 From: mgoupil at gmail.com (Matthias GOUPIL) Date: Fri, 18 Jan 2008 03:42:20 +0100 Subject: I found a bug in Nagios, I think In-Reply-To: <58fb07530801171832ue41ca67g9614c5c136833b32@mail.gmail.com> References: <58fb07530801171832ue41ca67g9614c5c136833b32@mail.gmail.com> Message-ID: Tu n'aurais pas oubli? la table derri?re ton from?!?! -- Mensa Envoy? depuis mon iPhone Le 18 janv. 08 ? 03:32, "Manuel HERVO" a ?crit : > Hello, > > I began working in the supervision and I think Nagios very well done. > > However, for 3.0rc1 version, I found a problem with the plug > "check_mysql_query." And the problem is not plug, but Nagios. > > As usual, I am still a semi-colon (;) at the end of my request and > Nagios does not seem to want to accept it within the parameters of > my command. > > Define command ( > Command_name check_query > Comand_line $USER1$/check_mysql_query-q $ARG1$ -d $ARG2$ -H > $HOSTADDRESS$ -u $ARG3$ -p $ARG4$ > ) > > (Define service > Servers use service > Host_name web-pre-production > Service_description Test Query Mysql > Check_command check_mysql_query! "SELECT id ForumMessages > FROM WHERE id = 1;"! ! ! > ) > > In file / nagios / var / objects.cache: > .... > Check_command check_mysql_query! "SELECT id ForumMessages FROM WHERE > id = 1 > .... > > The semi-colon(;) is blocking the wake of arguments passed to plugin. > > I hope you have help. > > Azema. > > P.S.: Sorry for my english, but I speak French. > --- > ---------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Nagios-devel mailing list > Nagios-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-devel ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From manuel.hervo at gmail.com Fri Jan 18 03:50:33 2008 From: manuel.hervo at gmail.com (Manuel HERVO) Date: Fri, 18 Jan 2008 03:50:33 +0100 Subject: I found a bug in Nagios, I think In-Reply-To: References: <58fb07530801171832ue41ca67g9614c5c136833b32@mail.gmail.com> Message-ID: <1200624633.27643.121.camel@azemaland> Le vendredi 18 janvier 2008 ? 03:42 +0100, Matthias GOUPIL a ?crit : > Tu n'aurais pas oubli? la table derri?re ton from?!?! > Non, il y a eu un petit cafouillage dans ma recopie, mais dans le fichier de config de l'host, j'ai bien le nom de ma table apr?s FROM. J'ai essay? de plusieurs mani?res, j'ai pass? 6 heures avant de comprendre d'o? venait le probl?me. Maintenant que j'ai enlev? le point virgule, ?a fonctionne correctement. > -- > Mensa > Envoy? depuis mon iPhone > > Le 18 janv. 08 ? 03:32, "Manuel HERVO" a > ?crit : > > > Hello, > > > > I began working in the supervision and I think Nagios very well done. > > > > However, for 3.0rc1 version, I found a problem with the plug > > "check_mysql_query." And the problem is not plug, but Nagios. > > > > As usual, I am still a semi-colon (;) at the end of my request and > > Nagios does not seem to want to accept it within the parameters of > > my command. > > > > Define command ( > > Command_name check_query > > Comand_line $USER1$/check_mysql_query-q $ARG1$ -d $ARG2$ -H > > $HOSTADDRESS$ -u $ARG3$ -p $ARG4$ > > ) > > > > (Define service > > Servers use service > > Host_name web-pre-production > > Service_description Test Query Mysql > > Check_command check_mysql_query! "SELECT id ForumMessages > > FROM WHERE id = 1;"! ! ! > > ) > > > > In file / nagios / var / objects.cache: > > .... > > Check_command check_mysql_query! "SELECT id ForumMessages FROM WHERE > > id = 1 > > .... > > > > The semi-colon(;) is blocking the wake of arguments passed to plugin. > > > > I hope you have help. > > > > Azema. > > > > P.S.: Sorry for my english, but I speak French. > > --- > > ---------------------------------------------------------------------- > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Nagios-devel mailing list > > Nagios-devel at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/nagios-devel > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Nagios-devel mailing list > Nagios-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-devel ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From ae at op5.se Fri Jan 18 09:11:12 2008 From: ae at op5.se (Andreas Ericsson) Date: Fri, 18 Jan 2008 09:11:12 +0100 Subject: I found a bug in Nagios, I think In-Reply-To: <58fb07530801171832ue41ca67g9614c5c136833b32@mail.gmail.com> References: <58fb07530801171832ue41ca67g9614c5c136833b32@mail.gmail.com> Message-ID: <47905F20.9050500@op5.se> Manuel HERVO wrote: > Hello, > > I began working in the supervision and I think Nagios very well done. > > However, for 3.0rc1 version, I found a problem with the plug > "check_mysql_query." And the problem is not plug, but Nagios. > > As usual, I am still a semi-colon (;) at the end of my request and Nagios > does not seem to want to accept it within the parameters of my command. > > Define command ( > Command_name check_query > Comand_line $USER1$/check_mysql_query-q $ARG1$ -d $ARG2$ -H $HOSTADDRESS$ > -u $ARG3$ -p $ARG4$ > ) > > (Define service > Servers use service > Host_name web-pre-production > Service_description Test Query Mysql > Check_command check_mysql_query! "SELECT id ForumMessages FROM > WHERE id = 1;"! ! ! > ) > > In file / nagios / var / objects.cache: > .... > Check_command check_mysql_query! "SELECT id ForumMessages FROM WHERE id = 1 > .... > > The semi-colon(;) is blocking the wake of arguments passed to plugin. > semi-colon is used to denote winged comments in Nagios. You'll have to escape it if you want it to work properly, but you could just remove it instead, as the check_mysql_query plugin doesn't require the semi-colon anyway. -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From andurin at process-zero.de Fri Jan 18 09:22:40 2008 From: andurin at process-zero.de (=?UTF-8?B?SGVuZHJpayBCw6Rja2Vy?=) Date: Fri, 18 Jan 2008 09:22:40 +0100 Subject: I found a bug in Nagios, I think In-Reply-To: <1200624633.27643.121.camel@azemaland> References: <58fb07530801171832ue41ca67g9614c5c136833b32@mail.gmail.com> <1200624633.27643.121.camel@azemaland> Message-ID: <479061D0.3020507@process-zero.de> Hey guys, Manuel HERVO schrieb: > > Le vendredi 18 janvier 2008 ? 03:42 +0100, Matthias GOUPIL a ?crit : >> Tu n'aurais pas oubli? la table derri?re ton from?!?! >> > Non, il y a eu un petit cafouillage dans ma recopie, mais dans le > fichier de config de l'host, j'ai bien le nom de ma table apr?s FROM. > > J'ai essay? de plusieurs mani?res, j'ai pass? 6 heures avant de > comprendre d'o? venait le probl?me. > > Maintenant que j'ai enlev? le point virgule, ?a fonctionne correctement. I know french is melodic language, but can we please keep this list in english? - Hendrik ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From manuel.hervo at gmail.com Fri Jan 18 09:42:49 2008 From: manuel.hervo at gmail.com (Manuel HERVO) Date: Fri, 18 Jan 2008 09:42:49 +0100 Subject: I found a bug in Nagios, I think In-Reply-To: <479061D0.3020507@process-zero.de> References: <58fb07530801171832ue41ca67g9614c5c136833b32@mail.gmail.com> <1200624633.27643.121.camel@azemaland> <479061D0.3020507@process-zero.de> Message-ID: <1200645769.27643.131.camel@azemaland> Sorry, Thanks for a explication. Azema. Le vendredi 18 janvier 2008 ? 09:22 +0100, Hendrik B?cker a ?crit : > Hey guys, > > Manuel HERVO schrieb: > > > > Le vendredi 18 janvier 2008 ? 03:42 +0100, Matthias GOUPIL a ?crit : > >> Tu n'aurais pas oubli? la table derri?re ton from?!?! > >> > > Non, il y a eu un petit cafouillage dans ma recopie, mais dans le > > fichier de config de l'host, j'ai bien le nom de ma table apr?s FROM. > > > > J'ai essay? de plusieurs mani?res, j'ai pass? 6 heures avant de > > comprendre d'o? venait le probl?me. > > > > Maintenant que j'ai enlev? le point virgule, ?a fonctionne correctement. > > > > I know french is melodic language, but can we please keep this list in > english? > > - > Hendrik > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Nagios-devel mailing list > Nagios-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-devel ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From dermoth at aei.ca Fri Jan 18 12:40:11 2008 From: dermoth at aei.ca (Thomas Guyot-Sionnest) Date: Fri, 18 Jan 2008 06:40:11 -0500 Subject: Difference in CPU time with and without ePN In-Reply-To: References: <4783A5B3.8070504@zango.com> <47848750.2040200@op5.se> <4784C2A0.8020000@aei.ca> <4784CDCC.7050101@op5.se> <4784FC92.2070103@zango.com> <4785E224.2060803@op5.se> <47879282.4030802@zango.com> <47884AB4.7070702@aei.ca> Message-ID: <4790901B.7000308@aei.ca> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 17/01/08 01:44 AM, William Leibzon wrote: > [Note: this is not a reply to particular message but thread in general] > > I've used nagios with embedded perl for quite a long time with fairly > heavy number of perl checks (in fact most of the plugins) and I've not > seen any serious memory leaks. I've also tested CPU > usage both with and without ePN and with ePN its 50%-75% less. Now to > be fair nagios installations I've set it up on do not run forever and > usually set to restart nagios server > once week. But if there were serious memory leaks I'd have noticed as I have specialized memory > plugin specifically looking for such issues (nagios dnx code from > June had big issue there BTW although new version I just downloaded > seems to have lot of it fixed but I can still see it leaking). > > As far as Perl I suspect the issues are specific to perl modules & > plugins you may be using rather then being ePN issues in general (note > that I'm not using Nagios::Plugin at all for example). If that is so, > then try forcing nagios to reload/recache the plugin by modifying it (I > think just adding extra line with '#' at the end of file should be > enough) In general it might actually be a good idea for nagios to have a > compiled setting on maximum amount of time perl plugin code would be cached and then > whenever nagios checks if plugin code has changed it can also check when > it was last cached and if its too long recache it even if the code is > still the same; this should really apply to both plugins and perl > modules which does present some extra challenges. At the same tiem my > understanding of embedded perl architecture used by nagios is still > limited (I've even tried once to do something similar myself based on > nagios code but could not understand some of what I saw [I plan on > trying it again when I have more time which is always an issue...] but > ePN is not that simple and may well have leak somewhere that is not > consistently showing for everyone). I don't think clearing the cache will be enough; i believe the lost memory is truly lost and cannot be recovered... On the system from which the CPU graph was taken I monitored closely Nagios memory usage and I couldn't see any sign of leak at all over days. On a test box I set-up with Nagios 3 (similar software) using the same config, there seems to be a small memory leak (1-2MB/day maximum) but I didn't monitored it long enough yet to confirm. Also a comparison is hard to make as many plugins are timing-out as I can't put the server in the right VLAN right now... Maybe the timeouts could have something to do as it abruptly terminates the plugins. I use Perl 5.8.8 on Slackware 11. My plugins uses some of these modules: utils.pm qw($TIMEOUT %ERRORS) (From Nagios-plugins 1.4.11) Nagios::Plugin Getopt::Long Net::LDAP Net::DNS Time::HiRes qw(gettimeofday tv_interval) DBI (Drivers: DBD::Sybase, DBD::mysql) Class::Date Any plugins not bundled with Perl are installed from CPAN so they are all fairly recent. Thomas -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHkJAb6dZ+Kt5BchYRAp1fAKDF64mJKyRRCk7iNJ/gpPeDgx7/KgCfbTou ubJSVpQ2B1sZhBIzrg1sBMM= =OUc3 -----END PGP SIGNATURE----- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From dfulton at nuvox.com Fri Jan 18 17:22:02 2008 From: dfulton at nuvox.com (Fulton, David) Date: Fri, 18 Jan 2008 11:22:02 -0500 Subject: NDOUtils Question Message-ID: <6C0AFDE8AAE2FC4F9AA957B32F6A6B1F2BE735@EMAIL01.CORPORATE.VOX.NET> I was looking at the schema that NDOUtils uses with MySQL and I was attempting to extend it for my own purposes. However, there appears to be only a hard coded list of objecttype_id's that NDOUtils uses. I am attempting to use the NDOUtils schema for mapping the physical layout of our datacenters but I want to treat each thing as an object (i.e. cabinets, shelves and cable runs, among the many objects). The reason is obvious. If the configuration is in the database when nagios starts using it for configuration (instead of flat files) I will be ready and the whole thing will be seamless(supposedly), plus it would give me the advantages of using a RDBMS for inventory and asset control in the same place as monitoring control. I was wondering since the objecttype_id field is a smallint if we could define a range of number set aside for "local" use. Better yet have a growing list of object types so that as nagios itself needs more (in the future) they will be recognized. David Fulton Systems Administrator NuVox Communications O:407-835-0470 C:321-246-2238 "Because Business is on the Line" P.S. I posted the same message but forgot I had subscribed to the list under a different email address. Please reply to this one. Thanks, David Fulton ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From cacovas at gmail.com Fri Jan 18 19:04:49 2008 From: cacovas at gmail.com (Carlos Pereira) Date: Fri, 18 Jan 2008 16:04:49 -0200 Subject: Nagios 2.10 + ndoutils-1.4b7 Message-ID: Hi folks, could someone help with nagios + ndoutil. I'm having a hard time to sync nagios to dump its information throught ndo to a mysql database. The scenerio is as follow: Nagios installation is not a problem, I set up and it works fine with the default value. ( Nagios 2.10). Install every thing obsvialy , Mysql 5.0 and client , Apache2 & PHP5. Now when comes down to install ndoutils-1.4b7, problems start to popup. Obs: installing ndoutils-1.4b7 its not a problem but, writing to data sink it is ( THE PROBLEM) 1 ? Out put from ./configure ( Parts that have "no" while running the command ) : checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking limits.h presence... yes checking for limits.h... yes checking ltdl.h usability... no checking ltdl.h presence... no checking for ltdl.h... no checking math.h usability... yes checking math.h presence... yes checking signal.h presence... yes checking for signal.h... yes checking socket.h usability... no checking socket.h presence... no checking for socket.h... no checking stdarg.h usability... yes checking stdarg.h presence... yes checking for main in -lnsl... yes checking for socket in -lsocket... no checking for main in -lwrap... yes checking for strdup... yes checking for strstr... yes checking for compress in -lz... yes checking for mysql_store_result in -lmysqlclient... yes checking for mysql_connect in -lmysqlclient... no checking mysql/mysql.h usability... yes checking mysql/mysql.h presence... yes checking for mysql/mysql.h... yes MySQL library and include file(s) were found! checking for PQconnectdb in -lpq... no No problems while executing " MAKE" I followed as the README doc recommends inside the ndoutils-1.4b7. Copied all the files to the its right places as well as the database, create user and granted the permissions and password. I ran the command ./installdb ( No problems). I change the proper parameter in ndo2db.cfg ( user and password) I execute the command : /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg Output from ls ?la (inside the directory) xxxxx:/usr/local/nagios/var# ls -la total 224 drwxrwsr-x 5 nagios nagios 4096 2008-01-18 15:15 . drwxrwsr-x 8 nagios nagios 4096 2008-01-17 17:34 .. drwxrwsr-x 2 nagios nagios 4096 2008-01-18 00:00 archives -rw-r--r-- 1 nagios nagios 6 2008-01-18 12:39 nagios.lock -rw-rw-r-- 1 nagios nagios 148605 2008-01-18 15:15 nagios.log srwxrwxrwx 1 nagios nagios 0 2008-01-18 12:38 ndo.sock -rw-r--r-- 1 nagios nagios 12943 2008-01-18 12:39 objects.cache -rw------- 1 nagios nagios 14010 2008-01-18 14:39 retention.dat drwxrwsr-x 2 nagios nagios 4096 2008-01-18 12:39 rw drwxrwsr-x 3 nagios nagios 4096 2008-01-17 17:34 spool -rw-rw-r-- 1 nagios nagios 14061 2008-01-18 15:15 status.dat Every thing starts fine BUT??? That's the output, I got from nagios.log [1200676718] ndomod: Error writing to data sink! Some output may get lost... [1200676734] ndomod: Successfully reconnected to data sink! 0 items lost, 87 queued items to flush. [1200676734] ndomod: Successfully flushed 87 queued items to data sink. [1200676734] ndomod: Error writing to data sink! Some output may get lost... [1200676750] ndomod: Successfully reconnected to data sink! 0 items lost, 74 queued items to flush. [1200676750] ndomod: Successfully flushed 74 queued items to data sink. [1200676750] ndomod: Error writing to data sink! Some output may get lost... [1200676766] ndomod: Successfully reconnected to data sink! 0 items lost, 78 queued items to flush. [1200676766] ndomod: Successfully flushed 78 queued items to data sink. [1200676766] ndomod: Error writing to data sink! Some output may get lost... [1200676782] ndomod: Successfully reconnected to data sink! 0 items lost, 90 queued items to flush. [1200676782] ndomod: Successfully flushed 90 queued items to data sink. [1200676782] ndomod: Error writing to data sink! Some output may get lost... [1200676798] ndomod: Successfully reconnected to data sink! 0 items lost, 70 queued items to flush. [1200676798] ndomod: Successfully flushed 70 queued items to data sink. [1200676798] ndomod: Error writing to data sink! Some output may get lost... That?s the output from ps ?ef root 20562 1 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20567 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20568 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20569 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20570 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20571 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20575 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20576 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20577 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start root 30899 2227 0 10:27 ? 00:00:08 sshd: root at pts/0 root 30901 30899 0 10:27 pts/0 00:00:05 -bash root 31325 2227 0 10:49 ? 00:00:02 sshd: root at pts/1 root 31335 31325 0 10:49 pts/1 00:00:00 -bash root 2822 1 0 11:10 ? 00:00:00 runsvdir -P /var/service log: ................................................ root 2850 2822 0 11:10 ? 00:00:00 runsv socklog-klog root 2851 2822 0 11:10 ? 00:00:00 runsv socklog-unix log 2852 2851 0 11:10 ? 00:00:00 svlogd main/main main/auth main/cron main/daemon main/debug main/ftp main/kern log 2853 2850 0 11:10 ? 00:00:00 svlogd -tt main/main root 2854 2850 0 11:10 ? 00:00:00 socklog ucspi nobody 2855 2851 0 11:10 ? 00:00:00 socklog unix /dev/log root 10421 30901 0 12:28 pts/0 00:00:00 mysql root 10521 30901 0 12:37 pts/0 00:00:00 nano ndo2db.cfg root 10591 1 0 12:38 pts/0 00:00:00 /bin/sh /usr/bin/mysqld_safe mysql 10628 10591 0 12:38 pts/0 00:00:02 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --pid-fi root 10629 10591 0 12:38 pts/0 00:00:00 logger -p daemon.err -t mysqld_safe -i -t mysqld nagios 10693 1 0 12:38 ? 00:00:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg nagios 10736 1 0 12:39 ? 00:00:04 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg So, I have no idea how to solve the problem no data are being writing to the database.Please somebody could help me or guide to the right place. Thanks in advance. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From cacovas at gmail.com Fri Jan 18 19:24:11 2008 From: cacovas at gmail.com (Carlos Pereira) Date: Fri, 18 Jan 2008 16:24:11 -0200 Subject: Nagios 2.10 + ndoutils-1.4b7 Message-ID: Hi folks, could someone help with nagios + ndoutil. I'm having a hard time to sync nagios to dump its information throught ndo to a mysql database. The scenerio is as follow: Nagios installation is not a problem, I set up and it works fine with the default value. ( Nagios 2.10). Install everything , Mysql 5.0 and client , Apache2 & PHP5. Now when comes down to install ndoutils-1.4b7, problems start to popup. Obs: installing its not a problem but, writing to data sink it is ( THE PROBLEM) 1 ? Out put from ./configure ( Parts that have "no" while running command the problems ( I think ) ) : checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking limits.h presence... yes checking for limits.h... yes checking ltdl.h usability... no checking ltdl.h presence... no checking for ltdl.h... no checking math.h usability... yes checking math.h presence... yes checking signal.h presence... yes checking for signal.h... yes checking socket.h usability... no checking socket.h presence... no checking for socket.h... no checking stdarg.h usability... yes checking stdarg.h presence... yes checking for main in -lnsl... yes checking for socket in -lsocket... no checking for main in -lwrap... yes checking for strdup... yes checking for strstr... yes checking for compress in -lz... yes checking for mysql_store_result in -lmysqlclient... yes checking for mysql_connect in -lmysqlclient... no checking mysql/mysql.h usability... yes checking mysql/mysql.h presence... yes checking for mysql/mysql.h... yes MySQL library and include file(s) were found! checking for PQconnectdb in -lpq... no No problems while executing " MAKE" Followed as the README doc recommends inside the ndoutils-1.4b7. Copied all the files to the right places as well as the database, create users grant permission and create the password. I ran the command ./installdb ( No problems). I change the proper parameter on ndo2db.cfg ( user and password) I execute the command : /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg Output from ls ?la (inside the directory) xxxxx:/usr/local/nagios/var# ls -la total 224 drwxrwsr-x 5 nagios nagios 4096 2008-01-18 15:15 . drwxrwsr-x 8 nagios nagios 4096 2008-01-17 17:34 .. drwxrwsr-x 2 nagios nagios 4096 2008-01-18 00:00 archives -rw-r--r-- 1 nagios nagios 6 2008-01-18 12:39 nagios.lock -rw-rw-r-- 1 nagios nagios 148605 2008-01-18 15:15 nagios.log srwxrwxrwx 1 nagios nagios 0 2008-01-18 12:38 ndo.sock -rw-r--r-- 1 nagios nagios 12943 2008-01-18 12:39 objects.cache -rw------- 1 nagios nagios 14010 2008-01-18 14:39 retention.dat drwxrwsr-x 2 nagios nagios 4096 2008-01-18 12:39 rw drwxrwsr-x 3 nagios nagios 4096 2008-01-17 17:34 spool -rw-rw-r-- 1 nagios nagios 14061 2008-01-18 15:15 status.dat Every thing starts fine BUT??? That's the output, I got from nagios.log [1200676718] ndomod: Error writing to data sink! Some output may get lost... [1200676734] ndomod: Successfully reconnected to data sink! 0 items lost, 87 queued items to flush. [1200676734] ndomod: Successfully flushed 87 queued items to data sink. [1200676734] ndomod: Error writing to data sink! Some output may get lost... [1200676750] ndomod: Successfully reconnected to data sink! 0 items lost, 74 queued items to flush. [1200676750] ndomod: Successfully flushed 74 queued items to data sink. [1200676750] ndomod: Error writing to data sink! Some output may get lost... [1200676766] ndomod: Successfully reconnected to data sink! 0 items lost, 78 queued items to flush. [1200676766] ndomod: Successfully flushed 78 queued items to data sink. [1200676766] ndomod: Error writing to data sink! Some output may get lost... [1200676782] ndomod: Successfully reconnected to data sink! 0 items lost, 90 queued items to flush. [1200676782] ndomod: Successfully flushed 90 queued items to data sink. [1200676782] ndomod: Error writing to data sink! Some output may get lost... [1200676798] ndomod: Successfully reconnected to data sink! 0 items lost, 70 queued items to flush. [1200676798] ndomod: Successfully flushed 70 queued items to data sink. [1200676798] ndomod: Error writing to data sink! Some output may get lost... That?s the output from ps ?ef root 20562 1 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20567 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20568 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20569 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20570 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20571 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20575 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20576 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20577 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start root 30899 2227 0 10:27 ? 00:00:08 sshd: root at pts/0 root 30901 30899 0 10:27 pts/0 00:00:05 -bash root 31325 2227 0 10:49 ? 00:00:02 sshd: root at pts/1 root 31335 31325 0 10:49 pts/1 00:00:00 -bash root 2822 1 0 11:10 ? 00:00:00 runsvdir -P /var/service log: ................................................ root 2850 2822 0 11:10 ? 00:00:00 runsv socklog-klog root 2851 2822 0 11:10 ? 00:00:00 runsv socklog-unix log 2852 2851 0 11:10 ? 00:00:00 svlogd main/main main/auth main/cron main/daemon main/debug main/ftp main/kern log 2853 2850 0 11:10 ? 00:00:00 svlogd -tt main/main root 2854 2850 0 11:10 ? 00:00:00 socklog ucspi nobody 2855 2851 0 11:10 ? 00:00:00 socklog unix /dev/log root 10421 30901 0 12:28 pts/0 00:00:00 mysql root 10521 30901 0 12:37 pts/0 00:00:00 nano ndo2db.cfg root 10591 1 0 12:38 pts/0 00:00:00 /bin/sh /usr/bin/mysqld_safe mysql 10628 10591 0 12:38 pts/0 00:00:02 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --pid-fi root 10629 10591 0 12:38 pts/0 00:00:00 logger -p daemon.err -t mysqld_safe -i -t mysqld nagios 10693 1 0 12:38 ? 00:00:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg nagios 10736 1 0 12:39 ? 00:00:04 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg So, I have no idea how to solve the problem no data are being written to the database. Please somebody could help me guide to the right place. Thanks in advance. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From dfulton at nuvox.com Fri Jan 18 20:04:45 2008 From: dfulton at nuvox.com (Fulton, David) Date: Fri, 18 Jan 2008 14:04:45 -0500 Subject: Nagios 2.10 + ndoutils-1.4b7 In-Reply-To: References: Message-ID: <6C0AFDE8AAE2FC4F9AA957B32F6A6B1F2BE7CB@EMAIL01.CORPORATE.VOX.NET> The first place I would look is to make sure that the username and password you are using to connect to mysql are correct. Try connecting with the command line client using them. If that works, check that your event brokering is enabled (broker everything). David Fulton Systems Administrator NuVox Communications O:407-835-0470 C:321-246-2238 "Because Business is on the Line" ________________________________ From: nagios-devel-bounces at lists.sourceforge.net [mailto:nagios-devel-bounces at lists.sourceforge.net] On Behalf Of Carlos Pereira Sent: Friday, January 18, 2008 1:24 PM To: nagios-devel at lists.sourceforge.net Subject: [Nagios-devel] Nagios 2.10 + ndoutils-1.4b7 Hi folks, could someone help with nagios + ndoutil. I'm having a hard time to sync nagios to dump its information throught ndo to a mysql database. The scenerio is as follow: Nagios installation is not a problem, I set up and it works fine with the default value. ( Nagios 2.10). Install everything , Mysql 5.0 and client , Apache2 & PHP5. Now when comes down to install ndoutils-1.4b7, problems start to popup. Obs: installing its not a problem but, writing to data sink it is ( THE PROBLEM) 1 - Out put from ./configure ( Parts that have "no" while running command the problems ( I think ) ) : checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking limits.h presence... yes checking for limits.h... yes checking ltdl.h usability... no checking ltdl.h presence... no checking for ltdl.h... no checking math.h usability... yes checking math.h presence... yes checking signal.h presence... yes checking for signal.h... yes checking socket.h usability... no checking socket.h presence... no checking for socket.h... no checking stdarg.h usability... yes checking stdarg.h presence... yes checking for main in -lnsl... yes checking for socket in -lsocket... no checking for main in -lwrap... yes checking for strdup... yes checking for strstr... yes checking for compress in -lz... yes checking for mysql_store_result in -lmysqlclient... yes checking for mysql_connect in -lmysqlclient... no checking mysql/mysql.h usability... yes checking mysql/mysql.h presence... yes checking for mysql/mysql.h... yes MySQL library and include file(s) were found! checking for PQconnectdb in -lpq... no No problems while executing " MAKE" Followed as the README doc recommends inside the ndoutils-1.4b7. Copied all the files to the right places as well as the database, create users grant permission and create the password. I ran the command ./installdb ( No problems). I change the proper parameter on ndo2db.cfg ( user and password) I execute the command : /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg Output from ls -la (inside the directory) xxxxx:/usr/local/nagios/var# ls -la total 224 drwxrwsr-x 5 nagios nagios 4096 2008-01-18 15:15 . drwxrwsr-x 8 nagios nagios 4096 2008-01-17 17:34 .. drwxrwsr-x 2 nagios nagios 4096 2008-01-18 00:00 archives -rw-r--r-- 1 nagios nagios 6 2008-01-18 12:39 nagios.lock -rw-rw-r-- 1 nagios nagios 148605 2008-01-18 15:15 nagios.log srwxrwxrwx 1 nagios nagios 0 2008-01-18 12:38 ndo.sock -rw-r--r-- 1 nagios nagios 12943 2008-01-18 12:39 objects.cache -rw------- 1 nagios nagios 14010 2008-01-18 14:39 retention.dat drwxrwsr-x 2 nagios nagios 4096 2008-01-18 12:39 rw drwxrwsr-x 3 nagios nagios 4096 2008-01-17 17:34 spool -rw-rw-r-- 1 nagios nagios 14061 2008-01-18 15:15 status.dat Every thing starts fine BUT......... That's the output, I got from nagios.log [1200676718] ndomod: Error writing to data sink! Some output may get lost... [1200676734] ndomod: Successfully reconnected to data sink! 0 items lost, 87 queued items to flush. [1200676734] ndomod: Successfully flushed 87 queued items to data sink. [1200676734] ndomod: Error writing to data sink! Some output may get lost... [1200676750] ndomod: Successfully reconnected to data sink! 0 items lost, 74 queued items to flush. [1200676750] ndomod: Successfully flushed 74 queued items to data sink. [1200676750] ndomod: Error writing to data sink! Some output may get lost... [1200676766] ndomod: Successfully reconnected to data sink! 0 items lost, 78 queued items to flush. [1200676766] ndomod: Successfully flushed 78 queued items to data sink. [1200676766] ndomod: Error writing to data sink! Some output may get lost... [1200676782] ndomod: Successfully reconnected to data sink! 0 items lost, 90 queued items to flush. [1200676782] ndomod: Successfully flushed 90 queued items to data sink. [1200676782] ndomod: Error writing to data sink! Some output may get lost... [1200676798] ndomod: Successfully reconnected to data sink! 0 items lost, 70 queued items to flush. [1200676798] ndomod: Successfully flushed 70 queued items to data sink. [1200676798] ndomod: Error writing to data sink! Some output may get lost... That?s the output from ps -ef root 20562 1 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20567 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20568 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20569 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20570 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20571 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20575 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20576 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start www-data 20577 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start root 30899 2227 0 10:27 ? 00:00:08 sshd: root at pts/0 root 30901 30899 0 10:27 pts/0 00:00:05 -bash root 31325 2227 0 10:49 ? 00:00:02 sshd: root at pts/1 root 31335 31325 0 10:49 pts/1 00:00:00 -bash root 2822 1 0 11:10 ? 00:00:00 runsvdir -P /var/service log: ................................................ root 2850 2822 0 11:10 ? 00:00:00 runsv socklog-klog root 2851 2822 0 11:10 ? 00:00:00 runsv socklog-unix log 2852 2851 0 11:10 ? 00:00:00 svlogd main/main main/auth main/cron main/daemon main/debug main/ftp main/kern log 2853 2850 0 11:10 ? 00:00:00 svlogd -tt main/main root 2854 2850 0 11:10 ? 00:00:00 socklog ucspi nobody 2855 2851 0 11:10 ? 00:00:00 socklog unix /dev/log root 10421 30901 0 12:28 pts/0 00:00:00 mysql root 10521 30901 0 12:37 pts/0 00:00:00 nano ndo2db.cfg root 10591 1 0 12:38 pts/0 00:00:00 /bin/sh /usr/bin/mysqld_safe mysql 10628 10591 0 12:38 pts/0 00:00:02 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --pid-fi root 10629 10591 0 12:38 pts/0 00:00:00 logger -p daemon.err -t mysqld_safe -i -t mysqld nagios 10693 1 0 12:38 ? 00:00:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg nagios 10736 1 0 12:39 ? 00:00:04 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg So, I have no idea how to solve the problem no data are being written to the database. Please somebody could help me guide to the right place. Thanks in advance. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From john.calcote at gmail.com Fri Jan 18 20:09:54 2008 From: john.calcote at gmail.com (John Calcote) Date: Fri, 18 Jan 2008 12:09:54 -0700 Subject: patch for nagios 3rc1 - move broker pre call down below macro processing Message-ID: <4790F982.5040103@gmail.com> Hi! I've recently joined the Distributed Nagios eXecutor (DNX) team (Oct '07), and have just gotten around to porting DNX over to Nagios 3. I've been looking at nagios-3.0rc1, and I can see that a new type of NEBCALLBACK event has been added - NEBTYPE_SERVICECHECK_ASYNC_PRECHECK. I also note that the run-time semantics of NEBTYPE_SERVICECHECK_INITIATE have been modified between Nagios 2.x and 3.x. That is, the INITIATE check no longer honors the handler's return code. I have a small wish in the form of a patch to nagios-3.0rc1 that I would like to suggest. I hope you'll find this change reasonable. The patch simply moves the call to broker_service_check( NEBTYPE_SERVICECHECK_INITIATE, ... ) above the code that configures the temporary check output file and check_results_info structure. This puts the broker call in a similar relative position in 3.x code as in 2.x code, except for the check_result_info structure initialization, which didn't really have to be done before the broker call anyway. In addition, I moved the code that honors the CALLBACKOVERRIDE return value down to the INITIATE check, but left the code that honors the newer CALLBACKCANCEL in the ASYNC_PRECHECK check. This grants two advantages that I can see: 1. It makes Nagios 2.x modules more portable because Nagios 3.x would honor the same NEB module return values they did in Nagios 2.x code for the INITATE check. In addition, it provides the useful ability for Nagios 3.x modules to cancel a check in the PRECHECK handler where it makes sense to do very little preprocessing before canceling the check; cancelling the check in the INITIATE handler is less efficient. 2. NEB modules that want to handle the check themselves with the CALLBACKOVERRIDE return value are going to have trouble processing macros the same way that Nagios does, unless Nagios provides some API routines to do it for them. It makes more sense (IMHO) to have the CALLBACKOVERRIDE honored by the INITIATE check, rather than the ASYNC_PRECHECK check. I'm hoping (if you all approve of this change), that we can get it into the Nagios 3.0 code base before 3.0 ships. Would it be too late for that? Regards, John PS. The patch also fixes a minor error path memory leak, where the raw_command string was not being freed in case of an out-of-memory condition while allocating the processed_command buffer. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: nagios-3.0rc1-dnx.patch URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From cacovas at gmail.com Fri Jan 18 20:18:14 2008 From: cacovas at gmail.com (Carlos Pereira) Date: Fri, 18 Jan 2008 20:18:14 +0100 (CET) Subject: Nagios 2.10 + ndoutils-1.4b7 In-Reply-To: <6C0AFDE8AAE2FC4F9AA957B32F6A6B1F2BE7CB@EMAIL01.CORPORATE.VOX.NET> References: <6C0AFDE8AAE2FC4F9AA957B32F6A6B1F2BE7CB@EMAIL01.CORPORATE.VOX.NET> Message-ID: <20080118191814.6EDA758008C@desire.netways.de> Hi Fulton, I have checked that and I can access the database : mysql -u nagios -p (no problem). Also the event broker is enabled with -1 (broker everything). Thansks. - Carlos Pereira (alex12) ----------------------- This thread is located in the archive at this URL: http://www.nagiosexchange.org/nagios-devel.33.0.html?&tx_maillisttofaq_pi1[showUid]=8483 > The first place I would look is to make sure that the username and password you > are using to connect to mysql are correct. Try connecting with the command > line client using them. If that works, check that your event brokering is > enabled (broker everything). > > > David Fulton > Systems Administrator > NuVox Communications > O:407-835-0470 > C:321-246-2238 > "Because Business is on the Line" > > > > > ________________________________ > > From: nagios-devel-bounces at lists.sourceforge.net > [mailto:nagios-devel-bounces at lists.sourceforge.net] On Behalf Of Carlos > Pereira > Sent: Friday, January 18, 2008 1:24 PM > To: nagios-devel at lists.sourceforge.net > Subject: [Nagios-devel] Nagios 2.10 + ndoutils-1.4b7 > > > > Hi folks, could someone help with nagios + ndoutil. I'm having a hard time to > sync nagios to dump its information throught ndo to a mysql database. > > > > The scenerio is as follow: > > > > Nagios installation is not a problem, I set up and it works fine with the > default value. ( Nagios 2.10). > > Install everything , Mysql 5.0 and client , Apache2 & PHP5. > > > > Now when comes down to install ndoutils-1.4b7, problems start to popup. > > Obs: installing its not a problem but, writing to data sink it is ( THE > PROBLEM) > > > > 1 - Out put from ./configure ( Parts that have "no" while running command the > problems ( I think ) ) : > > > > checking for C compiler default output file name... a.out > > checking whether the C compiler works... yes > > checking whether we are cross compiling... no > > checking for suffix of executables... > > checking for suffix of object files... o > > > > > > checking limits.h presence... yes > > checking for limits.h... yes > > checking ltdl.h usability... no > > checking ltdl.h presence... no > > checking for ltdl.h... no > > checking math.h usability... yes > > checking math.h presence... yes > > > > > > checking signal.h presence... yes > > checking for signal.h... yes > > checking socket.h usability... no > > checking socket.h presence... no > > checking for socket.h... no > > checking stdarg.h usability... yes > > checking stdarg.h presence... yes > > > > > > checking for main in -lnsl... yes > > checking for socket in -lsocket... no > > checking for main in -lwrap... yes > > checking for strdup... yes > > checking for strstr... yes > > > > checking for compress in -lz... yes > > checking for mysql_store_result in -lmysqlclient... yes > > checking for mysql_connect in -lmysqlclient... no > > checking mysql/mysql.h usability... yes > > checking mysql/mysql.h presence... yes > > checking for mysql/mysql.h... yes > > MySQL library and include file(s) were found! > > checking for PQconnectdb in -lpq... no > > > > > > > > No problems while executing " MAKE" > > > > Followed as the README doc recommends inside the ndoutils-1.4b7. > > Copied all the files to the right places as well as the database, create users > grant permission and create the password. > > > > I ran the command ./installdb ( No problems). > > > > I change the proper parameter on ndo2db.cfg ( user and password) > > > > I execute the command : > > /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg > > > > > > Output from ls -la (inside the directory) > > > > xxxxx:/usr/local/nagios/var# ls -la > > total 224 > > drwxrwsr-x 5 nagios nagios 4096 2008-01-18 15:15 . > > drwxrwsr-x 8 nagios nagios 4096 2008-01-17 17:34 .. > > drwxrwsr-x 2 nagios nagios 4096 2008-01-18 00:00 archives > > -rw-r--r-- 1 nagios nagios 6 2008-01-18 12:39 nagios.lock > > -rw-rw-r-- 1 nagios nagios 148605 2008-01-18 15:15 nagios.log > > srwxrwxrwx 1 nagios nagios 0 2008-01-18 12:38 ndo.sock > > -rw-r--r-- 1 nagios nagios 12943 2008-01-18 12:39 objects.cache > > -rw------- 1 nagios nagios 14010 2008-01-18 14:39 retention.dat > > drwxrwsr-x 2 nagios nagios 4096 2008-01-18 12:39 rw > > drwxrwsr-x 3 nagios nagios 4096 2008-01-17 17:34 spool > > -rw-rw-r-- 1 nagios nagios 14061 2008-01-18 15:15 status.dat > > > > Every thing starts fine BUT......... > > > > That's the output, I got from nagios.log > > > > > > [1200676718] ndomod: Error writing to data sink! Some output may get lost... > > [1200676734] ndomod: Successfully reconnected to data sink! 0 items lost, 87 > queued items to flush. > > [1200676734] ndomod: Successfully flushed 87 queued items to data sink. > > [1200676734] ndomod: Error writing to data sink! Some output may get lost... > > [1200676750] ndomod: Successfully reconnected to data sink! 0 items lost, 74 > queued items to flush. > > [1200676750] ndomod: Successfully flushed 74 queued items to data sink. > > [1200676750] ndomod: Error writing to data sink! Some output may get lost... > > [1200676766] ndomod: Successfully reconnected to data sink! 0 items lost, 78 > queued items to flush. > > [1200676766] ndomod: Successfully flushed 78 queued items to data sink. > > [1200676766] ndomod: Error writing to data sink! Some output may get lost... > > [1200676782] ndomod: Successfully reconnected to data sink! 0 items lost, 90 > queued items to flush. > > [1200676782] ndomod: Successfully flushed 90 queued items to data sink. > > [1200676782] ndomod: Error writing to data sink! Some output may get lost... > > [1200676798] ndomod: Successfully reconnected to data sink! 0 items lost, 70 > queued items to flush. > > [1200676798] ndomod: Successfully flushed 70 queued items to data sink. > > [1200676798] ndomod: Error writing to data sink! Some output may get lost... > > > > That?s the output from ps -ef > > > > > > root 20562 1 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start > > www-data 20567 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start > > www-data 20568 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start > > www-data 20569 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start > > www-data 20570 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start > > www-data 20571 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start > > www-data 20575 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start > > www-data 20576 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start > > www-data 20577 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start > > root 30899 2227 0 10:27 ? 00:00:08 sshd: root at pts/0 > > root 30901 30899 0 10:27 pts/0 00:00:05 -bash > > root 31325 2227 0 10:49 ? 00:00:02 sshd: root at pts/1 > > root 31335 31325 0 10:49 pts/1 00:00:00 -bash > > root 2822 1 0 11:10 ? 00:00:00 runsvdir -P /var/service log: > ................................................ > > root 2850 2822 0 11:10 ? 00:00:00 runsv socklog-klog > > root 2851 2822 0 11:10 ? 00:00:00 runsv socklog-unix > > log 2852 2851 0 11:10 ? 00:00:00 svlogd main/main main/auth > main/cron main/daemon main/debug main/ftp main/kern > > log 2853 2850 0 11:10 ? 00:00:00 svlogd -tt main/main > > root 2854 2850 0 11:10 ? 00:00:00 socklog ucspi > > nobody 2855 2851 0 11:10 ? 00:00:00 socklog unix /dev/log > > root 10421 30901 0 12:28 pts/0 00:00:00 mysql > > root 10521 30901 0 12:37 pts/0 00:00:00 nano ndo2db.cfg > > root 10591 1 0 12:38 pts/0 00:00:00 /bin/sh /usr/bin/mysqld_safe > > mysql 10628 10591 0 12:38 pts/0 00:00:02 /usr/sbin/mysqld > --basedir=/usr --datadir=/var/lib/mysql --user=mysql --pid-fi > > root 10629 10591 0 12:38 pts/0 00:00:00 logger -p daemon.err -t > mysqld_safe -i -t mysqld > > nagios 10693 1 0 12:38 ? 00:00:00 /usr/local/nagios/bin/ndo2db > -c /usr/local/nagios/etc/ndo2db.cfg > > nagios 10736 1 0 12:39 ? 00:00:04 /usr/local/nagios/bin/nagios > -d /usr/local/nagios/etc/nagios.cfg > > > > > > So, I have no idea how to solve the problem no data are being written to the > database. Please somebody could help me guide to the right place. > > > > Thanks in advance. > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Nagios-devel mailing list > Nagios-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-devel > ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From dfulton at nuvox.com Fri Jan 18 20:28:44 2008 From: dfulton at nuvox.com (Fulton, David) Date: Fri, 18 Jan 2008 14:28:44 -0500 Subject: Nagios 2.10 + ndoutils-1.4b7 In-Reply-To: <20080118191814.6EDA758008C@desire.netways.de> References: <6C0AFDE8AAE2FC4F9AA957B32F6A6B1F2BE7CB@EMAIL01.CORPORATE.VOX.NET> <20080118191814.6EDA758008C@desire.netways.de> Message-ID: <6C0AFDE8AAE2FC4F9AA957B32F6A6B1F2BE7DF@EMAIL01.CORPORATE.VOX.NET> Run a truss/strace on the relative processes, It has been a long time since I set any of it up, but those usually will give you an idea what is happening. Also check your mysql log files. And enable debugging in nagios. How to do all that is in the documentation so I won't cover it here. But pick it to pieces and see what is really happening. David Fulton Systems Administrator NuVox Communications O:407-835-0470 C:321-246-2238 "Because Business is on the Line" > -----Original Message----- > From: nagios-devel-bounces at lists.sourceforge.net > [mailto:nagios-devel-bounces at lists.sourceforge.net] On Behalf > Of Carlos Pereira > Sent: Friday, January 18, 2008 2:18 PM > To: nagios-devel at lists.sourceforge.net; cacovas at gmail.com > Subject: Re: [Nagios-devel] Nagios 2.10 + ndoutils-1.4b7 > > Hi Fulton, > > I have checked that and I can access the database : mysql -u > nagios -p (no problem). Also the event broker is enabled with > -1 (broker everything). > > Thansks. > > > - Carlos Pereira (alex12) > > ----------------------- > This thread is located in the archive at this URL: > http://www.nagiosexchange.org/nagios-devel.33.0.html?&tx_maill isttofaq_pi1[showUid]=8483 > > > The first place I would look is to make sure that the username and > > password you are using to connect to mysql are correct. Try > connecting > > with the command line client using them. If that works, check that > > your event brokering is enabled (broker everything). > > > > > > David Fulton > > Systems Administrator > > NuVox Communications > > O:407-835-0470 > > C:321-246-2238 > > "Because Business is on the Line" > > > > > > > > > > ________________________________ > > > > From: nagios-devel-bounces at lists.sourceforge.net > > [mailto:nagios-devel-bounces at lists.sourceforge.net] On Behalf Of > > Carlos Pereira > > Sent: Friday, January 18, 2008 1:24 PM > > To: nagios-devel at lists.sourceforge.net > > Subject: [Nagios-devel] Nagios 2.10 + ndoutils-1.4b7 > > > > > > > > Hi folks, could someone help with nagios + ndoutil. I'm > having a hard > > time to sync nagios to dump its information throught ndo to > a mysql database. > > > > > > > > The scenerio is as follow: > > > > > > > > Nagios installation is not a problem, I set up and it works > fine with > > the default value. ( Nagios 2.10). > > > > Install everything , Mysql 5.0 and client , Apache2 & PHP5. > > > > > > > > Now when comes down to install ndoutils-1.4b7, problems > start to popup. > > > > Obs: installing its not a problem but, writing to data sink it is ( > > THE > > PROBLEM) > > > > > > > > 1 - Out put from ./configure ( Parts that have "no" while running > > command the problems ( I think ) ) : > > > > > > > > checking for C compiler default output file name... a.out > > > > checking whether the C compiler works... yes > > > > checking whether we are cross compiling... no > > > > checking for suffix of executables... > > > > checking for suffix of object files... o > > > > > > > > > > > > checking limits.h presence... yes > > > > checking for limits.h... yes > > > > checking ltdl.h usability... no > > > > checking ltdl.h presence... no > > > > checking for ltdl.h... no > > > > checking math.h usability... yes > > > > checking math.h presence... yes > > > > > > > > > > > > checking signal.h presence... yes > > > > checking for signal.h... yes > > > > checking socket.h usability... no > > > > checking socket.h presence... no > > > > checking for socket.h... no > > > > checking stdarg.h usability... yes > > > > checking stdarg.h presence... yes > > > > > > > > > > > > checking for main in -lnsl... yes > > > > checking for socket in -lsocket... no > > > > checking for main in -lwrap... yes > > > > checking for strdup... yes > > > > checking for strstr... yes > > > > > > > > checking for compress in -lz... yes > > > > checking for mysql_store_result in -lmysqlclient... yes > > > > checking for mysql_connect in -lmysqlclient... no > > > > checking mysql/mysql.h usability... yes > > > > checking mysql/mysql.h presence... yes > > > > checking for mysql/mysql.h... yes > > > > MySQL library and include file(s) were found! > > > > checking for PQconnectdb in -lpq... no > > > > > > > > > > > > > > > > No problems while executing " MAKE" > > > > > > > > Followed as the README doc recommends inside the ndoutils-1.4b7. > > > > Copied all the files to the right places as well as the database, > > create users grant permission and create the password. > > > > > > > > I ran the command ./installdb ( No problems). > > > > > > > > I change the proper parameter on ndo2db.cfg ( user and password) > > > > > > > > I execute the command : > > > > /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg > > > > > > > > > > > > Output from ls -la (inside the directory) > > > > > > > > xxxxx:/usr/local/nagios/var# ls -la > > > > total 224 > > > > drwxrwsr-x 5 nagios nagios 4096 2008-01-18 15:15 . > > > > drwxrwsr-x 8 nagios nagios 4096 2008-01-17 17:34 .. > > > > drwxrwsr-x 2 nagios nagios 4096 2008-01-18 00:00 archives > > > > -rw-r--r-- 1 nagios nagios 6 2008-01-18 12:39 nagios.lock > > > > -rw-rw-r-- 1 nagios nagios 148605 2008-01-18 15:15 nagios.log > > > > srwxrwxrwx 1 nagios nagios 0 2008-01-18 12:38 ndo.sock > > > > -rw-r--r-- 1 nagios nagios 12943 2008-01-18 12:39 objects.cache > > > > -rw------- 1 nagios nagios 14010 2008-01-18 14:39 retention.dat > > > > drwxrwsr-x 2 nagios nagios 4096 2008-01-18 12:39 rw > > > > drwxrwsr-x 3 nagios nagios 4096 2008-01-17 17:34 spool > > > > -rw-rw-r-- 1 nagios nagios 14061 2008-01-18 15:15 status.dat > > > > > > > > Every thing starts fine BUT......... > > > > > > > > That's the output, I got from nagios.log > > > > > > > > > > > > [1200676718] ndomod: Error writing to data sink! Some > output may get lost... > > > > [1200676734] ndomod: Successfully reconnected to data sink! > 0 items > > lost, 87 queued items to flush. > > > > [1200676734] ndomod: Successfully flushed 87 queued items > to data sink. > > > > [1200676734] ndomod: Error writing to data sink! Some > output may get lost... > > > > [1200676750] ndomod: Successfully reconnected to data sink! > 0 items > > lost, 74 queued items to flush. > > > > [1200676750] ndomod: Successfully flushed 74 queued items > to data sink. > > > > [1200676750] ndomod: Error writing to data sink! Some > output may get lost... > > > > [1200676766] ndomod: Successfully reconnected to data sink! > 0 items > > lost, 78 queued items to flush. > > > > [1200676766] ndomod: Successfully flushed 78 queued items > to data sink. > > > > [1200676766] ndomod: Error writing to data sink! Some > output may get lost... > > > > [1200676782] ndomod: Successfully reconnected to data sink! > 0 items > > lost, 90 queued items to flush. > > > > [1200676782] ndomod: Successfully flushed 90 queued items > to data sink. > > > > [1200676782] ndomod: Error writing to data sink! Some > output may get lost... > > > > [1200676798] ndomod: Successfully reconnected to data sink! > 0 items > > lost, 70 queued items to flush. > > > > [1200676798] ndomod: Successfully flushed 70 queued items > to data sink. > > > > [1200676798] ndomod: Error writing to data sink! Some > output may get lost... > > > > > > > > That?s the output from ps -ef > > > > > > > > > > > > root 20562 1 0 Jan17 ? 00:00:00 > /usr/sbin/apache2 -k start > > > > www-data 20567 20562 0 Jan17 ? 00:00:00 > /usr/sbin/apache2 -k start > > > > www-data 20568 20562 0 Jan17 ? 00:00:00 > /usr/sbin/apache2 -k start > > > > www-data 20569 20562 0 Jan17 ? 00:00:00 > /usr/sbin/apache2 -k start > > > > www-data 20570 20562 0 Jan17 ? 00:00:00 > /usr/sbin/apache2 -k start > > > > www-data 20571 20562 0 Jan17 ? 00:00:00 > /usr/sbin/apache2 -k start > > > > www-data 20575 20562 0 Jan17 ? 00:00:00 > /usr/sbin/apache2 -k start > > > > www-data 20576 20562 0 Jan17 ? 00:00:00 > /usr/sbin/apache2 -k start > > > > www-data 20577 20562 0 Jan17 ? 00:00:00 > /usr/sbin/apache2 -k start > > > > root 30899 2227 0 10:27 ? 00:00:08 sshd: root at pts/0 > > > > root 30901 30899 0 10:27 pts/0 00:00:05 -bash > > > > root 31325 2227 0 10:49 ? 00:00:02 sshd: root at pts/1 > > > > root 31335 31325 0 10:49 pts/1 00:00:00 -bash > > > > root 2822 1 0 11:10 ? 00:00:00 runsvdir -P > /var/service log: > > ................................................ > > > > root 2850 2822 0 11:10 ? 00:00:00 runsv socklog-klog > > > > root 2851 2822 0 11:10 ? 00:00:00 runsv socklog-unix > > > > log 2852 2851 0 11:10 ? 00:00:00 svlogd > main/main main/auth > > main/cron main/daemon main/debug main/ftp main/kern > > > > log 2853 2850 0 11:10 ? 00:00:00 svlogd -tt main/main > > > > root 2854 2850 0 11:10 ? 00:00:00 socklog ucspi > > > > nobody 2855 2851 0 11:10 ? 00:00:00 socklog > unix /dev/log > > > > root 10421 30901 0 12:28 pts/0 00:00:00 mysql > > > > root 10521 30901 0 12:37 pts/0 00:00:00 nano ndo2db.cfg > > > > root 10591 1 0 12:38 pts/0 00:00:00 /bin/sh > /usr/bin/mysqld_safe > > > > mysql 10628 10591 0 12:38 pts/0 00:00:02 /usr/sbin/mysqld > > --basedir=/usr --datadir=/var/lib/mysql --user=mysql --pid-fi > > > > root 10629 10591 0 12:38 pts/0 00:00:00 logger -p > daemon.err -t > > mysqld_safe -i -t mysqld > > > > nagios 10693 1 0 12:38 ? 00:00:00 > /usr/local/nagios/bin/ndo2db > > -c /usr/local/nagios/etc/ndo2db.cfg > > > > nagios 10736 1 0 12:39 ? 00:00:04 > /usr/local/nagios/bin/nagios > > -d /usr/local/nagios/etc/nagios.cfg > > > > > > > > > > > > So, I have no idea how to solve the problem no data are > being written > > to the database. Please somebody could help me guide to the > right place. > > > > > > > > Thanks in advance. > > > > > > > ---------------------------------------------------------------------- > > --- This SF.net email is sponsored by: Microsoft Defy all > challenges. > > Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Nagios-devel mailing list > > Nagios-devel at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/nagios-devel > > > > -------------------------------------------------------------- > ----------- > This SF.net email is sponsored by: Microsoft Defy all > challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Nagios-devel mailing list > Nagios-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-devel > ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From andurin at process-zero.de Fri Jan 18 20:51:09 2008 From: andurin at process-zero.de (Hendrik =?ISO-8859-1?Q?B=E4cker?=) Date: Fri, 18 Jan 2008 20:51:09 +0100 Subject: Nagios 2.10 + ndoutils-1.4b7 In-Reply-To: <20080118191814.6EDA758008C@desire.netways.de> References: <6C0AFDE8AAE2FC4F9AA957B32F6A6B1F2BE7CB@EMAIL01.CORPORATE.VOX.NET> <20080118191814.6EDA758008C@desire.netways.de> Message-ID: <1200685870.6290.17.camel@baeckerh-laptop> Hi, did you recognized the debug options in your ndo2db.cfg? May be they will tell you more? - Hendrik Am Freitag, den 18.01.2008, 20:18 +0100 schrieb Carlos Pereira: > Hi Fulton, > > I have checked that and I can access the database : mysql -u nagios -p (no problem). Also the event broker is enabled with -1 (broker everything). > > Thansks. > > > - Carlos Pereira (alex12) > > ----------------------- > This thread is located in the archive at this URL: > http://www.nagiosexchange.org/nagios-devel.33.0.html?&tx_maillisttofaq_pi1[showUid]=8483 > > > The first place I would look is to make sure that the username and password you > > are using to connect to mysql are correct. Try connecting with the command > > line client using them. If that works, check that your event brokering is > > enabled (broker everything). > > > > > > David Fulton > > Systems Administrator > > NuVox Communications > > O:407-835-0470 > > C:321-246-2238 > > "Because Business is on the Line" > > > > > > > > > > ________________________________ > > > > From: nagios-devel-bounces at lists.sourceforge.net > > [mailto:nagios-devel-bounces at lists.sourceforge.net] On Behalf Of Carlos > > Pereira > > Sent: Friday, January 18, 2008 1:24 PM > > To: nagios-devel at lists.sourceforge.net > > Subject: [Nagios-devel] Nagios 2.10 + ndoutils-1.4b7 > > > > > > > > Hi folks, could someone help with nagios + ndoutil. I'm having a hard time to > > sync nagios to dump its information throught ndo to a mysql database. > > > > > > > > The scenerio is as follow: > > > > > > > > Nagios installation is not a problem, I set up and it works fine with the > > default value. ( Nagios 2.10). > > > > Install everything , Mysql 5.0 and client , Apache2 & PHP5. > > > > > > > > Now when comes down to install ndoutils-1.4b7, problems start to popup. > > > > Obs: installing its not a problem but, writing to data sink it is ( THE > > PROBLEM) > > > > > > > > 1 - Out put from ./configure ( Parts that have "no" while running command the > > problems ( I think ) ) : > > > > > > > > checking for C compiler default output file name... a.out > > > > checking whether the C compiler works... yes > > > > checking whether we are cross compiling... no > > > > checking for suffix of executables... > > > > checking for suffix of object files... o > > > > > > > > > > > > checking limits.h presence... yes > > > > checking for limits.h... yes > > > > checking ltdl.h usability... no > > > > checking ltdl.h presence... no > > > > checking for ltdl.h... no > > > > checking math.h usability... yes > > > > checking math.h presence... yes > > > > > > > > > > > > checking signal.h presence... yes > > > > checking for signal.h... yes > > > > checking socket.h usability... no > > > > checking socket.h presence... no > > > > checking for socket.h... no > > > > checking stdarg.h usability... yes > > > > checking stdarg.h presence... yes > > > > > > > > > > > > checking for main in -lnsl... yes > > > > checking for socket in -lsocket... no > > > > checking for main in -lwrap... yes > > > > checking for strdup... yes > > > > checking for strstr... yes > > > > > > > > checking for compress in -lz... yes > > > > checking for mysql_store_result in -lmysqlclient... yes > > > > checking for mysql_connect in -lmysqlclient... no > > > > checking mysql/mysql.h usability... yes > > > > checking mysql/mysql.h presence... yes > > > > checking for mysql/mysql.h... yes > > > > MySQL library and include file(s) were found! > > > > checking for PQconnectdb in -lpq... no > > > > > > > > > > > > > > > > No problems while executing " MAKE" > > > > > > > > Followed as the README doc recommends inside the ndoutils-1.4b7. > > > > Copied all the files to the right places as well as the database, create users > > grant permission and create the password. > > > > > > > > I ran the command ./installdb ( No problems). > > > > > > > > I change the proper parameter on ndo2db.cfg ( user and password) > > > > > > > > I execute the command : > > > > /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg > > > > > > > > > > > > Output from ls -la (inside the directory) > > > > > > > > xxxxx:/usr/local/nagios/var# ls -la > > > > total 224 > > > > drwxrwsr-x 5 nagios nagios 4096 2008-01-18 15:15 . > > > > drwxrwsr-x 8 nagios nagios 4096 2008-01-17 17:34 .. > > > > drwxrwsr-x 2 nagios nagios 4096 2008-01-18 00:00 archives > > > > -rw-r--r-- 1 nagios nagios 6 2008-01-18 12:39 nagios.lock > > > > -rw-rw-r-- 1 nagios nagios 148605 2008-01-18 15:15 nagios.log > > > > srwxrwxrwx 1 nagios nagios 0 2008-01-18 12:38 ndo.sock > > > > -rw-r--r-- 1 nagios nagios 12943 2008-01-18 12:39 objects.cache > > > > -rw------- 1 nagios nagios 14010 2008-01-18 14:39 retention.dat > > > > drwxrwsr-x 2 nagios nagios 4096 2008-01-18 12:39 rw > > > > drwxrwsr-x 3 nagios nagios 4096 2008-01-17 17:34 spool > > > > -rw-rw-r-- 1 nagios nagios 14061 2008-01-18 15:15 status.dat > > > > > > > > Every thing starts fine BUT......... > > > > > > > > That's the output, I got from nagios.log > > > > > > > > > > > > [1200676718] ndomod: Error writing to data sink! Some output may get lost... > > > > [1200676734] ndomod: Successfully reconnected to data sink! 0 items lost, 87 > > queued items to flush. > > > > [1200676734] ndomod: Successfully flushed 87 queued items to data sink. > > > > [1200676734] ndomod: Error writing to data sink! Some output may get lost... > > > > [1200676750] ndomod: Successfully reconnected to data sink! 0 items lost, 74 > > queued items to flush. > > > > [1200676750] ndomod: Successfully flushed 74 queued items to data sink. > > > > [1200676750] ndomod: Error writing to data sink! Some output may get lost... > > > > [1200676766] ndomod: Successfully reconnected to data sink! 0 items lost, 78 > > queued items to flush. > > > > [1200676766] ndomod: Successfully flushed 78 queued items to data sink. > > > > [1200676766] ndomod: Error writing to data sink! Some output may get lost... > > > > [1200676782] ndomod: Successfully reconnected to data sink! 0 items lost, 90 > > queued items to flush. > > > > [1200676782] ndomod: Successfully flushed 90 queued items to data sink. > > > > [1200676782] ndomod: Error writing to data sink! Some output may get lost... > > > > [1200676798] ndomod: Successfully reconnected to data sink! 0 items lost, 70 > > queued items to flush. > > > > [1200676798] ndomod: Successfully flushed 70 queued items to data sink. > > > > [1200676798] ndomod: Error writing to data sink! Some output may get lost... > > > > > > > > That?s the output from ps -ef > > > > > > > > > > > > root 20562 1 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start > > > > www-data 20567 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start > > > > www-data 20568 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start > > > > www-data 20569 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start > > > > www-data 20570 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start > > > > www-data 20571 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start > > > > www-data 20575 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start > > > > www-data 20576 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start > > > > www-data 20577 20562 0 Jan17 ? 00:00:00 /usr/sbin/apache2 -k start > > > > root 30899 2227 0 10:27 ? 00:00:08 sshd: root at pts/0 > > > > root 30901 30899 0 10:27 pts/0 00:00:05 -bash > > > > root 31325 2227 0 10:49 ? 00:00:02 sshd: root at pts/1 > > > > root 31335 31325 0 10:49 pts/1 00:00:00 -bash > > > > root 2822 1 0 11:10 ? 00:00:00 runsvdir -P /var/service log: > > ................................................ > > > > root 2850 2822 0 11:10 ? 00:00:00 runsv socklog-klog > > > > root 2851 2822 0 11:10 ? 00:00:00 runsv socklog-unix > > > > log 2852 2851 0 11:10 ? 00:00:00 svlogd main/main main/auth > > main/cron main/daemon main/debug main/ftp main/kern > > > > log 2853 2850 0 11:10 ? 00:00:00 svlogd -tt main/main > > > > root 2854 2850 0 11:10 ? 00:00:00 socklog ucspi > > > > nobody 2855 2851 0 11:10 ? 00:00:00 socklog unix /dev/log > > > > root 10421 30901 0 12:28 pts/0 00:00:00 mysql > > > > root 10521 30901 0 12:37 pts/0 00:00:00 nano ndo2db.cfg > > > > root 10591 1 0 12:38 pts/0 00:00:00 /bin/sh /usr/bin/mysqld_safe > > > > mysql 10628 10591 0 12:38 pts/0 00:00:02 /usr/sbin/mysqld > > --basedir=/usr --datadir=/var/lib/mysql --user=mysql --pid-fi > > > > root 10629 10591 0 12:38 pts/0 00:00:00 logger -p daemon.err -t > > mysqld_safe -i -t mysqld > > > > nagios 10693 1 0 12:38 ? 00:00:00 /usr/local/nagios/bin/ndo2db > > -c /usr/local/nagios/etc/ndo2db.cfg > > > > nagios 10736 1 0 12:39 ? 00:00:04 /usr/local/nagios/bin/nagios > > -d /usr/local/nagios/etc/nagios.cfg > > > > > > > > > > > > So, I have no idea how to solve the problem no data are being written to the > > database. Please somebody could help me guide to the right place. > > > > > > > > Thanks in advance. > > > > > > ------------------------------------------------------------------------- > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Nagios-devel mailing list > > Nagios-devel at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/nagios-devel > > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Nagios-devel mailing list > Nagios-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-devel ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From svalding at doverchem.com Fri Jan 18 21:58:40 2008 From: svalding at doverchem.com (Valdinger, Stephen (DOV, MSX)) Date: Fri, 18 Jan 2008 15:58:40 -0500 Subject: NDO Issue Message-ID: I'm running NDO 1.4b6 and Nagios 2.10. For the longest time things were running great, and I was actively using the data stored by the ndo2db daemon in the database to power NagVis. Today however, things started not working. Nothing has changed on the system, as I'm the only one who uses it! I've troubleshot, but to no avail. When I try to start up the ndo2db daemon with ndo2db -c /usr/local/Nagios/etc/ndo2db.cfg it creates TWO ndo2db daemons. I think this is the screw up, because nagvis reports Nagios as not running, when in fact ps -ef | grep Nagios proves it is! Any help out there for this issue? Stephen Valdinger Dover Chemical Corporation MIS Helpdesk Guru 330.365.3622 stephen.valdinger at doverchem.com "Spiderpig, Spiderpig, does whatever a Spiderpig does! Can he swing, from a web? No he can't, he's a pig. Look out! He is a Spiderpig!! -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From john.calcote at gmail.com Mon Jan 21 04:21:35 2008 From: john.calcote at gmail.com (John Calcote) Date: Sun, 20 Jan 2008 20:21:35 -0700 Subject: patch for nagios 3rc1 - update In-Reply-To: <4790F982.5040103@gmail.com> References: <4790F982.5040103@gmail.com> Message-ID: <47940FBF.7000106@gmail.com> Hello again. I'm sorry to do this, as it's rather embarrassing, but I've got a small update to the patch I submitted earlier. This version has two differences from the first patch I sent: 1. Most of the check_results_info structure initialization has been moved back above the broker INITIATE call. I found that I needed to use some of the fields in that global structure from within my INITIATE handler - mostly this is data that a proper handler needs to track for final results submission, but it's not passed any other way. I'll send another message later to discuss this in more detail. 2. I actually fixed the minor memory leaks that I mentioned in my previous note. ;) I guess I forgot to put those in before creating the patch. Thanks for your patience, John John Calcote wrote: > Hi! > > I've recently joined the Distributed Nagios eXecutor (DNX) team (Oct > '07), and have just gotten around to porting DNX over to Nagios 3. > > I've been looking at nagios-3.0rc1, and I can see that a new type of > NEBCALLBACK event has been added - > NEBTYPE_SERVICECHECK_ASYNC_PRECHECK. I also note that the run-time > semantics of NEBTYPE_SERVICECHECK_INITIATE have been modified between > Nagios 2.x and 3.x. That is, the INITIATE check no longer honors the > handler's return code. > > I have a small wish in the form of a patch to nagios-3.0rc1 that I > would like to suggest. I hope you'll find this change reasonable. The > patch simply moves the call to broker_service_check( > NEBTYPE_SERVICECHECK_INITIATE, ... ) above the code that configures > the temporary check output file and check_results_info structure. This > puts the broker call in a similar relative position in 3.x code as in > 2.x code, except for the check_result_info structure initialization, > which didn't really have to be done before the broker call anyway. In > addition, I moved the code that honors the CALLBACKOVERRIDE return > value down to the INITIATE check, but left the code that honors the > newer CALLBACKCANCEL in the ASYNC_PRECHECK check. > > This grants two advantages that I can see: > > 1. It makes Nagios 2.x modules more portable because Nagios 3.x would > honor the same NEB module return values they did in Nagios 2.x code > for the INITATE check. In addition, it provides the useful ability for > Nagios 3.x modules to cancel a check in the PRECHECK handler where it > makes sense to do very little preprocessing before canceling the > check; cancelling the check in the INITIATE handler is less efficient. > > 2. NEB modules that want to handle the check themselves with the > CALLBACKOVERRIDE return value are going to have trouble processing > macros the same way that Nagios does, unless Nagios provides some API > routines to do it for them. It makes more sense (IMHO) to have the > CALLBACKOVERRIDE honored by the INITIATE check, rather than the > ASYNC_PRECHECK check. > > I'm hoping (if you all approve of this change), that we can get it > into the Nagios 3.0 code base before 3.0 ships. Would it be too late > for that? > > Regards, > John > > PS. The patch also fixes a minor error path memory leak, where the > raw_command string was not being freed in case of an out-of-memory > condition while allocating the processed_command buffer. > -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: nagios-3.0rc1-dnx.patch URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From igor-v-k at yandex.ru Tue Jan 22 07:52:08 2008 From: igor-v-k at yandex.ru (Igor Khristophorov) Date: Tue, 22 Jan 2008 09:52:08 +0300 Subject: Nagios Tactical Overview - need to alarm only on hard states In-Reply-To: <478DDBE1.2020908@usit.uio.no> References: <20080115094755.GC3936@libra.onm.mtt.ru> <478DDBE1.2020908@usit.uio.no> Message-ID: <20080122065208.GF3936@libra.onm.mtt.ru> Hello, Unfotunately, I don't have much time now to study the code to make the patch soon. If somebody has any ideas on which variables to check and in which places in the code, I could find some time to implement it sooner. On Wed, Jan 16, 2008 at 11:26:41AM +0100, St?le Asker?d Johansen wrote: > > We also would very much like to have hard states only in the webgui. It > seems it was possible in earlier versions, but not any longer. To have > more flexibility like you mention would be very useful for us. > ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From jan.grant at bristol.ac.uk Tue Jan 22 17:03:54 2008 From: jan.grant at bristol.ac.uk (Jan Grant) Date: Tue, 22 Jan 2008 16:03:54 +0000 (GMT) Subject: Bug: cfg_dir doesn't work on solaris..? Message-ID: <20080122160228.C7209@tribble.ilrt.bris.ac.uk> Excuse me mailing the list with this: I can't find any public bug tracker and the mailing list archives appear a little light on details about this. Nutshell version is, just built nagios 2.10 on solaris 10. cfg_dir doesn't appear to recurse into directories - as far as I can tell, because the cfg_dir code relies on d_type in struct dirent. I can prep a patch, but I've no idea whether this is a known issue or fixed or whatever. Please let me know if this is known, fixed, patch requested, etc, and if I'm just being myopic or if there is indeed no bug tracker instance. jan PS. I appreciate your efforts! -- jan grant, ISYS, University of Bristol. http://www.bris.ac.uk/ Tel +44 (0)117 3317661 http://ioctl.org/jan/ OORDBMSs make me feel old; I remember when this was all fields. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From lavalamp at spiritual-machines.org Tue Jan 22 18:08:19 2008 From: lavalamp at spiritual-machines.org (Brian A. Seklecki) Date: Tue, 22 Jan 2008 12:08:19 -0500 Subject: Bug: cfg_dir doesn't work on solaris..? In-Reply-To: <20080122160228.C7209@tribble.ilrt.bris.ac.uk> References: <20080122160228.C7209@tribble.ilrt.bris.ac.uk> Message-ID: <1201021699.26176.9.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> It doesn't work in FreeBSD either -- don't get your hopes up. We reported it over a month ago to no avail. W/O a tracker you're just pissing in the wind. ~BAS On Tue, 2008-01-22 at 16:03 +0000, Jan Grant wrote: > Excuse me mailing the list with this: I can't find any public bug > tracker and the mailing list archives appear a little light on > details about this. > > Nutshell version is, just built nagios 2.10 on solaris 10. cfg_dir > doesn't appear to recurse into directories - as far as I can tell, > because the cfg_dir code relies on d_type in struct dirent. I can prep a > patch, but I've no idea whether this is a known issue or fixed or > whatever. > > Please let me know if this is known, fixed, patch requested, etc, and if > I'm just being myopic or if there is indeed no bug tracker instance. > > jan > > PS. I appreciate your efforts! > -- Brian A. Seklecki Collaborative Fusion, Inc. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From jan.grant at bristol.ac.uk Tue Jan 22 19:16:11 2008 From: jan.grant at bristol.ac.uk (Jan Grant) Date: Tue, 22 Jan 2008 18:16:11 +0000 (GMT) Subject: Bug: cfg_dir doesn't work on solaris..? In-Reply-To: <1201021699.26176.9.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> References: <20080122160228.C7209@tribble.ilrt.bris.ac.uk> <1201021699.26176.9.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> Message-ID: <20080122175918.F8384@tribble.ilrt.bris.ac.uk> On Tue, 22 Jan 2008, Brian A. Seklecki wrote: > It doesn't work in FreeBSD either -- don't get your hopes up. We > reported it over a month ago to no avail. W/O a tracker you're just > pissing in the wind. That's rather odd; FreeBSD includes d_type (it's originally a BSDism I think) - perhaps the configure script doesn't detect it properly... ... it seems so: _DIRENT_HAVE_D_TYPE doesn't get defined, which is what the xodtemplate.c code is dependent on... ... ah, there it is, in a Linux /usr/include/dirent.h: if you look at a linux manpage for readdir, the NOTES says: [[[ NOTES Only the fields d_name and d_ino are specified in POSIX.1-2001. The remaining fields are available on many, but not all systems. Under glibc, programs can check for the availability of the fields not defined in POSIX.1 by testing whether the macros _DIRENT_HAVE_D_NAMLEN, _DIRENT_HAVE_D_RECLEN, _DIRENT_HAVE_D_OFF, or _DIRENT_HAVE_D_TYPE are defined. ]]] ... so this looks like a "glibcism". To get this working under FreeBSD it may be simply sufficient to hack a [[[ #define _DIRENT_HAVE_D_TYPE ]]] into include/config.h before making... ...hm, having just tried it, that might just sort you out :-) The situation on solaris is a little more harsh, since it has no d_type. I'll sort out a patch to do that and something a little more convincing for common.h that detects BSD systems. Cheers, jan -- jan grant, ISYS, University of Bristol. http://www.bris.ac.uk/ Tel +44 (0)117 3317661 http://ioctl.org/jan/ ioctl(2): probably the coolest Unix system call in the world ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From jan.grant at bristol.ac.uk Tue Jan 22 19:34:19 2008 From: jan.grant at bristol.ac.uk (Jan Grant) Date: Tue, 22 Jan 2008 18:34:19 +0000 (GMT) Subject: [patch] FreeBSD fix for cfg_dir Re: Bug: cfg_dir doesn't work on solaris..? In-Reply-To: <20080122175918.F8384@tribble.ilrt.bris.ac.uk> References: <20080122160228.C7209@tribble.ilrt.bris.ac.uk> <1201021699.26176.9.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> <20080122175918.F8384@tribble.ilrt.bris.ac.uk> Message-ID: <20080122183151.M8384@tribble.ilrt.bris.ac.uk> On Tue, 22 Jan 2008, Jan Grant wrote: > To get this working under FreeBSD it may be simply sufficient to hack a > > [[[ > #define _DIRENT_HAVE_D_TYPE > ]]] > > into include/config.h before making... > > ...hm, having just tried it, that might just sort you out :-) Attached is a patch against nagios-2.10 which detects the "missing" _DIRENT_HAVE_D_TYPE on BSD systems (the heuristic is to look for DT_UNKNOWN) and defines it in those cases. It's a bit of a hack, but the attached diff makes nagios compile cleanly on FreeBSD and gets cffg_dir recursing properly. I'll sort out something for Solaris over the next day or so. Cheers, jan -- jan grant, ISYS, University of Bristol. http://www.bris.ac.uk/ Tel +44 (0)117 3317661 http://ioctl.org/jan/ Unfortunately, I have a very good idea how fast my keys are moving. -------------- next part -------------- --- include/config.h.in.orig 2008-01-22 18:22:04.000000000 +0000 +++ include/config.h.in 2008-01-22 18:27:08.000000000 +0000 @@ -218,6 +218,12 @@ #undef HAVE_DIRENT_H #ifdef HAVE_DIRENT_H #include +#ifndef _DIRENT_HAVE_D_TYPE +#ifdef DT_UNKNOWN +/* A good indication of d_type: set the following glibc flag */ +#define _DIRENT_HAVE_D_TYPE +#endif +#endif #endif #undef HAVE_PTHREAD_H -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From john.calcote at gmail.com Tue Jan 22 23:35:40 2008 From: john.calcote at gmail.com (John Calcote) Date: Tue, 22 Jan 2008 15:35:40 -0700 Subject: Nagios event broker architecture comments Message-ID: <3ee91eb90801221435i60df0e1fh908474bd25092f5c@mail.gmail.com> To the Nagios developers, Well, I said in a previous message that I'd write more about external agents handling checks in Nagios. As a developer on the dnx project, I can say truthfully that dnx has a vested interest in the general NEB architecture. As I've worked on porting DNX to Nagios 3.x, I've found a few architectural issues that I'd like to bring up. Nagios makes it possible for a NEB module to override (using the CALLBACKOVERRIDE return value) a service check at a key event that is published by the event broker during the processing of a service check. It's reasonably clear from comments in the Nagios code, that Nagios developers intended for this return value to be used by NEB modules to stop the processing of a check by Nagios so that the NEB module could execute the check and post the results itself. Needless to say, it's fairly significant to Nagios how the NEB module proceeds, once it returns CALLBACKOVERRIDE. If Nagios truly expects a NEB module to handle the processing of a check, and the subsequent submission of check results, then Nagios needs to make proper results submission possible. I posted a (updated) patch yesterday that makes some minor changes to nagios-3.0rc1/base/check.c. While these changes make it POSSIBLE for DNX to handle a Nagios service check, and post proper results, I feel like there are better ways to do it. Even with the patch I submitted, the DNX NEB module must directly access various global variables in the Nagios process space. For one thing, DNX needs data from various fields of the global check_result_info structure that is populated just before the broker INITIATE event is published. These data items are essential to submitting proper results info during handling of a service check. They include the check_options, schedule and reschedule flags, and the latency value. While both latency and check_options values can be found in the service structure (which is made available to the event handler), these fields do not contain the values actually written to the results file during results submission, so the proper values must be accessible to the event handler so that it can store them for later results submission. Another issue that I feel is important to address is the global symbol space to which DNX and other NEB modules have access. DNX uses a couple of "helper" functions provided by Nagios (unintentionally, I'm sure). DNX uses escape_newlines and move_check_result_to_queue, during results posting. It saves DNX duplicating a LOT of Nagios code while submitting service check results. (These functions were really great finds!) However, since posting results to Nagios is a critical bit of functionality - both for DNX and for Nagios - it might be nice if Nagios provided an actual API call that was published as part of the NEB interface documentation. If a NEB module intends to use CALLBACKOVERRIDE, it should be able to publish the results of an overridden service check in exactly the same way that Nagios does. This can be easily facilitated by having Nagios use the same API function to publish results from checks that Nagios itself executes. Sorry for the long-winded message. I just felt that these architectural issues should be addressed. Comments would be very much appreciated. :) Regards, John Calcote Sr. Software Engineer LDS Church, ICS Dept. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From ae at op5.se Wed Jan 23 09:41:40 2008 From: ae at op5.se (Andreas Ericsson) Date: Wed, 23 Jan 2008 09:41:40 +0100 Subject: [patch] FreeBSD fix for cfg_dir Re: Bug: cfg_dir doesn't work on solaris..? In-Reply-To: <20080122183151.M8384@tribble.ilrt.bris.ac.uk> References: <20080122160228.C7209@tribble.ilrt.bris.ac.uk> <1201021699.26176.9.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> <20080122175918.F8384@tribble.ilrt.bris.ac.uk> <20080122183151.M8384@tribble.ilrt.bris.ac.uk> Message-ID: <4796FDC4.1040105@op5.se> Jan Grant wrote: > > Attached is a patch against nagios-2.10 which detects the "missing" > _DIRENT_HAVE_D_TYPE on BSD systems (the heuristic is to look for > DT_UNKNOWN) and defines it in those cases. > The patch looks decent, but it's the wrong fix. When d_type is present, Nagios will still stat() the dirent to see if it's a directory before recursing into it. A more proper fix would be to ignore d_type and just rely on S_ISDIR(st.st_mode) to check for directories. That'll make it work the same on every system too, which is most definitely a Good Thing(tm). I'll get to work on this when I have time, although I fear it won't be until I've completed my current project at work and then have looked into the config parsing code and the event-queue problems. A Nagios hackathon would be a neat idea, and small stuff like this could get sorted in no-time. Hmm, perhaps I shall see if I can get the boss to sponsor one :-) -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From jan.grant at bristol.ac.uk Wed Jan 23 13:09:37 2008 From: jan.grant at bristol.ac.uk (Jan Grant) Date: Wed, 23 Jan 2008 12:09:37 +0000 (GMT) Subject: another [patch] Solaris fix for cfg_dir Re: Bug: cfg_dir doesn't work on solaris..? In-Reply-To: <20080122183151.M8384@tribble.ilrt.bris.ac.uk> References: <20080122160228.C7209@tribble.ilrt.bris.ac.uk> <1201021699.26176.9.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> <20080122175918.F8384@tribble.ilrt.bris.ac.uk> <20080122183151.M8384@tribble.ilrt.bris.ac.uk> Message-ID: <20080123120211.B45874@tribble.ilrt.bris.ac.uk> Attached is a patch against nagios-2.10 which carefully cuts out the uses of d_type in dirent, if that's missing, and just falls back to using stat (which follows symlinks). This makes recursive cfg_dir structures work on Solaris 10. Excuse the hackery: I've just carefully inserted additional checks for _DIRENT_HAVE_D_TYPE around block ends. Cheers, jan PS. I'm reasonably certain that what nagios tries to do in the cfg_dir handling is unnecessarily clever; since stat() follows symlinks anyway, just checking for file or directory types with a single stat() call should be sufficient. -- jan grant, ISYS, University of Bristol. http://www.bris.ac.uk/ Tel +44 (0)117 3317661 http://ioctl.org/jan/ Spreadsheet through network. Oh yeah. -------------- next part -------------- --- xdata/xodtemplate.c Fri Jan 19 22:02:01 2007 +++ xdata/xodtemplate.c Wed Jan 23 11:57:29 2008 @@ -392,11 +392,13 @@ #ifdef _DIRENT_HAVE_D_TYPE /* only process normal files and symlinks */ if(dirfile->d_type==DT_UNKNOWN){ +#endif x=stat(file,&stat_buf); if(x==0){ if(!S_ISREG(stat_buf.st_mode) && !S_ISLNK(stat_buf.st_mode)) continue; } +#ifdef _DIRENT_HAVE_D_TYPE } else{ if(dirfile->d_type!=DT_REG && dirfile->d_type!=DT_LNK) @@ -417,6 +419,7 @@ if(dirfile->d_type==DT_UNKNOWN || dirfile->d_type==DT_DIR || dirfile->d_type==DT_LNK){ if(dirfile->d_type==DT_UNKNOWN){ +#endif x=stat(file,&stat_buf); if(x==0){ if(!S_ISDIR(stat_buf.st_mode) && !S_ISLNK(stat_buf.st_mode)) @@ -424,12 +427,15 @@ } else continue; +#ifdef _DIRENT_HAVE_D_TYPE } +#endif /* ignore current, parent and hidden directory entries */ if(dirfile->d_name[0]=='.') continue; +#ifdef _DIRENT_HAVE_D_TYPE /* check that a symlink points to a dir */ if(dirfile->d_type==DT_LNK || (dirfile->d_type==DT_UNKNOWN && S_ISLNK(stat_buf.st_mode))){ @@ -505,6 +511,7 @@ /* Otherwise, we may proceed! */ } +#endif /* process the config directory */ result=xodtemplate_process_config_dir(file,options); @@ -512,6 +519,7 @@ /* break out if we encountered an error */ if(result==ERROR) break; +#ifdef _DIRENT_HAVE_D_TYPE } #endif } -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From jan.grant at bristol.ac.uk Wed Jan 23 13:12:51 2008 From: jan.grant at bristol.ac.uk (Jan Grant) Date: Wed, 23 Jan 2008 12:12:51 +0000 (GMT) Subject: [patch] FreeBSD fix for cfg_dir Re: Bug: cfg_dir doesn't work on solaris..? In-Reply-To: <4796FDC4.1040105@op5.se> References: <20080122160228.C7209@tribble.ilrt.bris.ac.uk> <1201021699.26176.9.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> <20080122175918.F8384@tribble.ilrt.bris.ac.uk> <20080122183151.M8384@tribble.ilrt.bris.ac.uk> <4796FDC4.1040105@op5.se> Message-ID: <20080123121037.R45874@tribble.ilrt.bris.ac.uk> On Wed, 23 Jan 2008, Andreas Ericsson wrote: > Jan Grant wrote: > > > > Attached is a patch against nagios-2.10 which detects the "missing" > > _DIRENT_HAVE_D_TYPE on BSD systems (the heuristic is to look for > > DT_UNKNOWN) and defines it in those cases. > > > > The patch looks decent, but it's the wrong fix. When d_type is present, > Nagios will still stat() the dirent to see if it's a directory before > recursing into it. A more proper fix would be to ignore d_type and > just rely on S_ISDIR(st.st_mode) to check for directories. > > That'll make it work the same on every system too, which is most > definitely a Good Thing(tm). > > I'll get to work on this when I have time, although I fear it won't > be until I've completed my current project at work and then have > looked into the config parsing code and the event-queue problems. > > A Nagios hackathon would be a neat idea, and small stuff like this > could get sorted in no-time. Hmm, perhaps I shall see if I can get > the boss to sponsor one :-) I've just supplied a second patch that works in the absence of _DIRENT_HAVE_D_TYPE (ie, on Solaris amongst others). By chopping out everything bracketted by those #ifdefs, the result should be what you're after. FWIW I could probably find some time to attend a (European) hackathon. Cheers, jan -- jan grant, ISYS, University of Bristol. http://www.bris.ac.uk/ Tel +44 (0)117 3317661 http://ioctl.org/jan/ Unfortunately, I have a very good idea how fast my keys are moving. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From ae at op5.se Wed Jan 23 14:04:43 2008 From: ae at op5.se (Andreas Ericsson) Date: Wed, 23 Jan 2008 14:04:43 +0100 Subject: another [patch] Solaris fix for cfg_dir Re: Bug: cfg_dir doesn't work on solaris..? In-Reply-To: <20080123120211.B45874@tribble.ilrt.bris.ac.uk> References: <20080122160228.C7209@tribble.ilrt.bris.ac.uk> <1201021699.26176.9.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> <20080122175918.F8384@tribble.ilrt.bris.ac.uk> <20080122183151.M8384@tribble.ilrt.bris.ac.uk> <20080123120211.B45874@tribble.ilrt.bris.ac.uk> Message-ID: <47973B6B.6040802@op5.se> Jan Grant wrote: > Attached is a patch against nagios-2.10 which carefully cuts out the > uses of d_type in dirent, if that's missing, and just falls back to > using stat (which follows symlinks). > > This makes recursive cfg_dir structures work on Solaris 10. > > Excuse the hackery: I've just carefully inserted additional checks for > _DIRENT_HAVE_D_TYPE around block ends. > Umm... This patch makes the code excessively hard to read, since block starts and ends are now inside #ifdef's. I was thinking something along the lines of while (de = readdir(dirp)) { struct stat st; if (stat(de->d_name) < 0) return -1; /* error */ switch (st.st_mode & S_IFMT) { case S_IFDIR: /* recurse */ break; case S_IFREG: /* read */ break; default: /* unhandled entry (socket, fifo, ... ) */ break; } } which doesn't have any references to d_type at all. Note that you don't need to check for S_IFLNK, as stat() will check the target of the link rather than the link itself. > > PS. I'm reasonably certain that what nagios tries to do in the cfg_dir > handling is unnecessarily clever; since stat() follows symlinks anyway, > just checking for file or directory types with a single stat() call > should be sufficient. > Yes, that's exactly what I meant. :) Care to resend? -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From joe.six at mahaska.org Wed Jan 23 17:00:03 2008 From: joe.six at mahaska.org (Joe Six) Date: Wed, 23 Jan 2008 10:00:03 -0600 Subject: NDOUtils 1.4b4 Question Message-ID: All, I recently noticed that every time the Nagios process is restarted, the hostgroup_id values in the nagios_hostgroups table are changed. No hostgroups have been changed in the nagios configuration files. Is the update of the hostgroup_id value to be expected with each Nagios restart? I am using NDOUtils version 1.4b4 and Nagios version 2.9. Thanks Joe 1/23/2008 --------------------------------------------------------------------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From lavalamp at spiritual-machines.org Wed Jan 23 17:11:26 2008 From: lavalamp at spiritual-machines.org (Brian A. Seklecki) Date: Wed, 23 Jan 2008 11:11:26 -0500 Subject: [patch] FreeBSD fix for cfg_dir Re: Bug: cfg_dir doesn't work on solaris..? In-Reply-To: <4796FDC4.1040105@op5.se> References: <20080122160228.C7209@tribble.ilrt.bris.ac.uk> <1201021699.26176.9.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> <20080122175918.F8384@tribble.ilrt.bris.ac.uk> <20080122183151.M8384@tribble.ilrt.bris.ac.uk> <4796FDC4.1040105@op5.se> Message-ID: <1201104686.26176.91.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> > A Nagios hackathon would be a neat idea, and small stuff like this > could get sorted in no-time. Hmm, perhaps I shall see if I can get > the boss to sponsor one :-) I'll have it catered. ~BAS ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From nagios at nagios.org Wed Jan 23 17:28:28 2008 From: nagios at nagios.org (Ethan Galstad) Date: Wed, 23 Jan 2008 10:28:28 -0600 Subject: Nagios-plugins links on Nagios CVS page (www.nagios.org) In-Reply-To: <4787AE21.60201@zango.com> References: <4787AE21.60201@zango.com> Message-ID: <47976B2C.7050005@nagios.org> Thomas Guyot-Sionnest wrote: > The links and CVS info for Nagios-plugins are wrong on the following page: > > http://www.nagios.org/development/cvs.php > >> Browse The CVS Tree > > This is now a Subversion Tree. URL: > http://nagiosplug.svn.sourceforge.net/viewvc/nagiosplug/ > >> Daily CVS Snapshots > > Although the link is still good, this should be called a Subversion > snapshot. > >> Anonymous CVS Access > > Subversion access: > svn co https://nagiosplug.svn.sourceforge.net/svnroot/nagiosplug nagiosplug > >> Automatic Notification Of CVS Commits > > Still the same mailing list, but Subversion commits now. > > > For more details: > > http://sourceforge.net/svn/?group_id=29880 > Thanks Thomas - Links are now updated on the site. Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From jan.grant at bristol.ac.uk Wed Jan 23 20:03:57 2008 From: jan.grant at bristol.ac.uk (Jan Grant) Date: Wed, 23 Jan 2008 19:03:57 +0000 (GMT) Subject: another [patch] Solaris fix for cfg_dir Re: Bug: cfg_dir doesn't work on solaris..? In-Reply-To: <47973B6B.6040802@op5.se> References: <20080122160228.C7209@tribble.ilrt.bris.ac.uk> <1201021699.26176.9.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> <20080122175918.F8384@tribble.ilrt.bris.ac.uk> <20080122183151.M8384@tribble.ilrt.bris.ac.uk> <20080123120211.B45874@tribble.ilrt.bris.ac.uk> <47973B6B.6040802@op5.se> Message-ID: <20080123185911.Q49258@tribble.ilrt.bris.ac.uk> On Wed, 23 Jan 2008, Andreas Ericsson wrote: > Care to resend? Attached; much cleaner. Seems ok on Solaris. Note, existing indentation style preserved. Cheers, jan -- jan grant, ISYS, University of Bristol. http://www.bris.ac.uk/ Tel +44 (0)117 3317661 http://ioctl.org/jan/ "I like oranges more than apples!?" - that's like comparing apples and oranges! -------------- next part -------------- --- xdata/xodtemplate.c.orig Wed Jan 23 14:56:37 2008 +++ xdata/xodtemplate.c Wed Jan 23 19:00:46 2008 @@ -369,7 +369,7 @@ /* open the directory for reading */ dirp=opendir(dirname); - if(dirp==NULL){ + if(dirp==NULL){ #ifdef NSCORE snprintf(temp_buffer,sizeof(temp_buffer)-1,"Error: Could not open config directory '%s' for reading.\n",dirname); temp_buffer[sizeof(temp_buffer)-1]='\x0'; @@ -381,139 +381,57 @@ /* process all files in the directory... */ while((dirfile=readdir(dirp))!=NULL){ + /* Skip hidden files and directories, and current and parent dir */ + if(dirfile->d_name[0]=='.') + continue; + /* create /path/to/file */ snprintf(file,sizeof(file),"%s/%s",dirname,dirfile->d_name); file[sizeof(file)-1]='\x0'; /* process this if it's a non-hidden config file... */ - x=strlen(dirfile->d_name); - if(x>4 && dirfile->d_name[0]!='.' && !strcmp(dirfile->d_name+(x-4),".cfg")){ - -#ifdef _DIRENT_HAVE_D_TYPE - /* only process normal files and symlinks */ - if(dirfile->d_type==DT_UNKNOWN){ - x=stat(file,&stat_buf); - if(x==0){ - if(!S_ISREG(stat_buf.st_mode) && !S_ISLNK(stat_buf.st_mode)) - continue; - } - } - else{ - if(dirfile->d_type!=DT_REG && dirfile->d_type!=DT_LNK) - continue; - } + if (stat(file,&stat_buf) == -1) { + /* An error */ +#ifdef NSCORE + snprintf(temp_buffer,sizeof(temp_buffer)-1,"Error: Could not open config directory member '%s' for reading.\n",file); + temp_buffer[sizeof(temp_buffer)-1]='\x0'; + write_to_logs_and_console(temp_buffer,NSLOG_CONFIG_ERROR,TRUE); #endif + closedir(dirp); + return ERROR; + } - /* process the config file */ - result=xodtemplate_process_config_file(file,options); + switch(stat_buf.st_mode & S_IFMT){ - /* break out if we encountered an error */ - if(result==ERROR) - break; - } + case S_IFREG: + x=strlen(dirfile->d_name); + if(x<=4 || strcmp(dirfile->d_name+(x-4),".cfg")) + break; -#ifdef _DIRENT_HAVE_D_TYPE - /* recurse into subdirectories... */ - if(dirfile->d_type==DT_UNKNOWN || dirfile->d_type==DT_DIR || dirfile->d_type==DT_LNK){ + /* process the config file */ + result=xodtemplate_process_config_file(file,options); - if(dirfile->d_type==DT_UNKNOWN){ - x=stat(file,&stat_buf); - if(x==0){ - if(!S_ISDIR(stat_buf.st_mode) && !S_ISLNK(stat_buf.st_mode)) - continue; - } - else - continue; - } - - /* ignore current, parent and hidden directory entries */ - if(dirfile->d_name[0]=='.') - continue; - - /* check that a symlink points to a dir */ - - if(dirfile->d_type==DT_LNK || (dirfile->d_type==DT_UNKNOWN && S_ISLNK(stat_buf.st_mode))){ - - readlink_count=readlink(file,link_buffer,MAX_FILENAME_LENGTH); - - /* Handle special case with maxxed out buffer */ - if(readlink_count==MAX_FILENAME_LENGTH){ -#ifdef NSCORE - snprintf(temp_buffer,sizeof(temp_buffer)-1,"Error: Cannot follow symlink '%s' - Too big!\n",file); - temp_buffer[sizeof(temp_buffer)-1]='\x0'; - write_to_logs_and_console(temp_buffer,NSLOG_CONFIG_ERROR,TRUE); -#endif + if(result==ERROR){ + closedir(dirp); return ERROR; - } + } - /* Check if reading symlink failed */ - if(readlink_count==-1){ -#ifdef NSCORE - snprintf(temp_buffer,sizeof(temp_buffer)-1,"Error: Cannot read symlink '%s': %s!\n",file, strerror(errno)); - temp_buffer[sizeof(temp_buffer)-1]='\x0'; - write_to_logs_and_console(temp_buffer,NSLOG_CONFIG_ERROR,TRUE); -#endif - return ERROR; - } + break; - /* terminate string */ - link_buffer[readlink_count]='\x0'; - - /* create new symlink buffer name */ - if(link_buffer[0]=='/'){ - /* full path */ - snprintf(linked_to_buffer,sizeof(linked_to_buffer)-1,"%s",link_buffer); - linked_to_buffer[sizeof(linked_to_buffer)-1]='\x0'; - } - else{ - /* relative path */ - snprintf(linked_to_buffer,sizeof(linked_to_buffer)-1,"%s/%s",dirname,link_buffer); - linked_to_buffer[sizeof(linked_to_buffer)-1]='\x0'; - } + case S_IFDIR: + /* recurse into subdirectories... */ + result=xodtemplate_process_config_dir(file,options); - /* - * At this point, we know it's a symlink - - * now check for whether it points to a - * directory or not - */ - - x=stat(linked_to_buffer,&stat_buf); - if(x!=0){ - - /* non-existent symlink - bomb out */ - if(errno==ENOENT){ -#ifdef NSCORE - snprintf(temp_buffer,sizeof(temp_buffer)-1,"Error: symlink '%s' points to non-existent '%s'!\n",file,link_buffer); - temp_buffer[sizeof(temp_buffer)-1]='\x0'; - write_to_logs_and_console(temp_buffer,NSLOG_CONFIG_ERROR,TRUE); -#endif - return ERROR; - } - -#ifdef NSCORE - snprintf(temp_buffer,sizeof(temp_buffer)-1,"Error: Cannot stat symlinked from '%s' to %s!\n",file,link_buffer); - temp_buffer[sizeof(temp_buffer)-1]='\x0'; - write_to_logs_and_console(temp_buffer,NSLOG_CONFIG_ERROR,TRUE); -#endif + if(result==ERROR){ + closedir(dirp); return ERROR; - } + } - if(!S_ISDIR(stat_buf.st_mode)){ - /* Not a symlink to a dir - skip */ - continue; - } + break; - /* Otherwise, we may proceed! */ - } + /* Everything else we ignore */ + } - /* process the config directory */ - result=xodtemplate_process_config_dir(file,options); - - /* break out if we encountered an error */ - if(result==ERROR) - break; - } -#endif } closedir(dirp); @@ -523,7 +441,7 @@ #endif return result; - } + } /* process data in a specific config file */ -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From nagios at nagios.org Thu Jan 24 03:00:32 2008 From: nagios at nagios.org (Ethan Galstad) Date: Wed, 23 Jan 2008 20:00:32 -0600 Subject: another [patch] Solaris fix for cfg_dir Re: Bug: cfg_dir doesn't work on solaris..? In-Reply-To: <20080123185911.Q49258@tribble.ilrt.bris.ac.uk> References: <20080122160228.C7209@tribble.ilrt.bris.ac.uk> <1201021699.26176.9.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> <20080122175918.F8384@tribble.ilrt.bris.ac.uk> <20080122183151.M8384@tribble.ilrt.bris.ac.uk> <20080123120211.B45874@tribble.ilrt.bris.ac.uk> <47973B6B.6040802@op5.se> <20080123185911.Q49258@tribble.ilrt.bris.ac.uk> Message-ID: <4797F140.3080908@nagios.org> Jan Grant wrote: > On Wed, 23 Jan 2008, Andreas Ericsson wrote: > >> Care to resend? > > Attached; much cleaner. Seems ok on Solaris. Note, existing indentation > style preserved. > > Cheers, > jan > > Thanks Jan! The patch looks great - I'll get this applied to both the 2.x and 3.x code branches. Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From nagios at nagios.org Thu Jan 24 03:50:48 2008 From: nagios at nagios.org (Ethan Galstad) Date: Wed, 23 Jan 2008 20:50:48 -0600 Subject: Nagios 3.0rc1 extinfo.cgi Segmentation fault In-Reply-To: <200801042154.39453.pitchfork@ederdrom.de> References: <200801042154.39453.pitchfork@ederdrom.de> Message-ID: <4797FD08.4000409@nagios.org> Joerg Linge wrote: > Hi List, > calling extinfi.cgi with non existing Host or Service values via QUERY_STRING results in a segmentation fault > > Now some debug infos ... > > 0 - [nagios at kassandra /usr/local/src/nagios-3.0rc1/cgi]$ export REMOTE_USER=linge > 0 - [nagios at kassandra /usr/local/src/nagios-3.0rc1/cgi]$ export REQUEST_METHOD=GET > 0 - [nagios at kassandra /usr/local/src/nagios-3.0rc1/cgi]$ export QUERY_STRING="type=2&host=wrong_host&service=wrong_service" > > > 1 - [nagios at kassandra /usr/local/src/nagios-3.0rc1/cgi]$ ./extinfo.cgi > Cache-Control: no-store > Pragma: no-cache > Refresh: 90 > Last-Modified: Fri, 04 Jan 2008 20:46:55 GMT > Expires: Thu, 01 Jan 1970 00:00:00 GMT > Content-type: text/html > > > > [... stripped content ...] > >
> > > Segmentation fault > 139 - [nagios at kassandra /usr/local/src/nagios-3.0rc1/cgi]$ > > The gdb returns the following Infos: > > [.... stripped content ..] >
> Program received signal SIGSEGV, Segmentation fault. > 0x08050005 in main () at extinfo.c:452 > 452 if(temp_service->action_url!=NULL && strcmp(temp_service->action_url,"")){ > (gdb) > > So now its your turn ;-) > > Kind regards > J?rg > Thanks Joerg! Patch will be in CVS shortly... Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From nagios at nagios.org Thu Jan 24 03:50:57 2008 From: nagios at nagios.org (Ethan Galstad) Date: Wed, 23 Jan 2008 20:50:57 -0600 Subject: [patch] summary.cgi Nagios 3.0rc1 broken In-Reply-To: <200801062008.55968.pitchfork@ederdrom.de> References: <200801062008.55968.pitchfork@ederdrom.de> Message-ID: <4797FD11.5050403@nagios.org> Joerg Linge wrote: > Hi Ethan, > summary.cgi does not recognice the host query string. So here is a small patch against summary.c Revision 1.25 ( 3.0rc1 ) > > Kind regards, > J?rg > > diff -u summary.c.orig summary.c > --- summary.c.orig 2008-01-06 19:34:00.000000000 +0100 > +++ summary.c 2008-01-06 19:56:17.000000000 +0100 > @@ -1170,7 +1170,7 @@ > break; > } > > - if((target_host_name=(char *)strdup(target_host_name))==NULL) > + if((target_host_name=(char *)strdup(variables[x]))==NULL) > target_host_name=""; > strip_html_brackets(target_host_name); > Thanks Joerg! Patch will be in CVS shortly... Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From nagios at nagios.org Thu Jan 24 03:57:15 2008 From: nagios at nagios.org (Ethan Galstad) Date: Wed, 23 Jan 2008 20:57:15 -0600 Subject: [PATCH] Fix "permission denied" on rename() under Cygwin In-Reply-To: <004f01c8495b$d6e0afc0$0116a8c0@iat.unileipzig.de> References: <004f01c8495b$d6e0afc0$0116a8c0@iat.unileipzig.de> Message-ID: <4797FE8B.9020801@nagios.org> Michael Bunk wrote: > Hello, > > Nagios 3.0RC1 doesn't work on Cygwin. rename() with a new filename, which is > already held open, results in a "Permission Denied". This situation appears > with the current code while reaping check results. > > Attached patch simply closes the new file just opened with mkstemp() - before > renaming instead of afterwards. Securitywise this is a bad solution, because > it opens a race condition, but at least Nagios works. > > Please incorporate the bugfix included in the attached patch in a secure form, > so that Nagios will be usable under Cygwin (if other Cygwin issues are > resolved as well). > > Best regards, > Michael Bunk > Aha! So there are actually people running Nagios under Cygwin. I was wondering about that... :-) Not the best security-wise, but on Windows there are worse things. I'll get this patch into CVS shortly... Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From nagios at nagios.org Thu Jan 24 04:28:04 2008 From: nagios at nagios.org (Ethan Galstad) Date: Wed, 23 Jan 2008 21:28:04 -0600 Subject: [PATCH] module/helloworld.c vs. "--disable-event-broker" In-Reply-To: <20071227204443.GL28313306@CIS.FU-Berlin.DE> References: <20071227204443.GL28313306@CIS.FU-Berlin.DE> Message-ID: <479805C4.7080702@nagios.org> Holger Weiss wrote: > "./configure --disable-event-broker; make all" currently fails due to > the neb_set_module_info() calls added to module/helloworld.c (r1.6). I > guess "make all" should simply omit building module/helloworld.o if > "--disable-event-broker" was specified, as e.g. with the attached patch? > > Holger > Excellent idea Holger! I'll get this into CVS shortly... Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From ton.voon at altinity.com Thu Jan 24 12:07:16 2008 From: ton.voon at altinity.com (Ton Voon) Date: Thu, 24 Jan 2008 11:07:16 +0000 Subject: NSCA error with --single and aggregate writes enabled In-Reply-To: <478D32AF.4040008@nagios.org> References: <21E3E5FF-1C15-44C1-8BFA-01C286F3DC49@altinity.com> <4783C573.5000701@zango.com> <478D32AF.4040008@nagios.org> Message-ID: <141BD9DE-5F43-4A09-B083-CE54A21A6DAA@altinity.com> On 15 Jan 2008, at 22:24, Ethan Galstad wrote: > Thanks - patch will be in CVS soon. Thanks Ethan. I've noticed that you haven't included the test case changes. Could those be added? If tests are kept up to date, that encourages more use of it! Ton http://www.altinity.com UK: +44 (0)870 787 9243 US: +1 866 879 9184 Fax: +44 (0)845 280 1725 Skype: tonvoon ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From einar.indrida at gmail.com Thu Jan 24 12:40:09 2008 From: einar.indrida at gmail.com (Einar Indridason) Date: Thu, 24 Jan 2008 11:40:09 +0000 Subject: thoughts about a SQL backend, and an issue tracker... Message-ID: <32603e510801240340k732670ddhd79924c08be670a0@mail.gmail.com> Good day. I have had some idle thoughts lately about Nagios. I don't know if they make sense, or whether they are have already been rejected. 1) Does Nagios have a bug-tracker of some kind, so if a bug is submitted, it can be checked if it exists already, and/or has been fixed? (Bugzilla, or Trac, or something like that?) 2) Would it make sense to have a (another) nagios-daemon, which clients would connect to, and ask for info, instead of having to read and parse " status.dat" (and other *.dat) files for every client that wants to get some infos. For example: echo "GET hosts -down" | telnet nagios.host.example.com 5665, or echo "GET services -flapping -down" | telnet nagios.host.example.com 5665 (possible issue here would be authentication / authorization) 3) Would it make sense to switch the data files and/or config files from being flat text files, into SQL tables? (For example, by using the embeddable SQLite?) (If yes, having the possibility of using different SQL databases would be a bonus. SQLite on a ramdisk, for example, versus PostGreSQL versus MySQL (versus DB2) (versus Oracle) 4) Would it make sense to change both services and hosts into "containers"? As in: "Our network has those following containers: (list of routers, switches, hosts, hostgroups) Some of those containers, have other containers: (The router for building B is a "master" for building B. It also connects some switches in building B with some hosts in building B). Some of those sub-containers have other containers: (the switch in building B is connecting some hosts - those hosts have some sub-systems running). Some of those sub-systems consists of several services, like: webserver-1, webserver-2, database-1, database-2, firewall accesslist, etc.... Two thumbs up for Nagios :-) Cheers, -- EinarI -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From nagios at nagios.org Thu Jan 24 13:19:45 2008 From: nagios at nagios.org (Ethan Galstad) Date: Thu, 24 Jan 2008 06:19:45 -0600 Subject: Line continuations not working as expected In-Reply-To: References: <8u7in3p57ccufhr2uga4a548h8c7bkjlc2@m27.itconsult.net> Message-ID: <47988261.1060602@nagios.org> Matthew Richardson wrote: > Sorry folks, I thought I had checked this before sending the message, but > had done it wrong. > > The issue is the difference between Unix & Windows line ends in the config > files. > > Using a "\" to end a line terminated with Unix line end, Nagios correctly > treats the subsequent line as a continuation. > > Unfortunately if a Windows line end (CRLF) follows the "\", Nagios does not > treat the line as a continuation. > > Because of the way I use a Windows PC, almost all of my Nagios config files > have Windows not Unix line ends. Apart from this issue, I have had no > problems. > > Would it be possible to adjust the code so as to allow continuations with > either type of line end? > > Best wishes, > Matthew [snip] Thanks for the note Matthew - a patch will be in CVS momentarily to handle the CR/LF scenario... Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From nagios at nagios.org Thu Jan 24 13:31:12 2008 From: nagios at nagios.org (Ethan Galstad) Date: Thu, 24 Jan 2008 06:31:12 -0600 Subject: {enable, disable}_notifications and file name expansion! In-Reply-To: <47828384.2050608@zango.com> References: <47828384.2050608@zango.com> Message-ID: <47988510.8060704@nagios.org> Thomas Guyot-Sionnest wrote: > Hi, > > I just noticed I recently started to have problems using the > enable/disable notification commands when run from cron. After some > investigation it turns out the culprit is... file name expansion! > > In those scripts the command line is built as this: > > cmdline="[$datetime] COMMAND_NAME;$datetime" > > and then the following command is run > > `$echocmd $cmdline >> $CommandFile` > > this result in the following being run (given current timestamp): > > `/bin/echo [1199733486] COMMAND_NAME;1199733486 >> $CommandFile` > > If you happen to have any of the digits in $datetime as a file name in > your current folder, it will be expanded by bash. In my case I had a > file named "1" in /root, and since it's being run by the root crontab it > turned out to this: > > `/bin/echo 1 COMMAND_NAME;1199733486 >> $CommandFile` > > which obviously won't work. > > There's a few ways to fix this problem: > > 1. Quoting the $cmdline in the echocmd arguments: > > --- enable_notifications 2008-01-07 10:48:17.000000000 -0800 > +++ enable_notifications 2008-01-07 11:42:01.000000000 -0800 > @@ -23,7 +23,7 @@ > cmdline="[$datetime] ENABLE_NOTIFICATIONS;$datetime" > > # append the command to the end of the command file > -`$echocmd $cmdline >> $CommandFile` > +`$echocmd "$cmdline" >> $CommandFile` > > > > > 2. Backquoting the hooks in $cmdline: > > --- enable_notifications 2008-01-07 10:48:17.000000000 -0800 > +++ enable_notifications 2008-01-07 11:44:42.000000000 -0800 > @@ -20,7 +20,7 @@ > datetime=`date +%s` > > # create the command line to add to the command file > -cmdline="[$datetime] ENABLE_NOTIFICATIONS;$datetime" > +cmdline="\[$datetime\] ENABLE_NOTIFICATIONS;$datetime" > > # append the command to the end of the command file > `$echocmd $cmdline >> $CommandFile` > > 3. Uning printf: > --- enable_notifications 2008-01-07 10:48:17.000000000 -0800 > +++ enable_notifications 2008-01-07 11:49:35.000000000 -0800 > @@ -12,18 +12,15 @@ > # the check_external_commands option in the main > # configuration file. > > -echocmd="/bin/echo" > +printfcmd="/bin/printf" > > CommandFile="/usr/local/nagios/var/rw/nagios.cmd" > > # get the current date/time in seconds since UNIX epoch > datetime=`date +%s` > > -# create the command line to add to the command file > -cmdline="[$datetime] ENABLE_NOTIFICATIONS;$datetime" > - > # append the command to the end of the command file > -`$echocmd $cmdline >> $CommandFile` > +`$printfcmd "[%i] ENABLE_NOTIFICATIONS;%i\n" $datetime $datetime >> > $CommandFile` > > > > > This should be fixed on both disable_notifications and > enable_notifications files in contrib/eventhandlers/ > > Thanks > Thanks Thomas - fix will be in CVS shortly. Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From nagios at nagios.org Thu Jan 24 13:42:02 2008 From: nagios at nagios.org (Ethan Galstad) Date: Thu, 24 Jan 2008 06:42:02 -0600 Subject: Fix garbage in command output In-Reply-To: References: Message-ID: <4798879A.1080807@nagios.org> Krzysztof Oledzki wrote: > Hello, > > This patch properly terminates the "output" string so both returned and > logged (syslog) values are correct. > > The bug was introduced in nrpe-2.8 (multiline plugin output) - > previously the child process was able to return a properly termintaed > c-string but now, when fread() is used instead of fgets(), the received > string (output) is not terminated and my_system() returns and logs some > kind of garbage. > > I also decided to remove two unused variables. > > Best regards, > > Krzysztof Ol?dzki > Thanks Krzysztof! Patch will be in CVS momentarily... Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From nagios at nagios.org Thu Jan 24 15:14:07 2008 From: nagios at nagios.org (Ethan Galstad) Date: Thu, 24 Jan 2008 08:14:07 -0600 Subject: strange behaviour when plugin output is containing the string "\n" in Nagios 3.0rc1 In-Reply-To: <47737FB2.1040100@sap.com> References: <47737FB2.1040100@sap.com> Message-ID: <47989D2F.10204@nagios.org> Marcus Hildenbrand wrote: > Hi, > > if a plugin is containing the string "\n" in its output I found some strange > output in status_file. For example if the output of the plugin looks like this: > > # some_plugin > test\ntmon\test\abc > # echo $? > 0 > # > > the status_file for this service is containing lines like this: > > servicestatus { > ... > ... > plugin_output=test\ > long_plugin_output=tmon\\\\test\\\\abc\n > ... > ... > } > > Also the trailing \ in plugin_output line sometimes seems to confuse the CGI's > in a way that they are dumping: > > # setenv REQUEST_METHOD POST > # setenv CONTENT_LENGTH 4 > # echo all | ./tac.cgi > Segmentation fault (core dumped) > > Is this a bug in Nagios or is it not allowed that the plugin output contains \n? > > Thanks and regards > Marcus > Thanks for the report Marcus! There was a bug in the newline/backslash escape logic. I'm posting a fix to CVS in a few moments... Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From mark.eisenblaetter at gmail.com Thu Jan 24 15:25:16 2008 From: mark.eisenblaetter at gmail.com (Mark Eisenblaetter) Date: Thu, 24 Jan 2008 15:25:16 +0100 Subject: Template befor Definition In-Reply-To: <288245630801160138y437578ddp23f16f6720efe6d1@mail.gmail.com> References: <288245630712140621u93d6113q643b51e4194cde39@mail.gmail.com> <4762A66D.2040107@op5.se> <288245630801160000j3f40be4q7fd1e748ebba782d@mail.gmail.com> <478DC94B.6040808@op5.se> <288245630801160138y437578ddp23f16f6720efe6d1@mail.gmail.com> Message-ID: <288245630801240625u2a1ec5ctbc4ed7654e68079f@mail.gmail.com> On Jan 16, 2008 10:38 AM, Mark Eisenblaetter wrote: > Hi, > > On Jan 16, 2008 10:07 AM, Andreas Ericsson wrote: > > > Please don't top-post. It makes it hard to follow the discussion, > > especially when you're top-posting to an answer that wasn't top- > > posted. Anyways... > > > > Mark Eisenblaetter wrote: > > > Hi, > > > sorry for the latye replay, > > > > > > Then I will have to much templates, because i will need for every > > > kombination of args a new template. > > > > > > For example by check_http I will need a template vor every > > website/service i > > > want to check or for every disk with diffrend warning and Critival > > > threshosld. > > > > > > Thats not so practicable. > > > > > > > Originally you wrote: > > >>> so it would be great if i can say in the template use the > > check_command > > >>> (check_dummy) defined in the template and not that defined in the > > Host. > > >>> > > > > To which I replied: > > >> That's what not setting anything in the object itself is for. If you > > don't > > >> have a check_command in the host object, it will use the one from the > > >> template. > > >> > > > > In other words, if you want a particular object to inherit the value > > from > > the template, simply don't set that value in the object. There was no > > other question-like statement in your original mail. > > > > Ok, then my first mail was not so clear I hoped. > > I was thinking of that as a new feature, to minimize the templatework. > > Ok it seems that i am the only one that would like this feature. Or do i miss one way to handle that situation without having a template for nearly every check? Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From michael_luebben at web.de Thu Jan 24 15:31:04 2008 From: michael_luebben at web.de (=?iso-8859-15?Q?Michael_L=FCbben?=) Date: Thu, 24 Jan 2008 15:31:04 +0100 Subject: New color for acknowledge services and hosts? Message-ID: <1446129506@web.de> Hi list, hi Ethan, what think you about a new color (blue) for acknowledge services and hosts? I think that makes it easier to see which service or host ist acknowlegde in the interface! What do you think about that? Bye Michael _______________________________________ GRATIS: Movie-FLAT. Jetzt freischalten! http://freemail.web.de/club/maxdome.htm ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From nagios at nagios.org Thu Jan 24 17:03:11 2008 From: nagios at nagios.org (Ethan Galstad) Date: Thu, 24 Jan 2008 10:03:11 -0600 Subject: Reproducible bug scheduling second host check following failure In-Reply-To: <7o1dn3p78bt4ngg5hit25fh730d3nsv5te@m27.itconsult.net> References: <7o1dn3p78bt4ngg5hit25fh730d3nsv5te@m27.itconsult.net> Message-ID: <4798B6BF.3070702@nagios.org> Matthew Richardson wrote: > I have, for some time, been suspicions that Nagios 3 appeared a bit slow > alerting on host failures. In order to diagnose this, I setup a minimal > config to try to show the problem using 3.0rc1 with a default nagios.cfg. > It consists of one host and one service (both checked by ping), with the > retries and timers set as below (the config file is attached in full):- > > |define host{ > | host_name hostname > | check_interval 10 > | retry_interval 1 > | max_check_attempts 5 > |} > |define service{ > | max_check_attempts 5 > | normal_check_interval 5 > | retry_check_interval 5 > | host_name hostname > | service_description ping > |} > [snip] Thanks for the debug info Matthew. I believe I've tracked down and fixed the problem. Patch will be in CVS in a few moments... Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From throck at duke.edu Thu Jan 24 21:17:33 2008 From: throck at duke.edu (Tom Throckmorton) Date: Thu, 24 Jan 2008 15:17:33 -0500 Subject: flap detection + state retention In-Reply-To: <20080123221614.GA20554@duke.edu> References: <20080123221614.GA20554@duke.edu> Message-ID: <20080124201733.GA23455@duke.edu> *delurks* Hello all, Can someone here please verify that in Nagios 2.x the state of flap_detection should or shouldn't be persistent across a reload / restart? See my post below, and the brief thread on nagios-users: http://thread.gmane.org/gmane.network.nagios.user/52044 Tests on 2.x seem to show that the retained values for global/host/service flap detection aren't being honored on a reload. This seems to contradict the docs, so I'm thinking it's a bug. I haven't yet tried 3.x, so I don't know if it's still an issue there, but I can confirm that this did work in 1.x. Steps to repeat: - find a host/service for which flapping is enabled - disable flap detection for that host/service; wait until that state is reflected in the extended info. Optionally check the status / retention files for appropriate values. - reload / restart - check the state of flap detection - has it reverted to enabled? You can also see this behavior by toggling global flapping and then reloading. Any insight is appreciated, -tt On Jan 23 17:16, Tom Throckmorton wrote: > Nagios 2.10, CentOS 5 x86_64 / i386, NDOUtils 1.4b6 > > Hi all, > > I've noticed something odd wrt flapping, and am wondering if I'm overlooking > something simple, or just misunderstanding the way flapping and state retention > are supposed to work. > > First, just to verify, my main nagios config has: > > enable_flap_detection=1 > retain_state_information=1 > state_retention_file=/var/log/nagios/retention.dat > retention_update_interval=<1,5,60 or even 0, doesn't matter> > use_retained_program_state=1 > use_retained_scheduling_info=1 > > In my global host and service templates that all hosts use, I have: > > flap_detection_enabled 1 > retain_status_information 1 > retain_nonstatus_information 1 > > I've also disabled flap detection explicitly for a few hosts, and it all works > as advertised. > > However, for any host/service, if I manually disable/enable flap detection via > an external command / cgi, and then reload or restart, the option reverts to > whatever is in the config for that given host/service. This happens regardless > of whether 1) the option is being set in retention.dat, which it is, 2) the > NDO-fed database thinks flapping for this host/service is disabled, or 3) w/out > the broker_module enabled. Other state options are being preserved (such as > active_checks_enabled) both at retention_update_interval and on a reload. > > So, what I think I'm seeing then is that during a reload, the > flap_detection_enabled option isn't getting read from the retention file at > program start, though I'd expect it to be persistent just as the other options > are. FWIW, same behavior on earlier Nagios 2.x releases, but 1.x seems to do > the right thing. > > Before I dig further, can anyone verify that the state of host or service flap > detection (enabled/disabled) should or shouldn't be persistent across a > reload/restart? I've combed the docs and archives, and am coming up dry. > > Thanks, > > -tt > > > -- > Tom Throckmorton > OIT - CSI > Duke University > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Nagios-users mailing list > Nagios-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- Tom Throckmorton OIT - CSI Duke University ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From throck at duke.edu Thu Jan 24 21:30:57 2008 From: throck at duke.edu (Tom Throckmorton) Date: Thu, 24 Jan 2008 15:30:57 -0500 Subject: complete list archive? Message-ID: <20080124203057.GA24316@duke.edu> Silly question... Does anyone know where I can find a complete archive of the list for download, a la pipermail yyyy-mm.txt.gz format or similar? Yes, I know it's available on gmane, sourceforge, etc., but I'm looking for something quicker/more portable/suitable for offline reading. -tt -- Tom Throckmorton OIT - CSI Duke University ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From lists at jssjr.com Thu Jan 24 22:58:24 2008 From: lists at jssjr.com (Scott Sanders) Date: Thu, 24 Jan 2008 16:58:24 -0500 Subject: Tracking commands as they move through nagios Message-ID: I am posting to devel because my initial thread on the users mailing list now concerns developers more than it does the average user. Some background first. I am developing an API into Nagios so that I can safely expose Nagios internals to scripts and applications that don't reside on the same system as the Nagios daemon. Using NFS to share the var directory was never really an option in my environment, so I put together a little API over http that uses the same Apache basic auth mechanism as the CGI's to restrict access. A query is made by forming a URL and an XML object is returned containing the query results. For example, a query like https://nagios.domain.com/=/find/host/address/192.168.1.1 would return an XML object identical to the one found in the objects.cache for the host with that address, but in XML instead of a tab-delimited key/value pair inside a curly brackets block. Now that this works, I am working on adding functionality that will let me submit commands to the nagios.cmd pipe through my API. My problem is that if i simply submit a command to the API and it is placed in the FIFO pipe, I have no meaningful information to pass back to the client. This creates a problem, as the client may wish to sit in a loop waiting for the command to be processed before continuing. (For example, scheduling a service downtime before stopping a database to take a backup snapshot.) I need to find some way to determine the success or failure of an external command and then return that to the client within the http timeout window. After looking at commands.c I am starting to believe this isn't possible with the current external commands implementation. I would really like to continue designing my API as an add-on to Nagios and not something that requires patching (or replacing) parts of the base system. Any suggestions on how I would accomplish this? (More background on this at http://thread.gmane.org/gmane.network.nagios.user/51971) -Scott Sanders -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From matthew-ln at itconsult.co.uk Thu Jan 24 23:15:37 2008 From: matthew-ln at itconsult.co.uk (Matthew Richardson) Date: Thu, 24 Jan 2008 22:15:37 +0000 Subject: flap detection + state retention In-Reply-To: <20080124201733.GA23455@duke.edu> References: <20080123221614.GA20554@duke.edu> <20080124201733.GA23455@duke.edu> Message-ID: >From: Tom Throckmorton >To: nagios-devel at lists.sourceforge.net >Date: Thu, 24 Jan 2008 15:17:33 -0500 >Subject: [Nagios-devel] flap detection + state retention >Tests on 2.x seem to show that the retained values for global/host/service flap >detection aren't being honored on a reload. This seems to contradict the docs, >so I'm thinking it's a bug. > >I haven't yet tried 3.x, so I don't know if it's still an issue there, but I >can confirm that this did work in 1.x. >From some unrelated tests this evening, service flap detection status is being correctly retained across restarts in 3.0rc1. Best wishes, Matthew ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From ae at op5.se Fri Jan 25 00:06:36 2008 From: ae at op5.se (Andreas Ericsson) Date: Fri, 25 Jan 2008 00:06:36 +0100 Subject: Template befor Definition In-Reply-To: <288245630801240625u2a1ec5ctbc4ed7654e68079f@mail.gmail.com> References: <288245630712140621u93d6113q643b51e4194cde39@mail.gmail.com> <4762A66D.2040107@op5.se> <288245630801160000j3f40be4q7fd1e748ebba782d@mail.gmail.com> <478DC94B.6040808@op5.se> <288245630801160138y437578ddp23f16f6720efe6d1@mail.gmail.com> <288245630801240625u2a1ec5ctbc4ed7654e68079f@mail.gmail.com> Message-ID: <479919FC.9020704@op5.se> Mark Eisenblaetter wrote: > On Jan 16, 2008 10:38 AM, Mark Eisenblaetter > wrote: > >> Hi, >> >> On Jan 16, 2008 10:07 AM, Andreas Ericsson wrote: >> >>> Please don't top-post. It makes it hard to follow the discussion, >>> especially when you're top-posting to an answer that wasn't top- >>> posted. Anyways... >>> >>> Mark Eisenblaetter wrote: >>>> Hi, >>>> sorry for the latye replay, >>>> >>>> Then I will have to much templates, because i will need for every >>>> kombination of args a new template. >>>> >>>> For example by check_http I will need a template vor every >>> website/service i >>>> want to check or for every disk with diffrend warning and Critival >>>> threshosld. >>>> >>>> Thats not so practicable. >>>> >>> Originally you wrote: >>>>>> so it would be great if i can say in the template use the >>> check_command >>>>>> (check_dummy) defined in the template and not that defined in the >>> Host. >>> To which I replied: >>>>> That's what not setting anything in the object itself is for. If you >>> don't >>>>> have a check_command in the host object, it will use the one from the >>>>> template. >>>>> >>> In other words, if you want a particular object to inherit the value >>> from >>> the template, simply don't set that value in the object. There was no >>> other question-like statement in your original mail. >>> >> Ok, then my first mail was not so clear I hoped. >> >> I was thinking of that as a new feature, to minimize the templatework. >> >> > Ok it seems that i am the only one that would like this feature. > > Or do i miss one way to handle that situation without having a template for > nearly every check? > You still have not clarified your original email. Only discovered that it's not clear enough. If you can manage that, I'm sure there's already a solution in Nagios for you. -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From ae at op5.se Fri Jan 25 00:19:16 2008 From: ae at op5.se (Andreas Ericsson) Date: Fri, 25 Jan 2008 00:19:16 +0100 Subject: thoughts about a SQL backend, and an issue tracker... In-Reply-To: <32603e510801240340k732670ddhd79924c08be670a0@mail.gmail.com> References: <32603e510801240340k732670ddhd79924c08be670a0@mail.gmail.com> Message-ID: <47991CF4.3040509@op5.se> Einar Indridason wrote: > Good day. I have had some idle thoughts lately about Nagios. I don't know > if they make sense, or whether they are have already been rejected. > > 1) Does Nagios have a bug-tracker of some kind, so if a bug is submitted, it > can be checked if it exists already, and/or has been fixed? (Bugzilla, or > Trac, or something like that?) > No, it doesn't. It has this mailing list as its primary "gather-up-patches" place, where they are reviewed and suchlike by some few folks who have been doing that sort of thing for a while. When Ethan has some spare moments, he browses the list and usually picks the final version of all patches and then bulk-applies them to nagios' cvs repo. > 2) Would it make sense to have a (another) nagios-daemon, which clients > would connect to, and ask for info, instead of having to read and parse " > status.dat" (and other *.dat) files for every client that wants to get some > infos. For example: echo "GET hosts -down" | telnet > nagios.host.example.com 5665, or echo "GET services -flapping -down" | > telnet nagios.host.example.com 5665 > (possible issue here would be authentication / authorization) > No, that wouldn't make sense. One would have to a) Invent a syntax for the queries to use. b) Handle indexing of important data. c) Take care of caching common queries. d) Open up the can of worms that is secure network-authentication. Several applications do that already, and have been doing it long enough that they can be considered true and tested. They're called databases. Nagios has support for writing its status data to databases, which is a much nicer solution. Should the format of that database not fit you, you are free to modify the existing code or write completely new one, both of which will be a lot simpler than re-inventing a database engine with a query language to match it. > 3) Would it make sense to switch the data files and/or config files from > being flat text files, into SQL tables? (For example, by using the > embeddable SQLite?) (If yes, having the possibility of using different SQL > databases would be a bonus. SQLite on a ramdisk, for example, versus > PostGreSQL versus MySQL (versus DB2) (versus Oracle) > Yes. Nagios 2 (and onwards) already has a ready-written module for this, called ndoutils. Google should provide ample amounts of information about that particular project. > 4) Would it make sense to change both services and hosts into "containers"? > As in: "Our network has those following containers: (list of routers, > switches, hosts, hostgroups) Some of those containers, have other > containers: (The router for building B is a "master" for building B. It > also connects some switches in building B with some hosts in building B). > Some of those sub-containers have other containers: (the switch in building > B is connecting some hosts - those hosts have some sub-systems running). > Some of those sub-systems consists of several services, like: webserver-1, > webserver-2, database-1, database-2, firewall accesslist, etc.... > No, not really, because one container could reside inside another container, which in turn resides inside the first one (fully meshed networks). It's better, and programmatically a lot easier, to consider each host a separate object rather than inventing meta-objects in which to put them. -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From ae at op5.se Fri Jan 25 00:20:00 2008 From: ae at op5.se (Andreas Ericsson) Date: Fri, 25 Jan 2008 00:20:00 +0100 Subject: New color for acknowledge services and hosts? In-Reply-To: <1446129506@web.de> References: <1446129506@web.de> Message-ID: <47991D20.3010706@op5.se> Michael L?bben wrote: > Hi list, hi Ethan, > > what think you about a new color (blue) for acknowledge services and hosts? I think that makes it easier to see which service or host ist acknowlegde in the interface! > > What do you think about that? > I think opening one of the CSS-files in a text-editor would let you choose whatever color you want. -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From throck at duke.edu Fri Jan 25 06:54:06 2008 From: throck at duke.edu (Tom Throckmorton) Date: Fri, 25 Jan 2008 00:54:06 -0500 Subject: flap detection + state retention In-Reply-To: References: <20080123221614.GA20554@duke.edu> <20080124201733.GA23455@duke.edu> Message-ID: <20080125055406.GA31325@duke.edu> On Jan 24 22:15, Matthew Richardson wrote: > >From: Tom Throckmorton > >To: nagios-devel at lists.sourceforge.net > >Date: Thu, 24 Jan 2008 15:17:33 -0500 > >Subject: [Nagios-devel] flap detection + state retention > > >Tests on 2.x seem to show that the retained values for global/host/service flap > >detection aren't being honored on a reload. This seems to contradict the docs, > >so I'm thinking it's a bug. > > > >I haven't yet tried 3.x, so I don't know if it's still an issue there, but I > >can confirm that this did work in 1.x. > > >From some unrelated tests this evening, service flap detection status is > being correctly retained across restarts in 3.0rc1. Thanks for checking. I've got a working patch for 2.x that I'll submit shortly after some cleanup. I just peeked at the 3.x code, and can see that it has the pieces that were missing from 2.x. -tt -- Tom Throckmorton OIT - CSI Duke University ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From mark.eisenblaetter at gmail.com Fri Jan 25 10:02:56 2008 From: mark.eisenblaetter at gmail.com (Mark Eisenblaetter) Date: Fri, 25 Jan 2008 10:02:56 +0100 Subject: Template befor Definition In-Reply-To: <479919FC.9020704@op5.se> References: <288245630712140621u93d6113q643b51e4194cde39@mail.gmail.com> <4762A66D.2040107@op5.se> <288245630801160000j3f40be4q7fd1e748ebba782d@mail.gmail.com> <478DC94B.6040808@op5.se> <288245630801160138y437578ddp23f16f6720efe6d1@mail.gmail.com> <288245630801240625u2a1ec5ctbc4ed7654e68079f@mail.gmail.com> <479919FC.9020704@op5.se> Message-ID: <288245630801250102v4009829coe1711e71c042dcef@mail.gmail.com> On Jan 25, 2008 12:06 AM, Andreas Ericsson wrote: > Mark Eisenblaetter wrote: > > On Jan 16, 2008 10:38 AM, Mark Eisenblaetter < > mark.eisenblaetter at gmail.com> > > wrote: > > > >> Hi, > >> > >> On Jan 16, 2008 10:07 AM, Andreas Ericsson wrote: > >> > >>> Please don't top-post. It makes it hard to follow the discussion, > >>> especially when you're top-posting to an answer that wasn't top- > >>> posted. Anyways... > >>> > >>> Mark Eisenblaetter wrote: > >>>> Hi, > >>>> sorry for the latye replay, > >>>> > >>>> Then I will have to much templates, because i will need for every > >>>> kombination of args a new template. > >>>> > >>>> For example by check_http I will need a template vor every > >>> website/service i > >>>> want to check or for every disk with diffrend warning and Critival > >>>> threshosld. > >>>> > >>>> Thats not so practicable. > >>>> > >>> Originally you wrote: > >>>>>> so it would be great if i can say in the template use the > >>> check_command > >>>>>> (check_dummy) defined in the template and not that defined in the > >>> Host. > >>> To which I replied: > >>>>> That's what not setting anything in the object itself is for. If you > >>> don't > >>>>> have a check_command in the host object, it will use the one from > the > >>>>> template. > >>>>> > >>> In other words, if you want a particular object to inherit the value > >>> from > >>> the template, simply don't set that value in the object. There was no > >>> other question-like statement in your original mail. > >>> > >> Ok, then my first mail was not so clear I hoped. > >> > >> I was thinking of that as a new feature, to minimize the templatework. > >> > >> > > Ok it seems that i am the only one that would like this feature. > > > > Or do i miss one way to handle that situation without having a template > for > > nearly every check? > > > > You still have not clarified your original email. Only discovered that > it's > not clear enough. If you can manage that, I'm sure there's already a > solution > in Nagios for you. > > -- > Andreas Ericsson andreas.ericsson at op5.se > OP5 AB www.op5.se > Tel: +46 8-230225 Fax: +46 8-230231 > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Nagios-devel mailing list > Nagios-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-devel > Hi Andreas, i know that if i don't define an command in the f.e. service definition that nagios will use the on in the template. I started to generate a service definition und two templates one for the master and one for the slave. Here are an exaple: define service{ use freschness-20min,GL-Tcoder-Active-Service,pnp host_name gl-tcoder-sw servicegroups interface service_description Port1 check_command check_mb_snmp_int!1 } Master Template, template pnp and freschness is only to set the checkfreschness and extinfo on the Master server and on the slave they are empty define service{ register 0 name GL-Tcoder-Active-Service active_checks_enabled 0 check_period 24x7 max_check_attempts 3 normal_check_interval 5 retry_check_interval 1 passive_checks_enabled 1 obsess_over_service 0 check_freshness 0 event_handler_enabled 0 notifications_enabled 1 notification_interval 120 notification_period 24x7 notification_options w,u,c,r,f contact_groups rss,streaming } Slave Template define service{ register 0 name GL-Tcoder-Active-Service active_checks_enabled 1 check_period 24x7 max_check_attempts 3 normal_check_interval 5 retry_check_interval 1 passive_checks_enabled 0 obsess_over_service 1 check_freshness 0 event_handler_enabled 0 notifications_enabled 0 notification_interval 120 notification_period 24x7 notification_options w,u,c,r,f contact_groups rss,streaming,bereitschaft } you can see that i am only change the definition from active on the slave to passiv an the master. With the problme on the master if he is trying to check freschness that he will try the interface check. If i want to use the check_dummy an the main server i have to move thec checkcommand in the templet definition. So that i can define on the slave template the snmpcheck and on the master template the check_dummy. wenn i will do it that way i have to generate for every interface i want to check one template. I think in my case it would be nearly 150 templates for diffrent plugins and args. my intention is now to write it taht way but in the master template i will define an checkcommand with check_dummy so that i don't need the additional templates. in addition i want to try to have a minimum of definition that are uniq on the slaves. In my setup it only the templates ans host- and service- groups the rest i kan use 1to1 from the Master and it will be easier to sync the config. In the meentime i will try to go throuh the source code and look if i'm able to make a patch for this behavior but my C is a lot worther then my english. I hope this helps because i don't know ta way o describe it better. Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From pozda at mfservis.cz Fri Jan 25 14:51:13 2008 From: pozda at mfservis.cz (Martin) Date: Fri, 25 Jan 2008 14:51:13 +0100 Subject: SERVICEOUTPUT macro Message-ID: <4799E951.9080006@mfservis.cz> Hi It seems that the SERVICEOUTPUT macro is not working in Nagios 3 RC1. I defined simple command for saving notification (serviceoutput) to file. This definition does not write to file. define command{ command_name notify-service-by-email command_line /usr/bin/printf "$HOSTALIAS$ $SERVICEOUTPUT" > /tmp/a.txt } Command definition without $SERVICEOUTPUT$ works correctly. define command{ command_name notify-service-by-email command_line /usr/bin/printf "$HOSTALIAS$ " > /tmp/a.txt } Martin Pozdilek ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From J.F.Wheeler at rl.ac.uk Fri Jan 25 14:54:50 2008 From: J.F.Wheeler at rl.ac.uk (Wheeler, JF (Jonathan)) Date: Fri, 25 Jan 2008 13:54:50 -0000 Subject: SERVICEOUTPUT macro In-Reply-To: <4799E951.9080006@mfservis.cz> References: <4799E951.9080006@mfservis.cz> Message-ID: You might mean $SERVICEOUTPUT$ in your command definition Jonathan Wheeler e-Science Centre Rutherford Appleton Laboratory > -----Original Message----- > From: nagios-devel-bounces at lists.sourceforge.net > [mailto:nagios-devel-bounces at lists.sourceforge.net] On Behalf > Of Martin > Sent: 25 January 2008 13:51 > To: Nagios Developers List > Subject: [Nagios-devel] SERVICEOUTPUT macro > > > Hi > > It seems that the SERVICEOUTPUT macro is not working in Nagios 3 RC1. > > I defined simple command for saving notification (serviceoutput) to > file. This definition does not write to file. > > define command{ > command_name notify-service-by-email > command_line /usr/bin/printf "$HOSTALIAS$ > $SERVICEOUTPUT" > > /tmp/a.txt > } > > > Command definition without $SERVICEOUTPUT$ works correctly. > > define command{ > command_name notify-service-by-email > command_line /usr/bin/printf "$HOSTALIAS$ " > /tmp/a.txt > } > > > Martin Pozdilek > > -------------------------------------------------------------- > ----------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Nagios-devel mailing list > Nagios-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-devel > ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From pozda at mfservis.cz Fri Jan 25 15:03:52 2008 From: pozda at mfservis.cz (Martin) Date: Fri, 25 Jan 2008 15:03:52 +0100 Subject: SERVICEOUTPUT macro In-Reply-To: References: Message-ID: <4799EC48.6000700@mfservis.cz> Yes. My fault. Of course I use $SERVICEOUTPUT$ in my config files. I'm playing with notifications. Notify-service-by-email was working before I upgrade to RC1. Martin Wheeler, JF (Jonathan) napsal(a): > You might mean $SERVICEOUTPUT$ in your command definition > > Jonathan Wheeler > e-Science Centre > Rutherford Appleton Laboratory > > >> -----Original Message----- >> From: nagios-devel-bounces at lists.sourceforge.net >> [mailto:nagios-devel-bounces at lists.sourceforge.net] On Behalf >> Of Martin >> Sent: 25 January 2008 13:51 >> To: Nagios Developers List >> Subject: [Nagios-devel] SERVICEOUTPUT macro >> >> >> Hi >> >> It seems that the SERVICEOUTPUT macro is not working in Nagios 3 RC1. >> >> I defined simple command for saving notification (serviceoutput) to >> file. This definition does not write to file. >> >> define command{ >> command_name notify-service-by-email >> command_line /usr/bin/printf "$HOSTALIAS$ >> $SERVICEOUTPUT" > >> /tmp/a.txt >> } >> >> >> Command definition without $SERVICEOUTPUT$ works correctly. >> >> define command{ >> command_name notify-service-by-email >> command_line /usr/bin/printf "$HOSTALIAS$ " > /tmp/a.txt >> } >> >> >> Martin Pozdilek >> >> -------------------------------------------------------------- >> ----------- >> This SF.net email is sponsored by: Microsoft >> Defy all challenges. Microsoft(R) Visual Studio 2008. >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >> _______________________________________________ >> Nagios-devel mailing list >> Nagios-devel at lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/nagios-devel >> >> > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Nagios-devel mailing list > Nagios-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-devel > ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From step at tdc.dk Fri Jan 25 15:16:55 2008 From: step at tdc.dk (Steffen Poulsen) Date: Fri, 25 Jan 2008 15:16:55 +0100 Subject: Error: Function nebmodule_init() in module'/usr/local/nagios/bin/ndomod.o' returned an error. In-Reply-To: References: Message-ID: Hi, Thanks to Denis, who wrote us this excellent little tip, we now have NDO working (yay!): ------ I saw your post on NDOUtils, http://archive.netbsd.se/?ml=nagios-devel&a=2007-11&m=5574880 And I just had a similar problem. To debug I added checkpoint to nebmodule_init() i.e. /* process arguments */ if(ndomod_process_module_args(args)==NDO_ERROR) { ndomod_write_to_logs("arguments error\x0",NSLOG_INFO_MESSAGE); return -1; } and it was an args error, which turned out to be the fact that in nagios.cfg I had two spaces before 'config_file' (I suspect because I copy-pasted the line from a how-to): broker_module=/gw/nagios/bin/ndomod.o config_file=/gw/nagios/etc/ndomod.cfg When I deleted one, it started working fine. ------- So, beware of the two-spaces-cut-and-paste error out there :-) Best regards, Steffen > -----Oprindelig meddelelse----- > Fra: nagios-devel-bounces at lists.sourceforge.net > [mailto:nagios-devel-bounces at lists.sourceforge.net] P? vegne > af Steffen Poulsen > Sendt: 1. november 2007 13:55 > Til: Nagios Developers List > Emne: [Nagios-devel] Error: Function nebmodule_init() in > module'/usr/local/nagios/bin/ndomod.o' returned an error. > > Hi, > > We have just tried upgrading to 3.0b6, and ndo 1.4b7. > > While nagios itself definetely seems more stable in this > version (thanks!), ndo is not able to load properly: > > [1193912686] Caught SIGHUP, restarting... > [1193912688] Nagios 3.0b6 starting... (PID=1125) [1193912688] > LOG VERSION: 2.0 [1193912688] ndomod: NDOMOD 1.4b7 > (10-31-2007) Copyright (c) 2005-2007 Ethan Galstad > (nagios at nagios.org) [1193912688] Error: Function > nebmodule_init() in module '/usr/local/nagios/bin/ndomod.o' > returned an error. Module will be unloaded. > [1193912688] Event broker module > '/usr/local/nagios/bin/ndomod.o' deinitialized successfully. > > Unfortunately it is not explicit in what the error is, and we > are unsure how to debug. > > Any ideas? > > Best regards, > Steffen ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From step at tdc.dk Fri Jan 25 15:34:17 2008 From: step at tdc.dk (Steffen Poulsen) Date: Fri, 25 Jan 2008 15:34:17 +0100 Subject: What GUI has the fastest reponse times for Nagios tactical overview? Message-ID: Hi, With a just-now working NDO installation we are now curious if we have options to to brighten the user experience in browsing the Nagios cgis. We are currently planning to do a three-level maps-in-maps implementation in NagVis, to mirror the grouping of hostgroups in Nagios. So, first level map would be a map of host groups, next level map is the map of hosts in the host groups, last level map is the map of services at the host. Similar for service groups. But, we were wondering - is there any software package already doing this kind of pre-rendered / cached views kind of processing? Preferably, we would like to have response times down from 10-30 seconds to 1-2 seconds per view. We are aware that the NagVis implementation doesn't solve the problem that as soon as you want to touch the configuration (perhaps set scheduled maintenance, disable notifications or the like), you are back in the slow response times, but we hope it can help in providing the "tactical overview" faster. Nagios is replacing and extending a Big Brother installation here which was using pre-rendered HTML for its pages, so that is pretty hard to beat - but how close can we get / what do you guys do? We are monitoring 8000+ services / 1000+ hosts in a distributed setup, our installation is running Solaris. Best regards, Steffen Poulsen -- Venlig hilsen Steffen Poulsen Systemmanager TDC A/S FNUHI Sletvej 30, A-094 DK-8310 Tranbjerg J Denmark +45 66 67 61 66 +45 29 25 66 68 (Mobil) E-mail: step at tdc.dk www: tdc.dk ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From throck at duke.edu Fri Jan 25 16:45:20 2008 From: throck at duke.edu (Tom Throckmorton) Date: Fri, 25 Jan 2008 10:45:20 -0500 Subject: [patch] Re: flap detection + state retention In-Reply-To: <20080124201733.GA23455@duke.edu> References: <20080123221614.GA20554@duke.edu> <20080124201733.GA23455@duke.edu> Message-ID: <20080125154520.GA16863@duke.edu> On Jan 24 15:17, Tom Throckmorton wrote: > *delurks* > > Hello all, > > Can someone here please verify that in Nagios 2.x the state of flap_detection > should or shouldn't be persistent across a reload / restart? > > See my post below, and the brief thread on nagios-users: > > http://thread.gmane.org/gmane.network.nagios.user/52044 > > Tests on 2.x seem to show that the retained values for global/host/service flap > detection aren't being honored on a reload. This seems to contradict the docs, > so I'm thinking it's a bug. So, it looks as if the calls for modified_attributes were completely missing from the flapping routines, and appear to have been so since the inception of 2.x. As i'm reading the code, these calls flag the attributes as modified, which means they'll be considered later in the state retention code. Without the calls, the state retention code ignores these attributes, which leads to the problem. Please consider the attached patch against base/flapping.c - this is against 2.10, which happens to be the same as CVS. I'd appreciate it if someone with a better handle on the code (and better programming skills) would have a look to make sure I'm not doing something dumb. I've tested this on 2.10 running on CentOS 5, and it appears to do the right thing regarding retaining state across reloads. Cheers, -tt -- Tom Throckmorton OIT - CSI Duke University -------------- next part -------------- --- base/flapping.c.dist 2006-05-19 10:25:03.000000000 -0400 +++ base/flapping.c 2008-01-25 09:15:41.000000000 -0500 @@ -41,6 +41,8 @@ extern double low_host_flap_threshold; extern double high_host_flap_threshold; +extern unsigned long modified_host_process_attributes; +extern unsigned long modified_service_process_attributes; /******************************************************************/ /******************** FLAP DETECTION FUNCTIONS ********************/ @@ -457,6 +459,10 @@ printf("enable_flap_detection() start\n"); #endif + /* set the attribute modified flag */ + modified_host_process_attributes|=MODATTR_FLAP_DETECTION_ENABLED; + modified_service_process_attributes|=MODATTR_FLAP_DETECTION_ENABLED; + /* set flap detection flag */ enable_flap_detection=TRUE; @@ -479,6 +485,11 @@ printf("disable_flap_detection() start\n"); #endif + /* set the attribute modified flag */ + modified_host_process_attributes|=MODATTR_FLAP_DETECTION_ENABLED; + modified_service_process_attributes|=MODATTR_FLAP_DETECTION_ENABLED; + + /* set flap detection flag */ enable_flap_detection=FALSE; @@ -502,6 +513,9 @@ printf("enable_host_flap_detection() start\n"); #endif + /* set the attribute modified flag */ + hst->modified_attributes|=MODATTR_FLAP_DETECTION_ENABLED; + /* nothing to do... */ if(hst->flap_detection_enabled==TRUE) return; @@ -536,6 +550,9 @@ printf("disable_host_flap_detection() start\n"); #endif + /* set the attribute modified flag */ + hst->modified_attributes|=MODATTR_FLAP_DETECTION_ENABLED; + /* nothing to do... */ if(hst->flap_detection_enabled==FALSE) return; @@ -586,6 +603,9 @@ printf("enable_service_flap_detection() start\n"); #endif + /* set the attribute modified flag */ + svc->modified_attributes|=MODATTR_FLAP_DETECTION_ENABLED; + /* nothing to do... */ if(svc->flap_detection_enabled==TRUE) return; @@ -620,6 +640,9 @@ printf("disable_service_flap_detection() start\n"); #endif + /* set the attribute modified flag */ + svc->modified_attributes|=MODATTR_FLAP_DETECTION_ENABLED; + /* nothing to do... */ if(svc->flap_detection_enabled==FALSE) return; @@ -660,9 +683,3 @@ return; } - - - - - - -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From pitchfork at ederdrom.de Fri Jan 25 18:02:17 2008 From: pitchfork at ederdrom.de (Joerg Linge) Date: Fri, 25 Jan 2008 18:02:17 +0100 Subject: [Nagios-users] What GUI has the fastest reponse times for Nagios tactical overview? In-Reply-To: References: Message-ID: <479A1619.2040707@ederdrom.de> Steffen Poulsen schrieb: > Hi, > > With a just-now working NDO installation we are now curious if we have > options to to brighten the user experience in browsing the Nagios cgis. > > We are currently planning to do a three-level maps-in-maps > implementation in NagVis, to mirror the grouping of hostgroups in > Nagios. So, first level map would be a map of host groups, next level > map is the map of hosts in the host groups, last level map is the map of > services at the host. Similar for service groups. [ snip] Hi Steffen, please ask Lars Michelsen ( NagVis main developer ) about the new automap functions designed vor NagVis 1.3 I think the most work is done. Kind regards J?rg ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From Thomas at zango.com Fri Jan 25 19:03:52 2008 From: Thomas at zango.com (Thomas Guyot-Sionnest) Date: Fri, 25 Jan 2008 10:03:52 -0800 Subject: SERVICEOUTPUT macro In-Reply-To: <4799E951.9080006@mfservis.cz> References: <4799E951.9080006@mfservis.cz> Message-ID: <804160344192334BB21922E8082EA6C0010FF492@seaex01.180solutions.com> > -----Original Message----- > From: nagios-devel-bounces at lists.sourceforge.net [mailto:nagios-devel- > bounces at lists.sourceforge.net] On Behalf Of Martin > Sent: January 25, 2008 8:51 > To: Nagios Developers List > Subject: [Nagios-devel] SERVICEOUTPUT macro > > Hi > > It seems that the SERVICEOUTPUT macro is not working in Nagios 3 RC1. > > I defined simple command for saving notification (serviceoutput) to > file. This definition does not write to file. > > define command{ > command_name notify-service-by-email > command_line /usr/bin/printf "$HOSTALIAS$ $SERVICEOUTPUT" > > /tmp/a.txt > } Missing $ in SERVICEOUTPUT Thomas ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From Thomas at zango.com Fri Jan 25 19:08:04 2008 From: Thomas at zango.com (Thomas Guyot-Sionnest) Date: Fri, 25 Jan 2008 10:08:04 -0800 Subject: SERVICEOUTPUT macro In-Reply-To: <804160344192334BB21922E8082EA6C0010FF492@seaex01.180solutions.com> References: <4799E951.9080006@mfservis.cz> <804160344192334BB21922E8082EA6C0010FF492@seaex01.180solutions.com> Message-ID: <804160344192334BB21922E8082EA6C0010FF497@seaex01.180solutions.com> > -----Original Message----- > From: nagios-devel-bounces at lists.sourceforge.net [mailto:nagios-devel- > bounces at lists.sourceforge.net] On Behalf Of Thomas Guyot-Sionnest > Sent: January 25, 2008 13:04 > To: Nagios Developers List > Subject: Re: [Nagios-devel] SERVICEOUTPUT macro > > Missing $ in SERVICEOUTPUT Oops, nevermind, I was missing all newer messages (it looked like the last post on the list). Sorry. Thomas ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From step at tdc.dk Sat Jan 26 14:11:19 2008 From: step at tdc.dk (Steffen Poulsen) Date: Sat, 26 Jan 2008 14:11:19 +0100 Subject: Memory leak in NDOUtils 1.4b7? Message-ID: Hi, After yesterdays luck installing NDO, unfortunately we experienced a problem with our nagios process tonight - apparently related to a memory leak in NDO. We have put up a memory usage graph at: http://www.forskernet.dk/nagios/ndo_memory_leak.png As can be seen in the graph, after NDO installation memory usage for the nagios process starts growing (from ~30mb to ~120mb / straight line). At this point a restart of the nagios process is triggered, but afterwards processing just halts. We tried to wake it up, restarting it a few times - but the flow of checks had stopped and we had no luck restarting it (external commands appeared to be processed OK, but no passive checks showed up in performance info). After removing NDO, things are back to normal. A regular performance view for the process is like this (40k+ passive checks/5 min): Check Statistics: Type Last 1 Min Last 5 Min Last 15 Min Active Scheduled Host Checks 0 0 0 Active On-Demand Host Checks 0 0 0 Parallel Host Checks 0 0 0 Serial Host Checks 0 0 0 Cached Host Checks 0 0 0 Passive Host Checks 0 0 0 Active Scheduled Service Checks 0 1 3 Active On-Demand Service Checks 0 0 0 Cached Service Checks 0 0 0 Passive Service Checks 8747 43064 129047 External Commands 8984 43301 129287 Buffer Usage: Type In Use Max Used Total Available External Commands 0 2476 81920 We are running Solaris SPARC, Nagios v3.0rc1 and NDOUtils v1.4b7. I haven't seen other reports mentioning this, but if this is a known error, excuses in advance. Actually, also as I'm writing this now I am getting aware that NDOUtils is mentioned as compatible with 3.0b6 only (at the Nagios download page) - I hope it should be OK to run NDOUtils with rc1(?). Best regards, Steffen Poulsen -- Venlig hilsen Steffen Poulsen Systemmanager TDC A/S FNUHI Sletvej 30, A-094 DK-8310 Tranbjerg J Denmark +45 66 67 61 66 +45 29 25 66 68 (Mobil) E-mail: step at tdc.dk www: tdc.dk ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From agrassic at gmail.com Sat Jan 26 23:57:56 2008 From: agrassic at gmail.com (Antonio Grassi) Date: Sat, 26 Jan 2008 20:57:56 -0200 Subject: when is a notification command run? Message-ID: <479BBAF4.3000606@gmail.com> Hi list. Browsing through the code of Nagios 3.0rc1, I see that in base/notifications.c, in the function notify_contact_of_service, the service notification command associated to a contact is executed only if the log of notifications is enabled. Does this mean that notifications are sent only if the system is configured to log notifications, or am I missunderstanding the code? Thanks, Antonio ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From andurin at process-zero.de Sun Jan 27 10:17:35 2008 From: andurin at process-zero.de (=?ISO-8859-15?Q?Hendrik_B=E4cker?=) Date: Sun, 27 Jan 2008 10:17:35 +0100 Subject: [PATCH] - Typo in freshness documentation Message-ID: <479C4C2F.7060901@process-zero.de> Hi Ethan, hi list, Wolfgang from german Nagios-Portal recognized a missing beginning "/" in the just updated freshness doc on line 122. Patch attached. Regards Hendrik -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: nagios-freshness-html-typo.patch URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From andurin at process-zero.de Sun Jan 27 14:52:45 2008 From: andurin at process-zero.de (=?ISO-8859-1?Q?Hendrik_B=E4cker?=) Date: Sun, 27 Jan 2008 14:52:45 +0100 Subject: when is a notification command run? In-Reply-To: <479BBAF4.3000606@gmail.com> References: <479BBAF4.3000606@gmail.com> Message-ID: <479C8CAD.7080804@process-zero.de> Hi Antonio, Antonio Grassi schrieb: > > Does this mean that notifications are sent only if the system is > configured to log notifications, or am I missunderstanding the code? > I assume you were right. Regarding the sources the effectiv "my_system" call to execute the command is only available if "log_notifications" is true. I guess just moving the end of the if clause a few lines earlier should do the trick. I've attached a unified diff against the latest CVS code. Regards, Hendrik -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: notification.c.patch URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 2185 bytes Desc: S/MIME Cryptographic Signature URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From nagios at nagios.org Mon Jan 28 15:52:04 2008 From: nagios at nagios.org (Ethan Galstad) Date: Mon, 28 Jan 2008 08:52:04 -0600 Subject: when is a notification command run? In-Reply-To: <479C8CAD.7080804@process-zero.de> References: <479BBAF4.3000606@gmail.com> <479C8CAD.7080804@process-zero.de> Message-ID: <479DEC14.5030901@nagios.org> Hendrik B?cker wrote: > Hi Antonio, > > Antonio Grassi schrieb: >> >> Does this mean that notifications are sent only if the system is >> configured to log notifications, or am I missunderstanding the code? >> > > I assume you were right. Regarding the sources the effectiv "my_system" > call to execute the command is only available if "log_notifications" is > true. > > I guess just moving the end of the if clause a few lines earlier should > do the trick. > > I've attached a unified diff against the latest CVS code. > > Regards, > Hendrik > Indeed - that's a pretty major bug. I'll get the patch applied and do the same for the corresponding host notification function, which also contains the bug. Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From nagios at nagios.org Mon Jan 28 15:53:59 2008 From: nagios at nagios.org (Ethan Galstad) Date: Mon, 28 Jan 2008 08:53:59 -0600 Subject: [PATCH] - Typo in freshness documentation In-Reply-To: <479C4C2F.7060901@process-zero.de> References: <479C4C2F.7060901@process-zero.de> Message-ID: <479DEC87.9020407@nagios.org> Hendrik B?cker wrote: > Hi Ethan, > hi list, > > Wolfgang from german Nagios-Portal recognized a missing beginning "/" in > the just updated freshness doc on line 122. > > Patch attached. > > Regards > Hendrik > Thanks! Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From nagios at nagios.org Mon Jan 28 16:05:36 2008 From: nagios at nagios.org (Ethan Galstad) Date: Mon, 28 Jan 2008 09:05:36 -0600 Subject: [patch] Re: flap detection + state retention In-Reply-To: <20080125154520.GA16863@duke.edu> References: <20080123221614.GA20554@duke.edu> <20080124201733.GA23455@duke.edu> <20080125154520.GA16863@duke.edu> Message-ID: <479DEF40.1080204@nagios.org> Tom Throckmorton wrote: > On Jan 24 15:17, Tom Throckmorton wrote: >> *delurks* >> >> Hello all, >> >> Can someone here please verify that in Nagios 2.x the state of flap_detection >> should or shouldn't be persistent across a reload / restart? >> >> See my post below, and the brief thread on nagios-users: >> >> http://thread.gmane.org/gmane.network.nagios.user/52044 >> >> Tests on 2.x seem to show that the retained values for global/host/service flap >> detection aren't being honored on a reload. This seems to contradict the docs, >> so I'm thinking it's a bug. > > So, it looks as if the calls for modified_attributes were completely missing > from the flapping routines, and appear to have been so since the inception of > 2.x. As i'm reading the code, these calls flag the attributes as modified, > which means they'll be considered later in the state retention code. Without > the calls, the state retention code ignores these attributes, which leads to > the problem. > > Please consider the attached patch against base/flapping.c - this is against > 2.10, which happens to be the same as CVS. I'd appreciate it if someone with a > better handle on the code (and better programming skills) would have a look to > make sure I'm not doing something dumb. > > I've tested this on 2.10 running on CentOS 5, and it appears to do the right > thing regarding retaining state across reloads. > > Cheers, > > > -tt > Thanks Tom - excellent patch. I'll get this into CVS shortly. Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From david.schmidt at univie.ac.at Mon Jan 28 16:27:41 2008 From: david.schmidt at univie.ac.at (David Schmidt) Date: Mon, 28 Jan 2008 16:27:41 +0100 (CET) Subject: ndo2db oracle development In-Reply-To: <474559B9.4040201@op5.se> References: <474559B9.4040201@op5.se> Message-ID: <20080128152741.5836C580051@desire.netways.de> For everyones information: I do have a working (compiles + doesnt crash + writes into oracle database) version ready. Details can be found here > http://www.nagios-portal.de/wbb/index.php?page=Thread&threadID=7140 Its in german only but I can translate the important stuff if there are some people interested. One of the next steps would be to integrate it into the nod2db project. Help is really appreciated here since I think some design changes should be made (right now there are many and huge #ifdef ORACLE switches and this doesnt seem like a good idea to me) - David Schmidt (davewood) ----------------------- This thread is located in the archive at this URL: http://www.nagiosexchange.org/nagios-devel.33.0.html?&tx_maillisttofaq_pi1[showUid]=6793 ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From john.calcote at gmail.com Mon Jan 28 16:54:18 2008 From: john.calcote at gmail.com (John Calcote) Date: Mon, 28 Jan 2008 08:54:18 -0700 Subject: patch for nagios 3rc1 - update In-Reply-To: <47940FBF.7000106@gmail.com> References: <4790F982.5040103@gmail.com> <47940FBF.7000106@gmail.com> Message-ID: <3ee91eb90801280754y2cd4dc9dmf9df22e20223e72f@mail.gmail.com> Hi Ethan, I was just looking over the archives for the nagios-devel list on sf.net and I noticed that the patch I sent as an attachment was referenced by the archived message. But when I clicked on the link, I got a message indicating that the attachment was "not available on the server". This makes me wonder if I even attached it properly to begin with. Since I haven't seen any responses to my messages on this thread, I wondered if anyone actually got my patch. To be sure, I'm resending it as in-line text, rather than an attachment. Thanks in advance, John diff -Naur nagios-3.0rc1/base/checks.c nagios-3.0rc1-patched/base/checks.c --- nagios-3.0rc1/base/checks.c 2007-12-12 11:39:46.000000000 -0700 +++ nagios-3.0rc1-patched/base/checks.c 2008-01-20 20:10:06.000000000 -0700 @@ -380,11 +380,6 @@ *preferred_time+=(svc->check_interval*interval_length); return ERROR; } - - /* neb module wants to override the service check - perhaps it will check the service itself */ - /* NOTE: if a module does this, it has to do a lot of the stuff found below to make sure things don't get whacked out of shape! */ - if(neb_result==NEBERROR_CALLBACKOVERRIDE) - return OK; #endif @@ -410,6 +405,7 @@ log_debug_info(DEBUGL_CHECKS,0,"Raw check command for service '%s' on host '%s' was NULL - aborting.\n",svc->description,svc->host_name); if(preferred_time) *preferred_time+=(svc->check_interval*interval_length); + svc->latency=old_latency; return ERROR; } @@ -419,6 +415,8 @@ log_debug_info(DEBUGL_CHECKS,0,"Processed check command for service '%s' on host '%s' was NULL - aborting.\n",svc->description,svc->host_name); if(preferred_time) *preferred_time+=(svc->check_interval*interval_length); + svc->latency=old_latency; + my_free(raw_command); return ERROR; } @@ -431,6 +429,32 @@ /* set the execution flag */ svc->is_executing=TRUE; + /* start save check info */ + check_result_info.object_check_type=SERVICE_CHECK; + check_result_info.check_type=SERVICE_CHECK_ACTIVE; + check_result_info.check_options=check_options; + check_result_info.scheduled_check=scheduled_check; + check_result_info.reschedule_check=reschedule_check; + check_result_info.start_time=start_time; + check_result_info.finish_time=start_time; + check_result_info.early_timeout=FALSE; + check_result_info.exited_ok=TRUE; + check_result_info.return_code=STATE_OK; + check_result_info.output=NULL; + +#ifdef USE_EVENT_BROKER + /* send data to event broker */ + neb_result=broker_service_check(NEBTYPE_SERVICECHECK_INITIATE,NEBFLAG_NONE,NEBATTR_NONE,svc,SERVICE_CHECK_ACTIVE,start_time,end_time,svc->service_check_command,svc->latency,0.0,service_check_timeout,FALSE,0,processed_command,NULL); + + /* neb module wants to override the service check - perhaps it will check the service itself */ + if(neb_result==NEBERROR_CALLBACKOVERRIDE){ + svc->latency=old_latency; + my_free(processed_command); + my_free(raw_command); + return OK; + } +#endif + /* open a temp file for storing check output */ old_umask=umask(new_umask); asprintf(&output_file,"%s/checkXXXXXX",temp_path); @@ -446,21 +470,10 @@ log_debug_info(DEBUGL_CHECKS|DEBUGL_IPC,1,"Check result output will be written to '%s' (fd=%d)\n",output_file,check_result_info.output_file_fd); - /* save check info */ - check_result_info.object_check_type=SERVICE_CHECK; + /* finish save check info */ check_result_info.host_name=(char *)strdup(svc->host_name); check_result_info.service_description=(char *)strdup(svc->description); - check_result_info.check_type=SERVICE_CHECK_ACTIVE; - check_result_info.check_options=check_options; - check_result_info.scheduled_check=scheduled_check; - check_result_info.reschedule_check=reschedule_check; check_result_info.output_file=(check_result_info.output_file_fd<0 || output_file==NULL)?NULL:strdup(output_file); - check_result_info.start_time=start_time; - check_result_info.finish_time=start_time; - check_result_info.early_timeout=FALSE; - check_result_info.exited_ok=TRUE; - check_result_info.return_code=STATE_OK; - check_result_info.output=NULL; /* free memory */ my_free(output_file); @@ -492,11 +505,6 @@ dbuf_init(&checkresult_dbuf,dbuf_chunk); -#ifdef USE_EVENT_BROKER - /* send data to event broker */ - broker_service_check(NEBTYPE_SERVICECHECK_INITIATE,NEBFLAG_NONE,NEBATTR_NONE,svc,SERVICE_CHECK_ACTIVE,start_time,end_time,svc->service_check_command,svc->latency,0.0,service_check_timeout,FALSE,0,processed_command,NULL); -#endif - /* reset latency (permanent value will be set later) */ svc->latency=old_latency; ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From nagios at nagios.org Mon Jan 28 17:02:34 2008 From: nagios at nagios.org (Ethan Galstad) Date: Mon, 28 Jan 2008 10:02:34 -0600 Subject: patch for nagios 3rc1 - update In-Reply-To: <3ee91eb90801280754y2cd4dc9dmf9df22e20223e72f@mail.gmail.com> References: <4790F982.5040103@gmail.com> <47940FBF.7000106@gmail.com> <3ee91eb90801280754y2cd4dc9dmf9df22e20223e72f@mail.gmail.com> Message-ID: <479DFC9A.2090808@nagios.org> John Calcote wrote: > Hi Ethan, > > I was just looking over the archives for the nagios-devel list on > sf.net and I noticed that the patch I sent as an attachment was > referenced by the archived message. But when I clicked on the link, I > got a message indicating that the attachment was "not available on the > server". This makes me wonder if I even attached it properly to begin > with. > > Since I haven't seen any responses to my messages on this thread, I > wondered if anyone actually got my patch. > > To be sure, I'm resending it as in-line text, rather than an attachment. > > Thanks in advance, > John > [snip] Hi John - The attachments came through okay. I'm actually reviewing them now and will have some comments later today. Ethan Galstad Nagios Developer ___ Email: nagios at nagios.org Web: www.nagios.org ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From john.calcote at gmail.com Mon Jan 28 17:06:46 2008 From: john.calcote at gmail.com (John Calcote) Date: Mon, 28 Jan 2008 09:06:46 -0700 Subject: patch for nagios 3rc1 - update In-Reply-To: <479DFC9A.2090808@nagios.org> References: <4790F982.5040103@gmail.com> <47940FBF.7000106@gmail.com> <3ee91eb90801280754y2cd4dc9dmf9df22e20223e72f@mail.gmail.com> <479DFC9A.2090808@nagios.org> Message-ID: <3ee91eb90801280806y157dcb23h2b6edb67d1592cf3@mail.gmail.com> Thanks - please ignore the last one I tried to send as inline text - I tried to apply it, and if failed miserably. I'm beginning to wonder if I really care for "Google's way of managing email"... :-} John On Jan 28, 2008 9:02 AM, Ethan Galstad wrote: > John Calcote wrote: > > Hi Ethan, > > > > I was just looking over the archives for the nagios-devel list on > > sf.net and I noticed that the patch I sent as an attachment was > > referenced by the archived message. But when I clicked on the link, I > > got a message indicating that the attachment was "not available on the > > server". This makes me wonder if I even attached it properly to begin > > with. > > > > Since I haven't seen any responses to my messages on this thread, I > > wondered if anyone actually got my patch. > > > > To be sure, I'm resending it as in-line text, rather than an attachment. > > > > Thanks in advance, > > John > > > [snip] > > Hi John - The attachments came through okay. I'm actually reviewing > them now and will have some comments later today. > > > Ethan Galstad > Nagios Developer > ___ > Email: nagios at nagios.org > Web: www.nagios.org > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Nagios-devel mailing list > Nagios-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-devel > ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From j.gabes at lectra.com Tue Jan 29 10:05:44 2008 From: j.gabes at lectra.com (Gabes Jean) Date: Tue, 29 Jan 2008 10:05:44 +0100 Subject: Patch for Circular Paths (new algo) 70s->0.007s :) In-Reply-To: <3ee91eb90801280806y157dcb23h2b6edb67d1592cf3@mail.gmail.com> References: <4790F982.5040103@gmail.com> <47940FBF.7000106@gmail.com><3ee91eb90801280754y2cd4dc9dmf9df22e20223e72f@mail.gmail.com><479DFC9A.2090808@nagios.org> <3ee91eb90801280806y157dcb23h2b6edb67d1592cf3@mail.gmail.com> Message-ID: <73BD1CEC958A564DB6771C5E97339F2806143792@SMAIL.eu.lectra.com> Hi Ethan, Hi Everyone, I saw on the documentation: "That means all you CompSci graduate students who have been emailing me about doing your thesis on Nagios can contribute some code back. :-)" I don't know who they are, but I try to do the job: change the algorithm of the circular check in order to have a O(n) complexity. My algo can optimise the circular check (for the moment, only the host part, but I'm working on the services part too). I use a personal modified version of the Deep First Search (http://en.wikipedia.org/wiki/Depth-first_search) (DFS) The hard work is already done: we've got link between parents->childs (the childs->parents link are not used). We "just" have to follow theses links. It's a recursive algo. All node are in the beginning unchecked (value =0), when we began to check them, it's temporary_checked (value=1). We check all childs if they are not already checked (here is the recursive part :) ). If a child is in state temporary_checked, there is a loop, not follow the link, make our status loop_inside (value=3),and return it. If the child is a ok state (value=2), ok no problem with the child. If the child is LOOP_DETECTED (value=3), do not follow the link, and we return our status=loop_inside. If all childs are OK or if we don't have child (no childs, no loop :) ), we are OK an return. The algo is ok for code with #ifdef NSCORE (I need the host link). If we does not have NSCORE, we can't use it. But in the major system it's ok isn't it? The modifications are in the files: *objects.h (to add macro DFS STATUS and add dfsCheckedStatus to host structure) *objects.c (to add DFS_UNCHECKED at the initialisation of host-> dfsCheckedStatus) *config.c (add 2 fonctions: dfsStatus to get the status of a host, dfs for make the dfs algo for a host (and it's childs), and modifie the pre_flight_circular_check for call dfs on all node that need it (not already checked)). I know that the 2 function need to be in the objects.c, but I'll move them in the final patch. I generated a "big" conf to test it: 10000 hosts (milleservers.cfg), dependant of 100 parents (parents.cfg). The 10000 are not looped. The loop are in the test.cfg file: 5 hosts, in a circular way. You can find the files (objects.c,h and config.c) and the samples at: http://zegabes.free.fr/nagios/ I put in this mail the code files. The original files are from 2 hours ago. Can you say me how we can make a "real" patch? Diff? but how can I have the modified version and the original version in the same directory? Thanks. I hope this patch can help you, like Nagios help me every days :) I'm working for make this algo for services, them clean all (in the good files...) and apply the code typing of Nagios (here it's emacs one). To find modification: search jean or Jean on the code ;) *objects.h: L382 and the begining for DEFINE DFS* *objects.c : L922 for default value for host *config.c: the very end, before the pre_flight_circular_check Here is a launch of Nagios -v with milleservers, parents and test.cfg: My algo begin Big problem on child :srv0 with root:srv4 Part of the loop problem on child :srv4 with root:srv3 Part of the loop problem on child :srv3 with root:srv2 Part of the loop problem on child :srv2 with root:srv1 Part of the loop problem on child :srv1 with root:srv0 My algo end My Circular Paths: 0.006864 sec * Error: There is a circular parent/child path that exists for host 'srv0'! Error: There is a circular parent/child path that exists for host 'srv1'! Error: There is a circular parent/child path that exists for host 'srv2'! Error: There is a circular parent/child path that exists for host 'srv3'! Error: There is a circular parent/child path that exists for host 'srv4'! Timing information on configuration verification is listed below. CONFIG VERIFICATION TIMES (* = Potential for speedup with -x option) ---------------------------------- Object Relationships: 0.972514 sec Circular Paths: 70.060319 sec * Misc: 0.004837 sec ============ TOTAL: 71.037670 sec * = 70.060319 sec (98.6%) estimated savings ***> One or more problems was encountered while running the pre-flight check... Great isn't it? original way:71.037670 new way: 0.006864 sec (My Circular Paths). Let me know if you want more information, I'll say you when I'll finish the services dependencies :) Jean Gabes (alias: naparuba) Ps: sorry for my english, I'm French, but I really try. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From j.gabes at lectra.com Tue Jan 29 10:15:16 2008 From: j.gabes at lectra.com (Gabes Jean) Date: Tue, 29 Jan 2008 10:15:16 +0100 Subject: Patch for Circular Paths (new algo) 70s->0.007s :) In-Reply-To: <73BD1CEC958A564DB6771C5E97339F2806143792@SMAIL.eu.lectra.com> References: <4790F982.5040103@gmail.com><47940FBF.7000106@gmail.com><3ee91eb90801280754y2cd4dc9dmf9df22e20223e72f@mail.gmail.com><479DFC9A.2090808@nagios.org><3ee91eb90801280806y157dcb23h2b6edb67d1592cf3@mail.gmail.com> <73BD1CEC958A564DB6771C5E97339F2806143792@SMAIL.eu.lectra.com> Message-ID: <73BD1CEC958A564DB6771C5E97339F280614381F@SMAIL.eu.lectra.com> Oups, I forgot the files :p Jean -----Message d'origine----- De?: nagios-devel-bounces at lists.sourceforge.net [mailto:nagios-devel-bounces at lists.sourceforge.net] De la part de Gabes Jean Envoy??: mardi 29 janvier 2008 10:06 ??: Nagios Developers List; nagios at nagios.org Objet?: [Nagios-devel] Patch for Circular Paths (new algo) 70s->0.007s :) Hi Ethan, Hi Everyone, I saw on the documentation: "That means all you CompSci graduate students who have been emailing me about doing your thesis on Nagios can contribute some code back. :-)" I don't know who they are, but I try to do the job: change the algorithm of the circular check in order to have a O(n) complexity. My algo can optimise the circular check (for the moment, only the host part, but I'm working on the services part too). I use a personal modified version of the Deep First Search (http://en.wikipedia.org/wiki/Depth-first_search) (DFS) The hard work is already done: we've got link between parents->childs (the childs->parents link are not used). We "just" have to follow theses links. It's a recursive algo. All node are in the beginning unchecked (value =0), when we began to check them, it's temporary_checked (value=1). We check all childs if they are not already checked (here is the recursive part :) ). If a child is in state temporary_checked, there is a loop, not follow the link, make our status loop_inside (value=3),and return it. If the child is a ok state (value=2), ok no problem with the child. If the child is LOOP_DETECTED (value=3), do not follow the link, and we return our status=loop_inside. If all childs are OK or if we don't have child (no childs, no loop :) ), we are OK an return. The algo is ok for code with #ifdef NSCORE (I need the host link). If we does not have NSCORE, we can't use it. But in the major system it's ok isn't it? The modifications are in the files: *objects.h (to add macro DFS STATUS and add dfsCheckedStatus to host structure) *objects.c (to add DFS_UNCHECKED at the initialisation of host-> dfsCheckedStatus) *config.c (add 2 fonctions: dfsStatus to get the status of a host, dfs for make the dfs algo for a host (and it's childs), and modifie the pre_flight_circular_check for call dfs on all node that need it (not already checked)). I know that the 2 function need to be in the objects.c, but I'll move them in the final patch. I generated a "big" conf to test it: 10000 hosts (milleservers.cfg), dependant of 100 parents (parents.cfg). The 10000 are not looped. The loop are in the test.cfg file: 5 hosts, in a circular way. You can find the files (objects.c,h and config.c) and the samples at: http://zegabes.free.fr/nagios/ I put in this mail the code files. The original files are from 2 hours ago. Can you say me how we can make a "real" patch? Diff? but how can I have the modified version and the original version in the same directory? Thanks. I hope this patch can help you, like Nagios help me every days :) I'm working for make this algo for services, them clean all (in the good files...) and apply the code typing of Nagios (here it's emacs one). To find modification: search jean or Jean on the code ;) *objects.h: L382 and the begining for DEFINE DFS* *objects.c : L922 for default value for host *config.c: the very end, before the pre_flight_circular_check Here is a launch of Nagios -v with milleservers, parents and test.cfg: My algo begin Big problem on child :srv0 with root:srv4 Part of the loop problem on child :srv4 with root:srv3 Part of the loop problem on child :srv3 with root:srv2 Part of the loop problem on child :srv2 with root:srv1 Part of the loop problem on child :srv1 with root:srv0 My algo end My Circular Paths: 0.006864 sec * Error: There is a circular parent/child path that exists for host 'srv0'! Error: There is a circular parent/child path that exists for host 'srv1'! Error: There is a circular parent/child path that exists for host 'srv2'! Error: There is a circular parent/child path that exists for host 'srv3'! Error: There is a circular parent/child path that exists for host 'srv4'! Timing information on configuration verification is listed below. CONFIG VERIFICATION TIMES (* = Potential for speedup with -x option) ---------------------------------- Object Relationships: 0.972514 sec Circular Paths: 70.060319 sec * Misc: 0.004837 sec ============ TOTAL: 71.037670 sec * = 70.060319 sec (98.6%) estimated savings ***> One or more problems was encountered while running the pre-flight check... Great isn't it? original way:71.037670 new way: 0.006864 sec (My Circular Paths). Let me know if you want more information, I'll say you when I'll finish the services dependencies :) Jean Gabes (alias: naparuba) Ps: sorry for my english, I'm French, but I really try. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel -------------- next part -------------- A non-text attachment was scrubbed... Name: config.c Type: application/octet-stream Size: 93706 bytes Desc: config.c URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: objects.c Type: application/octet-stream Size: 141746 bytes Desc: objects.c URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: objects.h Type: application/octet-stream Size: 29558 bytes Desc: objects.h URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From j.gabes at lectra.com Tue Jan 29 10:32:02 2008 From: j.gabes at lectra.com (Gabes Jean) Date: Tue, 29 Jan 2008 10:32:02 +0100 Subject: Patch for Circular Paths (new algo) 70s->0.007s :) In-Reply-To: <73BD1CEC958A564DB6771C5E97339F2806143792@SMAIL.eu.lectra.com> References: <4790F982.5040103@gmail.com><47940FBF.7000106@gmail.com><3ee91eb90801280754y2cd4dc9dmf9df22e20223e72f@mail.gmail.com><479DFC9A.2090808@nagios.org><3ee91eb90801280806y157dcb23h2b6edb67d1592cf3@mail.gmail.com> <73BD1CEC958A564DB6771C5E97339F2806143792@SMAIL.eu.lectra.com> Message-ID: <73BD1CEC958A564DB6771C5E97339F2806143908@SMAIL.eu.lectra.com> I try to make diff between modifiedversion.c and originalversion. Jean -----Message d'origine----- De?: nagios-devel-bounces at lists.sourceforge.net [mailto:nagios-devel-bounces at lists.sourceforge.net] De la part de Gabes Jean Envoy??: mardi 29 janvier 2008 10:06 ??: Nagios Developers List; nagios at nagios.org Objet?: [Nagios-devel] Patch for Circular Paths (new algo) 70s->0.007s :) Hi Ethan, Hi Everyone, I saw on the documentation: "That means all you CompSci graduate students who have been emailing me about doing your thesis on Nagios can contribute some code back. :-)" I don't know who they are, but I try to do the job: change the algorithm of the circular check in order to have a O(n) complexity. My algo can optimise the circular check (for the moment, only the host part, but I'm working on the services part too). I use a personal modified version of the Deep First Search (http://en.wikipedia.org/wiki/Depth-first_search) (DFS) The hard work is already done: we've got link between parents->childs (the childs->parents link are not used). We "just" have to follow theses links. It's a recursive algo. All node are in the beginning unchecked (value =0), when we began to check them, it's temporary_checked (value=1). We check all childs if they are not already checked (here is the recursive part :) ). If a child is in state temporary_checked, there is a loop, not follow the link, make our status loop_inside (value=3),and return it. If the child is a ok state (value=2), ok no problem with the child. If the child is LOOP_DETECTED (value=3), do not follow the link, and we return our status=loop_inside. If all childs are OK or if we don't have child (no childs, no loop :) ), we are OK an return. The algo is ok for code with #ifdef NSCORE (I need the host link). If we does not have NSCORE, we can't use it. But in the major system it's ok isn't it? The modifications are in the files: *objects.h (to add macro DFS STATUS and add dfsCheckedStatus to host structure) *objects.c (to add DFS_UNCHECKED at the initialisation of host-> dfsCheckedStatus) *config.c (add 2 fonctions: dfsStatus to get the status of a host, dfs for make the dfs algo for a host (and it's childs), and modifie the pre_flight_circular_check for call dfs on all node that need it (not already checked)). I know that the 2 function need to be in the objects.c, but I'll move them in the final patch. I generated a "big" conf to test it: 10000 hosts (milleservers.cfg), dependant of 100 parents (parents.cfg). The 10000 are not looped. The loop are in the test.cfg file: 5 hosts, in a circular way. You can find the files (objects.c,h and config.c) and the samples at: http://zegabes.free.fr/nagios/ I put in this mail the code files. The original files are from 2 hours ago. Can you say me how we can make a "real" patch? Diff? but how can I have the modified version and the original version in the same directory? Thanks. I hope this patch can help you, like Nagios help me every days :) I'm working for make this algo for services, them clean all (in the good files...) and apply the code typing of Nagios (here it's emacs one). To find modification: search jean or Jean on the code ;) *objects.h: L382 and the begining for DEFINE DFS* *objects.c : L922 for default value for host *config.c: the very end, before the pre_flight_circular_check Here is a launch of Nagios -v with milleservers, parents and test.cfg: My algo begin Big problem on child :srv0 with root:srv4 Part of the loop problem on child :srv4 with root:srv3 Part of the loop problem on child :srv3 with root:srv2 Part of the loop problem on child :srv2 with root:srv1 Part of the loop problem on child :srv1 with root:srv0 My algo end My Circular Paths: 0.006864 sec * Error: There is a circular parent/child path that exists for host 'srv0'! Error: There is a circular parent/child path that exists for host 'srv1'! Error: There is a circular parent/child path that exists for host 'srv2'! Error: There is a circular parent/child path that exists for host 'srv3'! Error: There is a circular parent/child path that exists for host 'srv4'! Timing information on configuration verification is listed below. CONFIG VERIFICATION TIMES (* = Potential for speedup with -x option) ---------------------------------- Object Relationships: 0.972514 sec Circular Paths: 70.060319 sec * Misc: 0.004837 sec ============ TOTAL: 71.037670 sec * = 70.060319 sec (98.6%) estimated savings ***> One or more problems was encountered while running the pre-flight check... Great isn't it? original way:71.037670 new way: 0.006864 sec (My Circular Paths). Let me know if you want more information, I'll say you when I'll finish the services dependencies :) Jean Gabes (alias: naparuba) Ps: sorry for my english, I'm French, but I really try. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel -------------- next part -------------- A non-text attachment was scrubbed... Name: config.c.patch Type: application/octet-stream Size: 4026 bytes Desc: config.c.patch URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: objects.c.patch Type: application/octet-stream Size: 150 bytes Desc: objects.c.patch URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: objects.h.patch Type: application/octet-stream Size: 459 bytes Desc: objects.h.patch URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From matthias at tuxlife.de Tue Jan 29 18:57:35 2008 From: matthias at tuxlife.de (Matthias Kerk) Date: Tue, 29 Jan 2008 18:57:35 +0100 Subject: illegal_macro_output_chars and $SERVICEOUTPUT$ Message-ID: <479F690F.6000409@tuxlife.de> Hi, Nagios will not filtered illegal chars from $SERVICEOUTPUT$ and so on. [1201267690.536801] [032.0] [pid=28051] ** Service Notification Attempt ** Host: 'matthiastest', Service: 'TEST', Type: 0, Options: 0, Current State: 1, Last Notification: Thu Jan 1 01:00:00 1970 [1201267690.536857] [032.0] [pid=28051] Notification viability test passed. [1201267690.536867] [032.1] [pid=28051] Current notification number: 16 (incremented) [1201267690.536875] [032.2] [pid=28051] Creating list of contacts to be notified. [1201267690.537221] [032.1] [pid=28051] Service notification will NOT be escalated. [1201267690.537236] [032.1] [pid=28051] Adding normal contacts for service to notification list. [1201267690.537245] [032.2] [pid=28051] Adding members of contact group 'matthias' for service to notification list. [1201267690.537253] [032.2] [pid=28051] Adding contact 'matthias' to notification list. [1201267690.537302] [032.2] [pid=28051] ** Attempting to notifying contact 'matthias'... [1201267690.537312] [032.2] [pid=28051] ** Checking service notification viability for contact 'matthias'... [1201267690.537327] [032.2] [pid=28051] ** Service notification viability for contact 'matthias' PASSED. [1201267690.537335] [032.2] [pid=28051] ** Notifying contact 'matthias' [1201267690.537413] [032.2] [pid=28051] Raw notification command: /usr/bin/printf "%b" "***** Nagios 3.0rc1 *****\n\nNotification Type: [1201267690.537422] [032.2] [pid=28051] Processed notification command: /usr/bin/printf "%b" "***** Nagios 3.0rc1 *****\n\nNotification Type: PROBLEM\n\nService: TEST\nHost: test.example.com\nAddress: 127.0.0.1\nState: WARNING\n\nDate/Time: Fri Jan 25 14:28:10 CET 2008\n\nAdditional Info:\n\nSNMP CRITICAL - 1 "md3 : active 'test' raid1 hdc5[2](F) hda5[0]"\n\nAttempt: 16" | /bin/mail -s "** PROBLEM alert - test.example.com/TEST is WARNING **" matthias-mail at example.com [1201267690.559551] [032.2] [pid=28051] Raw notification command: /usr/bin/printf "%b" "Service: [1201267690.559604] [032.2] [pid=28051] Processed notification command: /usr/bin/printf "%b" "Service: TEST\nHost: test.example.com\nAddress: 81.169.135.107\nState: WARNING\nInfo: SNMP CRITICAL - 1 "md3 : active 'test' raid1 hdc5[2](F) hda5[0]"\nDate: Fri Jan 25 14:28:10 CET 2008" | /bin/mail -s "PROBLEM: test.example.com/TEST is WARNING" matthias-sms at example.com [1201267690.581757] [032.2] [pid=28051] Calculating next valid notification time... [1201267690.581811] [032.2] [pid=28051] Default interval: 120.000000 [1201267690.582192] [032.2] [pid=28051] Interval used for calculating next valid notification time: 120.000000 [1201267690.582222] [032.0] [pid=28051] No contacts were notified. Next possible notification time: Fri Jan 25 16:28:10 2008 I believe it is a problem in common/macro.c 90 int clean_options=0; ... 153 /* grab the macro value */ 154 result=grab_macro_value(temp_buffer,&selected_macro,&clean_options,&free_macro); ... 196 /* include any cleaning options passed back to us */ 197 options&=clean_options; ... 393 int grab_macro_value(char *macro_buffer, char **output, int *clean_options, int *free_macro){ ... 584 if(result==OK) 585 *clean_options&=(STRIP_ILLEGAL_MACRO_CHARS|ESCAPE_MACRO_CHARS); clean_options & (STRIP_ILLEGAL_MACRO_CHARS|ESCAPE_MACRO_CHARS) = 0 0 & (1|2) = 0 0 & 3 = 0 Best regards, Matthias ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From matthias at tuxlife.de Tue Jan 29 19:34:41 2008 From: matthias at tuxlife.de (Matthias Kerk) Date: Tue, 29 Jan 2008 19:34:41 +0100 Subject: illegal_macro_output_chars and $SERVICEOUTPUT$ In-Reply-To: <479F690F.6000409@tuxlife.de> References: <479F690F.6000409@tuxlife.de> Message-ID: <479F71C1.7090205@tuxlife.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - --- ../orig/nagios-3.0rc2/common/macros.c 2008-01-15 21:35:52.000000000 +0100 +++ common/macros.c 2008-01-29 19:27:18.000000000 +0100 @@ -194,8 +194,8 @@ if(selected_macro!=NULL){ /* include any cleaning options passed back to us */ - - options&=clean_options; - - + //options&=clean_options; + options=(STRIP_ILLEGAL_MACRO_CHARS|ESCAPE_MACRO_CHARS); /* URL encode the macro if requested - this allocates new memory */ if(options & URL_ENCODE_MACRO_CHARS){ original_macro=selected_macro; this is not nice, but fix me problem. have anyone a better/clean resolution? Thanks, Matthias -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFHn3HBTG9/zWWjsBsRAprzAJ9X6zWtI0xS5D6j9nnKpMyads/TtwCaA8zA sbhIiuUO/FrfWK7Exny/GNM= =yBz0 -----END PGP SIGNATURE----- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From john.calcote at gmail.com Tue Jan 29 21:25:46 2008 From: john.calcote at gmail.com (John Calcote) Date: Tue, 29 Jan 2008 13:25:46 -0700 Subject: DNX patch for Nagios 3.0 rc2 Message-ID: <479F8BCA.1030100@gmail.com> Hi Ethan, I realize you're probably still looking over my changes to 3.0 rc1. But I had to post a patch for rc2 on the dnx-devel list so that people wanting to play with dnx on nagios 3 could move ahead with rc2. Since you didn't take the critical portions of my rc1 patch, I assume you have some issues with it. No doubt, I broke something. I've attached the rc2 version of my patch (same design). I'll continue to wait for your comments. Thanks! John -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: nagios-3.0rc2-dnx.patch URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From astinus at gentoo.org Tue Jan 29 21:47:46 2008 From: astinus at gentoo.org (Alex Howells) Date: Tue, 29 Jan 2008 20:47:46 +0000 Subject: DNX patch for Nagios 3.0 rc2 In-Reply-To: <479F8BCA.1030100@gmail.com> References: <479F8BCA.1030100@gmail.com> Message-ID: On 1/29/08, John Calcote wrote: > I realize you're probably still looking over my changes to 3.0 rc1. But > I had to post a patch for rc2 on the dnx-devel list so that people > wanting to play with dnx on nagios 3 could move ahead with rc2. > > Since you didn't take the critical portions of my rc1 patch, I assume > you have some issues with it. No doubt, I broke something. > > I've attached the rc2 version of my patch (same design). I'll continue > to wait for your comments. I'm quite a fan of the DNX stuff, conceptually, although I've not had a chance to play with it in production / testing yet. It certainly seems to scale better than the existing architecture. +1 for the hope this eventually gets merged properly. Don't suppose you guys have a decent interface up your sleeves too? :P Scaling the stock CGI interface to multiple users/groups each seems *very* difficult; all the links are static and it's just begging someone to write it in PHP, Ruby on Rails, or something. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From michael_luebben at web.de Wed Jan 30 07:50:37 2008 From: michael_luebben at web.de (=?iso-8859-15?Q?Michael_L=FCbben?=) Date: Wed, 30 Jan 2008 07:50:37 +0100 Subject: Message-ID: <1455310730@web.de> ________________________________________________________ Bis 50 MB Dateianh?nge? Kein Problem! http://www.digitaledienste.web.de/freemail/club/lp/?lp=7 ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From michael_luebben at web.de Wed Jan 30 07:52:00 2008 From: michael_luebben at web.de (=?iso-8859-15?Q?Michael_L=FCbben?=) Date: Wed, 30 Jan 2008 07:52:00 +0100 Subject: Bug in nagiostat (passive service check latency) Message-ID: <1455311005@web.de> Hi Ethan, hi list, i have found a bug in nagiostats in the nagios version 3.0rc1. I become no value for the option AVGPSVSVCLAT! In the file nagiostat.c i found follow code: nagiostats.c: /* passive service check latency */ else if(!strcmp(temp_ptr,"PSVACTSVCLAT")) printf("%d%s",(int)(min_passive_service_latency*1000),mrtg_delimiter); else if(!strcmp(temp_ptr,"PSVACTSVCLAT")) printf("%d%s",(int)(max_passive_service_latency*1000),mrtg_delimiter); else if(!strcmp(temp_ptr,"PSVACTSVCLAT")) printf("%d%s",(int)(average_passive_service_latency*1000),mrtg_delimiter); My solution: /* passive service check latency */ else if(!strcmp(temp_ptr,"MINPSVSVCLAT")) printf("%d%s",(int)(min_passive_service_latency*1000),mrtg_delimiter); else if(!strcmp(temp_ptr,"MAXPSVSVCLAT")) printf("%d%s",(int)(max_passive_service_latency*1000),mrtg_delimiter); else if(!strcmp(temp_ptr,"AVGPSVSVCLAT")) printf("%d%s",(int)(average_passive_service_latency*1000),mrtg_delimiter); I am not very good in C and i hpe its right Bye Michael _________________________________________________________________________ In 5 Schritten zur eigenen Homepage. Jetzt Domain sichern und gestalten! Nur 3,99 EUR/Monat! http://www.maildomain.web.de/?mc=021114 ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From ojan at gfi.fr Wed Jan 30 09:57:53 2008 From: ojan at gfi.fr (Olivier JAN) Date: Wed, 30 Jan 2008 09:57:53 +0100 Subject: multi-line configuration bug in rc2 ? In-Reply-To: <20080123121037.R45874@tribble.ilrt.bris.ac.uk> References: <20080122160228.C7209@tribble.ilrt.bris.ac.uk> <1201021699.26176.9.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> <20080122175918.F8384@tribble.ilrt.bris.ac.uk> <20080122183151.M8384@tribble.ilrt.bris.ac.uk> <4796FDC4.1040105@op5.se> <20080123121037.R45874@tribble.ilrt.bris.ac.uk> Message-ID: <20080130095753.349gzrnjq8o04ko8@intra.expertise-online.net> Hi all, This serviceescalation object was working in nagios 3.0rc1 and gives now an error with 3.0rc2. Seems the backslash is no more allowed. define serviceescalation{ hostgroup_name ALL service_description DISK_C,DISK_CIO,DISK_D,DISK_DIO,DISK_F,DISK_FIO,HARD_SYSTEM,HARD_TEMP,\ LOAD_CPU,LOAD_RAM,MSSQL_DISK,PING_CAISSE,PRINT_ERROR first_notification 3 last_notification 100 notification_interval 60 contact_groups help_n1,help_n2 escalation_period workhours escalation_options c,u,w,r } Olivier JAN ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From gmueller at netways.de Wed Jan 30 10:17:06 2008 From: gmueller at netways.de (Gerd Mueller) Date: Wed, 30 Jan 2008 10:17:06 +0100 Subject: specified hostgroup has no members Nagios 3.0rc2 Bug? Message-ID: <1201684626.11813.14.camel@netl-gm-01.int.netways.de> Hi developers, I am getting following warnings during the "Reading configuration data..." step: Warning: Specified hostgroup 'preita.abc' has no members (config file '/etc/nagios/config/abc/hostgroups.cfg', starting on line 1) Warning: Specified hostgroup 'def' has no members (config file '/etc/nagios/config/hostgroup.cfg', starting on line 1) Warning: Specified hostgroup 'postit.def' has no members (config file '/etc/nagios/config/def/hostgroups.cfg', starting on line 1) But Nagios starts without any errors. And Nagios knows the groups and their members. objects.cache: e.g. one group: define hostgroup { hostgroup_name preita.abc alias preita.abc members batch2.preita.abc,batch1.preita.abc,online2.preita.abc,online1.preita.abc } So I am pretty sure this Warning messages are wrong! BTW 2 wishes about hostgroups: 1. empty hostgroups if hostgroups without any members exist Nagios won't start. I really would appreciate if Nagios could throw warnings instead. 2. hide hostgroups from the cgi interface With earlier Nagios Versions it was possible to define groups as templates (register 0). These groups weren't shown on the webinterface. I would really appreciate a switch like "hide_from_cgis 0/1" for the hostgroups. Cheers, Gerd Cheers, Gerd -- Gerd Mueller Senior Consultant NETWAYS GmbH | Deutschherrnstr. 47a | D-90429 N?rnberg Tel: +49 911 92885-0 | Fax: +49 911 92885-33 GF: Julian Hein | AG N?rnberg HRB18461 http://www.netways.de | gmueller at netways.de ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From ojan at gfi.fr Wed Jan 30 10:32:30 2008 From: ojan at gfi.fr (Olivier JAN) Date: Wed, 30 Jan 2008 10:32:30 +0100 Subject: specified hostgroup has no members Nagios 3.0rc2 Bug? In-Reply-To: <1201684626.11813.14.camel@netl-gm-01.int.netways.de> References: <1201684626.11813.14.camel@netl-gm-01.int.netways.de> Message-ID: <20080130103230.x3gm1e4ls8kwg4cc@intra.expertise-online.net> I would also appreciate those two possibilities Olivier JAN Gerd Mueller a ?crit : > > BTW 2 wishes about hostgroups: > > 1. empty hostgroups > if hostgroups without any members exist Nagios won't start. I really > would appreciate if Nagios could throw warnings instead. > > 2. hide hostgroups from the cgi interface > With earlier Nagios Versions it was possible to define groups as > templates (register 0). These groups weren't shown on the webinterface. > I would really appreciate a switch like "hide_from_cgis 0/1" for the > hostgroups. > > Cheers, > > Gerd > > > > > Cheers, > > Gerd > > -- > Gerd Mueller > Senior Consultant > > NETWAYS GmbH | Deutschherrnstr. 47a | D-90429 N?rnberg > Tel: +49 911 92885-0 | Fax: +49 911 92885-33 > GF: Julian Hein | AG N?rnberg HRB18461 > > http://www.netways.de | gmueller at netways.de > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Nagios-devel mailing list > Nagios-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-devel > > ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From naparuba at gmail.com Wed Jan 30 11:04:41 2008 From: naparuba at gmail.com (nap) Date: Wed, 30 Jan 2008 11:04:41 +0100 Subject: Patch for Circular Paths (new algo) 70s->0.007s :) In-Reply-To: <6f8615170801300201p3eac818ch88f5e888f9e0f99c@mail.gmail.com> References: <73BD1CEC958A564DB6771C5E97339F2806166AF0@SMAIL.eu.lectra.com> <6f8615170801300201p3eac818ch88f5e888f9e0f99c@mail.gmail.com> Message-ID: <6f8615170801300204s271d231agd89cd8f8f411d271@mail.gmail.com> Hi Ethan, Hi Everyone, I already send this message to the list by another adresse, but it failed (exchange server...), so I resend it. I saw on the documentation: "That means all you CompSci graduate students who have been emailing me about doing your thesis on Nagios can contribute some code back. :-)" I don't know who they are, but I try to do the job: change the algorithm of the circular check in order to have a O(n) complexity. My algo can optimise the circular check (for the moment, only the host part, but I'm working on the services part too). I use a personal modified version of the Deep First Search (http://en.wikipedia.org/wiki/Depth-first_search) (DFS) The hard work is already done: we've got link between parents->childs (the childs->parents link are not used). We "just" have to follow theses links. It's a recursive algo. All node are in the beginning unchecked (value =0), when we began to check them, it's temporary_checked (value=1). We check all childs if they are not already checked (here is the recursive part :) ). If a child is in state temporary_checked, there is a loop, not follow the link, make our status loop_inside (value=3),and return it. If the child is a ok state (value=2), ok no problem with the child. If the child is LOOP_DETECTED (value=3), do not follow the link, and we return our status=loop_inside. If all childs are OK or if we don't have child (no childs, no loop :) ), we are OK an return. The algo is ok for code with #ifdef NSCORE (I need the host link). If we does not have NSCORE, we can't use it. But in the major system it's ok isn't it? The modifications are in the files: *objects.h (to add macro DFS STATUS and add dfsCheckedStatus to host structure) *objects.c (to add DFS_UNCHECKED at the initialisation of host-> dfsCheckedStatus) *config.c (add 2 fonctions: dfsStatus to get the status of a host, dfs for make the dfs algo for a host (and it's childs), and modifie the pre_flight_circular_check for call dfs on all node that need it (not already checked)). I know that the 2 function need to be in the objects.c, but I'll move them in the final patch. I generated a "big" conf to test it: 10000 hosts (milleservers.cfg), dependant of 100 parents (parents.cfg). The 10000 are not all looped. The loop are in the test.cfg file: 5 hosts, in a circular way. You can find the files (objects.c,h and config.c), the patch and the samples at: http://zegabes.free.fr/nagios/ I put in this mail the code files and the patchs. The original files are from 2 hours ago. I try to make diff, but I don't know if they are ok with "patch" I hope this patch can help you, like Nagios help me every days :) I'm working for make this algo for services, them clean all (in the good files...) and apply the code typing of Nagios (here it's emacs one). To find modification: search jean or Jean on the code ;) *objects.h: L382 and the begining for DEFINE DFS* *objects.c : L922 for default value for host *config.c: the very end, before the pre_flight_circular_check Here is a launch of Nagios -v with milleservers, parents and test.cfg: My algo begin Big problem on child :srv0 with root:srv4 Part of the loop problem on child :srv4 with root:srv3 Part of the loop problem on child :srv3 with root:srv2 Part of the loop problem on child :srv2 with root:srv1 Part of the loop problem on child :srv1 with root:srv0 My algo end My Circular Paths: 0.006864 sec * Error: There is a circular parent/child path that exists for host 'srv0'! Error: There is a circular parent/child path that exists for host 'srv1'! Error: There is a circular parent/child path that exists for host 'srv2'! Error: There is a circular parent/child path that exists for host 'srv3'! Error: There is a circular parent/child path that exists for host 'srv4'! Timing information on configuration verification is listed below. CONFIG VERIFICATION TIMES (* = Potential for speedup with -x option) ---------------------------------- Object Relationships: 0.972514 sec Circular Paths: 70.060319 sec * Misc: 0.004837 sec ============ TOTAL: 71.037670 sec * = 70.060319 sec (98.6%) estimated savings ***> One or more problems was encountered while running the pre-flight check... Great isn't it? original way:71.037670 new way: 0.006864 sec (My Circular Paths). Let me know if you want more information, I'll say you when I'll finish the services dependencies :) Jean Gabes (alias: naparuba) Ps: sorry for my english, I'm French, but I really try. -------------- next part -------------- A non-text attachment was scrubbed... Name: config.c Type: text/x-c Size: 93706 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: config.c.patch Type: application/octet-stream Size: 4026 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: objects.c Type: text/x-c Size: 141746 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: objects.c.patch Type: application/octet-stream Size: 150 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: objects.h Type: application/octet-stream Size: 29558 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: objects.h.patch Type: application/octet-stream Size: 459 bytes Desc: not available URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From ae at op5.se Wed Jan 30 15:04:55 2008 From: ae at op5.se (Andreas Ericsson) Date: Wed, 30 Jan 2008 15:04:55 +0100 Subject: Patch for Circular Paths (new algo) 70s->0.007s :) In-Reply-To: <6f8615170801300204s271d231agd89cd8f8f411d271@mail.gmail.com> References: <73BD1CEC958A564DB6771C5E97339F2806166AF0@SMAIL.eu.lectra.com> <6f8615170801300201p3eac818ch88f5e888f9e0f99c@mail.gmail.com> <6f8615170801300204s271d231agd89cd8f8f411d271@mail.gmail.com> Message-ID: <47A08407.6020800@op5.se> nap wrote: > Hi Ethan, > Hi Everyone, > > I already send this message to the list by another adresse, but it > failed (exchange server...), so I resend it. > > > I saw on the documentation: > "That means all you CompSci graduate students who have been emailing > me about doing your thesis on Nagios can contribute some code back. > :-)" > I don't know who they are, but I try to do the job: change the > algorithm of the circular check in order to have a O(n) complexity. > > My algo can optimise the circular check (for the moment, only the host > part, but I'm working on the services part too). I use a personal > modified version of the Deep First Search > (http://en.wikipedia.org/wiki/Depth-first_search) (DFS) > > The hard work is already done: we've got link between parents->childs > (the childs->parents link are not used). We "just" have to follow > theses links. > > It's a recursive algo. > > All node are in the beginning unchecked (value =0), when we began to > check them, it's temporary_checked (value=1). We check all childs if > they are not already checked (here is the recursive part :) ). > If a child is in state temporary_checked, there is a loop, not follow > the link, make our status loop_inside (value=3),and return it. > If the child is a ok state (value=2), ok no problem with the child. > If the child is LOOP_DETECTED (value=3), do not follow the link, and > we return our status=loop_inside. > If all childs are OK or if we don't have child (no childs, no loop :) > ), we are OK an return. > > The algo is ok for code with #ifdef NSCORE (I need the host link). > If we does not have NSCORE, we can't use it. But in the major system > it's ok isn't it? > Yes, the CGI's shouldn't ever read the un-expanded objects anyway, and Nagios should never write circular parent chains to the objects.cache- file, so whatever code there should be simply never kicks in. > The modifications are in the files: > *objects.h (to add macro DFS STATUS and add dfsCheckedStatus to host structure) > *objects.c (to add DFS_UNCHECKED at the initialisation of host-> > dfsCheckedStatus) > *config.c (add 2 fonctions: dfsStatus to get the status of a host, dfs > for make the dfs algo for a host (and it's childs), and modifie the > pre_flight_circular_check for call dfs on all node that need it (not > already checked)). > > I know that the 2 function need to be in the objects.c, but I'll move > them in the final patch. > > I generated a "big" conf to test it: 10000 hosts (milleservers.cfg), > dependant of 100 parents (parents.cfg). The 10000 are not all looped. > The loop are in the test.cfg file: 5 hosts, in a circular way. > > You can find the files (objects.c,h and config.c), the patch and the samples at: > http://zegabes.free.fr/nagios/ > > I put in this mail the code files and the patchs. The original files > are from 2 hours ago. > > I try to make diff, but I don't know if they are ok with "patch" > They are, but such diffs are really hard to read. If you could redo the patches like this: cp -a nagios nagios.orig cd nagios # (hack, hack, hack) diff -urN ../nagios.orig . > circular-parents.patch and then send the file circular-parents.patch to this list, it'll be a lot easier to review and apply. Judging from what I've seen so far though, I have a few issues with the code. It's only style so far. My head hurts when trying to review diffs in non-unified format. * Nagios code doesn't use CamelCase. Stick to snake_case for variables and suchlike (it's actually proven that CamelCase is harder to read for a majority of people and is much more often misspelled or misread). * Indentation doesn't follow that which already exists in Nagios. > > CONFIG VERIFICATION TIMES (* = Potential for speedup with -x option) > ---------------------------------- > Object Relationships: 0.972514 sec > Circular Paths: 70.060319 sec * > Misc: 0.004837 sec > ============ > TOTAL: 71.037670 sec * = 70.060319 sec (98.6%) estimated savings > That's not entirely true though, as you aren't removing the circular paths check entirely, just optimizing it. Anyways, this looks really good. Barring any breakage in currently grok'ed config syntax, I think this is a really nice optimization. -- Andreas Ericsson andreas.ericsson at op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From naparuba at gmail.com Wed Jan 30 15:21:16 2008 From: naparuba at gmail.com (nap) Date: Wed, 30 Jan 2008 15:21:16 +0100 Subject: Patch for Circular Paths (new algo) 70s->0.007s :) In-Reply-To: <47A08407.6020800@op5.se> References: <73BD1CEC958A564DB6771C5E97339F2806166AF0@SMAIL.eu.lectra.com> <6f8615170801300201p3eac818ch88f5e888f9e0f99c@mail.gmail.com> <6f8615170801300204s271d231agd89cd8f8f411d271@mail.gmail.com> <47A08407.6020800@op5.se> Message-ID: <6f8615170801300621r35c267fbvf7563de31f5c5ec1@mail.gmail.com> On Jan 30, 2008 3:04 PM, Andreas Ericsson wrote: > > nap wrote: > > Hi Ethan, > > Hi Everyone, > > > > I already send this message to the list by another adresse, but it > > failed (exchange server...), so I resend it. > > > > > > I saw on the documentation: > > "That means all you CompSci graduate students who have been emailing > > me about doing your thesis on Nagios can contribute some code back. > > :-)" > > I don't know who they are, but I try to do the job: change the > > algorithm of the circular check in order to have a O(n) complexity. > > > > My algo can optimise the circular check (for the moment, only the host > > part, but I'm working on the services part too). I use a personal > > modified version of the Deep First Search > > (http://en.wikipedia.org/wiki/Depth-first_search) (DFS) > > > > The hard work is already done: we've got link between parents->childs > > (the childs->parents link are not used). We "just" have to follow > > theses links. > > > > It's a recursive algo. > > > > All node are in the beginning unchecked (value =0), when we began to > > check them, it's temporary_checked (value=1). We check all childs if > > they are not already checked (here is the recursive part :) ). > > If a child is in state temporary_checked, there is a loop, not follow > > the link, make our status loop_inside (value=3),and return it. > > If the child is a ok state (value=2), ok no problem with the child. > > If the child is LOOP_DETECTED (value=3), do not follow the link, and > > we return our status=loop_inside. > > If all childs are OK or if we don't have child (no childs, no loop :) > > ), we are OK an return. > > > > The algo is ok for code with #ifdef NSCORE (I need the host link). > > If we does not have NSCORE, we can't use it. But in the major system > > it's ok isn't it? > > > > Yes, the CGI's shouldn't ever read the un-expanded objects anyway, and > Nagios should never write circular parent chains to the objects.cache- > file, so whatever code there should be simply never kicks in. Ok > > > The modifications are in the files: > > *objects.h (to add macro DFS STATUS and add dfsCheckedStatus to host structure) > > *objects.c (to add DFS_UNCHECKED at the initialisation of host-> > > dfsCheckedStatus) > > *config.c (add 2 fonctions: dfsStatus to get the status of a host, dfs > > for make the dfs algo for a host (and it's childs), and modifie the > > pre_flight_circular_check for call dfs on all node that need it (not > > already checked)). > > > > I know that the 2 function need to be in the objects.c, but I'll move > > them in the final patch. > > > > I generated a "big" conf to test it: 10000 hosts (milleservers.cfg), > > dependant of 100 parents (parents.cfg). The 10000 are not all looped. > > The loop are in the test.cfg file: 5 hosts, in a circular way. > > > > You can find the files (objects.c,h and config.c), the patch and the samples at: > > http://zegabes.free.fr/nagios/ > > > > I put in this mail the code files and the patchs. The original files > > are from 2 hours ago. > > > > I try to make diff, but I don't know if they are ok with "patch" > > > > They are, but such diffs are really hard to read. If you could redo the > patches like this: > > cp -a nagios nagios.orig > cd nagios > # (hack, hack, hack) > diff -urN ../nagios.orig . > circular-parents.patch Thanks, I'll do like this for the final patch. > > and then send the file circular-parents.patch to this list, it'll be > a lot easier to review and apply. > > Judging from what I've seen so far though, I have a few issues with > the code. It's only style so far. My head hurts when trying to review > diffs in non-unified format. > > * Nagios code doesn't use CamelCase. Stick to snake_case for variables > and suchlike (it's actually proven that CamelCase is harder to read for > a majority of people and is much more often misspelled or misread). Ok, I'll change this. > > * Indentation doesn't follow that which already exists in Nagios. Ok, I'll change this too. > > > > > CONFIG VERIFICATION TIMES (* = Potential for speedup with -x option) > > ---------------------------------- > > Object Relationships: 0.972514 sec > > Circular Paths: 70.060319 sec * > > Misc: 0.004837 sec > > ============ > > TOTAL: 71.037670 sec * = 70.060319 sec (98.6%) estimated savings > > > > That's not entirely true though, as you aren't removing the circular paths > check entirely, just optimizing it. Anyways, this looks really good. Barring > any breakage in currently grok'ed config syntax, I think this is a really > nice optimization. Thanks. Yes, I just optimize it. And it's just for host path for now. I'm working for host, services and notifications dependencies. Jean > > -- > Andreas Ericsson andreas.ericsson at op5.se > OP5 AB www.op5.se > Tel: +46 8-230225 Fax: +46 8-230231 > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Nagios-devel mailing list > Nagios-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-devel > ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From badri at diglinks.com Wed Jan 30 15:43:11 2008 From: badri at diglinks.com (Badri Pillai) Date: Wed, 30 Jan 2008 15:43:11 +0100 Subject: Release of n2rrd-1.3.2RC1 ... Message-ID: <47A08CFF.5020004@diglinks.com> Hi all, FYI info, today released n2rrd 1.3.2RC1 from changelog: 1.3.1 -> 1.3.2RC1 -------------- 1) n2rrd.pl patches from wish list 1.1) Service Maps is checked first, this way you can simulate a rewrite or really map service-name to some other string. this should not affect your previous installations 1.2) introduction of new variable name "TEMPLATE_NAME_AT_BEGINNING" if set to "1" then the template name would be the first word in service-description (field separator '_'), e.g: STATUS : service description = template name to look for ------------------------------------------------------ disabled(default) : xyz_abcd_template = template.t enabled : xyz_abcd_template = xyz.t 1.3) some minor patches NOTE: nothing has changed in template search order 2) rrd2graph.cgi when you have more then one value stored in an RRD, e.g: lets take check_icmp values: rta = RoundTripAverage pl = PacketLoss so in this case at times you may like to see only the packet loss values and/or zoom into it!! Now this is possible with this version. HOW: 2.1) using rrdinfo checks for ds[.*].type values and stores them 2.2) If there are 2 or more DS found, then below the graph an additional link "Select DS" is created, by clicking on it you open a menu with list of DS names links PLUS on top of list SERVICENAME HOSTNAME | All this is a special link, to go back to multi-ds-display OR display all systems summary having SERVICENAME 2.3) You may like to have a template for each DS names separately, the following steps are followed for a template search 2.3.1) look for in TEMPLATES_DIR/graph/HOSTNAME_SERVICENAME_DSNAME.t 2.3.2) TEMPLATES_DIR/graph/SERVICENAME_DSNAME.t 2.3.3) TEMPLATES_DIR/graph/default_ds_name.t (copy dist-default_ds_name.t as default_ds_name.t) will be used for all single DS values If still confused have a look at sysnetmon.diglinks.com (user=guest, password=guest) try icmp or netstat graphs Regards, Badri ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From tobias.mucke at googlemail.com Wed Jan 30 21:26:27 2008 From: tobias.mucke at googlemail.com (Tobias Mucke) Date: Wed, 30 Jan 2008 21:26:27 +0100 Subject: Fwd: Patch to get Nagios header files also installed In-Reply-To: <716dd5480712171203m143e6b07k4d6d69883139478c@mail.gmail.com> References: <716dd5480712171203m143e6b07k4d6d69883139478c@mail.gmail.com> Message-ID: <716dd5480801301226r67d930c5jdda3331a274b56b5@mail.gmail.com> Hi Nagios developers, some days ago I sent a patch for Nagios build system to get also header files installed. The idea behind was, to ease the build of NEB modules. There was no response at the mailing list, so I give it another try. Anybody? Thanks. Tobias ---------- Forwarded message ---------- From: Tobias Mucke Date: 17.12.2007 21:03 Subject: Patch to get Nagios header files also installed To: nagios-devel at lists.sourceforge.net Hi, it would be more easy for developers to distribute and users to build NEB modules if Nagios would be installed with its header files. I have written a patch for configure.in, Makefile.in and include/Makefile.in to support this. The latter is completely new since it didn't exist yet. I used the existing include/Makefile as a starting point. ########################################### diff -up nagios-3.0b7/configure.in nagios-3.0b7-new/configure.in --- nagios-3.0b7/configure.in 2007-11-23 19:24:50.000000000 +0100 +++ nagios-3.0b7-new/configure.in 2007-12-17 23:49:25.000000000 +0100 @@ -5,6 +5,7 @@ define([AC_CACHE_LOAD],) define([AC_CACHE_SAVE],) AC_INIT(base/nagios.c) +dnl Files given in AC_CONFIG_HEADER don't have to get installed by make, see include/Makefile.in AC_CONFIG_HEADER(include/config.h include/snprintf.h include/cgiutils.h) AC_PREFIX_DEFAULT(/usr/local/nagios) @@ -758,7 +759,7 @@ AC_SUBST(INITDIR) AC_SUBST(INSTALLPERLSTUFF) AC_PATH_PROG(PERL,perl) -AC_OUTPUT(Makefile subst pkginfo base/Makefile common/Makefile contrib/Makefile cgi/Makefile html/Makefile module/Makefile xdata/Makefile daemon-init html/index.html html/side.html) +AC_OUTPUT(Makefile subst pkginfo base/Makefile common/Makefile contrib/Makefile cgi/Makefile html/Makefile include/Makefile module/Makefile xdata/Makefile daemon-init html/index.html html/side.html) perl subst include/locations.h ########################################### ########################################### diff -up nagios-3.0b7/Makefile.in nagios-3.0b7-new/Makefile.in --- nagios-3.0b7/Makefile.in 2007-11-10 23:54:35.000000000 +0100 +++ nagios-3.0b7-new/Makefile.in 2007-12-17 23:36:36.000000000 +0100 @@ -1,7 +1,7 @@ ############################### # Makefile for Nagios # -# Last Modified: 11-10-2007 +# Last Modified: 12-17-2007 ############################### @@ -28,6 +28,7 @@ BINDIR=@bindir@ CGIDIR=@sbindir@ LIBEXECDIR=@libexecdir@ HTMLDIR=@datadir@ +INCLUDEDIR=@includedir@ INSTALL=@INSTALL@ INSTALL_OPTS=@INSTALL_OPTS@ COMMAND_OPTS=@COMMAND_OPTS@ @@ -175,6 +176,7 @@ install-cgis: install: cd $(SRC_BASE) && $(MAKE) $@ cd $(SRC_CGI) && $(MAKE) $@ + cd $(SRC_INCLUDE) && $(MAKE) $@ cd $(SRC_HTM) && $(MAKE) $@ $(MAKE) install-basic @@ -185,6 +187,7 @@ install-unstripped: $(MAKE) install-basic install-basic: + $(INSTALL) -m 755 $(INSTALL_OPTS) -d $(DESTDIR)$(INCLUDEDIR) $(INSTALL) -m 775 $(INSTALL_OPTS) -d $(DESTDIR)$(LIBEXECDIR) $(INSTALL) -m 775 $(INSTALL_OPTS) -d $(DESTDIR)$(LOGDIR) $(INSTALL) -m 775 $(INSTALL_OPTS) -d $(DESTDIR)$(LOGDIR)/archives ########################################### ########################################### --- nagios-3.0b7/include/Makefile 2007-10-18 14:57:22.000000000 +0200 +++ nagios-3.0b7-new/include/Makefile.in 2007-12-18 00:12:30.000000000 +0100 @@ -1,9 +1,14 @@ ############################### # Makefile for Include Files # -# Last Modified: 10-18-2007 +# Last Modified: 12-17-2007 ############################### +prefix=@prefix@ +INCLUDEDIR=@includedir@ +INSTALL=@INSTALL@ +INSTALL_OPTS=@INSTALL_OPTS@ + clean: rm -f *~ @@ -11,3 +16,14 @@ distclean: clean rm -f cgiutils.h config.h locations.h snprintf.h devclean: distclean + +install: + $(MAKE) install-basic + +install-basic: + $(INSTALL) -m 755 $(INSTALL_OPTS) -d $(DESTDIR)$(INCLUDEDIR) + for file in *.h; do \ + if test "x${file}" != "xconfig.h" -a "x${file}" != "xsnprintf.h" -a "x${file}" != "xcgiutils.h" ; then \ + $(INSTALL) -m 644 $(INSTALL_OPTS) $$file $(DESTDIR)$(INCLUDEDIR); \ + fi; \ + done ########################################### I have tested these changes on my Linux box. Since bash code is copied from configure script it should be already portable to other platforms. Please use your version of autoconf to update your configure script. Would be nice to see this patch in Nagios 3.0. Thanks. Tobias ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ From teng at dataway.com Wed Jan 30 23:48:00 2008 From: teng at dataway.com (Tedman Eng) Date: Wed, 30 Jan 2008 14:48:00 -0800 Subject: 3.0 macro documentation inconsistency Message-ID: In the 3.0 documentation section describing Host and Service Macros: There's some macros , and there's some : Example: $SERVICESTATE$ A string indicating the CURRENT state of the service... $LASTSERVICESTATE$ A string indicating the LAST state of the service... However, these macro descriptions have the text "LAST", which would imply : $HOSTOUTPUT$ The first line of text output from the LAST host check... $LONGHOSTOUTPUT$ The full text output (aside from the first line) from the LAST host check.... $HOSTPERFDATA$ This macro contains any performance data that may have been returned by the LAST host check. $SERVICEOUTPUT$ The first line of text output from the LAST service check... $LONGSERVICEOUTPUT$ The full text output (aside from the first line) from the LAST service check... $SERVICEPERFDATA$ This macro contains any performance data that may have been returned by the LAST service check. To be consistent, I think the documentation for these macros should read: $HOSTOUTPUT$ The first line of text output from the CURRENT host check... $LONGHOSTOUTPUT$ The full text output (aside from the first line) from the CURRENT host check... $HOSTPERFDATA$ This macro contains any performance data that may have been returned by the CURRENT host check. $SERVICEOUTPUT$ The first line of text output from the CURRENT service check... $LONGSERVICEOUTPUT$ The full text output (aside from the first line) from the CURRENT service check... $SERVICEPERFDATA$ This macro contains any performance data that may have been returned by the CURRENT service check. It would definately be nice if $LASTSERVICEOUTPUT$ did exist as a macro though! LAST OUTPUT is useful for calculating deltas, currently this is only possible by recording such information externally. (sorry for double posting, I'm not subscribed to nagios-devel so I'm not sure if this messsage will reach that list) ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Nagios-users mailing list Nagios-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null From patrick at nicsys.net Thu Jan 31 03:44:28 2008 From: patrick at nicsys.net (Patrick Kremer) Date: Wed, 30 Jan 2008 20:44:28 -0600 Subject: Bug in notify-host-by-email variables Message-ID: <004001c863b3$338fb150$800101df@KREMERHOME> I am running 3.0rc2. I am trying to get the variable $HOSTNOTES$ to appear in notify-host-by-email This is the notify-host-by-email command definition that I changed. I added "Notes: $HOSTNOTES$\n\n" in the spot shown below: # 'notify-host-by-email' command definition define command{ command_name notify-host-by-email command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nNotes: $HOSTNOTES$\n\nDate/Time: $LONGDATETIME$" | /usr/bin/mailx -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$ } The host I'm testing with contains only the word: TEST in its notes field. define host{ use generic-router-intrust host_name intrust-bob-ds1 alias Intrust BOB T1 address 66.243.142.155 notes TEST icon_image switch.gif statusmap_image switch.gd2 hostgroups intrust-office } This is the debug log output: [1201644173.199143] [032.2] [pid=30977] Raw Command: /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nNotes: $HOSTNOTES$\n\nDate/Time: $LONGDATETIME$" | /usr/bin/mailx -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$ [1201644173.199170] [032.2] [pid=30977] Processed Command: /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: PROBLEM\nHost: intrust-bob-ds1\nState: DOWN\nAddress: 66.243.142.155\nInfo: (Host Check Timed Out)\n\nNotes: TEST The problem is that the processed command gets truncated and a notification never gets sent. If I change the notify-host-by-email definition and replace $HOSTNOTES$ with another variable, $HOSTOUTPUT$ for instance, the debug log shows this: [1201644289.198269] [032.2] [pid=30977] Raw Command: /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nNotes: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$" | /usr/bin/mailx -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$ [1201644289.198292] [032.2] [pid=30977] Processed Command: /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: PROBLEM\nHost: intrust-bob-ds1\nState: DOWN\nAddress: 66.243.142.155\nInfo: (Host Check Timed Out)\n\nNotes: (Host Check Timed Out)\n\nDate/Time: Tue Jan 29 16:04:49 CST 2008" | /usr/bin/mailx -s "** PROBLEM Host Alert: intrust-bob-ds1 is DOWN **" pat at nicsys.net The notification gets sent successfully because the processed command is not truncated. I know I have the syntax correct because I have it working with other variables - it just wont work when I plug in the variable that I want: $HOSTNOTES$. It is listed as allowed in host notifications in the docs. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel From naparuba at gmail.com Thu Jan 31 19:33:01 2008 From: naparuba at gmail.com (nap) Date: Thu, 31 Jan 2008 19:33:01 +0100 Subject: Patch for Circular Paths (new algo) 70s->0.007s :) In-Reply-To: <6f8615170801300621r35c267fbvf7563de31f5c5ec1@mail.gmail.com> References: <73BD1CEC958A564DB6771C5E97339F2806166AF0@SMAIL.eu.lectra.com> <6f8615170801300201p3eac818ch88f5e888f9e0f99c@mail.gmail.com> <6f8615170801300204s271d231agd89cd8f8f411d271@mail.gmail.com> <47A08407.6020800@op5.se> <6f8615170801300621r35c267fbvf7563de31f5c5ec1@mail.gmail.com> Message-ID: <6f8615170801311033i7d84f991g90253d072360aa13@mail.gmail.com> Hi list, I finish my patch for the host path part. I try to folow the indentation and the coding style of nagios. you can find test files with a lot of parent/childs at http://zegabes.free.fr/nagios/ . I try to generate a big configuration for service dependencies (4000 dependencies) but it's ok (0.2s). So I don't know if I'vegot to check it if it's already ok. Someone got problem with this check or it's was just with host path check? Ask me if you have a question about my patch, or if it need more work for include it into a official version of nagios. Thanks, Gabes Jean On Jan 30, 2008 3:21 PM, nap wrote: > > On Jan 30, 2008 3:04 PM, Andreas Ericsson wrote: > > > > nap wrote: > > > Hi Ethan, > > > Hi Everyone, > > > > > > I already send this message to the list by another adresse, but it > > > failed (exchange server...), so I resend it. > > > > > > > > > I saw on the documentation: > > > "That means all you CompSci graduate students who have been emailing > > > me about doing your thesis on Nagios can contribute some code back. > > > :-)" > > > I don't know who they are, but I try to do the job: change the > > > algorithm of the circular check in order to have a O(n) complexity. > > > > > > My algo can optimise the circular check (for the moment, only the host > > > part, but I'm working on the services part too). I use a personal > > > modified version of the Deep First Search > > > (http://en.wikipedia.org/wiki/Depth-first_search) (DFS) > > > > > > The hard work is already done: we've got link between parents->childs > > > (the childs->parents link are not used). We "just" have to follow > > > theses links. > > > > > > It's a recursive algo. > > > > > > All node are in the beginning unchecked (value =0), when we began to > > > check them, it's temporary_checked (value=1). We check all childs if > > > they are not already checked (here is the recursive part :) ). > > > If a child is in state temporary_checked, there is a loop, not follow > > > the link, make our status loop_inside (value=3),and return it. > > > If the child is a ok state (value=2), ok no problem with the child. > > > If the child is LOOP_DETECTED (value=3), do not follow the link, and > > > we return our status=loop_inside. > > > If all childs are OK or if we don't have child (no childs, no loop :) > > > ), we are OK an return. > > > > > > The algo is ok for code with #ifdef NSCORE (I need the host link). > > > If we does not have NSCORE, we can't use it. But in the major system > > > it's ok isn't it? > > > > > > > Yes, the CGI's shouldn't ever read the un-expanded objects anyway, and > > Nagios should never write circular parent chains to the objects.cache- > > file, so whatever code there should be simply never kicks in. > > Ok > > > > > > The modifications are in the files: > > > *objects.h (to add macro DFS STATUS and add dfsCheckedStatus to host structure) > > > *objects.c (to add DFS_UNCHECKED at the initialisation of host-> > > > dfsCheckedStatus) > > > *config.c (add 2 fonctions: dfsStatus to get the status of a host, dfs > > > for make the dfs algo for a host (and it's childs), and modifie the > > > pre_flight_circular_check for call dfs on all node that need it (not > > > already checked)). > > > > > > I know that the 2 function need to be in the objects.c, but I'll move > > > them in the final patch. > > > > > > I generated a "big" conf to test it: 10000 hosts (milleservers.cfg), > > > dependant of 100 parents (parents.cfg). The 10000 are not all looped. > > > The loop are in the test.cfg file: 5 hosts, in a circular way. > > > > > > You can find the files (objects.c,h and config.c), the patch and the samples at: > > > http://zegabes.free.fr/nagios/ > > > > > > I put in this mail the code files and the patchs. The original files > > > are from 2 hours ago. > > > > > > I try to make diff, but I don't know if they are ok with "patch" > > > > > > > They are, but such diffs are really hard to read. If you could redo the > > patches like this: > > > > cp -a nagios nagios.orig > > cd nagios > > # (hack, hack, hack) > > diff -urN ../nagios.orig . > circular-parents.patch > > Thanks, I'll do like this for the final patch. > > > > > and then send the file circular-parents.patch to this list, it'll be > > a lot easier to review and apply. > > > > Judging from what I've seen so far though, I have a few issues with > > the code. It's only style so far. My head hurts when trying to review > > diffs in non-unified format. > > > > * Nagios code doesn't use CamelCase. Stick to snake_case for variables > > and suchlike (it's actually proven that CamelCase is harder to read for > > a majority of people and is much more often misspelled or misread). > Ok, I'll change this. > > > > > * Indentation doesn't follow that which already exists in Nagios. > Ok, I'll change this too. > > > > > > > > > CONFIG VERIFICATION TIMES (* = Potential for speedup with -x option) > > > ---------------------------------- > > > Object Relationships: 0.972514 sec > > > Circular Paths: 70.060319 sec * > > > Misc: 0.004837 sec > > > ============ > > > TOTAL: 71.037670 sec * = 70.060319 sec (98.6%) estimated savings > > > > > > > That's not entirely true though, as you aren't removing the circular paths > > check entirely, just optimizing it. Anyways, this looks really good. Barring > > any breakage in currently grok'ed config syntax, I think this is a really > > nice optimization. > Thanks. > Yes, I just optimize it. And it's just for host path for now. I'm > working for host, services and notifications dependencies. > > > Jean > > > > > -- > > > Andreas Ericsson andreas.ericsson at op5.se > > OP5 AB www.op5.se > > Tel: +46 8-230225 Fax: +46 8-230231 > > > > ------------------------------------------------------------------------- > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Nagios-devel mailing list > > Nagios-devel at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/nagios-devel > > > -------------- next part -------------- A non-text attachment was scrubbed... Name: circular-parents.patch Type: text/x-patch Size: 6441 bytes Desc: not available URL: -------------- next part -------------- ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -------------- next part -------------- _______________________________________________ Nagios-devel mailing list Nagios-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-devel