Nagios-users digest, Vol 1 #3093 - 22 msgs

Reddy, Venugopal (GE Infrastructure, non-ge) venugopal.reddy1 at ge.com
Sat Mar 18 11:55:43 CET 2006
Previous message: "Host assumed to be up" message?
Next message: performance data reporting
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi All,

Can some body please help me the procedure of uninstalling nagios2.x
completely? 

Thanks
Venu
-----Original Message-----
From: nagios-users-admin at lists.sourceforge.net
[mailto:nagios-users-admin at lists.sourceforge.net] On Behalf Of
nagios-users-request at lists.sourceforge.net
Sent: Saturday, March 18, 2006 1:31 AM
To: nagios-users at lists.sourceforge.net
Subject: Nagios-users digest, Vol 1 #3093 - 22 msgs

Send Nagios-users mailing list submissions to
	nagios-users at lists.sourceforge.net

To subscribe or unsubscribe via the World Wide Web, visit
	https://lists.sourceforge.net/lists/listinfo/nagios-users
or, via email, send a message with subject or body 'help' to
	nagios-users-request at lists.sourceforge.net

You can reach the person managing the list at
	nagios-users-admin at lists.sourceforge.net

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Nagios-users digest..."


Today's Topics:

   1. Nagios-Report-0.002 on CPAN/NagiosExchange.
(Stanley.Hopcroft at Dest.gov.au)
   2. Re: where to get help with ext. cmd
CHANGE_NORMAL_SVC_CHECK_INTERVAL ? (John P. Rouillard)
   3. Re: where to get help with ext. cmd
CHANGE_NORMAL_SVC_CHECK_INTERVAL ? (prosolutions at gmx.net)
   4. R: [Nagios-users] Strange situation.. (Marco Borsani)
   5. Empty hostgroups (James Fidell)
   6. Re: Empty hostgroups (joseph.petrucci at wachovia.com)
   7. RE: Empty hostgroups (Deborah Martin)
   8. RE: Empty hostgroups (joseph.petrucci at wachovia.com)
   9. Re: check_if_by_snmp output (Robert Story)
  10. RE: Local host is Down, Couldnotparse arguements (Marc Powell)
  11. e-mail acknowledgements (David Schlecht)
  12. Re: e-mail acknowledgements (David Schlecht)
  13. RE: e-mail acknowledgements (Marc Powell)
  14. Re: phantom host up messages (Mike Linden)
  15. RE: phantom host up messages (Marc Powell)
  16. Re: 2.0 stable stops checking (Terry)
  17. Re: 2.0 stable stops checking (Eli Stair)
  18. Re: 2.0 stable stops checking (Terry)
  19. Re: 2.0 stable stops checking (Eli Stair)
  20. "Host assumed to be up" message? (Mark Hennessy)
  21. Re: 2.0 stable stops checking (Terry)
  22. Re: 2.0 stable stops checking (Eli Stair)

--__--__--

Message: 1
Date: Fri, 17 Mar 2006 15:33:54 +1100
From: <Stanley.Hopcroft at Dest.gov.au>
To: <nagios-users at lists.sourceforge.net>
Subject: [Nagios-users] Nagios-Report-0.002 on CPAN/NagiosExchange.

Dear Folks,

I am writing to say that Nagios::Report 0.002 has been 'released' and is
available at
the usual places.

This relase fixes a bug, and adds limited charting capability and a
weaker alternate interface (provided by the Perl DBD::AnyData module)
that allows client code to select the report data with SQL (the small
subset that AnyData accepts).

0.002 Fri Mar 17 14:44:36 EST 2006
- fix bug in mkreport() processing of MUNGE_CALLBACK (would not change
report values).

*** This entailed a change _non_ backward compatible change in the
MUNGE_CALLBACK interface.
*** Client code that calls the alter->() callback _requires_ changing.
*** The alter callback is now called with one parm, a ref to a hash of
the field values
*** indexed by field name. See examples/ for scripts that have been
changed.

- added to_dbh() method to allow DBD::AnyData provided use of SQL
(simple) on report data
- added primitive support for chart templates to excel_dump. The
workbook written by Spreadsheet::WriteExcel can contain _one_ (1) chart
of the availability data.

This project does not scale very well. It provides a limited capability
to provide a Data source for processing by Reporting tools such as
Excel.=20

This module has probably reached the end of development (some may say it
would better have not started) apart from bug fixes.

If you are serious about reporting look at the DB NEB modules or Steve
Shipways stuff on NagiosExchange. This module provides however, a
limited capacity to provide reports in the format beloved by PHBs.

Yours sincerely.


--__--__--

Message: 2
To: nagios-users at lists.sourceforge.net
Subject: Re: [Nagios-users] where to get help with ext. cmd
CHANGE_NORMAL_SVC_CHECK_INTERVAL ? 
Reply-To: rouilj at ieee.org
Date: Thu, 16 Mar 2006 23:51:22 -0500
From: "John P. Rouillard" <rouilj at cs.umb.edu>


In message <20060317030214.GA10825 at mini.alaya.net>,
prosolutions writes:

>no.  sorry i wasn't clear on this.  that is what i see in the log file.
>the command that is run is that which is in the script, namely:
>
>printf "[%lu] CHANGE_NORMAL_SVC_CHECK_INTERVAL;$4;$5;15\n" $now >
$commandfile
>
>this is taken directly from the canonical definition of this command
> [...]
>i see that this differs from what you claim to have succesfully run
(see
> below)
>> >this should set the check_interval to 15 seconds instead
>> >of the default 90 seconds.  However, watching the log I see
>> >that the checks on the service revert to 90 s.
>> 
>> What are you seeing in the log that makes you think it's
>> getting reset? If I use:
>
>i am watching the log and timing it and watching how frequently the
>service checks get run.  they run at retry_check_interval up until
>max_check_attempts gets reached, then, even though the event handler
>runs the script to execute the CHANGE_NORMAL_SVC_CHECK_INTERVAL
command,
>setting the interval to 15 seconds (same as retry_check_interval), it
>does not set (i.e. it reverts back to check_interval (90s)

Hmm, I would expect a timeline something like the following:

 id  time   state
 1    0     poll fails in state soft failure try number 1 (aka soft 1)
 2    15s   soft alert 2
 3    30s   soft alert 3
 4    45s   hard alert  (scheduled with 90 seconds because change of
                        check interval hasn't occurred yet)
 5    45s+  event handler called generates
CHANGE_NORMAL_SVC_CHECK_INTERVAL
 6    135s  still hard (but now it schedules using the 15 second
timeperiod)
 7    150s  hard
 8    165s  hard
 9         stays with 15 second interval.

If you wanted to have the interval between 4 and 6 be something other
than 90 seconds you have to generate a SCHEDULE_FORCED_SVC_CHECK for
the date "now + 15 seconds" at line 5 in addition to the change in
check interval.

>>  echo "[`date +%s`]
CHANGE_NORMAL_SVC_CHECK_INTERVAL;tigris;DiskBackupMountCheck;3"

Printf is POSIX standard, but both my command and the printf produce
the same output. Try both and see. If you don't understand why they
produce the same output, read the man pages and get a book on shell
programming.

>okay this looks substantially differnt than the cannonical example
>above.  first off instead of [%lu] following the echo/print, you have
>[`date +%s`]   also, i don't see 
>
>\n `date +%s` > $commandfile
>
>at the end of your script.  is the date command supposed to go before
>and after the command?

No. I am just using shell substitution in the quoted string.
Functionally
they are equivalent. RTFM for bash, echo, printf, date etc..

>> to change the check interval to three minutes, and direct it
>> to the command pipe, I see it take effect and stay in
>> effect. Looking at the Event Log in the web interface shows:
>> 
>> [03-16-2006 22:54:25] EXTERNAL COMMAND:
CHANGE_NORMAL_SVC_CHECK_INTERVAL;tigris;DiskBackupMountCheck;3
>
>yep i see an entry just like this:
>
>[1142543941] EXTERNAL COMMAND:
CHANGE_NORMAL_SVC_CHECK_INTERVAL;test_host;te
>st.html;15
>
>but alas object.cache shows check_interval to still be 90s

Hmm, maybe a bug in 2.0b3?

>> When you change the check interval, it doesn't force a
>> reschedule of the service with the new interval. 
>
>i'm not sure what you mean here.  you mean, even though check_interval
>gets changed, that it must somehow be rescheduled before actually
taking
>effect?

Right. The new schedule won't take effect until the currently
scheduled poll is run at the original scheduling interval AFAICT.  If
you want the new interval to take affect immediately you need to force
it with a SCHEDULE_FORCED_SVC_CHECK command.

>> Also I am using nagios 2.0rc1, so YMMV.
>nagios 2.0b3 here. 
>
>thanks much for your help.  i will mess around with the command a bit
>and if i can get it to work.

Good luck. I'd be interested in seeing what finally works. Also you
should check the nagios 2.0 release notes between 2.0b3 and 2.0rc1 and
see if there is a reference to this problem.

				-- rouilj
John Rouillard
========================================================================
===
My employers don't acknowledge my existence much less my opinions.


--__--__--

Message: 3
Date: Thu, 16 Mar 2006 23:04:29 -0800
From: prosolutions at gmx.net
To: rouilj at ieee.org
Cc: nagios-users at lists.sourceforge.net
Subject: Re: [Nagios-users] where to get help with ext. cmd
CHANGE_NORMAL_SVC_CHECK_INTERVAL ?
Reply-To: prosolutions at gmx.net

> 
> >>  echo "[`date +%s`]
CHANGE_NORMAL_SVC_CHECK_INTERVAL;tigris;DiskBackupMountCheck;3"
> 
> Printf is POSIX standard, but both my command and the printf produce
> the same output. Try both and see. If you don't understand why they
> produce the same output, read the man pages and get a book on shell
> programming.


the issue isn't with the use of printf or echo, but rather the strings
which they are printing.  also, i know what shell substitution is but am
not familiar with the expression [%lu] and had no idea that was
equivalent to the output of `date +%s`.  what is confusing about your
example is that you seem to omit the trailing `date +%s` at the end of
your command, as opposed to the provided example:


now=`date +%s`
commandfile='/usr/local/nagios/var/rw/nagios.cmd'
/bin/printf "[%lu] CHANGE_NORMAL_SVC_CHECK_INTERVAL;host1;service1;15\n"
$now > $commandfile

the $now preceeding the redirection to the command file being equivalent
to the string returned by `date +%s`

this cannonical example is a bit confusing then: why does it have the
date at the beginning, i.e. the first [%lu], and then again at the end
with $now  ?  you seem to omit the second date from your command
although in your example you also don't echo a newline nor show the
redirect which that part I at least assume.

> 
> Right. The new schedule won't take effect until the currently
> scheduled poll is run at the original scheduling interval AFAICT.  If
> you want the new interval to take affect immediately you need to force
> it with a SCHEDULE_FORCED_SVC_CHECK command.
> 


thank you for the clarification regarding this.  but i am still not
seeing the CHANGE_NORMAL_SVC_CHECK_INTERVAL register in objects.cache.
i will continue delving into that and then eventually implement
SCHEDULE_FORCED_SVC_CHECK as per your suggestion.  thanks again for your
illuminating help!



--__--__--

Message: 4
From: "Marco Borsani" <m.borsani at it.net>
To: "'NAGIOS'" <nagios-users at lists.sourceforge.net>
Subject: R: [Nagios-users] Strange situation..
Date: Fri, 17 Mar 2006 08:16:02 +0100
Organization: ITnet

I found the problem.

Nagios does not keep nagios user environment (and PATH variable), so it
=
did
not find command like "grep" !=20

I can't believe.... ;-)

Regards
Marco

-}-----Messaggio originale-----
-}Da: Masopust, Christian [mailto:christian.masopust at siemens.com]=20
-}Inviato: gioved=EC 16 marzo 2006 10.58
-}A: Marco Borsani; NAGIOS
-}Oggetto: RE: [Nagios-users] Strange situation..
-}
-}
-}Does your nagios user have read permissions to your Invio_Rice.log?
-}
-}
-}
-}> -----Original Message-----
-}> From: nagios-users-admin at lists.sourceforge.net
-}> [mailto:nagios-users-admin at lists.sourceforge.net] On Behalf=20
-}Of Marco=20
-}> Borsani
-}> Sent: Thursday, March 16, 2006 10:30 AM
-}> To: 'NAGIOS'
-}> Subject: [Nagios-users] Strange situation..
-}> Importance: High
-}>=20
-}> Hi all!
-}>=20
-}> I prepare a very simple check (called check_fax) that read a text=20
-}> file.
-}>=20
-}> If check_fax find an "OK" it will return a status 0, if=20
-}check_fax find=20
-}> a "CRITICAL" it will return a status 2.
-}>=20
-}> Well, when I run manually check_fax (like nagios user or root
-}> user) it work
-}> correctly (returns 0), but when Nagios run check_fax, it returns
a=20
-}> status 2.
-}>=20
-}> I try to modify check_fax permissions like:
-}> -rwxr-xr-x   1 root       sys            654 Mar 16 09:42 check_fax
-}> -rwxr-xr-x   1 nagios     nagios         654 Mar 16 09:42 check_fax
-}> But nothing has been changed.
-}>=20
-}> Here it is the check_fax:
-}>=20
-}----------------------------------------------------------------------
-}> #!/bin/csh
-}> #
-}> set dir2 =3D /tmp
-}>=20
-}> set STATUS =3D `cat $dir2/Invio_Rice.log | grep -c OK`
-}>=20
-}> if ( "$STATUS" =3D=3D "1" ) then
-}>         echo "FAX OK - Invio e ricezione di prova avvenuto" >=20
-}> $dir2/Rice.log
-}>         echo "FAX OK - Invio e ricezione di prova avvenuto"
-}> else
-}>         echo "FAX CRITICAL - Invio e ricezione di prova NON=20
-}avvenuto"=20
-}> > $dir2/Rice.log
-}>         echo "FAX CRITICAL - Invio e ricezione di prova NON=20
-}avvenuto"
-}>         exit 2
-}> endif
-}>=20
-}> exit 0
-}> --------------------------------------------------------------
-}> ----------
-}>=20
-}> Any idea?
-}> Regards
-}>=20
-}> Marco Borsani                       =20
-}> Unix & Monitoring System Administrator Technical Operation
-}> Tel.    +39 010 4310115
-}> Fax     +39 010 4327454
-}> E-mail: m.borsani at IT.net
-}>=20
-}> ITnet S.r.l. - Direzione e Coordinamento di WIND
Telecomunicazioni=20
-}> S.p.A.
-}> Internet Service Provider
-}> Sede legale:                     Via C.G.Viola, 48 - 00148 Roma
-}> Dir. Centrale e Amministrativa:  	Via Pacinotti, 39
-}> 	                             16151 Genova (Italy)
-}>                                             =09
-}> http://www.it.net                               =20
-}> mailto:info at IT.net
-}> _______________________________________________________________
-}> Altre sedi ITnet:
-}> MILANO tel.: +39 02 30114900    info-milano at IT.net
-}> ROMA    tel.: +39 06 83116707    info-roma at IT.net
-}> _______________________________________________________________
-}> ITnet is associated to CIX (Commercial IP eXchange) and=20
-}RIPE ITnet is=20
-}> associated to AIIP (Associazione Italiana Internet Providers)
-}>=20
-}>=20
-}>=20
-}>=20
-}> -------------------------------------------------------
-}> This SF.Net email is sponsored by xPML, a groundbreaking
scripting=20
-}> language that extends applications into web and mobile=20
-}media. Attend=20
-}> the live webcast and join the prime developer group=20
-}breaking into this=20
-}> new coding territory!
-}> http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D110944&bid=3D241720&
-}> dat=3D121642
-}> _______________________________________________
-}> Nagios-users mailing list
-}> Nagios-users at lists.sourceforge.net
-}> https://lists.sourceforge.net/lists/listinfo/nagios-users
-}> ::: Please include Nagios version, plugin version (-v) and OS
when=20
-}> reporting any issue.
-}> ::: Messages without supporting info will risk being sent=20
-}to /dev/null
-}>=20
-}



--__--__--

Message: 5
Date: Fri, 17 Mar 2006 13:43:42 +0000
From: James Fidell <james at cloud9.co.uk>
To:  nagios-users at lists.sourceforge.net
Subject: [Nagios-users] Empty hostgroups

Is there an easy way to make nagios v2 not care if a hostgroup
is empty?

I'm migrating a live configuration from v1 to v2 at the same time
as correcting some errors and adding functionality and have it would
be useful to have all the hostgroups in the configuration files before
adding all the hosts.

Is it liable to cause pain if I change the code to warn about empty
hostgroups rather than fail with an error?

James


--__--__--

Message: 6
To: James Fidell <james at cloud9.co.uk>
Cc: nagios-users at lists.sourceforge.net,
        nagios-users-admin at lists.sourceforge.net
Subject: Re: [Nagios-users] Empty hostgroups
From: joseph.petrucci at wachovia.com
Date: Fri, 17 Mar 2006 08:51:37 -0500

This is a multipart message in MIME format.
--=_alternative 004C232E85257134_=
Content-Type: text/plain; charset="US-ASCII"

only thing I can think of is the same host can be in multiple hostgroups

so pick one host already defined and placeit in all empty hostgroups as
a 
placeholder then when you put a host in the hstgroup that actually
belongs 
remove that placeholder.

Joe Petrucci 
Office: 704-383-6089
Cell : 724-462-0443




James Fidell <james at cloud9.co.uk> 
Sent by: nagios-users-admin at lists.sourceforge.net
03/17/2006 08:43 AM


To
nagios-users at lists.sourceforge.net
cc

Subject
[Nagios-users] Empty hostgroups






Is there an easy way to make nagios v2 not care if a hostgroup
is empty?

I'm migrating a live configuration from v1 to v2 at the same time
as correcting some errors and adding functionality and have it would
be useful to have all the hostgroups in the configuration files before
adding all the hosts.

Is it liable to cause pain if I change the code to warn about empty
hostgroups rather than fail with an error?

James


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting 
language
that extends applications into web and mobile media. Attend the live 
webcast
and join the prime developer group breaking into this new coding 
territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when 
reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

ForwardSourceID:NT00010C52 

--=_alternative 004C232E85257134_=
Content-Type: text/html; charset="US-ASCII"


<br><font size=2 face="sans-serif">only thing I can think of is the same
host can be in multiple hostgroups so pick one host already defined and
placeit in all empty hostgroups as a placeholder then when you put a
host
in the hstgroup that actually belongs remove that placeholder.</font>
<br>
<br><font size=2 face="sans-serif">Joe Petrucci <br>
Office: 704-383-6089<br>
Cell : 724-462-0443<br>
</font>
<br>
<br>
<br>
<table width=100%>
<tr valign=top>
<td width=40%><font size=1 face="sans-serif"><b>James Fidell
<james at cloud9.co.uk></b>
</font>
<br><font size=1 face="sans-serif">Sent by:
nagios-users-admin at lists.sourceforge.net</font>
<p><font size=1 face="sans-serif">03/17/2006 08:43 AM</font>
<br>
<td width=59%>
<table width=100%>
<tr>
<td>
<div align=right><font size=1 face="sans-serif">To</font></div>
<td valign=top><font size=1
face="sans-serif">nagios-users at lists.sourceforge.net</font>
<tr>
<td>
<div align=right><font size=1 face="sans-serif">cc</font></div>
<td valign=top>
<tr>
<td>
<div align=right><font size=1 face="sans-serif">Subject</font></div>
<td valign=top><font size=1 face="sans-serif">[Nagios-users] Empty
hostgroups</font></table>
<br>
<table>
<tr valign=top>
<td>
<td></table>
<br></table>
<br>
<br>
<br><font size=2><tt>Is there an easy way to make nagios v2 not care if
a hostgroup<br>
is empty?<br>
<br>
I'm migrating a live configuration from v1 to v2 at the same time<br>
as correcting some errors and adding functionality and have it would<br>
be useful to have all the hostgroups in the configuration files
before<br>
adding all the hosts.<br>
<br>
Is it liable to cause pain if I change the code to warn about empty<br>
hostgroups rather than fail with an error?<br>
<br>
James<br>
<br>
<br>
-------------------------------------------------------<br>
This SF.Net email is sponsored by xPML, a groundbreaking scripting
language<br>
that extends applications into web and mobile media. Attend the live
webcast<br>
and join the prime developer group breaking into this new coding
territory!<br>
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&am
p;dat=121642<br>
_______________________________________________<br>
Nagios-users mailing list<br>
Nagios-users at lists.sourceforge.net<br>
https://lists.sourceforge.net/lists/listinfo/nagios-users<br>
::: Please include Nagios version, plugin version (-v) and OS when
reporting
any issue. <br>
::: Messages without supporting info will risk being sent to
/dev/null<br>
</tt></font>
<br><font size=2 color=white
face="sans-serif">ForwardSourceID:NT00010C52
   </font>
<br>
--=_alternative 004C232E85257134_=--


--__--__--

Message: 7
From: Deborah Martin <Deborah.Martin at Kognitio.com>
To: "'joseph.petrucci at wachovia.com'" <joseph.petrucci at wachovia.com>,
James 
    Fidell <james at cloud9.co.uk>
Cc: nagios-users at lists.sourceforge.net, 
    nagios-users-admin at lists.sourceforge.net
Subject: RE: [Nagios-users] Empty hostgroups
Date: Fri, 17 Mar 2006 13:58:13 -0000

This message is in MIME format. Since your mail reader does not
understand
this format, some or all of this message may not be legible.

------_=_NextPart_001_01C649CA.D522B1C0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit

You can also use a wildcard as well. 

-----Original Message-----
From: joseph.petrucci at wachovia.com [mailto:joseph.petrucci at wachovia.com]
Sent: 17 March 2006 13:52
To: James Fidell
Cc: nagios-users at lists.sourceforge.net;
nagios-users-admin at lists.sourceforge.net
Subject: Re: [Nagios-users] Empty hostgroups



only thing I can think of is the same host can be in multiple hostgroups
so
pick one host already defined and placeit in all empty hostgroups as a
placeholder then when you put a host in the hstgroup that actually
belongs
remove that placeholder. 

Joe Petrucci 
Office: 704-383-6089
Cell : 724-462-0443




James Fidell <james at cloud9.co.uk> 
Sent by: nagios-users-admin at lists.sourceforge.net 


03/17/2006 08:43 AM 



To
nagios-users at lists.sourceforge.net 

cc

Subject
[Nagios-users] Empty hostgroups

	




Is there an easy way to make nagios v2 not care if a hostgroup
is empty?

I'm migrating a live configuration from v1 to v2 at the same time
as correcting some errors and adding functionality and have it would
be useful to have all the hostgroups in the configuration files before
adding all the hosts.

Is it liable to cause pain if I change the code to warn about empty
hostgroups rather than fail with an error?

James


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting
language
that extends applications into web and mobile media. Attend the live
webcast
and join the prime developer group breaking into this new coding
territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when
reporting
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

ForwardSourceID:NT00010C52     



************************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. Any unauthorised distribution or copying is strictly 
prohibited.

Whilst Kognitio Limited takes steps to prevent the transmission of 
viruses via e-mail, we can not guarantee that any email or 
attachment is free from computer viruses and you are strongly
advised to undertake your own anti-virus precautions.

Kognitio grants no warranties regarding performance,
use or quality of any e-mail or attachment and undertakes no 
liability for loss or damage, howsoever caused.
***********************************************************************


------_=_NextPart_001_01C649CA.D522B1C0
Content-Type: text/html; charset="US-ASCII"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html;
charset=3DUS-ASCII">


<META content=3D"MSHTML 6.00.2800.1528" name=3DGENERATOR></HEAD>
<BODY>
<DIV><SPAN class=3D472035813-17032006><FONT face=3DArial color=3D#0000ff
si=
ze=3D2>You=20
can also use a wildcard as well. </FONT></SPAN></DIV>
<BLOCKQUOTE>
  <DIV class=3DOutlookMessageHeader dir=3Dltr align=3Dleft><FONT
face=3DTah=
oma=20
  size=3D2>-----Original Message-----<BR><B>From:</B>
joseph.petrucci at wacho=
via.com=20
  [mailto:joseph.petrucci at wachovia.com]<BR><B>Sent:</B> 17 March 2006=20
  13:52<BR><B>To:</B> James Fidell<BR><B>Cc:</B>=20
  nagios-users at lists.sourceforge.net;=20
  nagios-users-admin at lists.sourceforge.net<BR><B>Subject:</B> Re:
[Nagios-u=
sers]=20
  Empty hostgroups<BR><BR></FONT></DIV><BR><FONT face=3Dsans-serif
size=3D2=
>only=20
  thing I can think of is the same host can be in multiple hostgroups so
pi=
ck=20
  one host already defined and placeit in all empty hostgroups as a
placeho=
lder=20
  then when you put a host in the hstgroup that actually belongs remove
tha=
t=20
  placeholder.</FONT> <BR><BR><FONT face=3Dsans-serif size=3D2>Joe
Petrucci=
=20
  <BR>Office: 704-383-6089<BR>Cell : 724-462-0443<BR></FONT><BR><BR><BR>
  <TABLE width=3D"100%">
    <TBODY>
    <TR vAlign=3Dtop>
      <TD width=3D"40%"><FONT face=3Dsans-serif size=3D1><B>James
Fidell=20
        <james at cloud9.co.uk></B> </FONT><BR><FONT
face=3Dsans-serif=
=20
        size=3D1>Sent by:
nagios-users-admin at lists.sourceforge.net</FONT>=
=20
        <P><FONT face=3Dsans-serif size=3D1>03/17/2006 08:43 AM</FONT>
<BR>=
</P>
      <TD width=3D"59%">
        <TABLE width=3D"100%">
          <TBODY>
          <TR>
            <TD>
              <DIV align=3Dright><FONT face=3Dsans-serif
size=3D1>To</FONT>=
</DIV>
            <TD vAlign=3Dtop><FONT face=3Dsans-serif=20
              size=3D1>nagios-users at lists.sourceforge.net</FONT>=20
          <TR>
            <TD>
              <DIV align=3Dright><FONT face=3Dsans-serif
size=3D1>cc</FONT>=
</DIV>
            <TD vAlign=3Dtop>
          <TR>
            <TD>
              <DIV align=3Dright><FONT face=3Dsans-serif
size=3D1>Subject</=
FONT></DIV>
            <TD vAlign=3Dtop><FONT face=3Dsans-serif
size=3D1>[Nagios-users=
] Empty=20
              hostgroups</FONT></TR></TBODY></TABLE><BR>
        <TABLE>
          <TBODY>
          <TR vAlign=3Dtop>
            <TD>
 
<TD></TR></TBODY></TABLE><BR></TR></TBODY></TABLE><BR><BR><BR><=
FONT=20
  size=3D2><TT>Is there an easy way to make nagios v2 not care if a=20
  hostgroup<BR>is empty?<BR><BR>I'm migrating a live configuration from
v1 =
to v2=20
  at the same time<BR>as correcting some errors and adding functionality
an=
d=20
  have it would<BR>be useful to have all the hostgroups in the
configuratio=
n=20
  files before<BR>adding all the hosts.<BR><BR>Is it liable to cause
pain i=
f I=20
  change the code to warn about empty<BR>hostgroups rather than fail
with a=
n=20
 
error?<BR><BR>James<BR><BR><BR>-----------------------------------------
-=
-------------<BR>This=20
  SF.Net email is sponsored by xPML, a groundbreaking scripting
language<BR=
>that=20
  extends applications into web and mobile media. Attend the live
webcast<B=
R>and=20
  join the prime developer group breaking into this new coding=20
 
territory!<BR>http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D110944
&=
amp;bid=3D241720&dat=3D121642<BR>___________________________________
___=
_________<BR>Nagios-users=20
  mailing=20
 
list<BR>Nagios-users at lists.sourceforge.net<BR>https://lists.sourceforge.
n=
et/lists/listinfo/nagios-users<BR>:::=20
  Please include Nagios version, plugin version (-v) and OS when
reporting =
any=20
  issue. <BR>::: Messages without supporting info will risk being sent
to=
=20
  /dev/null<BR></TT></FONT><BR><FONT face=3Dsans-serif color=3Dwhite=20
  size=3D2>ForwardSourceID:NT00010C52    </FONT>=20
<BR></BLOCKQUOTE>
<P STYLE=3D"margin-top: 0pt;margin-bottom: 0pt;"><SPAN
STYLE=3D"FONT-FAMILY=
:'Arial';FONT-SIZE:8pt;">***********************************************
***=
********************** </SPAN>
</P>
<P STYLE=3D"margin-top: 0pt;margin-bottom: 0pt;"><SPAN
STYLE=3D"FONT-FAMILY=
:'Arial';FONT-SIZE:8pt;">This email and any files transmitted with it
are c=
onfidential and </SPAN>
</P>
<P STYLE=3D"margin-top: 0pt;margin-bottom: 0pt;"><SPAN
STYLE=3D"FONT-FAMILY=
:'Arial';FONT-SIZE:8pt;">intended solely for the use of the individual
or e=
ntity to whom they </SPAN>
</P>
<P STYLE=3D"margin-top: 0pt;margin-bottom: 0pt;"><SPAN
STYLE=3D"FONT-FAMILY=
:'Arial';FONT-SIZE:8pt;">are addressed. Any unauthorised distribution or
co=
pying is strictly  </SPAN>
</P>
<P STYLE=3D"margin-top: 0pt;margin-bottom: 0pt;"><SPAN
STYLE=3D"FONT-FAMILY=
:'Arial';FONT-SIZE:8pt;">prohibited. </SPAN>
</P>
<P STYLE=3D"margin-top: 0pt;margin-bottom: 0pt;"><SPAN
STYLE=3D"FONT-FAMILY=
:'Arial';FONT-SIZE:8pt;"> </SPAN>
</P>
<P STYLE=3D"margin-top: 0pt;margin-bottom: 0pt;"><SPAN
STYLE=3D"FONT-FAMILY=
:'Arial';FONT-SIZE:8pt;">Whilst Kognitio Limited takes steps to prevent
the=
 transmission of  </SPAN>
</P>
<P STYLE=3D"margin-top: 0pt;margin-bottom: 0pt;"><SPAN
STYLE=3D"FONT-FAMILY=
:'Arial';FONT-SIZE:8pt;">viruses via e-mail, we can not guarantee that
any =
email or  </SPAN>
</P>
<P STYLE=3D"margin-top: 0pt;margin-bottom: 0pt;"><SPAN
STYLE=3D"FONT-FAMILY=
:'Arial';FONT-SIZE:8pt;">attachment is free from computer viruses and
you a=
re strongly </SPAN>
</P>
<P STYLE=3D"margin-top: 0pt;margin-bottom: 0pt;"><SPAN
STYLE=3D"FONT-FAMILY=
:'Arial';FONT-SIZE:8pt;">advised to undertake your own anti-virus
precautio=
ns. </SPAN>
</P>
<P STYLE=3D"margin-top: 0pt;margin-bottom: 0pt;"><SPAN
STYLE=3D"FONT-FAMILY=
:'Arial';FONT-SIZE:8pt;"> </SPAN>
</P>
<P STYLE=3D"margin-top: 0pt;margin-bottom: 0pt;"><SPAN
STYLE=3D"FONT-FAMILY=
:'Arial';FONT-SIZE:8pt;">Kognitio grants no warranties regarding
performanc=
e, </SPAN>
</P>
<P STYLE=3D"margin-top: 0pt;margin-bottom: 0pt;"><SPAN
STYLE=3D"FONT-FAMILY=
:'Arial';FONT-SIZE:8pt;">use or quality of any e-mail or attachment and
und=
ertakes no  </SPAN>
</P>
<P STYLE=3D"margin-top: 0pt;margin-bottom: 0pt;"><SPAN
STYLE=3D"FONT-FAMILY=
:'Arial';FONT-SIZE:8pt;">liability for loss or damage, howsoever caused.
</=
SPAN>
</P>
<P STYLE=3D"margin-top: 0pt;margin-bottom: 0pt;"><SPAN
STYLE=3D"FONT-FAMILY=
:'Arial';FONT-SIZE:8pt;">***********************************************
***=
********************* </SPAN> </P></BODY></HTML>

------_=_NextPart_001_01C649CA.D522B1C0--


--__--__--

Message: 8
To: Deborah Martin <Deborah.Martin at kognitio.com>
Cc: James Fidell <james at cloud9.co.uk>,
        "'joseph.petrucci at wachovia.com'" <joseph.petrucci at wachovia.com>,
        nagios-users at lists.sourceforge.net,
        nagios-users-admin at lists.sourceforge.net
Subject: RE: [Nagios-users] Empty hostgroups
From: joseph.petrucci at wachovia.com
Date: Fri, 17 Mar 2006 09:30:50 -0500

This is a multipart message in MIME format.
--=_alternative 004FBA5E85257134_=
Content-Type: text/plain; charset="US-ASCII"

I missed that you can use wildcards in the hostname field. thanks that
may 
save me some work in the future.

Joe Petrucci 
Office: 704-383-6089
Cell : 724-462-0443




Deborah Martin <Deborah.Martin at kognitio.com> 
03/17/2006 08:58 AM


To
"'joseph.petrucci at wachovia.com'" <joseph.petrucci at wachovia.com>, James 
Fidell <james at cloud9.co.uk>
cc
nagios-users at lists.sourceforge.net, 
nagios-users-admin at lists.sourceforge.net
Subject
RE: [Nagios-users] Empty hostgroups






You can also use a wildcard as well. 
-----Original Message-----
From: joseph.petrucci at wachovia.com [mailto:joseph.petrucci at wachovia.com]
Sent: 17 March 2006 13:52
To: James Fidell
Cc: nagios-users at lists.sourceforge.net; 
nagios-users-admin at lists.sourceforge.net
Subject: Re: [Nagios-users] Empty hostgroups


only thing I can think of is the same host can be in multiple hostgroups

so pick one host already defined and placeit in all empty hostgroups as
a 
placeholder then when you put a host in the hstgroup that actually
belongs 
remove that placeholder. 

Joe Petrucci 
Office: 704-383-6089
Cell : 724-462-0443



James Fidell <james at cloud9.co.uk> 
Sent by: nagios-users-admin at lists.sourceforge.net 
03/17/2006 08:43 AM 


To
nagios-users at lists.sourceforge.net 
cc

Subject
[Nagios-users] Empty hostgroups








Is there an easy way to make nagios v2 not care if a hostgroup
is empty?

I'm migrating a live configuration from v1 to v2 at the same time
as correcting some errors and adding functionality and have it would
be useful to have all the hostgroups in the configuration files before
adding all the hosts.

Is it liable to cause pain if I change the code to warn about empty
hostgroups rather than fail with an error?

James


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting 
language
that extends applications into web and mobile media. Attend the live 
webcast
and join the prime developer group breaking into this new coding 
territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when 
reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

ForwardSourceID:NT00010C52     
************************************************************************

This email and any files transmitted with it are confidential and 
intended solely for the use of the individual or entity to whom they 
are addressed. Any unauthorised distribution or copying is strictly 
prohibited. 
Whilst Kognitio Limited takes steps to prevent the transmission of 
viruses via e-mail, we can not guarantee that any email or 
attachment is free from computer viruses and you are strongly 
advised to undertake your own anti-virus precautions. 
Kognitio grants no warranties regarding performance, 
use or quality of any e-mail or attachment and undertakes no 
liability for loss or damage, howsoever caused. 
*********************************************************************** 
ForwardSourceID:NT00010C6E 

--=_alternative 004FBA5E85257134_=
Content-Type: text/html; charset="US-ASCII"


<br><font size=2 face="sans-serif">I missed that you can use wildcards
in the hostname field. thanks that may save me some work in the
future.</font>
<br>
<br><font size=2 face="sans-serif">Joe Petrucci <br>
Office: 704-383-6089<br>
Cell : 724-462-0443<br>
</font>
<br>
<br>
<br>
<table width=100%>
<tr valign=top>
<td width=40%><font size=1 face="sans-serif"><b>Deborah Martin
<Deborah.Martin at kognitio.com></b>
</font>
<p><font size=1 face="sans-serif">03/17/2006 08:58 AM</font>
<br>
<td width=59%>
<table width=100%>
<tr>
<td>
<div align=right><font size=1 face="sans-serif">To</font></div>
<td valign=top><font size=1
face="sans-serif">"'joseph.petrucci at wachovia.com'"
<joseph.petrucci at wachovia.com>, James    Fidell
<james at cloud9.co.uk></font>
<tr>
<td>
<div align=right><font size=1 face="sans-serif">cc</font></div>
<td valign=top><font size=1
face="sans-serif">nagios-users at lists.sourceforge.net,
nagios-users-admin at lists.sourceforge.net</font>
<tr>
<td>
<div align=right><font size=1 face="sans-serif">Subject</font></div>
<td valign=top><font size=1 face="sans-serif">RE: [Nagios-users] Empty
hostgroups</font></table>
<br>
<table>
<tr valign=top>
<td>
<td></table>
<br></table>
<br>
<br>
<br><font size=2 color=blue face="Arial">You can also use a wildcard as
well. </font>
<br><font size=2 face="Tahoma">-----Original Message-----<b><br>
From:</b> joseph.petrucci at wachovia.com
[mailto:joseph.petrucci at wachovia.com]<b><br>
Sent:</b> 17 March 2006 13:52<b><br>
To:</b> James Fidell<b><br>
Cc:</b> nagios-users at lists.sourceforge.net;
nagios-users-admin at lists.sourceforge.net<b><br>
Subject:</b> Re: [Nagios-users] Empty hostgroups<br>
</font>
<br><font size=2 face="sans-serif"><br>
only thing I can think of is the same host can be in multiple hostgroups
so pick one host already defined and placeit in all empty hostgroups as
a placeholder then when you put a host in the hstgroup that actually
belongs
remove that placeholder.</font><font size=3> <br>
</font><font size=2 face="sans-serif"><br>
Joe Petrucci <br>
Office: 704-383-6089<br>
Cell : 724-462-0443</font><font size=3><br>
<br>
<br>
</font>
<table width=100%>
<tr valign=top>
<td width=53%><font size=1 face="sans-serif"><b>James Fidell
<james at cloud9.co.uk></b>
<br>
Sent by: nagios-users-admin at lists.sourceforge.net</font><font size=3>
</font>
<p><font size=1 face="sans-serif">03/17/2006 08:43 AM</font><font
size=3>
</font>
<td width=46%>
<br>
<table width=100%>
<tr>
<td width=18%>
<div align=right><font size=1 face="sans-serif">To</font></div>
<td width=81% valign=top><font size=1
face="sans-serif">nagios-users at lists.sourceforge.net</font><font size=3>
</font>
<tr>
<td>
<div align=right><font size=1 face="sans-serif">cc</font></div>
<td valign=top>
<tr>
<td>
<div align=right><font size=1 face="sans-serif">Subject</font></div>
<td valign=top><font size=1 face="sans-serif">[Nagios-users] Empty
hostgroups</font></table>
<br>
<br>
<table width=100%>
<tr valign=top>
<td width=49%>
<td width=50%></table>
<br></table>
<br><font size=3><br>
<br>
</font><font size=2><tt><br>
Is there an easy way to make nagios v2 not care if a hostgroup<br>
is empty?<br>
<br>
I'm migrating a live configuration from v1 to v2 at the same time<br>
as correcting some errors and adding functionality and have it would<br>
be useful to have all the hostgroups in the configuration files
before<br>
adding all the hosts.<br>
<br>
Is it liable to cause pain if I change the code to warn about empty<br>
hostgroups rather than fail with an error?<br>
<br>
James<br>
<br>
<br>
-------------------------------------------------------<br>
This SF.Net email is sponsored by xPML, a groundbreaking scripting
language<br>
that extends applications into web and mobile media. Attend the live
webcast<br>
and join the prime developer group breaking into this new coding
territory!<br>
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&am
p;dat=121642<br>
_______________________________________________<br>
Nagios-users mailing list<br>
Nagios-users at lists.sourceforge.net<br>
https://lists.sourceforge.net/lists/listinfo/nagios-users<br>
::: Please include Nagios version, plugin version (-v) and OS when
reporting
any issue. <br>
::: Messages without supporting info will risk being sent to
/dev/null</tt></font><font size=3><br>
</font><font size=2 color=white face="sans-serif"><br>
ForwardSourceID:NT00010C52    </font><font size=3> </font>
<br><font size=1
face="Arial">***********************************************************
*************
</font>
<br><font size=1 face="Arial">This email and any files transmitted with
it are confidential and </font>
<br><font size=1 face="Arial">intended solely for the use of the
individual
or entity to whom they </font>
<br><font size=1 face="Arial">are addressed. Any unauthorised
distribution
or copying is strictly </font>
<br><font size=1 face="Arial">prohibited. </font>
<br><font size=1 face="Arial">Whilst Kognitio Limited takes steps to
prevent
the transmission of </font>
<br><font size=1 face="Arial">viruses via e-mail, we can not guarantee
that any email or </font>
<br><font size=1 face="Arial">attachment is free from computer viruses
and you are strongly </font>
<br><font size=1 face="Arial">advised to undertake your own anti-virus
precautions. </font>
<br><font size=1 face="Arial">Kognitio grants no warranties regarding
performance,
</font>
<br><font size=1 face="Arial">use or quality of any e-mail or attachment
and undertakes no </font>
<br><font size=1 face="Arial">liability for loss or damage, howsoever
caused.
</font>
<br><font size=1
face="Arial">***********************************************************
************
</font>
<br><font size=2 color=white
face="sans-serif">ForwardSourceID:NT00010C6E
   </font>
<br>
--=_alternative 004FBA5E85257134_=--


--__--__--

Message: 9
Date: Fri, 17 Mar 2006 09:59:51 -0500
From: Robert Story <rstory-l at 2006.revelstone.com>
To: "Mrutyunjaya Dash" <mdash at juniper.net>
Cc: <nagios-users at lists.sourceforge.net>
Subject: Re: [Nagios-users] check_if_by_snmp output

On Thu, 16 Mar 2006 13:05:04 +0530 Mrutyunjaya wrote:
MD> I tried executing the command but it seems the command is not there
in the
MD> system. I have installed net-snmp on the system. Can you please let
me
MD> know whether snmpwalk comes along with net-snmp package or I need to
MD> install some other package separately to get the snmpwalk?

If you installed from source, it should be there. If you are using a
RedHat
based system and only installed net-snmp, you just have the daemons. The
net-snmp-utils package has the user utilities.


--__--__--

Message: 10
Subject: RE: [Nagios-users] Local host is Down, Couldnotparse arguements
Date: Fri, 17 Mar 2006 09:34:53 -0600
From: "Marc Powell" <marc at ena.com>
To: <Nagios-users at lists.sourceforge.net>



> -----Original Message-----
> From: nagios-users-admin at lists.sourceforge.net [mailto:nagios-users-
> admin at lists.sourceforge.net] On Behalf Of Jody Noscov
> Sent: Thursday, March 16, 2006 7:30 PM
> To: Nagios-users at lists.sourceforge.net
> Subject: [Nagios-users] Local host is Down, Couldnotparse arguements
>=20
> Usage: check_ping -H <host_address> -w <wrta>,<wpl>% -c <crta>,<cpl>%
> [-p packets] [-t timeout] [-L] [-4|-6]
>=20
> I am not sure what I need to set the following to
>=20
> -w <wrta>,<wpl>% -c <crta>,<cpl>%
> [-p packets] [-t timeout] [-L] [-4|-6]
>=20
> Is there a document in the help files someone could point me to

Use 'check_ping --help'. That'll return more verbose help.

THRESHOLD is <rta>,<pl>% where <rta> is the round trip average travel
time (ms) which triggers a WARNING or CRITICAL state, and <pl> is the
percentage of packet loss to trigger an alarm state.

The example from Demetri Mouratis should clarify these further.

--
Marc


--__--__--

Message: 11
To: nagios-users at lists.sourceforge.net
From: David Schlecht <dschlecht at doit.nv.gov>
Date: Fri, 17 Mar 2006 16:36:23 +0100 (CET)
Subject: [Nagios-users] e-mail acknowledgements

Hi list



I would like to allow our users to send e-mail to the Nagios

server to initiate events such as acknowledge notifications,

enable/disable notifications, and such.



A procmail interface seems like a good solution. I noticed a

post from long ago regarding this.



Has anyone had any luck implementing such an interface?







- David Schlecht (dschl)



-----------------------

The mailing list archive is found here:

http://www.nagiosexchange.org/nagios-users.34.0.html

			=09


--__--__--

Message: 12
To: nagios-users at lists.sourceforge.net
Subject: Re: [Nagios-users] e-mail acknowledgements
From: David Schlecht <dschlecht at doit.nv.gov>
Date: Fri, 17 Mar 2006 16:50:20 +0100 (CET)

My thanks to Terry and Dany for their input

regarding the acknowledgments part of this.



- David Schlecht (dschl)



-----------------------

This thread is located in the archive at this URL:

http://www.nagiosexchange.org/nagios-users.34.0.html?&tx_maillisttofaq_p
i=

1[showUid]=3D16416

				=09


--__--__--

Message: 13
Subject: RE: [Nagios-users] e-mail acknowledgements
Date: Fri, 17 Mar 2006 09:51:45 -0600
From: "Marc Powell" <marc at ena.com>
To: <nagios-users at lists.sourceforge.net>



> -----Original Message-----
> From: nagios-users-admin at lists.sourceforge.net [mailto:nagios-users-
> admin at lists.sourceforge.net] On Behalf Of David Schlecht
> Sent: Friday, March 17, 2006 9:36 AM
> To: nagios-users at lists.sourceforge.net
> Subject: [Nagios-users] e-mail acknowledgements
>=20
> Hi list
>=20
>=20
>=20
> I would like to allow our users to send e-mail to the Nagios
>=20
> server to initiate events such as acknowledge notifications,
>=20
> enable/disable notifications, and such.
>=20
>=20
>=20
> A procmail interface seems like a good solution. I noticed a
>=20
> post from long ago regarding this.
>=20
>=20
>=20
> Has anyone had any luck implementing such an interface?

There was a post just 2 days ago from Terry with a Subject of "Re:
[Nagios-users] Acknowledge issues via e-mail" that included a procmail
recipe and a script.

--
Marc=20


--__--__--

Message: 14
Date: Fri, 17 Mar 2006 09:36:37 -0500
From: "Mike Linden" <stanglinden at gmail.com>
To: "Marc Powell" <marc at ena.com>
Subject: Re: [Nagios-users] phantom host up messages
Cc: nagios-users at lists.sourceforge.net

------=_Part_501_2749469.1142606197267
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

On 3/16/06, Marc Powell <marc at ena.com> wrote:
>
>
>
> > -----Original Message-----
> > From: nagios-users-admin at lists.sourceforge.net [mailto:nagios-users-
> > admin at lists.sourceforge.net] On Behalf Of Mike Linden
> > Sent: Thursday, March 16, 2006 8:08 AM
> > To: nagios-users at lists.sourceforge.net
> > Subject: [Nagios-users] phantom host up messages
> >
> > Hi,
> > I have a couple systems that have been unplugged and removed from
the
> rack
> > ,yet Nagios still reports a Host UP alert for them from the
check_ping
> > process.
> > How is this possible?
> >
> > ***** Nagios  *****
> >
> > Notification Type: RECOVERY
> > Host: csep1039
> > State: UP
> > Address: csep1039
> > Info: (No output!)
> >
> > Date/Time: Thu Mar 16 08:56:42 EST 2006
>
> The plugin/command you are using to check the host is providing no
> command line output but is still exiting with a code of 0
(successful).
> Try running the check_command, as the nagios user, exactly as it's
> defined, from the command line and then 'echo $?'. The results of the
> echo for a down host should be '2'. Chances are you're going to see
some
> additional error output from ping/check_ping.
>
> There was also a suggestion earlier for another issue to add '2>&1' to
> the end of the check_command definition to redirect error output back
to
> Nagios. If that works as advertised, and I believe it will, that could
> be informative as well.
>
> --
> Marc
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
> language
> that extends applications into web and mobile media. Attend the live
> webcast
> and join the prime developer group breaking into this new coding
> territory!
> http://sel.as-us.falkag.net/sel?cmdlnk&kid=110944&bid$1720&dat=121642
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>

Marc,
indeed there are other issues here.
The check_ping is seg faulting with an error code of 139

./check_ping -H csep0602 -w 100.0,20% -c 500.0,60% ; echo $?
Segmentation fault
139

Which is resulting in a host up message.
Do you have any idea why this would occur on a seemingly random basis?
I
also manually ran the command for other servers, which also resulted in
the
same output.

Thanks
Mike

--
Mike Linden
http://linden.linuxps.com

------=_Part_501_2749469.1142606197267
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

<br><br><div><span class=3D"gmail_quote">On 3/16/06, <b
class=3D"gmail_send=
ername">Marc Powell</b> <<a
href=3D"mailto:marc at ena.com">marc at ena.com</a=
>> wrote:</span><blockquote class=3D"gmail_quote"
style=3D"border-left: =
1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left:
1ex;=
">
<br><br>> -----Original Message-----<br>> From: <a
href=3D"mailto:nag=
ios-users-admin at lists.sourceforge.net">nagios-users-admin at lists.sourcefo
rge=
.net</a> [mailto:<a
href=3D"mailto:nagios-users-">nagios-users-</a><br>>=
=20
<a
href=3D"mailto:admin at lists.sourceforge.net">admin at lists.sourceforge.net<
=
/a>] On Behalf Of Mike Linden<br>> Sent: Thursday, March 16, 2006
8:08 A=
M<br>> To: <a
href=3D"mailto:nagios-users at lists.sourceforge.net">nagios-=
users at lists.sourceforge.net
</a><br>> Subject: [Nagios-users] phantom host up
messages<br>><br>&g=
t; Hi,<br>> I have a couple systems that have been unplugged and
removed=
 from the<br>rack<br>> ,yet Nagios still reports a Host UP alert for
the=
m from the check_ping
<br>> process.<br>> How is this possible?<br>><br>> *****
Nagio=
s  *****<br>><br>> Notification Type: RECOVERY<br>>
Host=
: csep1039<br>> State: UP<br>> Address: csep1039<br>> Info: (No
ou=
tput!)
<br>><br>> Date/Time: Thu Mar 16 08:56:42 EST 2006<br><br>The
plugin/=
command you are using to check the host is providing no<br>command line
out=
put but is still exiting with a code of 0 (successful).<br>Try running
the =
check_command, as the nagios user, exactly as it's
<br>defined, from the command line and then 'echo $?'. The results of
the<b=
r>echo for a down host should be '2'. Chances are you're going to see
some<=
br>additional error output from ping/check_ping.<br><br>There was also a
su=
ggestion earlier for another issue to add '2>&1' to
<br>the end of the check_command definition to redirect error output
back t=
o<br>Nagios. If that works as advertised, and I believe it will, that
could=
<br>be informative as
well.<br><br>--<br>Marc<br><br><br>------------------=
-------------------------------------
<br>This SF.Net email is sponsored by xPML, a groundbreaking scripting
lang=
uage<br>that extends applications into web and mobile media. Attend the
liv=
e webcast<br>and join the prime developer group breaking into this new
codi=
ng territory!
<br><a
href=3D"http://sel.as-us.falkag.net/sel?cmdlnk&kid=110944&bi=
d$1720&dat=121642">http://sel.as-us.falkag.net/sel?cmdlnk&kid=11
094=
4&bid$1720&dat=121642</a><br>___________________________________
___=
_________
<br>Nagios-users mailing list<br><a
href=3D"mailto:Nagios-users at lists.sourc=
eforge.net">Nagios-users at lists.sourceforge.net</a><br><a
href=3D"https://li=
sts.sourceforge.net/lists/listinfo/nagios-users">https://lists.sourcefor
ge.=
net/lists/listinfo/nagios-users
</a><br>::: Please include Nagios version, plugin version (-v) and OS
when =
reporting any issue.<br>::: Messages without supporting info will risk
bein=
g sent to /dev/null<br></blockquote></div><br>Marc,<br>indeed there are
oth=
er issues here.
<br>The check_ping is seg faulting with an error code of
139<br><br>./check=
_ping -H csep0602 -w 100.0,20% -c 500.0,60% ; echo $?<br>Segmentation
fault=
<br>139<br><br>Which is resulting in a host up message.  <br>Do you
ha=
ve any idea why this would occur on a seemingly random basis?  I
also =
manually ran the command for other servers, which also resulted in the
same=
 output.
<br><br>Thanks<br>Mike<br clear=3D"all"><br>-- <br>Mike Linden<br><a
href=
=3D"http://linden.linuxps.com">http://linden.linuxps.com</a>

------=_Part_501_2749469.1142606197267--


--__--__--

Message: 15
Subject: RE: [Nagios-users] phantom host up messages
Date: Fri, 17 Mar 2006 10:29:28 -0600
From: "Marc Powell" <marc at ena.com>
To: <nagios-users at lists.sourceforge.net>



> -----Original Message-----
> From: Mike Linden [mailto:stanglinden at gmail.com]
> Sent: Friday, March 17, 2006 8:37 AM
> To: Marc Powell
> Cc: nagios-users at lists.sourceforge.net
> Subject: Re: [Nagios-users] phantom host up messages
>=20
>=20
>=20
> On 3/16/06, Marc Powell <marc at ena.com> wrote:
>=20
>=20
>=20
> 	> -----Original Message-----
> 	> From: nagios-users-admin at lists.sourceforge.net [mailto:nagios-
> users-
> 	> admin at lists.sourceforge.net] On Behalf Of Mike Linden
> 	> Sent: Thursday, March 16, 2006 8:08 AM
> 	> To: nagios-users at lists.sourceforge.net
> 	> Subject: [Nagios-users] phantom host up messages
> 	>
> 	> Hi,
> 	> I have a couple systems that have been unplugged and removed
from
> the
> 	rack
> 	> ,yet Nagios still reports a Host UP alert for them from the
> check_ping
> 	> process.
> 	> How is this possible?
> 	>


>=20
> 	The plugin/command you are using to check the host is providing
no
> 	command line output but is still exiting with a code of 0
> (successful).
> 	Try running the check_command, as the nagios user, exactly as
it's
> 	defined, from the command line and then 'echo $?'. The results
of
> the
> 	echo for a down host should be '2'. Chances are you're going to
see
> some
> 	additional error output from ping/check_ping.


>=20
> Marc,
> indeed there are other issues here.
> The check_ping is seg faulting with an error code of 139
>=20
> ./check_ping -H csep0602 -w 100.0,20% -c 500.0,60% ; echo $?
> Segmentation fault
> 139
>=20
> Which is resulting in a host up message.
> Do you have any idea why this would occur on a seemingly random basis?
I
> also manually ran the command for other servers, which also resulted
in
> the same output.

I sure don't. Are you running the latest version of the plugins? What OS
are you running on? Maybe that'll trigger recall for someone ;)

--
Marc=20


--__--__--

Message: 16
Date: Fri, 17 Mar 2006 10:29:37 -0600
From: Terry <td3201 at gmail.com>
To: "Matthias Eble"
<matthias.eble at mailing.kaufland-informationssysteme.com>
Subject: Re: [Nagios-users] 2.0 stable stops checking
Cc: nagios-users at lists.sourceforge.net

I am seeing this as well.  I have services that do not get checked
when they are scheduled:

Last Check Type:=09ACTIVE
Last Check Time:=0903-17-2006 08:50:47
Status Data Age:=090d 1h 37m 51s
Next Scheduled Active Check:  =0903-17-2006 10:09:01
Latency:=09342.408 seconds
Check Duration:=0910.015 seconds
Last State Change:=0903-16-2006 11:55:02
Current State Duration:=090d 22h 33m 36s

It is currently 10:29 and it still hasnt been checked.  This is one of
many examples.

On 3/15/06, Matthias Eble
<matthias.eble at mailing.kaufland-informationssysteme.com> wrote:
> hi all!
>
> we are experiencing occassional problems with nagios 2.0 stable. The
> main process was reloaded due to configuration changes yesterday (Mar
> 14th). since then ps -ef looks like this:
>
> nagios    1078     1 12 Mar09 ?        16:49:43 /opt/nagios/bin/nagios
> -d /opt/nagios/etc/nagios.cfg
> nagios    9431  1078  0 Mar14 ?        00:00:00 [nagios] <defunct>
>
> and nagios stopped to check. Has anyone an idea what could have
happened
> ? The nagios.log and status.dat files have not been updated since
then.
>
> thanks
> matthias
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
langua=
ge
> that extends applications into web and mobile media. Attend the live
webc=
ast
> and join the prime developer group breaking into this new coding
territor=
y!
>
http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D110944&bid=3D241720&dat=
=3D121642
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
report=
ing any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>


--__--__--

Message: 17
Date: Fri, 17 Mar 2006 10:56:45 -0800
From: Eli Stair <estair at ilm.com>
To: Terry <td3201 at gmail.com>
CC: Matthias Eble
<matthias.eble at mailing.kaufland-informationssysteme.com>,
  nagios-users at lists.sourceforge.net
Subject: Re: [Nagios-users] 2.0 stable stops checking


I've been seeing this continuously in 2.0beta/rc/releases.  For details 
on my situation/posts check the devel/users archives, I'm curious if any

similarities exist.  I haven't gotten acknowledgement/resolution on this

either, the only thing I've determined is that (in my case) stopping 
nagios and restarting with the retention file zeroed resolves the issue 
100%.

In the case of having an extra nagios process running that can 
definitely cause this and other issues.  In my case that's never been 
present and thus not the cause...

/eli

Terry wrote:
> I am seeing this as well.  I have services that do not get checked
> when they are scheduled:
> 
> Last Check Type:	ACTIVE
> Last Check Time:	03-17-2006 08:50:47
> Status Data Age:	0d 1h 37m 51s
> Next Scheduled Active Check:  	03-17-2006 10:09:01
> Latency:	342.408 seconds
> Check Duration:	10.015 seconds
> Last State Change:	03-16-2006 11:55:02
> Current State Duration:	0d 22h 33m 36s
> 
> It is currently 10:29 and it still hasnt been checked.  This is one of
> many examples.
> 
> On 3/15/06, Matthias Eble
> <matthias.eble at mailing.kaufland-informationssysteme.com> wrote:
> 
>>hi all!
>>
>>we are experiencing occassional problems with nagios 2.0 stable. The
>>main process was reloaded due to configuration changes yesterday (Mar
>>14th). since then ps -ef looks like this:
>>
>>nagios    1078     1 12 Mar09 ?        16:49:43 /opt/nagios/bin/nagios
>>-d /opt/nagios/etc/nagios.cfg
>>nagios    9431  1078  0 Mar14 ?        00:00:00 [nagios] <defunct>
>>
>>and nagios stopped to check. Has anyone an idea what could have
happened
>>? The nagios.log and status.dat files have not been updated since
then.
>>
>>thanks
>>matthias
>>
>>
>>
>>-------------------------------------------------------
>>This SF.Net email is sponsored by xPML, a groundbreaking scripting
language
>>that extends applications into web and mobile media. Attend the live
webcast
>>and join the prime developer group breaking into this new coding
territory!
>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=1216
42
>>_______________________________________________
>>Nagios-users mailing list
>>Nagios-users at lists.sourceforge.net
>>https://lists.sourceforge.net/lists/listinfo/nagios-users
>>::: Please include Nagios version, plugin version (-v) and OS when
reporting any issue.
>>::: Messages without supporting info will risk being sent to /dev/null
>>
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
language
> that extends applications into web and mobile media. Attend the live
webcast
> and join the prime developer group breaking into this new coding
territory!
> http://sel.as-us.falkag.net/sel?cmd=k&kid0944&bid$1720&dat1642
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
reporting any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
> 



--__--__--

Message: 18
Date: Fri, 17 Mar 2006 13:22:30 -0600
From: Terry <td3201 at gmail.com>
To: nagios-users at lists.sourceforge.net
Subject: Re: [Nagios-users] 2.0 stable stops checking

In just looking at the logs, the status.log is being continuously
updated as normal but when checks stop, the nagios.log stops gathering
entries as well.

On 3/17/06, Eli Stair <estair at ilm.com> wrote:
>
> I've been seeing this continuously in 2.0beta/rc/releases.  For
details
> on my situation/posts check the devel/users archives, I'm curious if
any
> similarities exist.  I haven't gotten acknowledgement/resolution on
this
> either, the only thing I've determined is that (in my case) stopping
> nagios and restarting with the retention file zeroed resolves the
issue
> 100%.
>
> In the case of having an extra nagios process running that can
> definitely cause this and other issues.  In my case that's never been
> present and thus not the cause...
>
> /eli
>
> Terry wrote:
> > I am seeing this as well.  I have services that do not get checked
> > when they are scheduled:
> >
> > Last Check Type:      ACTIVE
> > Last Check Time:      03-17-2006 08:50:47
> > Status Data Age:      0d 1h 37m 51s
> > Next Scheduled Active Check:          03-17-2006 10:09:01
> > Latency:      342.408 seconds
> > Check Duration:       10.015 seconds
> > Last State Change:    03-16-2006 11:55:02
> > Current State Duration:       0d 22h 33m 36s
> >
> > It is currently 10:29 and it still hasnt been checked.  This is one
of
> > many examples.
> >
> > On 3/15/06, Matthias Eble
> > <matthias.eble at mailing.kaufland-informationssysteme.com> wrote:
> >
> >>hi all!
> >>
> >>we are experiencing occassional problems with nagios 2.0 stable. The
> >>main process was reloaded due to configuration changes yesterday
(Mar
> >>14th). since then ps -ef looks like this:
> >>
> >>nagios    1078     1 12 Mar09 ?        16:49:43
/opt/nagios/bin/nagios
> >>-d /opt/nagios/etc/nagios.cfg
> >>nagios    9431  1078  0 Mar14 ?        00:00:00 [nagios] <defunct>
> >>
> >>and nagios stopped to check. Has anyone an idea what could have
happene=
d
> >>? The nagios.log and status.dat files have not been updated since
then.
> >>
> >>thanks
> >>matthias
> >>
> >>
> >>
> >>-------------------------------------------------------
> >>This SF.Net email is sponsored by xPML, a groundbreaking scripting
lang=
uage
> >>that extends applications into web and mobile media. Attend the live
we=
bcast
> >>and join the prime developer group breaking into this new coding
territ=
ory!
>
>>http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D110944&bid=3D241720&da
t=
=3D121642
> >>_______________________________________________
> >>Nagios-users mailing list
> >>Nagios-users at lists.sourceforge.net
> >>https://lists.sourceforge.net/lists/listinfo/nagios-users
> >>::: Please include Nagios version, plugin version (-v) and OS when
repo=
rting any issue.
> >>::: Messages without supporting info will risk being sent to
/dev/null
> >>
> >
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by xPML, a groundbreaking scripting
lang=
uage
> > that extends applications into web and mobile media. Attend the live
we=
bcast
> > and join the prime developer group breaking into this new coding
territ=
ory!
> >
http://sel.as-us.falkag.net/sel?cmd=3Dk&kid=110944&bid$1720&dat=121642
> > _______________________________________________
> > Nagios-users mailing list
> > Nagios-users at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nagios-users
> > ::: Please include Nagios version, plugin version (-v) and OS when
repo=
rting any issue.
> > ::: Messages without supporting info will risk being sent to
/dev/null
> >
>
>


--__--__--

Message: 19
Date: Fri, 17 Mar 2006 11:34:21 -0800
From: Eli Stair <estair at ilm.com>
To: Terry <td3201 at gmail.com>
CC:  nagios-users at lists.sourceforge.net
Subject: Re: [Nagios-users] 2.0 stable stops checking


So you're seeing the scenario where nagios stops _all_ checks 
altogether?  I've had this happen when the nagios parent process dies, 
and logs to nagios.log to this effect "[1139362901] Caught SIGSEGV, 
shutting down... ".  I was getting these very frequently when I went 
above some apparent host/service threshhold (went away when I removed 
about 128 nodes at one point recently).  In these cases the CGI's still 
respond for some reason, which seemed inappropriate...

I've also seen the same symptom, but without a well-advertised nagios 
failure, where the process is still present in memory but checks aren't 
executed and the CGI's are functional.

The third related (and my current bane...) issue is where MOST all 
checks occur, but some (sometimes large) groups of unrelated actions no 
longer occur.  Host/service checks as a whole seem to be working, but 
I'll notice that I haven't gotten an alert for something that failed, 
and then see that whole class of service checks on one hostgroup aren't 
running anymore... and then start to see the same issue with other 
checks/actions as well.

I'd sure love to just have nagios start working again, as I'm strongly 
against having to write an external framework for checking various parts

of Nagios and alerrt me when it's broken!  Alternately, I've always kept

up to date on other OS monitor/alert frameworks and still nothing is as 
extensible as Nagios is (yet).

/eli


Terry wrote:
> In just looking at the logs, the status.log is being continuously
> updated as normal but when checks stop, the nagios.log stops gathering
> entries as well.
> 
> On 3/17/06, Eli Stair <estair at ilm.com> wrote:
> 
>>I've been seeing this continuously in 2.0beta/rc/releases.  For
details
>>on my situation/posts check the devel/users archives, I'm curious if
any
>>similarities exist.  I haven't gotten acknowledgement/resolution on
this
>>either, the only thing I've determined is that (in my case) stopping
>>nagios and restarting with the retention file zeroed resolves the
issue
>>100%.
>>
>>In the case of having an extra nagios process running that can
>>definitely cause this and other issues.  In my case that's never been
>>present and thus not the cause...
>>
>>/eli
>>
>>Terry wrote:
>>
>>>I am seeing this as well.  I have services that do not get checked
>>>when they are scheduled:
>>>
>>>Last Check Type:      ACTIVE
>>>Last Check Time:      03-17-2006 08:50:47
>>>Status Data Age:      0d 1h 37m 51s
>>>Next Scheduled Active Check:          03-17-2006 10:09:01
>>>Latency:      342.408 seconds
>>>Check Duration:       10.015 seconds
>>>Last State Change:    03-16-2006 11:55:02
>>>Current State Duration:       0d 22h 33m 36s
>>>
>>>It is currently 10:29 and it still hasnt been checked.  This is one
of
>>>many examples.
>>>
>>>On 3/15/06, Matthias Eble
>>><matthias.eble at mailing.kaufland-informationssysteme.com> wrote:
>>>
>>>
>>>>hi all!
>>>>
>>>>we are experiencing occassional problems with nagios 2.0 stable. The
>>>>main process was reloaded due to configuration changes yesterday
(Mar
>>>>14th). since then ps -ef looks like this:
>>>>
>>>>nagios    1078     1 12 Mar09 ?        16:49:43
/opt/nagios/bin/nagios
>>>>-d /opt/nagios/etc/nagios.cfg
>>>>nagios    9431  1078  0 Mar14 ?        00:00:00 [nagios] <defunct>
>>>>
>>>>and nagios stopped to check. Has anyone an idea what could have
happened
>>>>? The nagios.log and status.dat files have not been updated since
then.
>>>>
>>>>thanks
>>>>matthias
>>>>
>>>>
>>>>
>>>>-------------------------------------------------------
>>>>This SF.Net email is sponsored by xPML, a groundbreaking scripting
language
>>>>that extends applications into web and mobile media. Attend the live
webcast
>>>>and join the prime developer group breaking into this new coding
territory!
>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=12
1642
>>>>_______________________________________________
>>>>Nagios-users mailing list
>>>>Nagios-users at lists.sourceforge.net
>>>>https://lists.sourceforge.net/lists/listinfo/nagios-users
>>>>::: Please include Nagios version, plugin version (-v) and OS when
reporting any issue.
>>>>::: Messages without supporting info will risk being sent to
/dev/null
>>>>
>>>
>>>
>>>
>>>-------------------------------------------------------
>>>This SF.Net email is sponsored by xPML, a groundbreaking scripting
language
>>>that extends applications into web and mobile media. Attend the live
webcast
>>>and join the prime developer group breaking into this new coding
territory!
>>>http://sel.as-us.falkag.net/sel?cmd=k&kid0944&bid$1720&dat1642
>>>_______________________________________________
>>>Nagios-users mailing list
>>>Nagios-users at lists.sourceforge.net
>>>https://lists.sourceforge.net/lists/listinfo/nagios-users
>>>::: Please include Nagios version, plugin version (-v) and OS when
reporting any issue.
>>>::: Messages without supporting info will risk being sent to
/dev/null
>>>
>>
>>
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
language
> that extends applications into web and mobile media. Attend the live
webcast
> and join the prime developer group breaking into this new coding
territory!
> http://sel.as-us.falkag.net/sel?cmd=k&kid0944&bid$1720&dat1642
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
reporting any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
> 



--__--__--

Message: 20
Date: Fri, 17 Mar 2006 14:37:39 -0500
From: "Mark Hennessy" <mhennessy at cloud9.net>
To: <nagios-users at lists.sourceforge.net>
Subject: [Nagios-users] "Host assumed to be up" message?

I have noticed with Nagios 1.3 and earlier, in "Host Detail" if a host =
is not
checked because its services are all up, "Host assumed to be up" is =
shown.
Now with Nagios 2.0, it shows the last check, and lots of those check =
dates
can of course get to be very old.

My Nagios users who watch this page are nervous at seeing such an old =
last
check.  I need to do one of the following:

1. Restore the old behavior of masking that date if the host is still up
=
and
we know this by virtue of the service checks being successful.  This may
=
be
perhaps done by putting the last successful service check date in that =
field?

2. Make Nagios check the hosts anyway even if all of the services are =
working
so that date increments.

Any ideas?

--
 Mark Hennessy


--__--__--

Message: 21
Date: Fri, 17 Mar 2006 13:39:59 -0600
From: Terry <td3201 at gmail.com>
To: "Eli Stair" <estair at ilm.com>
Subject: Re: [Nagios-users] 2.0 stable stops checking
Cc: nagios-users at lists.sourceforge.net

No, not all checks.  I see check_ping processes still firing up:

[root at plaut08 etc]# ps xauwwww -H| grep nagios  | grep -v grep
nagios   26676 11.0  0.1 28620 3852 ?        Ssl  13:35   0:11 =20
/usr/bin/nagios -d /etc/nagios/nagios.cfg
nagios   26814  0.0  0.1 28624 3852 ?        S    13:36   0:00   =20
/usr/bin/nagios -d /etc/nagios/nagios.cfg
nagios   26815  0.0  0.0  4684  640 ?        S    13:36   0:00     =20
/usr/lib/nagios/plugins/check_ping -H 172.28.7.59 -w 3000.0,80%% -c
5000.0,100%% -p 15 -t 30
nagios   26816  0.0  0.0  2580  528 ?        S    13:36   0:00       =20
/bin/ping -n -U -w 90 -c 15 172.28.7.59


I am seeing the same thing as you where only certain hosts/hostgroups
are being checked and then all of a sudden everything stops BUT pings
based on above but those checks are not being updated in nagios.log.=20
Very weird.

On 3/17/06, Eli Stair <estair at ilm.com> wrote:
>
> So you're seeing the scenario where nagios stops _all_ checks
> altogether?  I've had this happen when the nagios parent process dies,
> and logs to nagios.log to this effect "[1139362901] Caught SIGSEGV,
> shutting down... ".  I was getting these very frequently when I went
> above some apparent host/service threshhold (went away when I removed
> about 128 nodes at one point recently).  In these cases the CGI's
still
> respond for some reason, which seemed inappropriate...
>
> I've also seen the same symptom, but without a well-advertised nagios
> failure, where the process is still present in memory but checks
aren't
> executed and the CGI's are functional.
>
> The third related (and my current bane...) issue is where MOST all
> checks occur, but some (sometimes large) groups of unrelated actions
no
> longer occur.  Host/service checks as a whole seem to be working, but
> I'll notice that I haven't gotten an alert for something that failed,
> and then see that whole class of service checks on one hostgroup
aren't
> running anymore... and then start to see the same issue with other
> checks/actions as well.
>
> I'd sure love to just have nagios start working again, as I'm strongly
> against having to write an external framework for checking various
parts
> of Nagios and alerrt me when it's broken!  Alternately, I've always
kept
> up to date on other OS monitor/alert frameworks and still nothing is
as
> extensible as Nagios is (yet).
>
> /eli
>
>
> Terry wrote:
> > In just looking at the logs, the status.log is being continuously
> > updated as normal but when checks stop, the nagios.log stops
gathering
> > entries as well.
> >
> > On 3/17/06, Eli Stair <estair at ilm.com> wrote:
> >
> >>I've been seeing this continuously in 2.0beta/rc/releases.  For
details
> >>on my situation/posts check the devel/users archives, I'm curious if
an=
y
> >>similarities exist.  I haven't gotten acknowledgement/resolution on
thi=
s
> >>either, the only thing I've determined is that (in my case) stopping
> >>nagios and restarting with the retention file zeroed resolves the
issue
> >>100%.
> >>
> >>In the case of having an extra nagios process running that can
> >>definitely cause this and other issues.  In my case that's never
been
> >>present and thus not the cause...
> >>
> >>/eli
> >>
> >>Terry wrote:
> >>
> >>>I am seeing this as well.  I have services that do not get checked
> >>>when they are scheduled:
> >>>
> >>>Last Check Type:      ACTIVE
> >>>Last Check Time:      03-17-2006 08:50:47
> >>>Status Data Age:      0d 1h 37m 51s
> >>>Next Scheduled Active Check:          03-17-2006 10:09:01
> >>>Latency:      342.408 seconds
> >>>Check Duration:       10.015 seconds
> >>>Last State Change:    03-16-2006 11:55:02
> >>>Current State Duration:       0d 22h 33m 36s
> >>>
> >>>It is currently 10:29 and it still hasnt been checked.  This is one
of
> >>>many examples.
> >>>
> >>>On 3/15/06, Matthias Eble
> >>><matthias.eble at mailing.kaufland-informationssysteme.com> wrote:
> >>>
> >>>
> >>>>hi all!
> >>>>
> >>>>we are experiencing occassional problems with nagios 2.0 stable.
The
> >>>>main process was reloaded due to configuration changes yesterday
(Mar
> >>>>14th). since then ps -ef looks like this:
> >>>>
> >>>>nagios    1078     1 12 Mar09 ?        16:49:43
/opt/nagios/bin/nagio=
s
> >>>>-d /opt/nagios/etc/nagios.cfg
> >>>>nagios    9431  1078  0 Mar14 ?        00:00:00 [nagios] <defunct>
> >>>>
> >>>>and nagios stopped to check. Has anyone an idea what could have
happe=
ned
> >>>>? The nagios.log and status.dat files have not been updated since
the=
n.
> >>>>
> >>>>thanks
> >>>>matthias
> >>>>
> >>>>
> >>>>
> >>>>-------------------------------------------------------
> >>>>This SF.Net email is sponsored by xPML, a groundbreaking scripting
la=
nguage
> >>>>that extends applications into web and mobile media. Attend the
live =
webcast
> >>>>and join the prime developer group breaking into this new coding
terr=
itory!
>
>>>>http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D110944&bid=3D241720&
d=
at=3D121642
> >>>>_______________________________________________
> >>>>Nagios-users mailing list
> >>>>Nagios-users at lists.sourceforge.net
> >>>>https://lists.sourceforge.net/lists/listinfo/nagios-users
> >>>>::: Please include Nagios version, plugin version (-v) and OS when
re=
porting any issue.
> >>>>::: Messages without supporting info will risk being sent to
/dev/nul=
l
> >>>>
> >>>
> >>>
> >>>
> >>>-------------------------------------------------------
> >>>This SF.Net email is sponsored by xPML, a groundbreaking scripting
lan=
guage
> >>>that extends applications into web and mobile media. Attend the
live w=
ebcast
> >>>and join the prime developer group breaking into this new coding
terri=
tory!
>
>>>http://sel.as-us.falkag.net/sel?cmd=3Dk&kid=110944&bid$1720&dat=12164
2
> >>>_______________________________________________
> >>>Nagios-users mailing list
> >>>Nagios-users at lists.sourceforge.net
> >>>https://lists.sourceforge.net/lists/listinfo/nagios-users
> >>>::: Please include Nagios version, plugin version (-v) and OS when
rep=
orting any issue.
> >>>::: Messages without supporting info will risk being sent to
/dev/null
> >>>
> >>
> >>
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by xPML, a groundbreaking scripting
lang=
uage
> > that extends applications into web and mobile media. Attend the live
we=
bcast
> > and join the prime developer group breaking into this new coding
territ=
ory!
> >
http://sel.as-us.falkag.net/sel?cmd=3Dk&kid=110944&bid$1720&dat=121642
> > _______________________________________________
> > Nagios-users mailing list
> > Nagios-users at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nagios-users
> > ::: Please include Nagios version, plugin version (-v) and OS when
repo=
rting any issue.
> > ::: Messages without supporting info will risk being sent to
/dev/null
> >
>
>


--__--__--

Message: 22
Date: Fri, 17 Mar 2006 12:00:28 -0800
From: Eli Stair <estair at ilm.com>
To: Terry <td3201 at gmail.com>
CC:  nagios-users at lists.sourceforge.net
Subject: Re: [Nagios-users] 2.0 stable stops checking


Are you in a position to stop services for a minute and check starting 
up again with the retention.dat file moved out of the way?  If you're 
hesitant you may want to start up another instance of Nagios in parallel

for testing it and such.  That's sane, but I've proven to myself enough 
that this is always the case (in _my_ _current_ instance) and just have 
to do it on the production system when I catch it.

I'm real curious to find out if this is the same exact issue/resolution 
that works for you as well.

/eli

Terry wrote:
> No, not all checks.  I see check_ping processes still firing up:
> 
> [root at plaut08 etc]# ps xauwwww -H| grep nagios  | grep -v grep
> nagios   26676 11.0  0.1 28620 3852 ?        Ssl  13:35   0:11  
> /usr/bin/nagios -d /etc/nagios/nagios.cfg
> nagios   26814  0.0  0.1 28624 3852 ?        S    13:36   0:00    
> /usr/bin/nagios -d /etc/nagios/nagios.cfg
> nagios   26815  0.0  0.0  4684  640 ?        S    13:36   0:00      
> /usr/lib/nagios/plugins/check_ping -H 172.28.7.59 -w 3000.0,80%% -c
> 5000.0,100%% -p 15 -t 30
> nagios   26816  0.0  0.0  2580  528 ?        S    13:36   0:00        
> /bin/ping -n -U -w 90 -c 15 172.28.7.59
> 
> 
> I am seeing the same thing as you where only certain hosts/hostgroups
> are being checked and then all of a sudden everything stops BUT pings
> based on above but those checks are not being updated in nagios.log. 
> Very weird.
> 
> On 3/17/06, Eli Stair <estair at ilm.com> wrote:
> 
>>So you're seeing the scenario where nagios stops _all_ checks
>>altogether?  I've had this happen when the nagios parent process dies,
>>and logs to nagios.log to this effect "[1139362901] Caught SIGSEGV,
>>shutting down... ".  I was getting these very frequently when I went
>>above some apparent host/service threshhold (went away when I removed
>>about 128 nodes at one point recently).  In these cases the CGI's
still
>>respond for some reason, which seemed inappropriate...
>>
>>I've also seen the same symptom, but without a well-advertised nagios
>>failure, where the process is still present in memory but checks
aren't
>>executed and the CGI's are functional.
>>
>>The third related (and my current bane...) issue is where MOST all
>>checks occur, but some (sometimes large) groups of unrelated actions
no
>>longer occur.  Host/service checks as a whole seem to be working, but
>>I'll notice that I haven't gotten an alert for something that failed,
>>and then see that whole class of service checks on one hostgroup
aren't
>>running anymore... and then start to see the same issue with other
>>checks/actions as well.
>>
>>I'd sure love to just have nagios start working again, as I'm strongly
>>against having to write an external framework for checking various
parts
>>of Nagios and alerrt me when it's broken!  Alternately, I've always
kept
>>up to date on other OS monitor/alert frameworks and still nothing is
as
>>extensible as Nagios is (yet).
>>
>>/eli
>>
>>
>>Terry wrote:
>>
>>>In just looking at the logs, the status.log is being continuously
>>>updated as normal but when checks stop, the nagios.log stops
gathering
>>>entries as well.
>>>
>>>On 3/17/06, Eli Stair <estair at ilm.com> wrote:
>>>
>>>
>>>>I've been seeing this continuously in 2.0beta/rc/releases.  For
details
>>>>on my situation/posts check the devel/users archives, I'm curious if
any
>>>>similarities exist.  I haven't gotten acknowledgement/resolution on
this
>>>>either, the only thing I've determined is that (in my case) stopping
>>>>nagios and restarting with the retention file zeroed resolves the
issue
>>>>100%.
>>>>
>>>>In the case of having an extra nagios process running that can
>>>>definitely cause this and other issues.  In my case that's never
been
>>>>present and thus not the cause...
>>>>
>>>>/eli
>>>>
>>>>Terry wrote:
>>>>
>>>>
>>>>>I am seeing this as well.  I have services that do not get checked
>>>>>when they are scheduled:
>>>>>
>>>>>Last Check Type:      ACTIVE
>>>>>Last Check Time:      03-17-2006 08:50:47
>>>>>Status Data Age:      0d 1h 37m 51s
>>>>>Next Scheduled Active Check:          03-17-2006 10:09:01
>>>>>Latency:      342.408 seconds
>>>>>Check Duration:       10.015 seconds
>>>>>Last State Change:    03-16-2006 11:55:02
>>>>>Current State Duration:       0d 22h 33m 36s
>>>>>
>>>>>It is currently 10:29 and it still hasnt been checked.  This is one
of
>>>>>many examples.
>>>>>
>>>>>On 3/15/06, Matthias Eble
>>>>><matthias.eble at mailing.kaufland-informationssysteme.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>>hi all!
>>>>>>
>>>>>>we are experiencing occassional problems with nagios 2.0 stable.
The
>>>>>>main process was reloaded due to configuration changes yesterday
(Mar
>>>>>>14th). since then ps -ef looks like this:
>>>>>>
>>>>>>nagios    1078     1 12 Mar09 ?        16:49:43
/opt/nagios/bin/nagios
>>>>>>-d /opt/nagios/etc/nagios.cfg
>>>>>>nagios    9431  1078  0 Mar14 ?        00:00:00 [nagios] <defunct>
>>>>>>
>>>>>>and nagios stopped to check. Has anyone an idea what could have
happened
>>>>>>? The nagios.log and status.dat files have not been updated since
then.
>>>>>>
>>>>>>thanks
>>>>>>matthias
>>>>>>
>>>>>>
>>>>>>
>>>>>>-------------------------------------------------------
>>>>>>This SF.Net email is sponsored by xPML, a groundbreaking scripting
language
>>>>>>that extends applications into web and mobile media. Attend the
live webcast
>>>>>>and join the prime developer group breaking into this new coding
territory!
>>>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=
121642
>>>>>>_______________________________________________
>>>>>>Nagios-users mailing list
>>>>>>Nagios-users at lists.sourceforge.net
>>>>>>https://lists.sourceforge.net/lists/listinfo/nagios-users
>>>>>>::: Please include Nagios version, plugin version (-v) and OS when
reporting any issue.
>>>>>>::: Messages without supporting info will risk being sent to
/dev/null
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>-------------------------------------------------------
>>>>>This SF.Net email is sponsored by xPML, a groundbreaking scripting
language
>>>>>that extends applications into web and mobile media. Attend the
live webcast
>>>>>and join the prime developer group breaking into this new coding
territory!
>>>>>http://sel.as-us.falkag.net/sel?cmd=k&kid0944&bid$1720&dat1642
>>>>>_______________________________________________
>>>>>Nagios-users mailing list
>>>>>Nagios-users at lists.sourceforge.net
>>>>>https://lists.sourceforge.net/lists/listinfo/nagios-users
>>>>>::: Please include Nagios version, plugin version (-v) and OS when
reporting any issue.
>>>>>::: Messages without supporting info will risk being sent to
/dev/null
>>>>>
>>>>
>>>>
>>>
>>>-------------------------------------------------------
>>>This SF.Net email is sponsored by xPML, a groundbreaking scripting
language
>>>that extends applications into web and mobile media. Attend the live
webcast
>>>and join the prime developer group breaking into this new coding
territory!
>>>http://sel.as-us.falkag.net/sel?cmd=k&kid0944&bid$1720&dat1642
>>>_______________________________________________
>>>Nagios-users mailing list
>>>Nagios-users at lists.sourceforge.net
>>>https://lists.sourceforge.net/lists/listinfo/nagios-users
>>>::: Please include Nagios version, plugin version (-v) and OS when
reporting any issue.
>>>::: Messages without supporting info will risk being sent to
/dev/null
>>>
>>
>>
> 




--__--__--

_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users


End of Nagios-users Digest




-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null
Previous message: "Host assumed to be up" message?
Next message: performance data reporting
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Users mailing list