Nagios-users digest, Vol 1 #3188 - 5 msgs

Bryce, Jennifer JBryce at mtm.com
Wed May 24 15:58:22 CEST 2006


Good ole Georgie

Jennifer Bryce
MTM Technologies, Inc (formerly NEXL, Inc.)
Technical Consultant
978.538.3000 x543

-----Original Message-----
From: nagios-users-admin at lists.sourceforge.net
[mailto:nagios-users-admin at lists.sourceforge.net] On Behalf Of
nagios-users-request at lists.sourceforge.net
Sent: Tuesday, May 23, 2006 11:10 PM
To: nagios-users at lists.sourceforge.net
Subject: Nagios-users digest, Vol 1 #3188 - 5 msgs

Send Nagios-users mailing list submissions to
	nagios-users at lists.sourceforge.net

To subscribe or unsubscribe via the World Wide Web, visit
	https://lists.sourceforge.net/lists/listinfo/nagios-users
or, via email, send a message with subject or body 'help' to
	nagios-users-request at lists.sourceforge.net

You can reach the person managing the list at
	nagios-users-admin at lists.sourceforge.net

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Nagios-users digest..."


Today's Topics:

   1. Re: How to reduce a very high latency number (Jacob Ritorto)
   2. RE: Check that a process is running! (Marc Powell)
   3. 2.x: nagios_check_command never called from cgi pages (Marc
Martinez)
   4. RE: Undefined subroutine &Embed::Persistent::eval_file called.  in
check_disk_smb (Stanley.Hopcroft at Dest.gov.au)
   5. CGI scripts not running (Eric X. Holzapfel)

--__--__--

Message: 1
Date: Tue, 23 May 2006 09:48:19 -0400
From: "Jacob Ritorto" <jritorto at nut.net>
To: nagios-users at lists.sourceforge.net
Subject: [Nagios-users] Re: How to reduce a very high latency number

Greetings,
       A colleague of mine (poctum) and I ran into something like
this while using nsca and have crafted a similar solution.  We
observed that send_nsca was sending only one result to the central
Nagios server per connection.  Testing revealed that send_nsca was
capable of handling thousands of results per connection.  Sending only
one at a time was resulting in lots of dropped data because there were
nominally about 5 results derived per second.  We enabled
aggregate_status_updates in the nagios.cfg file, but this yielded no
improvement in the result submissions.  BTW, this is Nagios-2.2 and
nsca-2.6 on Solaris 10.  Our workaround is a quick and dirty but
efficient solution.  It may not be as refined as trask's and relies on
nuances of unix file handling algorithms to get the job done.  That
said, it's working perfectly for us.  As this seems to work well, but
may violate Ethan's design intentions, your feedback/input is
requested.  Deploy at your own risk.

Jacob Ritorto, Lead UNIX Server Operations Engineer
InnovationsTech

Here's our solution:

1) Altered last line in
/opt/nagios/libexec/eventhandlers/submit_check_result thusly.  It
basically concatenates check results to a temp file.

#/bin/printf "%s\t%s\t%s\t%s\n" "$1" "$2" "$return_code" "$4" |
/opt/nagios/bin/send_nsca 172.16.x.x -c /opt/nagios/etc/send_nsca.cfg

/bin/printf "%s\t%s\t%s\t%s\n" "$1" "$2" "$return_code" "$4" >>
/opt/nagios/var/results.waiting


2) Created a daemon process called reap (managed by smf, but it has
been up for a month so far, so may be ok as an init.d script) to pull
aside the aforementioned temp file (results.waiting) every five
seconds and send the bits off to the central Nagios server (note that
original file is re-created immediately via step 1 above).  This
probably only works perfectly on unix & unix-like systems due to the
nature of files hanging around intact until the last program
referencing them has exited.  It's been some time, but the last I
checked, DOS/WINxxxx doesn't treat files this way.  Here's the simple
little reap daemon:

# cat /opt/nagios/bin/reap
#!/usr/bin/tcsh
while (1)
 sleep 5
 mv /opt/nagios/var/results.waiting /opt/nagios/var/results.sending
 cat /opt/nagios/var/results.sending | /opt/nagios/bin/send_nsca
172.16.x.x -c /opt/nagios/etc/send_nsca.cfg >/dev/null
end


Summary:  Slave Nagios servers now store up check results in the temp
file for 5 seconds, then they get shipped off to nsca on the central
Nagios machine in one swoop instead of one-at-a-time.


*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~



From: Trask <trasko at gm...>
Re: How to reduce a very high latency number
2006-05-23 03:50

On 5/22/06, srunschke at abit.de <srunschke at abit.de> wrote:
> nagios-users-admin at lists.sourceforge.net schrieb am 17.05.2006
20:09:16:
>
> To me this is obviously a performance issue related to hardware.
> Your machines have way too few RAM. It is totally not possible to
> run 1800 checks on a 512MB machine in a timely manner.
>

I figured this out this past Saturday.  It is not any lack of the
hardware.  I was seeing negligible load nor an excessive use of
memory.  No configuration change I made seemed to have any appreciable
effect on the latency times I was getting.  I ended up doing a "top"
with 1 second intervals and just watching it for awhile.  I noticed
that sometimes there would be a good number of nagios processes
20-30-40 or so, but the majority of the time there were only 2, 3 or 4
processes.  Although I do not know exactly *why* this was happening,
it ends up the during the time where there was 2-4 processes running,
2 of them were always the"submit_passive_check" script and
"send_nsca".  It appears that this is being done serially (ie not in
parallel) and ends up blocking subsequent checks until they are done.
I would see these 2 processes running (with steadily increasing PIDs)
for up to a minute and then a short-lived (4-5 seconds) "explosion" of
nagios processes (service/host checks).  After this flurry of
activity, it would be another 60 seconds or so of just 2-4 processes.

I resolved this problem by changing by "submit_passive_check" script.
Below are some sample scripts, both old and new.  The short of it is
like this:  Previously, the "submit_passive_check" script did a printf
of the data in the appropriate format and piped it to the "send_nsca"
command (in a shell script).  I have eliminated this bottleneck by
having "submit_passive_check" redirect its output to a named pipe and
then having another script feed "send_nsca" with that data as it comes
in to the named pipe.

Latency times have dropped from the 600-700 seconds to 0.2 seconds on
the worst server and from 45-55 seconds to 0.06 on the 2nd to worst.
That's more like it!

Below are a few scripts w/ notes as to what each one is.  Thanks to
everyone who offered help.

~trask


--__--__--

Message: 2
Subject: RE: [Nagios-users] Check that a process is running!
Date: Tue, 23 May 2006 09:32:30 -0500
From: "Marc Powell" <marc at ena.com>
To: <nagios-users at lists.sourceforge.net>



> -----Original Message-----
> From: nagios-users-admin at lists.sourceforge.net [mailto:nagios-users-
> admin at lists.sourceforge.net] On Behalf Of lennart.kvam at softronic.se
> Sent: Tuesday, May 23, 2006 5:44 AM
> To: nagios-users at lists.sourceforge.net
> Subject: [Nagios-users] Check that a process is running!
>=20
> Hello all!
>=20
> I want to monitor a that a process is running on a solaris8 box, I
want a
> nagios to notify me if the procs dies or if more than one is running,
> anyone know what plugin to use?

check_procs

>=20
> I'v tried check_procs but dont get it to work!

What did you use? What happens?

> Please help:-)

You need just two arguments to check_procs: -c and -C. One of the
Examples given in --help is almost exactly what you need.

--
Marc


--__--__--

Message: 3
Date: Tue, 23 May 2006 10:31:22 -0700
From: "Marc Martinez" <lastxit at gmail.com>
To: nagios-users at lists.sourceforge.net
Subject: [Nagios-users] 2.x: nagios_check_command never called from cgi
pages

I've been unable to find a mention of why the nagios_check_command
handling has been removed from the sources.

I do notice that the option is no longer in the documentation for
cgi.cfg but an example entry still exists in the sample-config version
of the file.

was this ripped out for performance considerations or just shelved in
a re-org and never brought back?

thanks for any insight..

Marc


--__--__--

Message: 4
Subject: RE: [Nagios-users] Undefined subroutine
&Embed::Persistent::eval_file called.  in check_disk_smb
Date: Wed, 24 May 2006 09:22:40 +1000
From: <Stanley.Hopcroft at Dest.gov.au>
To: <nagios-users at lists.sourceforge.net>

Dear Sir,

I am writing to thank you for your letter and say,

>-----Original Message-----

>Message: 15
>Date: Tue, 23 May 2006 14:10:03 +0200 (CEST)
>From: =3D?iso-8859-1?q?jacobo=3D20garc=3DEDa?=3D <ies_med at yahoo.es>
>To: nagios-users at lists.sourceforge.net
>Subject: [Nagios-users] Undefined subroutine=20
>&Embed::Persistent::eval_file called.  in check_disk_smb
>
>Under Nagios i had this problem when running
>check_disk_smb
>
>"Undefined subroutine &Embed::Persistent::eval_file
>called. "
>

It sounds like that you have a Nagios built with embedded Perl.

(you can check by strings </Path/to/>nagios | grep -i perl | head -5. If
you see
libperl.so
Perl_croak
Perl_markstack_grow
Perl_croak_nocontext
Perl_save_int

then you have a Perl interpreter built into Nagios).

If this is not the case, I can't help.

Unfortunately if your Nagios does have Perl in it, the Perl driver
(p1.pl) seems not to be where Nagios expects it.

Look at your nagios.cfg

(eg
[sh1517 at acisf011 Dhcp]$ grep -i P1 /etc/nagios/nagios.cfg=20
# P1.PL FILE LOCATION
# This value determines where the p1.pl perl script (used by the
p1_file=3D/usr/bin/p1.pl
[sh1517 at acisf011 Dhcp]$=20
)

and check if p1.pl is in the location nagios.cfg says it should be.

If it is, I don't know what is happening.

If not, try and locate it - you can get it from the Nagios CVS or from
the corresponding dist tarball or the RPM/package - and put it there.
You could also check its not hiding somewhere else in the file
system.=20

If you can get a copy of p1.pl then either relocate it or put it in the
path specified by nagios.cfg (p1.pl is used only by Nagios so moving it
won't break anything). You may have to restart Nagios for the change to
take effect.

The file is plain text, pure Perl. p1.pl defines the subroutine that
Nagios is trying to have Perl call.

Finally, if you don't have a good reason to use embedded Perl, you
probably shouldn't be using it. If you don't want to use embedded Perl,
you can't turn it off at run time. The only option is replacing the
Nagios binary with one compiled without embedded Perl. The Dag Wieers
Redhat RPMs for Nagios all build with embedded Perl so if you use RPMs,
you would need to choose another RPM or hack the Dag SPEC file to not
build embedded Perl.


>
>when i run from command line i response on 2 lines,
>but it seems to be ok.
>
>i dont know what to do.
>
>

Yours sincerely.


--__--__--

Message: 5
Date: Tue, 23 May 2006 16:35:26 -0700
From: "Eric X. Holzapfel" <ewh at groupoliver.com>
To: <nagios-users at lists.sourceforge.net>
Subject: [Nagios-users] CGI scripts not running

This is a multi-part message in MIME format.

------_=_NextPart_001_01C67EC2.0005EC82
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

Hello Nagios Users,
=20
I have downloaded and built nagios 2.3.1 (lets assume for the moment
that I did it correctly).
=20
 when I click on any of the links (like Tactical Informmation, etc) all
that I get is a blank page.
=20
I have the htpasswd set, etc, and can log in with default user (nagios),
and have copied the cgi files (just in case) to the /var/www/cgi-bin
directory as well as the files in their normal place
/usr/local/nagios/sbin.
=20
The permission seem to be set IAW the nagios install directions, etc.
=20
I am running Fedora Core 5, and apache 2.0.
=20
Any suggestions here???
=20
Thanks,
=20
eric

------_=_NextPart_001_01C67EC2.0005EC82
Content-Type: text/html;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Dus-ascii">
<META content=3D"MSHTML 6.00.2900.2873" name=3DGENERATOR></HEAD>
<BODY>
<DIV><SPAN class=3D308523023-23052006><FONT face=3DArial size=3D2>Hello
=
Nagios=20
Users,</FONT></SPAN></DIV>
<DIV><SPAN class=3D308523023-23052006><FONT face=3DArial=20
size=3D2></FONT></SPAN> </DIV>
<DIV><SPAN class=3D308523023-23052006><FONT face=3DArial size=3D2>I have
=
downloaded=20
and built nagios 2.3.1 (lets assume for the moment that I did it=20
correctly).</FONT></SPAN></DIV>
<DIV><SPAN class=3D308523023-23052006><FONT face=3DArial=20
size=3D2></FONT></SPAN> </DIV>
<DIV><SPAN class=3D308523023-23052006><FONT face=3DArial =
size=3D2> when I click=20
on any of the links (like Tactical Informmation, etc) all that I get is
=
a blank=20
page.</FONT></SPAN></DIV>
<DIV><SPAN class=3D308523023-23052006><FONT face=3DArial=20
size=3D2></FONT></SPAN> </DIV>
<DIV><SPAN class=3D308523023-23052006><FONT face=3DArial size=3D2>I have
=
the htpasswd=20
set, etc, and can log in with default user (nagios), and have copied the
=
cgi=20
files (just in case) to the /var/www/cgi-bin directory as well as the =
files in=20
their normal place /usr/local/nagios/sbin.</FONT></SPAN></DIV>
<DIV><SPAN class=3D308523023-23052006><FONT face=3DArial=20
size=3D2></FONT></SPAN> </DIV>
<DIV><SPAN class=3D308523023-23052006><FONT face=3DArial size=3D2>The =
permission seem=20
to be set IAW the nagios install directions, etc.</FONT></SPAN></DIV>
<DIV><SPAN class=3D308523023-23052006><FONT face=3DArial=20
size=3D2></FONT></SPAN> </DIV>
<DIV><SPAN class=3D308523023-23052006><FONT face=3DArial size=3D2>I am =
running Fedora=20
Core 5, and apache 2.0.</FONT></SPAN></DIV>
<DIV><SPAN class=3D308523023-23052006><FONT face=3DArial=20
size=3D2></FONT></SPAN> </DIV>
<DIV><SPAN class=3D308523023-23052006><FONT face=3DArial size=3D2>Any =
suggestions=20
here???</FONT></SPAN></DIV>
<DIV><SPAN class=3D308523023-23052006><FONT face=3DArial=20
size=3D2></FONT></SPAN> </DIV>
<DIV><SPAN class=3D308523023-23052006><FONT face=3DArial=20
size=3D2>Thanks,</FONT></SPAN></DIV>
<DIV><SPAN class=3D308523023-23052006><FONT face=3DArial=20
size=3D2></FONT></SPAN> </DIV>
<DIV><SPAN class=3D308523023-23052006><FONT face=3DArial=20
size=3D2>eric</FONT></SPAN></DIV></BODY></HTML>

------_=_NextPart_001_01C67EC2.0005EC82--




--__--__--

_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users


End of Nagios-users Digest


-------------------------------------------------------
All the advantages of Linux Managed Hosting--Without the Cost and Risk!
Fully trained technicians. The highest number of Red Hat certifications in
the hosting industry. Fanatical Support. Click to learn more
http://sel.as-us.falkag.net/sel?cmd=lnk&kid7521&bid$8729&dat1642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list