RFC embedded Perl Nagios changes: usability and performance.

Ethan Galstad nagios at nagios.org
Mon Jan 5 02:09:49 CET 2004


Hi Stanley -

Your suggestions certainly sound worth pursuing.  I try to stay as 
far away as possible from the epn code, but if you submit patches 
against CVS HEAD (2.x), I'll include them in 2.0.  The one change I 
did make in 2.0 is add a config file option to unload/reload the epn 
after a certain number of uses to possibly help cut down on memory 
leaks (not sure if it will help though).


On 1 Jan 2004 at 13:54, Stanley Hopcroft wrote:

> Dear Ladies and Gentlemen,
> 
> I am writing to request your comments on proposed changes to the
> embedded Perl interpreter support in Nagios (ePN).
> 
> This letter has two sections: part A is general discussion and the
> proposals; B concerns testing.
> 
> Part A Discussion and Proposals.
> 
> The ePN support contributed by Mr Stephen Davies embeds a Perl
> interpreter into the Nagios binary and (with much cleverness)  allows
> Perl plugins to be compiled only once before they are run (as well as
> letting the Plugin call the Perl exit without trashing the
> interpreter; saving the output the plugin writes to STDOUT and other
> good things)
> 
> ePN provides these benefits to Perl plugins and Nagios
> 
> 1 Perl plugins are not subject to the OS forking the Nagios process;
> Perl plugins are called as Nagios functions
> 
> While Nagios forks to execute _each_ plugin, an ePN Nagios does not
> request another fork (in the popen() system call) to run the plugin.
> 
> 2 Perl plugins are compiled only once, saving both an exec of the Perl
> interpreter and the Perl compilation phase (to the Perl op-code parse
> tree) each time a Perl plugin is run. Since the compilation phase may
> include loading Perl modules required by the plugin (some of which are
> huge), there is quite a saving or work if not execution time (since
> Perl is pretty fast) in single compilation.
> 
> There are also ePN tradeoffs such as markedly increased memory
> consumption (the Perl parse trees remain in memory).
> 
> The ePN implementation that has performed well in both Netsaint
> 0.0.[4-7] and Nagios 1.[0-1] and it seems to be remains unchanged in
> the HEAD CVS branch (the Perl calls in the head checks.c seem to be
> the same as those in 1.x, and p1.pl appears unchanged), could be
> enhanced in these areas
> 
> 1 Performance
> 
> Three observations concerned with performance are that
> 
> 1.1 The plugin output is returned to Nagios through the file system
> (rather than as an extra element of a list returned by
> Embed::Persistent::run_plugin).
> 
> Instead of STDOUT being tied to the file system it could be tied to an
> in memory data structure (probably scalar) and the value the plugin
> 'writes' as output returned as an extra element on the Perl stack.
> 
> This I think is a straight forward enhancement that saves Nagios
> system calls to generate a temporary file name, open the file, read
> the line of plugin output and unlink the file.
> 
> Unfortunately, I don't know why STDOUT was tied to a file: it doesn't
> seem to have any debugging advantages because the contents are either
> logged by Nag (and the file unlinked) or there are _no_ contents. 
> 
> 1.2 The ePN author comments specifically on the separation of the
> parse and execution phases
> 
> '
> # Only major changes are to separate the compiling and cacheing from
> # the execution so that the cache can be kept in "non-volatile" parent
> # process while the execution is done from "volatile" child processes
> '
> 
> Unfortunately, I dnn't understand this (presumably the 'processes' are
> the eval_file and run_package subroutines in the Embed::Persistent
> package. However, while eval_file is only concerned whether to parse
> the plugin, the data structure that it uses to avoid reparsing
> unchanged plugins that have already been compiled (%Cache) is an entry
> in the package symbol table: it is visible to both subroutines).
> 
> It _may_ be better to replace the two Nagios calls to Perl (one for
> eval_file and the other for run_package) by one call to a new Perl
> subroutine that optionally compiles and runs the plugin.
> 
> This change is extensive; I have no plans to do so immediately if at
> all.
> 
> 1.3 p1.pl uses IO::File to open the file to which plugin output is
> sent. IO::File is a big module (according to Lincoln Stein) and in
> this case where it is only being used to return a file handle glob, it
> seems overkill.
> 
> Unfortunately, replacing it by a normal two argument Perl open
> produces a wierd failure in _all_ the Perl plugins.
> 
> In view of 1.1 this doesn't seem worth worrying about.
> 
> 2 Usability
> 
> Coding plugins to succeed under ePN requires more Perl nouse and
> experience than without ePN: the same plugin can run from the command
> line but fail with ePN.
> 
> The ePN support can be enhanced to provide information useful to the
> plugin developer and to avoid spurious CRITICAL states (caused by
> plugin mistakes) by 
> 
> 2.1 Logging Perl warnings and compile time errors
> 
> 2.2 Logging (in the Nagios log file) a clear indication that the
> plugin has failed and at the same time returning UNKNOWN instead of
> CRITICAL when a plugin cannot be executed.
> 
> Here is an extract from a test Nagios nagios.log running with ePN
> support that provides such logging
> 
> [1072870185] SERVICE ALERT: oradev;AUB;UNKNOWN;SOFT;1;**ePN plugin
> runtime error: Can't locate object method "new" via package
> "Nagios::WebTransact" at (eval 1) line 79 in plugin 'check_aub'.
> 
> [1072870245] SERVICE ALERT: oradev;AUB;UNKNOWN;SOFT;2;**ePN plugin
> runtime error: Can't locate object method "new" via package
> "Nagios::WebTransact" at (eval 1) line 79 in plugin 'check_aub'.
> 
> [1072870275] SERVICE ALERT: oradev;bad_plugin;UNKNOWN;SOFT;1;**ePN
> plugin 'ap5' has syntax errors. Check ePN log.
> 
> [1072870305] SERVICE ALERT: oradev;AUB;UNKNOWN;HARD;3;**ePN plugin
> runtime error: Can't locate object method "new" via package
> "Nagios::WebTransact" at (eval 1) line 79 in plugin 'check_aub'.
> 
> [1072870335] SERVICE ALERT: oradev;bad_plugin;UNKNOWN;SOFT;2;**ePN
> plugin 'ap5' has syntax errors. Check ePN log.
> 
> [1072870395] SERVICE ALERT: oradev;bad_plugin;UNKNOWN;HARD;3;**ePN
> plugin 'ap5' has syntax errors. Check ePN log.
> 
> These errors signify that
> 
> 1 the plugin named check_aub would compile but had a fatal run-time
> error 
> 
> (
> Hardly suprising since it was hacked for this purpose
> 
> tsitc> diff -c ../libexec/check_aub
> /usr/local/nagios/libexec/check_aub
> *** ../libexec/check_aub        Wed Dec 31 12:20:33 2003
> --- /usr/local/nagios/libexec/check_aub Sat Jun 21 11:15:51 2003
> ***************
> *** 30,36 ****
> 
>   use Getopt::Long;
> 
> ! # use Nagios::WebTransact ;
>   use utils qw($TIMEOUT %ERRORS &print_revision &support);
> 
>   my $PROGNAME = 'check_aub' ;
> --- 30,36 ----
> 
>   use Getopt::Long;
> 
> ! use Nagios::WebTransact ;
>   use utils qw($TIMEOUT %ERRORS &print_revision &support);
> 
>   my $PROGNAME = 'check_aub' ;
> tsitc> 
> )
> 
> 2 The plugin named 'ap5' failed to compile under the ePN
> 
> Here is the corresponding entry in the new ePN log of plugin syntax
> errors.
> 
> tsitc> tail -30 epn.log 
> 
> **ePN plugin syntax error: Global symbol "$i" requires explicit
> package name at (eval 3) line 5.
>  in package Embed::Persistent file
> /home/anwsmh/nagios-1.0_test-debug/bin/p1.pl at line 144 in text
> "
>                 package main;
>                 use subs 'CORE::GLOBAL::exit';
>                 sub CORE::GLOBAL::exit { die "ExitTrap: $_[0]
> (Embed::ap5)"; }
>                 package Embed::ap5; sub hndlr { shift(@_);
> @ARGV=@_;
> #!/usr/bin/perl -w
> 
> use strict ;
> 
> $i = 0 ;
> 
> while ($_ = shift @ARGV) {
>   print "\$ARGV\[$i\]: $_ " ;           # NB embedded Perl only reads
> __1__ (one) line of output !
>   $i++ ;
> }
> ; }
> 
> ;".
> tsitc> 
> 
> This shows the complete plugin listing _as it is executed by ePN (the
> original plugin text is wrapped as a subroutine with the exit method
> overridden); the line number reported as the line containing the error
> (5) is wrt to the _original_ plugin text.
> 
> The usability changes are
> 
> 1 comprised of patches to p1.pl solely
> 
> 2 change the exit status of a plugin with a run time error from
> CRITICAL to UNKNOWN (this was obviously a minor mistake in the
> original p1.pl)
> 
> 3 change the message logged for a plugin with a run time error from 
> 
> (No output!)
> 
> to
> 
> **ePN plugin runtime error: Can't locate object method "new" via
> package "Nagios::WebTransact" at (eval 1) line 79 in plugin
> 'check_aub'.
> 
> and 
> 
> 4 change the the message logged for a plugin with a syntax error from
> 
> (No output!)
> 
> to
> 
> *ePN plugin 'ap5' has syntax errors. Check ePN log.
> 
> 5 add an ad-hoc log to p1.pl to record plugin syntax errors. At the
> moment this log file is specified by a hard coded string in p1.pl that
> re-opens STDERR in append mode to that file. The CPAN Sys::Syslog
> module could be be used to allow syslogd to rotate and archive but I
> think this an unacceptably memory tradeoff.
> 
> B Testing
> 
> I have patched p1.pl (the version I think is 1.2 from both 1.x and 2.x
> CVS branches) to implement the usability changes above (and will also
> do so for the first of the performance changes [don't use file system
> to return plugin output]) and used it successfully on _one_ FreeBSD
> system (system Perl 5.005_03) for
> 
> 1 My production Nag (200 hosts/400 services/"if it's not pinged it's
> Perl'd"), only for about 24 hours at this stage.
> 
> 2 A test Nag (same host/Perl - tiny config, hacked plugins, different
> paths) for adhoc testing.
> 
> 3 ePN simulator (same host/Perl)
> 
> I will try on at least a Linux/threaded Perl system before asking for
> testers.
> 
> Since these changes have the potential to 
> 
> . creep into the Nagios C code (almost certainly will. NB I think the
> changes are confined to checks.c but any part of the code that
> contains #ifdef EMBEDDEDPERL potentially needs changing)
> 
> . create havoc (see 1.3 under Performance)
> 
> I welcome any comments and particuarly those about testing.
> 
> At this stage my plan is to
> 
> . allow the usability changes more time to misbehave (as well as
> trying them on a Linux test system - mini config)
> 
> . try the more extensive performance change (1.1) as above - since
> this is the only one of the changes that really is helpful to Nagios
> ePN installations.
> 
> . Invite testers - preferably from experienced sysadmins/entropy
> removalists
> 
> . submit patches.
> 
> Yours sincerely. 
> 
> 
> -- 
> ----------------------------------------------------------------------
> -- Stanley Hopcroft
> ----------------------------------------------------------------------
> --
> 
> '...No man is an island, entire of itself; every man is a piece of the
> continent, a part of the main. If a clod be washed away by the sea,
> Europe is the less, as well as if a promontory were, as well as if a
> manor of thy friend's or of thine own were. Any man's death diminishes
> me, because I am involved in mankind; and therefore never send to know
> for whom the bell tolls; it tolls for thee...'
> 
> from Meditation 17, J Donne.
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: IBM Linux Tutorials.
> Become an expert in LINUX or just sharpen your skills.  Sign up for
> IBM's Free Linux Tutorials.  Learn everything from the bash shell to
> sys admin. Click now!
> http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
> _______________________________________________ Nagios-devel mailing
> list Nagios-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-devel
> 



Ethan Galstad,
Nagios Developer
---
Email: nagios at nagios.org
Website: http://www.nagios.org



-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click




More information about the Developers mailing list