Nagios stress-test

Andreas Ericsson ae at op5.se
Tue Apr 5 09:37:33 CEST 2005


Ahoy all.

This mail is intended for those of you interested in contributing to 
Nagios but aren't very comfortable with threadsafe C-programming. Others 
might want to skip this mail.

I've created a small but significantly weird plugin called check_rand, 
available for download at http://oss.op5.se/nagios and 
https://devel.op5.se/oss

check_rand will;
* exit with a properly pseudo-random exit code between 0 and 3. Each 
code is tested to have equal value.
* print a random message 50% of the times, and the message "Life, loathe 
it or ignore it, you can't like it" (which happened to be the first 
message that the fortune program spit out).
* print perfdata 50% of the times.
* print empty perfdata 25% of the times (half the times it prints perfdata).
* *possibly* time out 12.5% of the times its run (the value passed to 
sleep is random, so it might not time out after all, but sleep only has 
a 12.5 chance of being called).
* sleep up to 12 seconds prior to exiting, after having printed output. 
This will let it sometimes time out AFTER having printed output, which 
might be a source of crashing.

To implement it for every check available, as well as notification 
commands (you won't want the barrage of notifications this script will 
generate), you should run something like this;

cp misccommands.cfg misccommands.cfg.bak
cp checkcommands.cfg checkcommands.cfg.bak
sed 's,\(command_line[^/]*\)[^ ]*\(.*\),\1/check_rand\2,' 
checkcommands.cfg.bak >> checkcommands.cfg
sed 's,\(command_line[^/]*\)[^ ]*\(.*\),\1/check_rand\2,' 
misccommands.cfg.bak >> misccommands.cfg

(make sure you get those sed-lines right. Cut'n'past is your friend.)
Or you can manually change the command actually run to check_rand (or 
symlink every plugin you have to check_rand, or something else that will 
assure that check_rand is run instead of the actual plugin). This kind 
of stress-testing is fairly important if we want 2.x to go stable 
sometime soon, so don't be afraid to ask if you're having trouble.


Those easily offended should take heed, as I included fortune's 
offensive database. I just needed a lot of C-style strings pronto and 
took the ones that were readily available.

If you're serious about helping out debugging nagios you should run an 
un-stripped version (file /usr/local/nagios/bin/nagios will tell you) so 
that core-dumps are made useful and have daemon_dumps_core set to 1 in 
your nagios.cfg. It's very important that you keep the core files and 
the nagios .log and .sav-files that were generated during the (possible) 
crash, as debugging without them is simply hell.

If you don't like reporting things to the nagios-devel mailinglist you 
can send bug-reports privately to me and I'll collect and forward them. 
It's appreciated all the same, and results should be visible in the form 
of commits to CVS and more stable code.

The check_rand plugin is written in ANSI C for portability but uses 
/dev/urandom as its source for randomness. If this is a showstopper, 
then let me know and I'll work around it.


Cheers, and thanks for listening and (possibly) contributing.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Lead Developer


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list