Problems with nrpe2 signals and plugin cleanup

Bill Moran wmoran at collaborativefusion.com
Mon Feb 25 22:17:46 CET 2008


I'm writing a custom plugin for our application that runs under nrpe2.

This bugger deals with a lot of data (potentially several G) thus nrpe2
is configured with a large timeout (300s) and it's impractical to keep
all the data in RAM, so I'm using temp files.

My problem is that sometimes network problems cause the script to take
longer than 300 seconds to run.  In this case, I want to receive an
alert, so all is well.  The problem here is that nrpe2 terminates the
script so the temp files are left lying around.

In looking for a more elegant solution than having admins clean up
temp files manually, or having a cron job clean them up, I tried
installing a signal handler in the plugin to guarantee cleanup
of the temp files, but it didn't work, so I delved into nrpe2s
source a bit to figure out why.  I found that on timeout, nrpe2
issues a SIGTERM immediately followed by a SIGKILL.  Since SIGKILL
is not catchable, my theory is that the SIGKILL signal arrives
before my script has had a chance to run the signal handler for
the SIGTERM, thus the cleanup is never done.

So ... I've two questions:

First, does anyone have a suggestion on how to handle this better
in the script?

Second, I'm curious about the rapid issuance of the TERM/KILL
signals.  Is there anything preventing nrpe2 from simply sleep()ing
a few seconds between the two signals?  I mean, if I'm willing to
wait 300s for success, I'm willing to wait 305s for a clean failure.

-- 
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran at collaborativefusion.com
Phone: 412-422-3463x4023

****************************************************************
IMPORTANT: This message contains confidential information and is
intended only for the individual named. If the reader of this
message is not an intended recipient (or the individual
responsible for the delivery of this message to an intended
recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited. Please
notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
sender therefore does not accept liability for any errors or
omissions in the contents of this message, which arise as a
result of e-mail transmission.
****************************************************************

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/




More information about the Developers mailing list