DNX Version 0.13 Released!

Adam Augustine augustineas at gmail.com
Thu Oct 4 01:11:25 CEST 2007


Sorry for the long time between releases. Bob has has been working on
it quite a bit (I'm the one who's been slacking), so this should catch
us up a bit. This release should resolve all the outstanding bugs and
addresses a few of the features.

Once we found the short term stability bug, worker nodes were able to
run for a couple of weeks under our normal load before the
communications channel bug got us. But Bob worked his magic and found
the problem in record time.

If you can, please test and let us know how it works for you.

Version 0.13:
=============
- Added out-of-memory condition checking for all strdup(3) calls.
- Fixed DNX communications channel exhaustion bug. This bug occurred when
a Client worker thread exited: Although the dispatch and collector channels
were properly closed, they weren't released from the DNX Channel Map pool.
Since this pool has a finite number of slots, we ran out of slots eventually.
Running out of slots then prevented the creation of any additional worker
threads.
- Fixed memory leak in the Client, related to the above problem.
- Fixed the same DNX communications channel exhaustion bug in the NEB Server
module as well. Although, this was not likely to occur very often.
- Added some additional error and debug logging.
- Added some graceful handling of NULL strings in the XML protocol messaging.


Version 0.12:
=============
- Implemented the auditWorkerJobs directive in the server's configuration
file. This feature allows you to track which worker nodes are executing
which service checks.
- Fixed negative job counter issue in client.
- Added debugging level support for the server. Setting the debug flag in
the server config file to any positive integer enables debugging. The
higher the integer, the more verbose the debugging output.
- The server module no longer writes messages to nagios.log. All server
modules messages are now written to the syslog.
- Cleaned-up memory leaks in both server and client.
- Fixed nasty corner-case where a job might be expired and collected at
the same time, causing a heap corruption due to the job structure memory
being freed twice. Even though this race-condition was properly semaphore-
protected, the expiration thread didn't properly mark the expired job
as removed from the global job queue. Hence, the collection thread might
acquire the semaphore right after the expiration thread released it, and
therefore still see the expired job as active. The job would then be
"collected", even though it was already "expired".


Version 0.11:
=============
- Implemented the localCheckPattern directive in the server's configuration
file. This permits you to specify an extended regular expression string which
will be used to see if a check command job should execute locally (instead
of being sent to a DNX client.)


Version 0.10:
=============
- Fixed improper XML parsing of command or response values, where the
command/response contains embedded XML tags (or even just embedded
angle-brackets). This fix affects both the server and the client,
since they share the common XML parsing routines contained in dnxXml.c

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/




More information about the Developers mailing list