NDOUtil-1.4b5 not working, NDOUtil-1.4b2 is fine

Alex Burger alex_b at users.sourceforge.net
Fri Sep 21 22:52:37 CEST 2007


With Nagios 2.8 and NDOUtils-1.4b5 configured with either a unix socket 
or TCP, I always get the following error when it attempts to log a new 
event to the database:

Sep 20 22:05:42 server1 nagios: ndomod: Error writing to data sink! 
Some output may get lost...

After about 15 seconds, I will then get the following (numbers change 
obviously):

Sep 20 22:05:58 server1 nagios: ndomod: Successfully reconnected to data 
sink!  0 items lost, 193 queued items to flush.
Sep 20 22:05:58 server1 nagios: ndomod: Successfully flushed 193 queued 
items to data sink.
Sep 20 22:05:58 server1 ndo2db: Successfully connected to MySQL database

The event that it attempted to log does not get written to the database. 
  Usually after a few minutes the database will get updated with the 
information.  I thought at first that it was adding the change 15 
seconds later, but it's usually a few minutes later even if more events 
are triggered by Nagios.

BTW, I am checking the host status in the database with:

select current_state, output from nagios_servicestatus where 
service_object_id IN
(select object_id from nagios_objects where name1 = 'testserver');

I switched back to 1.4b2 and everything works fine.  When a host changes 
state, the database is updated within seconds and there are no errors in 
the log.  I started with a fresh database for both versions.

I have confirmed that all of my 'Error writing to data sink!' errors are 
coming from line 776 of ndomod.c in ndomod_write_to_sink().  From what I 
can see it is never able to write the data to the socket after it's 
received.  It always has to put it in the buffer for later processing.

I attempted to troubleshoot it by adding some additional log entries 
using ndomod_write_to_logs(), but most of the time I ended up with a 
segfault in Nagios.  For example, adding the following to the top of 
ndomod_write_to_sink() will result in a segfault in both 1.4b2 and 1.4b5:

  asprintf(&temp_buffer,"ndomod: Hello!");
  ndomod_write_to_logs(temp_buffer,NSLOG_INFO_MESSAGE);
  free(temp_buffer);
  temp_buffer=NULL;

What am I missing?

I am runnnig Redhat EL 4 x86_64:
Linux server1 2.6.9-42.0.10.ELsmp #1 SMP Fri Feb 16 17:13:42 EST 2007 
x86_64 x86_64 x86_64 GNU/Linux

A user on the nagios-users list is having the same problem on both 64 
and 32 bit Linux running Nagios 2.9.

Alex


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/




More information about the Developers mailing list