Nagios 2.0a1 SIGSEGV problem solved - bug still present?

Tom DE BLENDE (GCC) Tom.DeBlende at dhl.com
Mon Sep 6 12:13:52 CEST 2004


Dear,

Thanks to the help I received from Mr. Hopcroft, I managed to solve the 
SIGSEGV problems I had with Nagios 2.0a1. Here is the output I had from gdb:

[root at gcclo77 etc]# gdb ../bin/nagios
GNU gdb Red Hat Linux (6.0post-0.20040223.20rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host 
libthread_db library "/lib/tls/libthread_db.so.1".

(gdb) set args /usr/local/nagios/etc/nagios.cfg
(gdb) r
Starting program: /usr/local/nagios/bin/nagios 
/usr/local/nagios/etc/nagios.cfg
[Thread debugging using libthread_db enabled]
[New Thread -1220216704 (LWP 22830)]

Nagios 2.0a1
Copyright (c) 1999-2004 Ethan Galstad (nagios at nagios.org)
Last Modified: 11-18-2003
License: GPL

Nagios 2.0a1 starting... (PID=22830)
[New Thread -1220326480 (LWP 22839)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1220216704 (LWP 22830)]
0xb747ea6a in __strtol_internal () from /lib/tls/libc.so.6
(gdb) list
231                     {"version",no_argument,0,'V'},
232                     {"license",no_argument,0,'V'},
233                     {"verify",no_argument,0,'v'},
234                     {"daemon",no_argument,0,'d'},
235                     {0,0,0,0}
236             };
237     #endif
238
239             /* make sure we have the correct number of command line 
arguments */
240             if(argc<2)
(gdb) info stack
#0  0xb747ea6a in __strtol_internal () from /lib/tls/libc.so.6
#1  0x0806c49a in xrddefault_read_state_information 
(main_config_file=0x809b008 "/usr/local/nagios/etc/nagios.cfg") at 
stdlib.h:317
#2  0x08069d37 in read_initial_state_information 
(main_config_file=0x809b008 "/usr/local/nagios/etc/nagios.cfg") at 
sretention.c:99
#3  0x08052413 in main (argc=134852616, argv=0x809b008) at nagios.c:614
(gdb) bt full
#0  0xb747ea6a in __strtol_internal () from /lib/tls/libc.so.6
No symbol table info available.
#1  0x0806c49a in xrddefault_read_state_information 
(main_config_file=0x809b008 "/usr/local/nagios/etc/nagios.cfg") at 
stdlib.h:317
       temp_buffer = 
"state_history\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\000\000\0000\000\000\000.170000 
ms)\000\000\000he published applications \"Microsoft Word 2000,Microsoft 
Excel 2000,Microsoft PowerPoint 2000,Microsoft Access 2000,Desktop,Inte"...
       temp_buffer2 = "\b°\t\b", '\0' <repeats 16 times>, 
"\001\000\000\000[F_·>º^·\027AF·_sta_ at F·P\000\000\000Lº^·t·^·\004³^·ì\001`·\t\000\000\000¨\237]·\020\005`·|\233]·þ)F·øÄÿ¿æ5_·\000\000\000\000[F_·>º^·,\017R·\000\000\000\000\000\000\000\000\230xX·\230xX·\000\000\000\000¤Êÿ¿«sI·\230Æÿ¿\000\000\000\000\001", 
'\0' <repeats 23 times>, 
"\220Êÿ¿Ü­I·XÆÿ¿´\201\000\000\001\000\000\000\000\000\000\000HÅÿ¿\002\000\000\000\001\000\000\000ÌÊÿ¿xÉÿ¿\000\000\000\000m\t\t"...
       temp_ptr = 0x0
       fp = (FILE *) 0x80a4358
       host_name = 0x809b290 "frangocopy"
       service_description = 0x809c538 "Backup"
       data_type = 4
       x = 19
       temp_host = (host *) 0x0
       temp_service = (service *) 0x82da188
       temp_command = (command *) 0xbfffc800
       var = 0xbfffc800 "state_history"
       val = 0xbfffc80e "0"
       current_time = 1094462915
       scheduling_info_is_ok = 0
#2  0x08069d37 in read_initial_state_information 
(main_config_file=0x809b008 "/usr/local/nagios/etc/nagios.cfg") at 
sretention.c:99
No locals.
#3  0x08052413 in main (argc=134852616, argv=0x809b008) at nagios.c:614
       result = 134852616
       error = 0
       buffer = "Nagios 2.0a1 starting... 
(PID=22830)\000\000\000\000Ñ\233_·`øD·\000\000\000\000 
\000\000\000\002\000\000\000ðõD·ÔüD·u._·\000 
]·MÐ\000\000\020\005`·|Îÿ¿\017Ì^·ì\001`·\224\b`·\000\000\000\000\000\000\000\000)Ú^·\000 
^·\021", '\0' <repeats 11 times>, "p\r`·", '\0' <repeats 32 times>, 
"àÍÿ¿\000\000\000\000\000\000\000\000OÚ_·<Ù_·>Ù_·ø\006`·\000\000\000\000ì\001`·¦\022Q\001=\v\003", 
'\0' <repeats 25 times>, "ø\006`·", '\0' <repeats 44 times>...
       display_license = 0
       display_help = 0
       c = -1219031296
       option_index = 0
       long_options = {{name = 0x8087a00 "help", has_arg = 0, flag = 
0x0, val = 104}, {name = 0x8088a90 "version", has_arg = 0,
   flag = 0x0, val = 86}, {name = 0x8087a05 "license", has_arg = 0, flag 
= 0x0, val = 86}, {name = 0x8087a0d "verify",
   has_arg = 0, flag = 0x0, val = 118}, {name = 0x8087a14 "daemon", 
has_arg = 0, flag = 0x0, val = 100}, {name = 0x0,
   has_arg = 0, flag = 0x0, val = 0}}
(gdb)


This lead Mr. Hopcroft to believe that there might have been a problem 
with the service definition of the Backup service on Frangocopy. I went 
to look into the retention.dat file, and noticed the entry for that 
service wasn't properly formatted. The trailing } wasn't present. I 
tried adding it manually, but that didn't work. Only after removing the 
entry entirely from the file, I could startup Nagios again.

I don't know how the retention file got messed up like that. But it 
probably shouldn't cause a seg fault, so that's why I'm posting this. 
Now people more skilled than me can look into it, and make changes to 
the code where appropriate.

Kind regards,
Tom

-- 
Tom De Blende
Senior Infrastructure Analyst
DHL European Coordination Center - IT Department
Tel +32 2 713 42 62        
Fax +32 2 713 52 00



-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_idP47&alloc_id808&op=click




More information about the Developers mailing list