Nagios 3.2.0 SIGSEGV

Jelle Smet nagios at smetj.net
Fri Apr 30 23:36:25 CEST 2010



Hi List, 

I'm running Nagios 3.2.0 compiled from source on SLES 10

SP2
The installation is pretty vanilla except that we make use of

livestatus mk. 

We experience quite regularly a Nagios crash...
At first

glance it looks like Nagios works as it should, the cgi's respond but no

more jobs are scheduled and the # of hosts and services is at zero. 

I was

wondering if there are any ppl who have/had the same experience?
Any tips

and feedback would be appreciated.. 

This is how our logfile looks like (I

have replaced hostnames by XXXXXXXXXX):
...snip....
[1272660915]

livestatus: Query: Filter: name = be-gen-wap-02
 [1272660915] livestatus:

Query: OutputFormat:json
 [1272660915] livestatus: Query: KeepAlive: on


[1272660915] livestatus: Query: ResponseHeader: fixed16
 [1272660915]

livestatus: Time to process request: 81 us. Size of answer: 257 bytes


[1272660915] livestatus: Query: GET services
 [1272660915] livestatus:

Query: Filter: host_name = XXXXXXXXXX)
 [1272660915] livestatus: Query:

Columns: description display_name state host_alias host_address

plugin_output notes last_check next_check state_type current_attempt

max_check_attempts last_state_change last_hard_state_change perf_data

scheduled_downtime_depth acknowledged host_acknowledged

host_scheduled_downtime_depth has_been_checked
 [1272660915] livestatus:

Query: OutputFormat:json
 [1272660915] livestatus: Query: KeepAlive: on


[1272660915] livestatus: Query: ResponseHeader: fixed16
 [1272660915]

livestatus: Time to process request: 10 us. Size of answer: 312 bytes


[1272660915] livestatus: Query: GET hosts
 [1272660915] livestatus: Query:

Columns: state plugin_output alias display_name address notes last_check

next_check state_type current_attempt max_check_attempts last_state_change

last_hard_state_change statusmap_image perf_data acknowledged

scheduled_downtime_depth has_been_checked state
 [1272660915] livestatus:

Query: Filter: name = be-gen-wap-01
 [1272660915] livestatus: Query:

OutputFormat:json
 [1272660915] livestatus: Query: KeepAlive: on


[1272660915] livestatus: Query: ResponseHeader: fixed16
 [1272660915]

livestatus: Time to process request: 81 us. Size of answer: 257 bytes


[1272660915] livestatus: Query: GET services
 [1272660915] livestatus:

Query: Filter: host_name = XXXXXXXXXX)
 [1272660915] livestatus: Query:

Columns: description display_name state host_alias host_address

plugin_output notes last_check next_check state_type current_attempt

max_check_attempts last_state_change last_hard_state_change perf_data

scheduled_downtime_depth acknowledged host_acknowledged

host_scheduled_downtime_depth has_been_checked
 [1272660915] livestatus:

Query: OutputFormat:json
 [1272660915] livestatus: Query: KeepAlive: on


[1272660915] livestatus: Query: ResponseHeader: fixed16
 [1272660915]

livestatus: Time to process request: 9 us. Size of answer: 312 bytes


[1272660939] SERVICE ALERT: XXXXXXXXXX;Procs -

Default;UNKNOWN;SOFT;1;CHECK_NRPE: Socket timeout after 30 seconds.


[1272660959] SERVICE ALERT: XXXXXXXXXX;Procs -

Default;OK;SOFT;2;CHECK_PROCS_MULTI OK - all processes OK
 [1272660969]

SERVICE ALERT: XXXXXXXXXX;CPU - Usage;OK;SOFT;2;OK - CPU0 'Load Percentage'

= 4: OK - _Total 'Load Percentage' = 4:
 [1272660979] SERVICE ALERT:

XXXXXXXXXX;CPU - Usage;OK;SOFT;3;OK - CPU0 'Load Percentage' = 30: OK -

CPU1 'Load Percentage' = 2: OK - _Total 'Load Percentage' = 16:


[1272660999] SERVICE ALERT: XXXXXXXXXX;CPU - Usage;UNKNOWN;SOFT;2;Unknown -

LoadPercentage cannot be determined
 [1272661034] livestatus: Query: GET

hosts
 [1272661034] livestatus: Query: Columns: childs
 [1272661034] Caught

SIGSEGV, shutting down... 

The Debug file looks like

this:
[1272661033.237415] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.237436] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.239257] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661033.239289] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.239297] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.239356] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.239362] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.241357] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661033.241390] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.241399] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.241433] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.241438] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.243672] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661033.243719] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.243730] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.243778] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.243785] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.245368] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661033.245401] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.245410] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.245445] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.245449] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.248076] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661033.248111] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.248120] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.248182] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.248187] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.249484] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661033.249514] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.249523] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.249566] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.249571] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.251416] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661033.505428] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661033.757384] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661034.009381] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661034.009432] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.009442] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.009510] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.009515] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.011394] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661034.011426] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.011435] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.011482] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.011487] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.013339] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661034.013376] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.013385] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.013454] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.013459] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.014324] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661034.014352] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.014361] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.014409] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.014414] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.017539] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661034.017579] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.017589] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.017679] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.017687] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.019850] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661034.019903] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.019916] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.019951] [064.1] [pid=9031] Making callbacks (type 9)...


[1272661034.019967] [064.2] [pid=9031] Callback #1 (type 9) return code =

0
 [1272661034.020002] [064.1] [pid=9031] Making callbacks (type 13)..  
--


Jelle Smet
http://www.smetj.net
  [1272661033.237415] [064.1] [pid=9031]

Making callbacks (type 13)...
 [1272661033.237436] [064.2] [pid=9031]

Callback #1 (type 13) return code = 0
 [1272661033.239257] [064.1]

[pid=9031] Making callbacks (type 8)...
 [1272661033.239289] [064.1]

[pid=9031] Making callbacks (type 13)...
 [1272661033.239297] [064.2]

[pid=9031] Callback #1 (type 13) return code = 0
 [1272661033.239356]

[064.1] [pid=9031] Making callbacks (type 13)...
 [1272661033.239362]

[064.2] [pid=9031] Callback #1 (type 13) return code = 0


[1272661033.241357] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661033.241390] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.241399] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.241433] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.241438] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.243672] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661033.243719] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.243730] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.243778] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.243785] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.245368] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661033.245401] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.245410] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.245445] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.245449] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.248076] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661033.248111] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.248120] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.248182] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.248187] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.249484] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661033.249514] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.249523] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.249566] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661033.249571] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661033.251416] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661033.505428] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661033.757384] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661034.009381] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661034.009432] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.009442] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.009510] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.009515] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.011394] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661034.011426] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.011435] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.011482] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.011487] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.013339] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661034.013376] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.013385] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.013454] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.013459] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.014324] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661034.014352] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.014361] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.014409] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.014414] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.017539] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661034.017579] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.017589] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.017679] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.017687] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.019850] [064.1] [pid=9031] Making callbacks (type 8)...


[1272661034.019903] [064.1] [pid=9031] Making callbacks (type 13)...


[1272661034.019916] [064.2] [pid=9031] Callback #1 (type 13) return code =

0
 [1272661034.019951] [064.1] [pid=9031] Making callbacks (type 9)...


[1272661034.019967] [064.2] [pid=9031] Callback #1 (type 9) return code =

0
 [1272661034.020002] [064.1] [pid=9031] Making callbacks (type 13)..
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20100430/4b9f2362/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list