A second 2.0b3 SEGV after exiting scheduled downtime.

Stanley Hopcroft Stanley.Hopcroft at IPAustralia.Gov.AU
Sat Apr 30 12:19:01 CEST 2005


Dear Folks,

I am writing to report another SEGV in 2.0b3 after exiting from a period 
of scheduled downtime

[1114822965] SERVICE DOWNTIME ALERT: wins;Secondary DNS - Internet 
names;STOPPED; Service has exited from a period of scheduled downtime
[1114822965] SERVICE DOWNTIME ALERT: tsitc;Secondary DNS - Internet 
names;STOPPED; Service has exited from a period of scheduled downtime
[1114822965] SERVICE DOWNTIME ALERT: tsitc2;Secondary DNS - Internet 
names;STOPPED; Service has exited from a period of scheduled downtime
[1114822965] SERVICE DOWNTIME ALERT: networks2;Primary DNS - Internet 
names;STOPPED; Service has exited from a period of scheduled downtime
[1114822965] SERVICE DOWNTIME ALERT: VirtualDNS;Corporate DNS - Internet 
names;STOPPED; Service has exited from a period of scheduled downtime
[1114822965] SERVICE DOWNTIME ALERT: xeno;Isys text search of TM Goods 
and Services;STOPPED; Service has exited from a period of scheduled 
downtime
[1114822965] SERVICE DOWNTIME ALERT: pericles;ATMOSS via the firewall 
provider infrastructure;STOPPED; Service has exited from a period of 
scheduled downtime
[1114822965] SERVICE DOWNTIME ALERT: pericles;AUB via the firewall 
provider infrastructure;STOPPED; Service has exited from a period of 
scheduled downtime
[1114822965] SERVICE DOWNTIME ALERT: pericles;Online Services via the 
firewall provider infrastructure;STOPPED; Service has exited from a 
period of scheduled downtime
[1114822965] SERVICE DOWNTIME ALERT: pericles;www.IPAustralia.Gov.AU via 
the firewall provider infrastructure;STOPPED; Service has exited from a 
period of scheduled downtime
[1114822965] SERVICE DOWNTIME ALERT: squid;Secondary DNS - Internet 
names;STOPPED; Service has exited from a period of scheduled downtime
[1114822965] SERVICE DOWNTIME ALERT: cbradc01;AD Domain controller name 
resolution;STOPPED; Service has exited from a period of scheduled 
downtime
[1114822965] SERVICE DOWNTIME ALERT: cbradc02;AD Domain controller name 
resolution;STOPPED; Service has exited from a period of scheduled 
downtime

# after noticing that Nagios had stopped and taking action

[1114825380] Nagios 2.0b3 starting... (PID=70165)
[1114825380] LOG VERSION: 2.0
[1114825381] Finished daemonizing... (New PID=70166)

Apr 30 11:02:45 tsitc /kernel: pid 21714 (nagios), uid 1000: exited on 
signal 11

Since 1114822965 is 11:02:45 the SEGV corresponds with the last 'exited 
from a period of scheduled downtime' message.

(tsitc> perl -e 'print localtime(1114822965)."\n"'
Sat Apr 30 11:02:45 2005

signal 11 on FreeBSD 4.11 RELEASE is #define SIGSEGV         11      
/* segmentation violation */ )


Here's the scheduled downtime log messages for some of the 'exited from'
messages.

[1114817073] EXTERNAL COMMAND: SCHEDULE_SVC_DOWNTIME;cbradc01;AD Domain 
controller name resolution;1114817061;1114824261;1;0;86400;Stanley 
Hopcroft;ICON maintenance. No access to cache
, email, DNS etc
[1114817073] SERVICE DOWNTIME ALERT: cbradc01;AD Domain controller name 
resolution;STARTED; Service has entered a period of scheduled downtime
[1114817080] EXTERNAL COMMAND: SCHEDULE_SVC_DOWNTIME;cbradc02;AD Domain 
controller name resolution;1114817061;1114824261;1;0;86400;Stanley 
Hopcroft;ICON maintenance. No access to cache
, email, DNS etc
[1114817080] SERVICE DOWNTIME ALERT: cbradc02;AD Domain controller name 
resolution;STARTED; Service has entered a period of scheduled downtime

I may have filled out the form incorrectly. The intent was to 
schedule 24 hours downtime, but I failed to enter the end date 
corresponding to start + 24 hours; the SCHEDULE_SVC_DOWNTIME mesasges 
show the default end time of now + 2 hours.

This is the second time a SEGV has occurred under the same circumstances 
of scheduling - quickly - large numbers of downtime (for the case of 
multiple servcies in different hostgroups - no service groups ... - that 
are all affected by infrastructure maintenance.)

Since I had to reschedule the downtime (after the restart), it 
should be interesting to see what happens when they exit scheduled 
downtime in about 12 hours time. These downtimes were scheduled by

Sat Apr 30 11:53:35 EXTERNAL COMMAND: SCHEDULE_SVC_DOWNTIME;cbradc02;AD 
Domain controller name resolution;1114825998;1114919598;0;0;7200;Stanley 
Hopcroft;ICON maintenance. No access to cache, email, DNS etc

Sat Apr 30 11:53:45 EXTERNAL COMMAND: SCHEDULE_SVC_DOWNTIME;cbradc01;AD 
Domain controller name resolution;1114825998;1114919598;0;0;7200;Stanley 
Hopcroft;ICON maintenance. No access to cache, email, DNS etc

These are pretty lame efforts also: end == start + 24 hours (at least) 
and 'flexible' downtime (?); I think that was a mistake since I 
wanted the downtime period to be 24 hours.

Yours sincerely.

-- 
Stanley Hopcroft

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: disclaimer.txt
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20050430/146a445e/attachment.txt>


More information about the Developers mailing list