Event Handlers are not runing or logging. (on WARNING or CRITICAL)

Bruce bruce at webfarm.co.nz
Thu Sep 2 00:26:39 CEST 2004


Hi,

I think my email is not working correctly because im not getting 
responses to my questions until I post a follow up (very weird)

Has anyone had any thoughts on my findings below?

Just to refresh the issue,
Originally I thought Event handlers were not running, however I have 
since found that the event handlers are running but only when a service 
check returns OK when it has been in another state. This is not very 
useful since an event handler should be fixing the occurring problems 
not trying to fix them after they are manually fixed.  Ive included a 
log file of one host/service which experiences the problem (qouted 
below) so that people can see what I mean,

Any thoughts would be appreciated,

-- 
+------------------------------------------+      \|||/
| Bruce at WebFarm.co.nz       +64 06 7572881 |      (o o)
| Systems Technician                       +---ooO-(_)-Ooo---+
|                                                            |
| WebFarm                           http://www.webfarm.co.nz |
| FreeParking                   http://www.freeparking.co.nz |
+------------------------------------------------------------+

... FreeParking - NZ's best value Domain, WebHosting and email accounts - bar none 
... WebFarm - NZ's eCommerce specialists since 1997 




bruce wrote:

>Hi,
>
>Ive done a little more testing and it appears the event handlers ARE
>running but only when the state changes to OK, which of course is no use
>for fixing the problem.
>
>Below is the nagios.log file from one of the live system (well result of:
>egrep 'creeper.*Defun' var/nagios.log), freshclam seems
>to be running on all the severs but the Defunct processes check does get
>some results. The nagios configs are excatly the same for these also (the
>command sends fixdefuncts.sh instead of restartFreshClam.sh and thats the
>only difference.
>
>-- 8<-- nagios.log
>[1093669850] SERVICE ALERT: creeper;Defuncts;OK;HARD;1;OK - 5 processes
>running with STATE = Z
>[1093670146] SERVICE ALERT: creeper;Defuncts;WARNING;HARD;1;WARNING - 6
>processes running with STATE = Z
>[1093673451] SERVICE ALERT: creeper;Defuncts;WARNING;HARD;1;WARNING - 7
>processes running with STATE = Z
>[1093677052] SERVICE ALERT: creeper;Defuncts;WARNING;HARD;1;WARNING - 8
>processes running with STATE = Z
>[1093680652] SERVICE ALERT: creeper;Defuncts;WARNING;HARD;1;WARNING - 10
>processes running with STATE = Z
>[1093684251] SERVICE ALERT: creeper;Defuncts;WARNING;HARD;1;WARNING - 10
>processes running with STATE = Z
>[1093685900] SERVICE ALERT: creeper;Defuncts;CRITICAL;HARD;1;CRITICAL -
>11 processes running with STATE = Z
>[1093687852] SERVICE ALERT: creeper;Defuncts;CRITICAL;HARD;1;CRITICAL -
>11 processes running with STATE = Z
>[1093691451] SERVICE ALERT: creeper;Defuncts;CRITICAL;HARD;1;CRITICAL -
>13 processes running with STATE = Z
>[1093695059] SERVICE ALERT: creeper;Defuncts;CRITICAL;HARD;1;CRITICAL -
>15 processes running with STATE = Z
>[1093696438] SERVICE ALERT: creeper;Defuncts;OK;HARD;1;OK - 0 processes
>running with STATE = Z
>[1093696438] SERVICE EVENT HANDLER:
>creeper;Defuncts;OK;HARD;1;allserver_defunct_fix
>[1093696516] SERVICE ALERT: creeper;Defuncts;OK;HARD;1;OK - 0 processes
>running with STATE = Z
>[1093696624] SERVICE ALERT: creeper;Defuncts;OK;HARD;1;OK - 0 processes
>running with STATE = Z
>[1093696673] SERVICE ALERT: creeper;Defuncts;OK;HARD;1;OK - 0 processes
>running with STATE = Z
>[1093697080] SERVICE ALERT: creeper;Defuncts;OK;HARD;1;OK - 1 processes
>running with STATE = Z
>-- 8<-- End nagios.log
>
>As you can see it goes through the motions, OK => WARNING => CRITICAL =>
>OK (when we mannually restart the offending process on the server, yeah
>the better fix would be to fix the process but we are still investigating
>why it happens :( very weird, but different issue )
>
>When changing from OK => WARNING it dosnt run the event handler, only when
>it goes back to OK does it run.
>
>If I change the event handlers args to be a static CIRITCAL the handler
>logs in and does the restart, so everything is fine there.
>
>Here are the related config sections just for reference of this command
>and service:
>
>define service {
>        use                            hosted
>        service_description            Defuncts
>        check_command                  serv_check_zombie_procs
>
>        event_handler                  allserver_defunct_fix
>        event_handler_enabled          1
>        hostgroup_name                 shared
>}
>define command {
>	command_name                    allserver_defunct_fix
>	command_line                    $USER1$/fix-w-allserver.sh $HOSTADDRESS$ $SERVICESTATE$ $SERVICEATTEMPT$ defunctFix.sh
>}
>
>
>Any thoughts or suggestions?
>
>Cheers,
>  
>



-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list