[naemon-users] Weird issue with notification_interval (possible bug ?)

VERVAET Frederik (ITS/IIS) frederik.vervaet at proximus.com
Tue Sep 22 18:46:28 CEST 2015


Hi,

I think I have found the issue. A forgotten "notifies = no" (a mystery how that happened) in a poller configuration caused the poller to notify before the master (which ofcourse fails as the scripts for notification are not present on the poller). As such merlin behaved properly causing the master to not renotify.
Corrected it and sofar it seems to renotify properly. Since it was only 1 poller it would also explain why the issue came and went ... (depending on which poller executed the check)

Frederik Vervaet
UNIX Monitoring (HES) | ITS/IIS
http://mon.bc

[Proximus]<http://www.proximus.be/>

Connect with us on:

[Proximus Facebook]<https://www.facebook.com/proximusBe>   [Proximus Twitter] <https://twitter.com/proximus>    [Proximus YouTube] <https://www.youtube.com/proximus>    [Proximus LinkedIn] <https://www.linkedin.com/company/proximus>

From: Naemon-users [mailto:naemon-users-bounces+frederik.vervaet=proximus.com at monitoring-lists.org] On Behalf Of VERVAET Frederik (ITS/IIS)
Sent: Thursday 17 September 2015 23:23
To: naemon-users at monitoring-lists.org
Subject: [naemon-users] Weird issue with notification_interval (possible bug ?)

Hi,

I noticed the following oddity the other day :

When I set the notification interval to 5  mins I see in Ninja GUI (I use naemon+merlin) the notification listed as sent but NO actual notification was sent out.

When I enable debug logs in naemon I see :

[1442523762.409221] [032.0] [pid=28500] ** Service Notification Attempt ** Host: 'machineX', Service: 'Service X', Type: NORMAL, Options: 0, Current State: 2, Last Notification: Thu Sep 17 23:02:42 2015
[1442523762.409273] [032.0] [pid=28500] SERVICE NOTIFICATION SUPPRESSED: machineX;Service X;Re-notification blocked for this problem because not enough time has passed since last notification.[1442523762.409287] [032.1] [pid=28500] Next valid notification time: Thu Sep 17 23:07:42 2015
1442523762 = Thu, 17 Sep 2015 21:02:42 GMT = Thu, 17 Sep 2015 23:02:42 (Local time)

(do note the 3rd log entry which doesn't seem to be on a new line ? this is how I see it in the logs)

As you can see last notification time in the first entry = the notification time of the current notification attempt.

It's as if Naemon somewhere starts notifying and updates a lot of internal counters/variables (including last_notification_time) and then suddenly afterwards decides to check the notification_interval value and reconsiders.

5 minutes later exactly the same in the log except the last_notification is again set to the time of the current notification and so on ... basically rendering renotifications useless.

1442524062 = Thu, 17 Sep 2015 21:07:42 GMT = Thu, 17 Sep 2015 23:07:42 (local time)
[1442524062.422325] [032.0] [pid=28500] ** Service Notification Attempt ** Host: 'hostX', Service: 'serviceX', Type: NORMAL, Options: 0, Current State: 2, Last Notification: Thu Sep 17 23:07:42 2015
[1442524062.422370] [032.0] [pid=28500] SERVICE NOTIFICATION SUPPRESSED: hostX;serviceX;Re-notification blocked for this problem because not enough time has passed since last notification.[1442524062.422382] [032.1] [pid=28500] Next valid notification time: Thu Sep 17 23:12:42 2015

As usual: as far as I can tell the config is ok.
Naemon Core 2015.c.1 (obtained from git5 repo)
Any pointed would be appreciated
Frederik Vervaet
UNIX Monitoring (HES) | ITS/IIS
http://mon.bc

[Proximus]<http://www.proximus.be/>

Connect with us on:

[Proximus Facebook]<https://www.facebook.com/proximusBe>   [Proximus Twitter] <https://twitter.com/proximus>    [Proximus YouTube] <https://www.youtube.com/proximus>    [Proximus LinkedIn] <https://www.linkedin.com/company/proximus>


________________________________

***** Disclaimer *****
http://www.proximus.be/maildisclaimer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/naemon-users/attachments/20150922/6ce17cc3/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 7060 bytes
Desc: image001.png
URL: <https://www.monitoring-lists.org/archive/naemon-users/attachments/20150922/6ce17cc3/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 1296 bytes
Desc: image002.png
URL: <https://www.monitoring-lists.org/archive/naemon-users/attachments/20150922/6ce17cc3/attachment-0007.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image003.png
Type: image/png
Size: 1895 bytes
Desc: image003.png
URL: <https://www.monitoring-lists.org/archive/naemon-users/attachments/20150922/6ce17cc3/attachment-0008.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image006.png
Type: image/png
Size: 1211 bytes
Desc: image006.png
URL: <https://www.monitoring-lists.org/archive/naemon-users/attachments/20150922/6ce17cc3/attachment-0009.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image007.png
Type: image/png
Size: 1601 bytes
Desc: image007.png
URL: <https://www.monitoring-lists.org/archive/naemon-users/attachments/20150922/6ce17cc3/attachment-0010.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image008.png
Type: image/png
Size: 1211 bytes
Desc: image008.png
URL: <https://www.monitoring-lists.org/archive/naemon-users/attachments/20150922/6ce17cc3/attachment-0011.png>


More information about the Naemon-users mailing list