[naemon-users] Notifications being sent despite in scheduled downtime.

Steve Traylen steve at traylen.net
Mon Sep 5 16:53:45 CEST 2022


Hi,

Been running a naemon/thruk instance since a few years now.

As of the last 2 or 3 days my instance is now creating notifications
for services and hosts that are in scheduled downtime.

Not obvious that I have changed something.

Logs are confusing however since they are perfect and do not
demonstrate the problem.

# A downtime was entered from 1 second ago to 1 year
later.[1662374724] EXTERNAL COMMAND:
SCHEDULE_SVC_DOWNTIME;my.example.org;lbd;1662374723;1693910723;1;0;0;User
lxman;roger appstatus destroy : Machine had been created more than
5184000 seconds ago[1662374724] SERVICE DOWNTIME ALERT:
my.example.org;lbd;STARTED; Service has entered a period of scheduled
downtime# Indeed the downtime reports that the service is now
suppressed for notification[1662374724] SERVICE NOTIFICATION
SUPPRESSED: my.example.org;lbd;Notifications about SCHEDULED DOWNTIME
events blocked for this object.# Service gos bad.[1662375882] SERVICE
ALERT: my.example.org;lbd;CRITICAL;SOFT;1;SNMP CRITICAL -
*-1*[1662376182] SERVICE ALERT:
my.example.org;lbd;CRITICAL;SOFT;2;SNMP CRITICAL - *-1*[1662376482]
SERVICE ALERT: my.example.org;lbd;CRITICAL;HARD;3;SNMP CRITICAL -
*-1*# Indeed now at hard state and service is logged as notification
suppressed as you would expect.[1662376482] SERVICE NOTIFICATION
SUPPRESSED: my.example.org;lbd;Notification blocked for object
currently in a scheduled downtime.1662376482 = Monday, 5 September
2022 13:14:42However a notification was totally sent out at this
time.```:fire: __PROBLEM__ my.example.org/lbd is CRITICAL for 0d 0h
10m 1s. , lnodes/login, production, S513-C-VM960.SNMP CRITICAL - *-1*
[Nag :link:](https://cernnag.example.ch/thruk/cgi-bin/extinfo.cgi?type=2&host=my.example.org&service=lbd),
[Monit :link:](https://monit-grafana.example.ch/d/RwtmMDXmz/single-host-metrics?orgId=1&var-hostname=my.example.org),
[Fore :link:](https://judy.example.ch/hosts/my.example.org), [SSH
:link:](ssh://root@my.example.org)```

Service appears as in downtime on thruk interface. There is no naemon.log
entry for that notification that went out.

Only recent action was during some network instability of the naemon server
itself I hit 'Disable all notifications' and then 'Enable all notifications'
That said I have tried to remove all histroy in the service by stopping
nameon and cleaning up all these files.
* /var/lib/naemon/  objects.cache, retention.cache, status.dat.
* /var/log/naemon  naemon.log archives/*

Any idea why notifications might still be being sent.

Versions on CentOS 7:
rpm -q naemon thruk naemon-livestatus
naemon-1.3.1-0.noarch
thruk-2.48.3-11458.1.x86_64
naemon-livestatus-1.3.1-0.x86_64

Many Thanks

Steve.

-- 
Steve Traylen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/naemon-users/attachments/20220905/14edca7c/attachment.html>


More information about the Naemon-users mailing list