[naemon-users] unsubscribe

Paul M Dubuc work at paul.dubuc.org
Tue Sep 6 00:15:29 CEST 2022


On 9/5/22 10:53 AM, Steve Traylen wrote:
> Hi,
> 
> Been running a naemon/thruk instance since a few years now.
> 
> As of the last 2 or 3 days my instance is now creating notifications
> for services and hosts that are in scheduled downtime.
> 
> Not obvious that I have changed something.
> 
> Logs are confusing however since they are perfect and do not 
> demonstrate the problem.
> 
> # A downtime was entered from 1 second ago to 1 year later.
> [1662374724] EXTERNAL COMMAND: SCHEDULE_SVC_DOWNTIME;my.example.org 
> <http://my.example.org>;lbd;1662374723;1693910723;1;0;0;User lxman;roger 
> appstatus destroy : Machine had been created more than 5184000 seconds ago
> [1662374724] SERVICE DOWNTIME ALERT: my.example.org 
> <http://my.example.org>;lbd;STARTED; Service has entered a period of 
> scheduled downtime
> 
> # Indeed the downtime reports that the service is now suppressed for 
> notification
> [1662374724] SERVICE NOTIFICATION SUPPRESSED: my.example.org 
> <http://my.example.org>;lbd;Notifications about SCHEDULED DOWNTIME 
> events blocked for this object.
> 
> # Service gos bad.
> [1662375882] SERVICE ALERT: my.example.org 
> <http://my.example.org>;lbd;CRITICAL;SOFT;1;SNMP CRITICAL - *-1*
> [1662376182] SERVICE ALERT: my.example.org 
> <http://my.example.org>;lbd;CRITICAL;SOFT;2;SNMP CRITICAL - *-1*
> [1662376482] SERVICE ALERT: my.example.org 
> <http://my.example.org>;lbd;CRITICAL;HARD;3;SNMP CRITICAL - *-1*
> 
> # Indeed now at hard state and service is logged as notification 
> suppressed as you would expect.
> [1662376482] SERVICE NOTIFICATION SUPPRESSED: my.example.org 
> <http://my.example.org>;lbd;Notification blocked for object currently in 
> a scheduled downtime.
> 
> 1662376482 = Monday, 5 September 2022 13:14:42
> 
> 
> However a notification was totally sent out at this time.
> 
> ```
> :fire: __PROBLEM__ my.example.org/lbd <http://my.example.org/lbd> is 
> CRITICAL for 0d 0h 10m 1s. , lnodes/login, production, S513-C-VM960.
> 
> SNMP CRITICAL - *-1* [Nag 
> :link:](https://cernnag.example.ch/thruk/cgi-bin/extinfo.cgi?type=2&host=my.example.org&service=lbd 
> <https://cernnag.example.ch/thruk/cgi-bin/extinfo.cgi?type=2&host=my.example.org&service=lbd>), 
> [Monit 
> :link:](https://monit-grafana.example.ch/d/RwtmMDXmz/single-host-metrics?orgId=1&var-hostname=my.example.org 
> <https://monit-grafana.example.ch/d/RwtmMDXmz/single-host-metrics?orgId=1&var-hostname=my.example.org>), 
> [Fore :link:](https://judy.example.ch/hosts/my.example.org 
> <https://judy.example.ch/hosts/my.example.org>), [SSH 
> :link:](ssh://root@my.example.org <mailto:root at my.example.org>)
> ```
> 
> Service appears as in downtime on thruk interface. There is no 
> naemon.log entry for that notification that went out.
> 
> Only recent action was during some network instability of the naemon 
> server itself I hit 'Disable all notifications' and then 'Enable all 
> notifications'
> That said I have tried to remove all histroy in the service by stopping 
> nameon and cleaning up all these files.
> * /var/lib/naemon/  objects.cache, retention.cache, status.dat.
> * /var/log/naemon  naemon.log archives/*
> 
> Any idea why notifications might still be being sent.
> 
> Versions on CentOS 7:
> rpm -q naemon thruk naemon-livestatus
> naemon-1.3.1-0.noarch
> thruk-2.48.3-11458.1.x86_64
> naemon-livestatus-1.3.1-0.x86_64
> 
> Many Thanks
> 
> Steve.
> 
> -- 
> Steve Traylen



More information about the Naemon-users mailing list