Unexpected trends reports

Edgar Shine eshine at mcosta.eng.br
Tue Apr 19 23:38:48 CEST 2005


Hi,

After read my post I decided re-submit it, because the problem was 
poorly described in my first (and 2nd) email.
Let´s try again: :P

I´m using Nagios (2.0b2) to monitor remote radios (about 300 devices) 
using ping plugin. I have some problems with trend reports.

Problem description:
1) Trend reports states an outage: "Critical - Time range: Thu Apr 7 
13:13:57 2005 to Thu Apr 7 15:34:47 2005 - Duration: 0d 2h 20m 50s - 
State Info: Critical - Plugin timed out after 10 seconds".
2) I´ve realized that this is not the true, the real outage time was 
less than 5 minutes. Looking the service alert history, I´ve found these 
lines:
---begin---
[04-07-2005 13:13:57] SERVICE ALERT: 
tajuras_comercial;PING;CRITICAL;HARD;1:CRITICAL - Plugin timed out after 
10 seconds
[04-07-2005 13:17:58] SERVICE ALERT: 
tajuras_comercial;PING;WARNING;SOFT;1;PING WARNING - Packet loss = 40%, 
RTA = 25.30 ms
[04-07-2005 13:18:57] SERVICE ALERT: 
tajuras_comercial;PING;OK;SOFT;2;PING OK - Packet loss = 40%, RTA = 29.40 ms
[04-07-2005 15:34:47] Caught SIGTERM, shutting down...
[04-07-2005 15:34:47] Nagios 2.0b2 starting...(PID=31270)
---end---
3) The nagios.log file has these lines:
---begin---
[1112890437] SERVICE ALERT: 
tajuras_comercial;PING;CRITICAL;HARD;1;CRITICAL - Plugin timed out after 
10 seconds
[1112890678] SERVICE ALERT: tajuras_comercial;PING;WARNING;SOFT;1;PING 
WARNING - Packet loss = 40%, RTA = 25.30 ms [1112890737] SERVICE ALERT: 
tajuras_comercial;PING;OK;SOFT;2;PING OK - Packet loss = 0%, RTA = 29.40 ms
[1112898887] INITIAL SERVICE STATE: 
tajuras_comercial;PING;OK;HARD;1;PING OK - Packet loss = 0%, RTA = 35.50 ms
--eof---

I presume that after a critical hard state, trends.cgi expects a hard 
recovery to graph a recovery state, but there is just a soft recovery 
after a soft state warning alert.

As a workaround, I configured the warning state (199.99 ms, 79%) values 
to be near to critical state (200ms,80%), but if I could use warning 
states it´ll be useful to set priorities for my team to fix these polled 
devices.

System info:
- Linux (Debian 3.0 - stable):
- libgd1: 1.8.4-17
- libgd2: 2.0.1-10
- zlib1g-dev: 1.1.4-1.0
- libpng2-dev: 1.0.12-3
- libjpeg62-dev: 6b-5

I´ll appreciate any tips about this issue.
TIA for your time.

rgds,
Edgar Shine


-------------------------------------------------------
This SF.Net email is sponsored by: New Crystal Reports XI.
Version 11 adds new functionality designed to reduce time involved in
creating, integrating, and deploying reporting solutions. Free runtime info,
new features, or free trial, at: http://www.businessobjects.com/devxi/728
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list