BUG HostStateFlapping calculation - RFC

Percy Jahn jahn at fg-networking.de
Wed Feb 15 10:28:37 CET 2006


Hello,

IMHO at the moment, hoststateflapping is calculated at a very weird way.
A "no-state-change" event is added to the flapping history if wait_threshold 
has elapsed.
Actual calculation:
wait_threshold=(hst->total_service_check_interval*interval_length)/hst->total_services;

In other words: if a host with 10 services, all in a 5 minutes interval 
checked is flapping, a "no-state-change" event is added every 50 Minutes, but 
hostchecks (for a down host) are executed every 30 seconds.
In our installation, we use a service called "config_backup" which makes a 
backup of our routers/switches and is called every 1440 Minutes. The 
wait_threshold raises very high, with only one config service attached to the 
host. So a very long time without state change must be passed for a 
non-flapping state.

The attached patch calculates the wait_threshold as a 
"average_service_check_interval" (W):

(1/N(1)) + (1/N(2)) + ... + (1/N(n)) = (1/W)

In other words: if a host with 10 services, all in a 5 minute interval checked 
is flapping, a "no-state-change" event is added every 30 seconds.

RFC

Best regards
Percy Jahn
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hoststateflapping.patch
Type: application/octet-stream
Size: 2127 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20060215/2b8babb7/attachment.obj>


More information about the Developers mailing list