[naemon-dev] Ideas about future features

Matthias Eble psychotrahe at gmail.com
Fri Dec 27 02:06:36 CET 2013


Hi all,

Na*ios has been working quite well for me in the past.
However, I'd like share some thoughts about how the system could be
improved from my point of view.
It'd be great to hear how Naemon will evolve in the future.

There probably is already a plan in the team's heads or maybe written
down somewhere.

I have som ideas in my mind. They are rather raw, and indefinite, but
I'd still like to
discuss them with you.

For me, two things would be really helpful:
1) have a feature to monitor per metric rather than per check_command.
   * Today, many plugins check lots of things.
   * Typical example is check_disk, check_snmp.
   * Depending on the configuration method, acknowledging a problem
with /mytmpmount also disables notifications for /var
     * To fix that, we'd need to create a stricter plugin output
standard that contains per-metric status codes.
        * metrics would be /mytmpmount-freespace, TCP-response-time,
http-status-code or http-match-string
     * The core would need to create sub-services at run-time and
populate their results.
     * Benefit: per metric actions and logging. Especially per metric
downtime and acknowledgements
     * Maybe it could also be used for receiving snmp traps or log
pattern matching checks
        * different alerting for different patterns/traps

   * Today, many folks wrap the plugin call and submit results to a
passive check.
      * works, but all possible services need to be in the config.
      * That's where folks start generating nagios configs and reload
the daemon.
      * Is that what we want? Maybe?
         * problems arise when there are syntax problems

   * raw proposal:
  define service {
  ...
      check_command  check_disk
    contact_group  os_admins
    define metric {
        metric_name  ^/oracle.*
        contact_group oracle_admins
    }
  }

   * maybe another layer could be added for check_multi-like plugins.
      * but they could also be forced to structure metric names

2) Exceptions
   * today, we have different services for systems with 24/7 and 8/5
      * inheriting a services' notification period from the host is
not sufficient
      * usually you want to notify a host down event earlier thant
filesystem full
   * a expression like syntax would be neat: non_prod_hosts && ! db_hosts

define service {
...
    check_command check_disk
    notification_period  24x7
    except {
        hostgroup_name  non_prod_hosts
        notification_period 8x5
    }
    except ...
}


The above topics highly depend on the idea of configuring naemon
   * is it "the way to go" to run generator scripts?
      * if so, is there a need to make generation/activation/testing smoother?
   * Flexible config file format for easy grouping/assignment?
      * use templating/grouping for everything except inventory data?
   * have another system to ask for hosts and service assignments?
   * event based host/service creation at run-time?

In other words, if the desired way to do configuration is to have an
intelligent config
generator that produces stupid service/host definitions without templating, etc.
all the above is obsolete.

So, that's it for now.
What do you think? What's the focus of the dev-team?

Matthias


More information about the Naemon-dev mailing list