Log monitoring with SEC and Nagios.

Risto Vaarandi risto.vaarandi at seb.ee
Fri Aug 31 13:14:45 CEST 2007


Nate Campi wrote:
> On Thu, Aug 30, 2007 at 11:11:17AM +1000, Stanley.Hopcroft at Dest.gov.au wrote:
>> Dear Risto
>>
>> (Thank you very much for SEC, the king of event correlators).
>  
> I also thank you, SEC saves my SA staff a lot of trouble every day.

Thanks - it's good to hear that :)

>>> few weeks ago I posted a question to this list about passive service 
>>> checks - I was actually experimenting with Nagios as an event log 
>>> monitoring GUI. I am tracking event logs with SEC and also 
>>> sending out 
>>> alerts with it, but I would still like to see correlated log 
>>> messages in 
>>> Nagios web interface as well.
>>>
>> I used to use (and enjoy) SEC to inject passive service check results
>> to Nagios.
> 
> I also do this, but it forces me to define a different check for every
> thing that I might see - because if I submit a second, different bad
> result (like a different system error message for a "syslog" check)
> it'll overwrite the last submitted results. There are ways around this
> on the SEC side if you want to keep state, but you'd probably like
> people to be able to wipe events clear independently on the Nagios side
> (like with a passive submission from the CGI) and not have that old
> result come back. I hate to state that like it's fact when I'm at best
> an intermediate Nagios admin, no expert. Am I overlooking anything here?
> 
> You could have a feedback loop between Nagios logs and SEC that helps
> detect the passive submission that clears your prior alerts, but that
> seems overly complicated. If it was like a traditional NMS that just
> accepts arbitrary events, then it might be more like what Risto is
> looking for.
> 
> What exactly are your needs, Risto?

I'd just like to display correlated alerts in Nagios web page. Right 
now, I have settled for a strategy where all events come in with a 
CRITICAL severity, and the user has to submit a passive check by hand, 
in order to set the service back to OK-state.

The service definition looks like follows:

define service{
         use                             generic-service
         hostgroup_name                  linux
         service_description             Syslog
         is_volatile                     1
         active_checks_enabled           0
         max_check_attempts              1
         normal_check_interval           5
         retry_check_interval            1
         contact_groups                  nagios-admin
         check_command                   check_passive
         }

I have disabled host checks altogether, so that service alerts from the 
Syslog delay are displayed with a minimal lag (it is now just around 5-6 
seconds, while with host checks enables it was sometimes several 
minutes, since every submitted passive service check triggered a host 
check).

The only thing I am missing is a way to disable host checks on a service 
basis - for example, the service definition could have a special 
parameter called 'enable_host_check' that the user can set to 0. The 
reason I'd like to have this parameter in the future is that I have 
active checks for the 'linux' hostgroup as well, like

define service{
         use                             generic-service
	hostgroup_name                  linux
         service_description             SSH
         max_check_attempts              3
         normal_check_interval           3
         retry_check_interval            1
	contact_groups                  nagios-admin
         check_command                   check_ssh
         }

It would be good to have host check enabled for such services, so that 
no service alert is generated if the host is down.

However, that was just a small suggestion and not a complaint :)
I'd like to thank people who are developing Nagios ;)

br,
risto

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list