nagios as message log server

Neil neil-on-nagios at restricted.dyndns.org
Sun Feb 22 22:52:11 CET 2004


Hi Stanley, 

This is awesome. Just need to find out some interesting events in one of our 
Windows machine and implement something similar to what you have suggested 
to me. Looks hard but I will keep on trying examples. 

I will also search about volatility of a service. I haven't tried modifying 
the value in my services.cfg. 

Thanks. You ROCK!!! :) 

Neil 

Stanley Hopcroft writes: 

> Dear Sir, 
> 
> I am writing to thank you for your letter and say, 
> 
> On Sat, Feb 21, 2004 at 08:39:09PM -0800, nagios-users-request at lists.sourceforge.net wrote: 
> 
>> Message: 2
>> From: "Neil" <neil-on-nagios at restricted.dyndns.org>
>> Subject: [Nagios-users] Re: nagios as message log server 
>> 
> 
>   .. preamble snipped 
> 
>>  
>> 
>> It's nice to have all the system/critical events from all over the 
>> enterprise to be sent a central logging system,
> 
> yeah hup !  
> 
>> in this case, nagios. But, 
>> what I am worried now is that if we aren't
>> actually monitoring a service, 
>> but just waiting for a critical message in /var/log/messages or a critical 
>> event sent by Snare for windows. 
> 
>   .. snip: synopsis of problem is event log entry raises CRITICAL status
>            of corresp Nag service but how does it get set back to OK. 
> 
>> 
>> Since this isn't a service, I can't find a solution on how I can restore 
>> back the state to OK.
> 
> Either Issue a 'submit a passive check result' from the Nag service
> description panel 
> 
> or, 
> 
> Employ an Event Correlator (ie software that understands the
> significance and relationships of messages _in_ time) such as Sec to
> unlatch the CRITICAL after a sufficient interval following the CRITICAL
> message (if that is the appropriate processing for the service. In fact,
> if the service represents an IDS alarm you may not want to do this. In
> any case, you are probably better off with having Nag treat the service
> as "Volatile"). 
> 
> There is a significant difference between software like Swatch that
> is mainly intended to react to patterns in log files and Sec that
> understands that the pattern represents the start of __event
> processing__  
> 
> Examples of events (rather than patterns representing messages) are 
> 
> . log message rate rises above threshold (eg SU failed on ...) 
> 
> . log message rate falls below another threshold 
> 
> . a log message followed by another one within no more than 't' seconds
> of the first message (the mesasges can be completely different) 
> 
> . all the log messages matching a pattern in an interval 
> 
> None of these can be processed without referring to the time intervals
> between the messages. 
> 
> <off topic> 
> 
> Here is a complete worked example of resetting a service with Sec. 
> 
> 1 Here is the Sec rule to process the events. The rationale is below. 
> 
> type=SingleWithSuppress
> ptype=RegExp        
> pattern=Authentication Failure Trap .+?IpAddress: (\S+)
> desc=Authentication traps
> action=assign %a $1;                                                                                                                    
>    eval %n ( $a = '%a'; %%hn = ('10.a.b.c1' => 'foo',
>                                 '10.a.b.c2' => 'bar',
>                                 '10.a.b.c3' => 'blech',
>                                 '10.a.b.c4' => 'baz',
>                                );
>              exists $hn{$a} ? "$hn{$a}/$a" : "Unknown hostname/$a" ;);                                                               
>    eval %o ('Trap from %n. Print spooler may be scanning all addresses 
> with Snmp to discover an offline printer.') ;                 
>    assign %h tsitc;                                                                                                                  
>    delete auth_traps_seen;                                                                                                           
>    create auth_traps_seen 960 ( assign %o No auth traps caused by %n for 
> last 16 minutes.;                                      
>                                 write  
> /usr/local/nagios/var/rw/nagios.cmd ([%u] 
> PROCESS_SERVICE_CHECK_RESULT;%h;%s;0;%o);      
>                               ) ;                                                                                               
>    write  /usr/local/nagios/var/rw/nagios.cmd ([%u] 
> PROCESS_SERVICE_CHECK_RESULT;%h;%s;2;%o)
> window=900
>         
> 
> The intent is to process a stream of 'Authentication failure Trap' 
> messages by 
> 
> 1 At the start of the stream, inform the Nagios service monitor 
> (writing a formattted Nagios 'passive service check result' to the 
> Nagios command input fifo) that the Nagios service corresponding to 
> these traps is 'critical' (the ';2' in the last sec write comand). 
> 
> 2 While the stream continues, inform Nagios each 15 minutes after the 
> last notification. 
> 
> 3 If there is no Auth Trap message for 16 minutes after the last one,
> inform Nagios that the 'Nagios service' (corresponding to the traps) is
> OK (the ';0;' in the first write command). 
> 
> (The rule uses a Perl mini-program to map the IP address of the cause of
> the Auth Trap to a few apriori known culprits (qw(foo bar baz)). 
> 
> The rationale is 
> 
> SingleWithSuppress compresses the trap messages into one write() each 15 
> minutes. 
> 
> After the last Sec action (write) then
> 1 if there is another trap within 15 minutes   => rule fails: no write
> 2 if there is a trap between 15 and 16 minutes => rule matches: write
> 3 if there are no traps for 16 minutes         => rule context expires:
>                                                   write OK. 
> 
> </off topic> 
> 
> Yours sincerely. 
> 
>  
> 
> 
> -- 
> ------------------------------------------------------------------------
> Stanley Hopcroft
> ------------------------------------------------------------------------ 
> 
> '...No man is an island, entire of itself; every man is a piece of the
> continent, a part of the main. If a clod be washed away by the sea,
> Europe is the less, as well as if a promontory were, as well as if a
> manor of thy friend's or of thine own were. Any man's death diminishes
> me, because I am involved in mankind; and therefore never send to know
> for whom the bell tolls; it tolls for thee...' 
> 
> from Meditation 17, J Donne. 
> 
> 
> -------------------------------------------------------
> SF.Net is sponsored by: Speed Start Your Linux Apps Now.
> Build and deploy apps & Web services for Linux with
> a free DVD software kit from IBM. Click Now!
> http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
 


-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list