Using Nagios to monitor "service-less" hosts

Tedman Eng teng at dataway.com
Thu Nov 9 00:30:51 CET 2006


Are you sure you've haven't got check_interval configured in the host
directive, or inherited from the template being applied?  The "active
checks" setting does something different.

"check_interval"         =  how often to perform SCHEDULED host checks
"active_checks_enabled"  =  whether or not Nagios executes a check when
needed

"when needed" can be be triggered by "host check_interval" or "service
non-ok"


With retention turned on, some settings are retained and thus ignore changes
in .cfg files.
For more detailed info, 
http://nagios.sourceforge.net/docs/2_0/xodtemplate.html#retention_notes

NOTE: The retention notes specify that it only applies to settings changed
during runtime, but I've seen cases where undefining a setting does not
"clear" the setting if it's been set in the past through a .cfg file.
For example enabling "flap detection" in .cfg, and then later not defining
it in the .cfg left the host with flap detection enabled.


> -----Original Message-----
> From: Andy Shellam (Mailing Lists)
> [mailto:andy.shellam-lists at mailnetwork.co.uk]
> Sent: Wednesday, November 08, 2006 2:45 PM
> To: Tedman Eng
> Cc: nagios-users at lists.sourceforge.net
> Subject: Re: [Nagios-users] Using Nagios to monitor 
> "service-less" hosts
> 
> 
> Ted,
> 
> I've stopped Nagios, removed all ".dat" files from var, and 
> restarted it 
> - all checks are now pending.
> However, I did look through retention.dat (I presume this is what you 
> meant - status.sav didn't exist) before I killed it, and the 
> check_interval parameter was not defined for any host.
> 
> I would think, surely, "state retention" only retains the 
> service/host 
> check states so if, for example, the Nagios machine reboots, when it 
> comes back up it knows where it left off.  Otherwise if you 
> change the 
> config, you'd have to remember to remove all the .dat files 
> (or at least 
> retention.dat) in var before the config change takes effect, and I 
> certainly haven't had to do that before.
> 
> And as far as Nagios was concerned, "scheduled active host 
> checks" were 
> OFF - or so it said in the config viewer.
> 
> I'll wait a couple of minutes, see where it goes from here......
> 
> OK 5 minutes have passed - no different.
> Service SC-Gateway - Ping = checked and confirmed OK at 22:38:15
> The host SC-Gateway = checked and confirmed OK at 22:38:15, 
> then checked 
> again at 22:39:40 and again at 22:41:40.
> And note "Next active scheduled check" reads N/A.
> 
> Andy.
> 
> Tedman Eng wrote:
> > If you have state retention enabled, then Nagios remembers 
> lots of settings
> > and does not "reset" them when reloading a config 
> (otherwise it wouldn't be
> > retaining).  "Host Active Checks Enabled" likely did not 
> disable themselves
> > after changing the .cfg file, because the state was 
> "remembered" from
> > previous runs.  Try stopping Nagios, clearing the 
> status.sav and restarting
> > Nagios.
> >
> >   
> >> -----Original Message-----
> >> From: Andy Shellam (Mailing Lists)
> >> [mailto:andy.shellam-lists at mailnetwork.co.uk]
> >> Sent: Wednesday, November 08, 2006 12:58 PM
> >> To: Tedman Eng
> >> Cc: nagios-users at lists.sourceforge.net
> >> Subject: Re: [Nagios-users] Using Nagios to monitor 
> >> "service-less" hosts
> >>
> >>
> >> Hi Ted,
> >>
> >> I understand the distinction - I *did* have host checks actively 
> >> scheduled (ie. the host parameter 'check_interval' set to 
> 1 - this is 
> >> now 0 so host checks shouldn't be scheduled, right?)  Yet 
> Nagios IS 
> >> checking the hosts every few minutes roughly, regardless of child 
> >> service status.
> >>
> >> Here's a dead simple example - the FH-Gateway - it has a 
> >> single service, 
> >> which is a Ping.  The host also has a Ping set as it's 
> >> active_check_command parameter.
> >> Now, if I show you the service breakdown for the Ping _service_ on 
> >> FH-Gateway:
> >>
> >> Current Status: 	
> >>   OK    
> >> Status Information: 	PING OK - Packet loss = 0%, RTA 
> = 3.02 ms
> >> Performance Data: 	
> >> Current Attempt: 	1/2
> >> State Type: 	HARD
> >> Last Check Type: 	ACTIVE
> >> Last Check Time: 	08-11-2006 20:49:37
> >> Status Data Age: 	0d 0h 0m 51s
> >> Next Scheduled Active Check:   	08-11-2006 20:50:37
> >> Latency: 	0.607 seconds
> >> Check Duration: 	9.013 seconds
> >> Last State Change: 	08-11-2006 10:46:46
> >> Current State Duration: 	0d 10h 3m 42s
> >>
> >>
> >> Nagios reports it's been in the same state (ie. OK) for 10 
> hours, 3 
> >> minutes, and 42 seconds right?
> >> So why was the host checked only a few seconds ago?
> >>
> >> Host Status: 	
> >>   UP    
> >> Status Information: 	PING OK - Packet loss = 0%, RTA 
> = 0.27 ms
> >> Performance Data: 	
> >> Current Attempt: 	1/2
> >> State Type: 	HARD
> >> Last Check Type: 	ACTIVE
> >> Last Check Time: 	08-11-2006 20:50:49
> >> Status Data Age: 	0d 0h 0m 39s
> >> Next Scheduled Active Check:   	N/A
> >> Latency: 	9.113 seconds
> >> Check Duration: 	9.011 seconds
> >> Last State Change: 	07-11-2006 06:20:35
> >> Current State Duration: 	1d 14h 30m 53s
> >> Last Host Notification: 	N/A
> >> Current Notification Number:   	0  
> >> Is This Host Flapping? 	
> >>   NO  
> >> Percent State Change: 	0.00%
> >> In Scheduled Downtime? 	
> >>   NO  
> >> Last Update: 	08-11-2006 20:51:16
> >>
> >>
> >> If the general line of thinking is correct, Nagios should 
> have last 
> >> checked the host back at (or around) 10:46 this morning when 
> >> there was a 
> >> blip in the service check.  But it didn't.  It does check 
> >> them every 1-2 
> >> minutes.
> >> My check_interval parameter is 0 - the config viewer in 
> the web CGIs 
> >> shows "enabled active checks" as NO for each host.
> >>
> >> Since I've been writing this - the above host has been 
> >> checked again at 
> >> 20:54:49 - exactly 4 minutes since the last check.  No 
> change in the 
> >> service status - 10 hours, 9 minutes now.
> >>
> >> Any ideas?
> >>
> >> Andy.
> >>
> >>
> >>
> >> Tedman Eng wrote:
> >>     
> >>> Host checks are not actively scheduled in normal operation.
> >>>
> >>> You could go months without requiring a host check, and the 
> >>>       
> >> status age of
> >>     
> >>> the host check will show something like 81 days for example.
> >>>
> >>> If you see recent host checks, then that means there was a 
> >>>       
> >> service problem
> >>     
> >>> and Nagios wanted to be sure it wasn't the host.
> >>>
> >>> Perhaps if you thought of "host check" as "network link 
> >>>       
> >> status", it would
> >>     
> >>> make the distinction more clear.
> >>>
> >>>
> >>>   
> >>>       
> >>>> -----Original Message-----
> >>>> From: Andy Shellam (Mailing Lists)
> >>>> [mailto:andy.shellam-lists at mailnetwork.co.uk]
> >>>> Sent: Wednesday, November 08, 2006 11:56 AM
> >>>> To: Sloane, Robert Raymond
> >>>> Cc: nagios-users at lists.sourceforge.net
> >>>> Subject: Re: [Nagios-users] Using Nagios to monitor 
> >>>> "service-less" hosts
> >>>>
> >>>>
> >>>> Sloane, Robert Raymond wrote:
> >>>>     
> >>>>         
> >>>>>> Last Check Time: 	08-11-2006 19:34:40
> >>>>>> Next Scheduled Active Check:   	N/A
> >>>>>>     
> >>>>>>         
> >>>>>>             
> >>>>> Interesting.  Nagios thinks the last check was run over a 
> >>>>>           
> >> month ago.
> >>     
> >>>>>   
> >>>>>       
> >>>>>           
> >>>> No, thankfully!  That date is the 8th November (British format.)
> >>>>     
> >>>>         
> >>>>> You wouldn't see anything about hosts in the scheduling 
> >>>>>           
> >> queue.  Host
> >>     
> >>>>> checks are run immediately, not through the queue.  That is 
> >>>>>       
> >>>>>           
> >>>> why it is
> >>>>     
> >>>>         
> >>>>> best to not use them.
> >>>>>   
> >>>>>       
> >>>>>           
> >>>> I did when the check_interval was set to 1 in the hosts - it 
> >>>> showed the 
> >>>> host name and a blank service column.
> >>>> I'd mentioned this only to prove the point that the checks do 
> >>>> not seem 
> >>>> to be scheduled any more, so I cannot figure out why it's 
> >>>> still running 
> >>>> the host checks at (seemingly) regular intervals.
> >>>>
> >>>> There are no hosts under that machine (or indeed above 
> >>>>         
> >> it), and all 
> >>     
> >>>> services checks are up and have been for a good 6-8 hours.
> >>>>
> >>>> I'm stumped!
> >>>>
> >>>> Andy.
> >>>>
> >>>> --------------------------------------------------------------
> >>>> -----------
> >>>> Using Tomcat but need to do more? Need to support web 
> >>>> services, security?
> >>>> Get stuff done quickly with pre-integrated technology to make 
> >>>> your job easier
> >>>> Download IBM WebSphere Application Server v.1.0.1 based on 
> >>>> Apache Geronimo
> >>>> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&
> >>>>     
> >>>>         
> >>> dat=121642
> >>> _______________________________________________
> >>> Nagios-users mailing list
> >>> Nagios-users at lists.sourceforge.net
> >>> https://lists.sourceforge.net/lists/listinfo/nagios-users
> >>> ::: Please include Nagios version, plugin version (-v) and 
> >>>       
> >> OS when reporting
> >>     
> >>> any issue. 
> >>> ::: Messages without supporting info will risk being sent 
> >>>       
> >> to /dev/null
> >>     
> >>>
> >>>
> >>>   
> >>>       
> >> --------------------------------------------------------------
> >> -----------
> >> Using Tomcat but need to do more? Need to support web 
> >> services, security?
> >> Get stuff done quickly with pre-integrated technology to make 
> >> your job easier
> >> Download IBM WebSphere Application Server v.1.0.1 based on 
> >> Apache Geronimo
> >> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&
> >>     
> > dat=121642
> > _______________________________________________
> > Nagios-users mailing list
> > Nagios-users at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nagios-users
> > ::: Please include Nagios version, plugin version (-v) and 
> OS when reporting
> > any issue. 
> > ::: Messages without supporting info will risk being sent 
> to /dev/null
> >
> > !DSPAM:37,455258fb40411755016805!
> >
> >
> >   
> 
> 
> --------------------------------------------------------------
> -----------
> Using Tomcat but need to do more? Need to support web 
> services, security?
> Get stuff done quickly with pre-integrated technology to make 
> your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on 
> Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&
dat=121642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list