Nagios scheduling queue seems to be laggingbehind real time

tom.welsh at bt.com tom.welsh at bt.com
Fri May 28 14:26:55 CEST 2004


Hi,

Only the main nagios process is running but lots of child processes run when checks are being executed eg .....

ps -eaf | grep nagios
     pms 21318     1  1 08:50:08 ?        3:40 /opt/pms/nagios/bin/nagios -d /opt/pms/nagios/etc/nagios.cfg
     pms   822   344  0 13:24:25 pts/18   0:00 grep nagios
     pms   806   805  0 13:24:22 ?        0:00 sh -c /opt/pms/nagios/libexec/check_ping -H 172.16.118.103 -w 1000,70% -c 1000,
     pms   683     1  0 13:24:17 ?        0:00 /opt/pms/nagios/bin/nagios -d /opt/pms/nagios/etc/nagios.cfg
     pms   691     1  0 13:24:17 ?        0:00 /opt/pms/nagios/bin/nagios -d /opt/pms/nagios/etc/nagios.cfg
     pms   766     1  0 13:24:18 ?        0:00 /opt/pms/nagios/bin/nagios -d /opt/pms/nagios/etc/nagios.cfg
     pms   807   806  0 13:24:22 ?        0:00 /opt/pms/nagios/libexec/check_ping -H 172.16.118.103 -w 1000,70% -c 1000,80% -p
     pms   538     1  0 13:24:15 ?        0:00 /opt/pms/nagios/bin/nagios -d /opt/pms/nagios/etc/nagios.cfg
     pms   547     1  0 13:24:15 ?        0:00 /opt/pms/nagios/bin/nagios -d /opt/pms/nagios/etc/nagios.cfg
     pms   754     1  0 13:24:17 ?        0:00 /opt/pms/nagios/bin/nagios -d /opt/pms/nagios/etc/nagios.cfg
     pms   805 21318  0 13:24:22 ?        0:00 /opt/pms/nagios/bin/nagios -d /opt/pms/nagios/etc/nagios.cfg
     pms   707     1  0 13:24:17 ?        0:00 /opt/pms/nagios/bin/nagios -d /opt/pms/nagios/etc/nagios.cfg
     pms   555     1  0 13:24:15 ?        0:00 /opt/pms/nagios/bin/nagios -d /opt/pms/nagios/etc/nagios.cfg
     pms   602     1  0 13:24:16 ?        0:00 /opt/pms/nagios/bin/nagios -d /opt/pms/nagios/etc/nagios.cfg
     pms   610     1  0 13:24:16 ?        0:00 /opt/pms/nagios/bin/nagios -d /opt/pms/nagios/etc/nagios.cfg

I restarted the nagios service at 08:50:00 so i believe that I have only 1 process running.

Cheers

Tom

-----Original Message-----
From: Thales Maia [mailto:tchagas at uolinc.com]
Sent: 28 May 2004 13:05
To: Welsh,T,Tom,XJH2A C
Cc: nagios-users at lists.sourceforge.net
Subject: Re: [Nagios-users] Nagios scheduling queue seems to be
laggingbehind real time


Another trick: Maybe more than 1 main nagios process is running.


On Fri, 2004-05-28 at 04:49, tom.welsh at bt.com wrote:
> Hi All
> 
> I'm new to this list so try to be kind :)
> 
> OS. Solaris with current patches applied
> 
> Hardware : Sun E220R, 2 * 450MHz Ultra SPARCII cpu's, 1gb ram, 2 * 18gb SCSi drives
> 
> Monitored Hosts: 64
> 
> Monitored Services 2094
> 
> Monitoring interval 5 mins
> 
> Nagios version 1.0 ( Yes I know there is a new version but an upgrade is not an option for us just now :(  )
> 
> Problem:
> nagios appears to be running fine it is just that when I look at my scheduling queue the tests at the top of the queue seem to be constantly 16 - 17 mins behind real time. Currently time on my box is 08:40.  The next entry to run should have been executed at 08:24.
> 
> I have read the section in the docs regarding scheduling and adjusted my max_concurrent_checks accordingly. My service_reaper_frequency is still set to 10
> 
> Can any one point out where I can make changes to bring this system back into line with real time?
> 
> Here is the output from nagios -s ../etc/nagios.cfg
> 
> 
>         SERVICE SCHEDULING INFORMATION
>         -------------------------------
>         Total services:             2094
>         Total hosts:                64
> 
>         Command check interval:     -1 sec
>         Check reaper interval:      10 sec
> 
>         Inter-check delay method:   SMART
>         Average check interval:     341.117 sec
>         Inter-check delay:          0.163 sec
> 
>         Interleave factor method:   SMART
>         Average services per host:  32.719
>         Service interleave factor:  33
> 
>         Initial service check scheduling info:
>         --------------------------------------
>         First scheduled check:      1085729805 -> Fri May 28 08:36:45 2004
>         Last scheduled check:       1085730148 -> Fri May 28 08:42:28 2004
> 
>         Rough guidelines for max_concurrent_checks value:
>         -------------------------------------------------
>         Absolute minimum value:     62
>         Recommend value:            186
> 
> 
> From nagios.cfg...
> 
> inter_check_delay_method=s
> service_interleave_factor=s
> max_concurrent_checks=186
> service_reaper_frequency=10
> sleep_time=1
> service_check_timeout=63
> host_check_timeout=30
> event_handler_timeout=30
> notification_timeout=30
> ocsp_timeout=5
> perfdata_timeout=5
> 
> 
> Thanks for your help
> 
> Regards,
> 
> 
> Tom 
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by: Oracle 10g
> Get certified on the hottest thing ever to hit the market... Oracle 10g. 
> Take an Oracle 10g class now, and we'll give you the exam FREE.
> http://ads.osdn.com/?ad_id149&alloc_id66&op=click
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
-- 
THALES MAIA CHAGAS
Sysadmin - UOL S/A




-------------------------------------------------------
This SF.Net email is sponsored by: Oracle 10g
Get certified on the hottest thing ever to hit the market... Oracle 10g. 
Take an Oracle 10g class now, and we'll give you the exam FREE.
http://ads.osdn.com/?ad_id149&alloc_id66&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list