Nagios HANGS scheduling info

Fernando Shayani fernando.shayani at bsb.politec.com.br
Mon Dec 12 17:54:20 CET 2005


Well,

PERFORMANCE SUGGESTIONS
-----------------------
I have no suggestions - things look okay.

That's what it says... But I will put 5 to my max_concurrent_check...

Thanks for the tip. 


Fernando Shayani
fernando.shayani at bsb.politec.com.br
(61) 3038-6951
POLITEC - Brasília - DF

-----Original Message-----
From: Marco Ramos [mailto:mramos at co.sapo.pt] 
Sent: segunda-feira, 12 de dezembro de 2005 12:35
To: Fernando Shayani
Cc: nagios-users at lists.sourceforge.net
Subject: RE: [Nagios-users] Nagios HANGS scheduling info


The problem should be your max_concurrent_checks set to 0. Run "nagios - s nagios.cfg" and set the max_concurrent_check to the value suggested.
This should fix it.

Best regards,
Marco Ramos 

On Mon, 2005-12-12 at 10:54 -0300, Fernando Shayani wrote:
> Well, it's STILL hangs... Here is my configuration:
> 
> log_file=/usr/local/nagios/var/nagios.log
> temp_file=/usr/local/nagios/var/nagios.tmp
> status_file=/usr/local/nagios/var/status.dat
> aggregate_status_updates=1
> status_update_interval=3
> nagios_user=nagios
> nagios_group=nagios
> enable_notifications=1
> execute_service_checks=1
> accept_passive_service_checks=1
> enable_event_handlers=1
> log_rotation_method=d
> log_archive_path=/usr/local/nagios/var/archives
> check_external_commands=1
> command_check_interval=-1
> command_file=/usr/local/nagios/var/rw/nagios.cmd
> downtime_file=/usr/local/nagios/var/downtime.dat
> comment_file=/usr/local/nagios/var/comments.dat
> lock_file=/usr/local/nagios/var/nagios.lock
> retain_state_information=1
> state_retention_file=/usr/local/nagios/var/retention.dat
> retention_update_interval=360
> use_retained_program_state=1
> use_syslog=1
> log_notifications=1
> log_service_retries=0
> log_host_retries=0
> log_event_handlers=1
> log_initial_states=0
> log_external_commands=0
> log_passive_checks=0
> sleep_time=1
> service_interleave_factor=s
> max_concurrent_checks=0
> service_reaper_frequency=2
> interval_length=60
> use_aggressive_host_checking=0
> enable_flap_detection=1
> low_service_flap_threshold=20
> high_service_flap_threshold=80
> low_host_flap_threshold=20
> high_host_flap_threshold=80
> soft_state_dependencies=0
> service_check_timeout=25
> host_check_timeout=10
> event_handler_timeout=30
> notification_timeout=15
> ocsp_timeout=60
> perfdata_timeout=60
> obsess_over_services=0
> process_performance_data=0
> check_for_orphaned_services=1
> check_service_freshness=0
> freshness_check_interval=60
> date_format=euro
> illegal_object_name_chars=`~!$%&*|\\\'\\\"<>?,()=
> illegal_macro_output_chars=`~$&|\\\'\\\"<>
> admin_email=fernando.shayani at bsb.politec.com.br
> service_inter_check_delay_method=s
> max_service_check_spread=15
> host_inter_check_delay_method=s
> max_host_check_spread=15
> auto_reschedule_checks=1
> auto_rescheduling_interval=30
> auto_rescheduling_window=180
> 
> 
> And here is my STATS:
> 
> CURRENT STATUS DATA
> ----------------------------------------------------
> Status File:                          /usr/local/nagios/var/status.dat
> Status File Age:                      0d 0h 0m 3s
> Status File Version:                  2.0b6
> 
> Program Running Time:                 0d 2h 47m 39s
> 
> Total Services:                       314
> Services Checked:                     314
> Services Scheduled:                   313
> Active Service Checks:                314
> Passive Service Checks:               0
> Total Service State Change:           0.000 / 12.110 / 0.299 %
> Active Service Latency:               0.003 / 358.274 / 16.984 %
> Active Service Execution Time:        0.036 / 25.014 / 2.207 sec
> Active Service State Change:          0.000 / 12.110 / 0.299 %
> Active Services Last 1/5/15/60 min:   48 / 182 / 267 / 277
> Passive Service State Change:         0.000 / 0.000 / 0.000 %
> Passive Services Last 1/5/15/60 min:  0 / 0 / 0 / 0
> Services Ok/Warn/Unk/Crit:            304 / 3 / 4 / 3
> Services Flapping:                    0
> Services In Downtime:                 0
> 
> Total Hosts:                          129
> Hosts Checked:                        129
> Hosts Scheduled:                      1
> Active Host Checks:                   129
> Passive Host Checks:                  0
> Total Host State Change:              0.000 / 10.260 / 0.291 %
> Active Host Latency:                  0.000 / 0.176 / 0.001 %
> Active Host Execution Time:           0.000 / 5.973 / 2.012 sec
> Active Host State Change:             0.000 / 10.260 / 0.291 %
> Active Hosts Last 1/5/15/60 min:      1 / 2 / 3 / 6
> Passive Host State Change:            0.000 / 0.000 / 0.000 %
> Passive Hosts Last 1/5/15/60 min:     0 / 0 / 0 / 0
> Hosts Up/Down/Unreach:                128 / 1 / 0
> Hosts Flapping:                       0
> Hosts In Downtime:                    0
> 
> 
> Please... Help... 
> 
> 
> Fernando Shayani
> fernando.shayani at bsb.politec.com.br
> (61) 3038-6951
> POLITEC - Brasília - DF
> 
> -----Original Message-----
> From: nagios-users-admin at lists.sourceforge.net 
> [mailto:nagios-users-admin at lists.sourceforge.net] On Behalf Of 
> Fernando Shayani
> Sent: quinta-feira, 8 de dezembro de 2005 07:45
> To: Marco Ramos
> Cc: nagios-users at lists.sourceforge.net
> Subject: RES: [Nagios-users] Nagios HANGS scheduling info
> 
> Ok. It continues hanging...
>  
> Now I changed the REAPER from 5 to 2... Lets see..
> Thanks
> Fernando
> 
> 	-----Mensagem original----- 
> 	De: Marco Ramos [mailto:mramos at co.sapo.pt] 
> 	Enviada: ter 06-dez-05 16:42 
> 	Para: Fernando Shayani 
> 	Cc: nagios-users at lists.sourceforge.net 
> 	Assunto: RE: [Nagios-users] Nagios HANGS scheduling info
> 	
> 	
> 
> 
> 	Try to tune your service_reaper_frequency and max_concurrent_checks
> 	values. Take a look at http://nagios.org/faqs/viewfaq.php?faq_id=115.
> 	
> 	Had the same problem a while ago and managed to solve it tunning this
> 	two options.
> 	
> 	regards,
> 	Marco Ramos
> 	
> 	On Tue, 2005-12-06 at 14:53 -0300, Fernando Shayani wrote:
> 	> Well, my configuration is:
> 	>
> 	> Inter-check sleep time (sleep_time=0.25)
> 	> Service inter-check delay method (service_inter_check_delay_method=s)
> 	> Maximum service check spread (max_service_check_spread=2)
> 	> Service interleave factor (service_interleave_factor=s)
> 	> Maximum concurrent service checks (max_concurrent_checks=0)
> 	> Service reaper frequency (service_reaper_frequency=5)
> 	> Host inter-check delay method (host_inter_check_delay_method=s)
> 	> Maximum host check spread (max_host_check_spread=2)
> 	> Timing interval length (interval_length=60)
> 	> Agressive host checking option (use_aggressive_host_checking=0)
> 	>
> 	> The following options are not set.
> 	> Auto-rescheduling option
> 	> Auto-rescheduling interval
> 	> Auto-rescheduling window
> 	>
> 	>
> 	> I will read the configuration and recheck it all.
> 	>
> 	> I also got the SYSLOG line right after que problem. I hope this could help you help me.
> 	>
> 	> Dec  6 06:33:24 bsbserv007 nagios: Warning: The check of service 'CPU LOAD' on host 'BSBSERV017' could not be performed due to a fork() error.  The check will be rescheduled.
> 	>
> 	>
> 	> Thanks for the help.
> 	>
> 	>
> 	> Fernando Shayani
> 	> fernando.shayani at bsb.politec.com.br
> 	> (61) 3038-6951
> 	> POLITEC - Brasília - DF
> 	>
> 	> -----Original Message-----
> 	> From: Marcel Mitsuto Fucatu Sugano [mailto:msugano at uolinc.com]
> 	> Sent: segunda-feira, 5 de dezembro de 2005 15:33
> 	> To: Fernando Shayani
> 	> Cc: Eli Stair; nagios-users at lists.sourceforge.net
> 	> Subject: RE: [Nagios-users] Nagios HANGS scheduling info
> 	>
> 	> On Mon, 2005-12-05 at 09:01 -0300, Fernando Shayani wrote:
> 	> > Well, I upgraded to b6, enabled the Orphaned Service and still
> 	> > hangs...
> 	> > 
> 	> > Is there any other clue?
> 	> > 
> 	> > Fernando Shayani
> 	> > fernando.shayani at bsb.politec.com.br
> 	> > (61) 3038-6951
> 	> > POLITEC - Brasília - DF
> 	>
> 	> Have you followed the instructions available at:
> 	> http://nagios.sourceforge.net/docs/2_0/configmain.html ?
> 	>
> 	> Read that and check the following configs:
> 	> Inter-check sleep time
> 	> Service inter-check delay method
> 	> Maximum service check spread
> 	> Service interleave factor
> 	> Maximum concurrent service checks
> 	> Service reaper frequency
> 	> Host inter-check delay method
> 	> Maximum host check spread
> 	> Timing interval length
> 	> Auto-rescheduling option
> 	> Auto-rescheduling interval
> 	> Auto-rescheduling window
> 	>
> 	> Agressive host checking option
> 	>
> 	> Which are related to the scheduling options, and might be the source of your problem. Have you messed up those variables? Whatever the case, post the values of this configuration variables.
> 	>
> 	> HTH,
> 	> --
> 	> Marcel Mitsuto Fucatu Sugano <msugano at uolinc.com> Universo Online S.A. -- http://www.uol.com.br
> 	>
> 	>
> 	>
> 	> -------------------------------------------------------
> 	> This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
> 	> for problems?  Stop!  Download the new AJAX search engine that makes
> 	> searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> 	> http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
> 	> _______________________________________________
> 	> Nagios-users mailing list
> 	> Nagios-users at lists.sourceforge.net
> 	> https://lists.sourceforge.net/lists/listinfo/nagios-users
> 	> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> 	> ::: Messages without supporting info will risk being sent to /dev/null
> 	>
> 	
> 	
> 
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems?  Stop!  Download the new AJAX search engine that makes searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> http://ads.osdn.com/?ad_idv37&alloc_id865&op=ick
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
> 
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through log 
> files for problems?  Stop!  Download the new AJAX search engine that 
> makes searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
> 




-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list