Ndoutils bad performance

Cristiano Casado co.casado at gmail.com
Fri May 8 05:27:51 CEST 2009


Hello everyone.

I am using  Nagios 3.0.6 with ndoutils 1.4b7 and MySQL 5.0 on a Linux
machine CentOS 5.2 (2.6.18) with 2GB ram and x64 Intel Xeon 2.50GHz
processor in the laboratory environment.

I chose to use the ndoutils to maintain historical information aboutchecks
on the database (tables nagios_servicechecks and nagios_hostcheks) and
Nagvis to use the graphical representation of my network.

The functional tests were good, but the project got to the point where I had
to do a benchmark with a number of 7,000 services, simulating my production
environment.

Loading broker ndomod but without running the ndo2db daemon I have a good
performance of service checks with almost 95% of services being out in the
window of 5 min and a check with low latency. Important emphasize that all
services are active (not passive) with check_interval of 5 minutes.

When I run the ndo2db daemon, where it initialize a unix socket and starts
operations in the database, I get a considerable loss of check performance
of the service with only 9% of services being handled in window of 5 min,
high latency of checks and CPU iowait counter between 90% and 100%. Services
to be processed by the Nagios daemon in 5 min are processed with delay of up
to 1 hour.

How MySQL is on the same machine, it did the tuning (buffer, threads, etc.)
and applied additionals table indexes.
As test I changed the nagios database schema for the "blackhole". Thus the
bank accepts connections and operations (select, insert, delete, ...) but
not recording the data on disk. The iowait continued high with bad
performance. I do not suspect most of the database.

I noticed that the daemon ndo2db uses only 1 connection to the database to
various operations per second, each operation is awaiting the end of the
previous run. What I find particularly bad.

Question: someone uses the facilities of ndoutils with the database in large
nagios installations (> 1500 hosts> 7000 services) without performance
problems to give me help ?

Nagios settings are below.

I followed some recommendations for the tuning of the document nagios
http://nagios.sourceforge.net/docs/3_0/tuning.html


### Nagios Config ###
log_file=/var/log/nagios/nagios.log
object_cache_file=/var/nagios/objects.cache
precached_object_file=/var/nagios/objects.precache
resource_file=/etc/nagios/resource.cfg
status_file=/var/nagios/status/status.dat
status_update_interval=10
nagios_user=nagios
nagios_group=nagios
check_external_commands=1
command_check_interval=15s
command_file=/var/nagios/rw/nagios.cmd
external_command_buffer_slots=4096
lock_file=/var/run/nagios.pid
temp_file=/var/nagios/nagios.tmp
temp_path=/tmp
event_broker_options=-1
broker_module=/usr/libexec/ndomod-3x.o config_file=/etc/nagios/ndomod.cfg
log_rotation_method=d
log_archive_path=/var/log/nagios/archives
use_syslog=1
log_notifications=1
log_service_retries=1
log_host_retries=1
log_event_handlers=1
log_initial_states=0
log_external_commands=1
log_passive_checks=1
service_inter_check_delay_method=s
max_service_check_spread=30
service_interleave_factor=s
host_inter_check_delay_method=s
max_host_check_spread=30
max_concurrent_checks=0
check_result_reaper_frequency=5
max_check_result_reaper_time=5
check_result_path=/var/nagios/spool/checkresults
max_check_result_file_age=3600
cached_host_check_horizon=30
cached_service_check_horizon=60
enable_predictive_host_dependency_checks=1
enable_predictive_service_dependency_checks=1
soft_state_dependencies=0
auto_reschedule_checks=0
auto_rescheduling_interval=30
auto_rescheduling_window=180
sleep_time=0.25
service_check_timeout=60
host_check_timeout=30
event_handler_timeout=30
notification_timeout=30
ocsp_timeout=5
perfdata_timeout=5
retain_state_information=1
state_retention_file=/var/nagios/retention.dat
retention_update_interval=60
use_retained_program_state=1
use_retained_scheduling_info=1
retained_host_attribute_mask=0
retained_service_attribute_mask=0
retained_process_host_attribute_mask=0
retained_process_service_attribute_mask=0
retained_contact_host_attribute_mask=0
retained_contact_service_attribute_mask=0
interval_length=60
use_aggressive_host_checking=0
execute_service_checks=1
accept_passive_service_checks=1
execute_host_checks=1
accept_passive_host_checks=1
enable_notifications=1
enable_event_handlers=1
process_performance_data=0
obsess_over_services=0
obsess_over_hosts=0
translate_passive_host_checks=0
passive_host_checks_are_soft=0
check_for_orphaned_services=1
check_for_orphaned_hosts=1
check_service_freshness=1
service_freshness_check_interval=60
check_host_freshness=0
host_freshness_check_interval=60
additional_freshness_latency=15
enable_flap_detection=1
low_service_flap_threshold=5.0
high_service_flap_threshold=20.0
low_host_flap_threshold=5.0
high_host_flap_threshold=20.0
date_format=us
p1_file=/usr/bin/p1.pl
enable_embedded_perl=1
use_embedded_perl_implicitly=1
illegal_object_name_chars=`~!$%^&*|'"<>?,()=
illegal_macro_output_chars=`~$&|'"<>
use_regexp_matching=0
use_true_regexp_matching=0
admin_email=nagios at localhost
admin_pager=pagenagios at localhost
daemon_dumps_core=0
use_large_installation_tweaks=1
enable_environment_macros=0
debug_level=0
debug_verbosity=1
debug_file=/var/log/nagios/nagios.debug
max_debug_file_size=1000000
cfg_file=/etc/nagiosql/commands.cfg
cfg_file=/etc/nagiosql/contactgroups.cfg
cfg_file=/etc/nagiosql/contacts.cfg
cfg_file=/etc/nagiosql/contacttemplates.cfg
cfg_file=/etc/nagiosql/hostdependencies.cfg
cfg_file=/etc/nagiosql/hostescalations.cfg
cfg_file=/etc/nagiosql/hostextinfo.cfg
cfg_file=/etc/nagiosql/hostgroups.cfg
cfg_dir=/etc/nagiosql/hosts
cfg_file=/etc/nagiosql/hosttemplates.cfg
cfg_file=/etc/nagiosql/servicedependencies.cfg
cfg_file=/etc/nagiosql/serviceescalations.cfg
cfg_file=/etc/nagiosql/serviceextinfo.cfg
cfg_file=/etc/nagiosql/servicegroups.cfg
cfg_dir=/etc/nagiosql/services
cfg_file=/etc/nagiosql/servicetemplates.cfg
cfg_file=/etc/nagiosql/timeperiods.cfg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20090508/18952cee/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image 
processing features enabled. http://p.sf.net/sfu/kodak-com
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list