Nagios 3.0.4 performance issue

Alloo, Vincent v-alloo at ti.com
Wed Nov 19 12:18:40 CET 2008


By removing all "servicegroups", my load is back to normal.
I had all my NRPE services within the same service group in order to put in place a service dependency. It means 3600 services within the same service group.
It was causing the huge load seen on my machine.
Is it a normal behavior, or is it a bug?

Regards,

Vincent Alloo
TI France Design Systems Operations Manager
Europe, Middle East and Africa IT Services
Texas Instruments France

E-Mail: v-alloo at ti.com<mailto:v-alloo at ti.com>
Phone: +33 4 93 22 26 97
Mobile: +33 6 82 13 00 80
From: Alloo, Vincent
Sent: Tuesday, November 18, 2008 5:35 PM
To: nagios-users at lists.sourceforge.net
Subject: [Nagios-users] Nagios 3.0.4 performance issue

Hello,
Since few days (can't see what was changed), I have big performance issue with my Nagios install (CPU load). I have tweaked some parameters without  success. Any help is welcome!
My hardware is a SUN X4150, dual-core, 16GB RAM, Solaris 10u4.

Thanks.

load averages:  9.73,  11.0,  11.9;                    up 68+06:30:36                                17:31:47
267 processes: 252 sleeping, 6 running, 3 zombie, 6 on cpu
CPU states: 20.7% idle, 71.7% user,  7.6% kernel,  0.0% iowait,  0.0% swap
Memory: 16G phys mem, 12G free mem, 25G total swap, 25G free swap

   PID USERNAME LWP PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
 17332 nagios     1  50    0   14M 5224K sleep    0:00  8.71% nagios
 17333 nagios     1  50    0   14M 5200K sleep    0:00  8.69% nagios
 17330 nagios     1  50    0   14M 5252K sleep    0:00  8.67% nagios
 17348 nagios     1  40    0   14M 4968K run      0:00  5.72% nagios
 17347 nagios     1  40    0   14M 4812K cpu/1    0:00  4.57% nagios
 17349 nagios     1  30    0   14M 4792K cpu/2    0:00  4.11% nagios
 17351 nagios     1  30    0   14M 4748K run      0:00  3.12% nagios
 17353 nagios     1  40    0   14M 4668K cpu/0    0:00  1.84% nagios
 17352 nagios     1  40    0   14M 4600K run      0:00  0.87% nagios
 16307 nagios     2  50    0   14M   12M sleep    1:18  0.60% nagios
 17357 nagios     1  50    0 3556K 2260K sleep    0:00  0.21% check_nrpe
   230 valloo     1  59    0 3396K 2324K cpu/3    0:15  0.20% top
 17359 nagios     1  50    0 3548K 2132K sleep    0:00  0.14% check_nrpe
 17345 nagios     1  50    0 3552K 2136K sleep    0:00  0.13% check_nrpe
 17350 nagios     1  50    0   14M 2608K sleep    0:00  0.12% nagios

> ./nagiostats

Nagios Stats 3.0.4
Copyright (c) 2003-2008 Ethan Galstad (www.nagios.org)
Last Modified: 10-15-2008
License: GPL

CURRENT STATUS DATA
------------------------------------------------------
Status File:                            /opt/nagios/var/status.dat
Status File Age:                        0d 0h 0m 8s
Status File Version:                    3.0.4

Program Running Time:                   0d 0h 29m 19s
Nagios PID:                             16307
Used/High/Total Command Buffers:        0 / 1 / 4096

Total Services:                         3615
Services Checked:                       3611
Services Scheduled:                     3567
Services Actively Checked:              3615
Services Passively Checked:             0
Total Service State Change:             0.000 / 24.670 / 0.158 %
Active Service Latency:                 0.000 / 7.371 / 0.487 sec
Active Service Execution Time:          0.000 / 18.744 / 1.248 sec
Active Service State Change:            0.000 / 24.670 / 0.158 %
Active Services Last 1/5/15/60 min:     488 / 3350 / 3404 / 3567
Passive Service Latency:                0.000 / 0.000 / 0.000 sec
Passive Service State Change:           0.000 / 0.000 / 0.000 %
Passive Services Last 1/5/15/60 min:    0 / 0 / 0 / 0
Services Ok/Warn/Unk/Crit:              3478 / 40 / 15 / 82
Services Flapping:                      6
Services In Downtime:                   0

Total Hosts:                            1013
Hosts Checked:                          1013
Hosts Scheduled:                        1012
Hosts Actively Checked:                 1013
Host Passively Checked:                 0
Total Host State Change:                0.000 / 11.320 / 0.011 %
Active Host Latency:                    0.000 / 1.892 / 0.655 sec
Active Host Execution Time:             0.010 / 14.015 / 0.145 sec
Active Host State Change:               0.000 / 11.320 / 0.011 %
Active Hosts Last 1/5/15/60 min:        225 / 992 / 1012 / 1012
Passive Host Latency:                   0.000 / 0.000 / 0.000 sec
Passive Host State Change:              0.000 / 0.000 / 0.000 %
Passive Hosts Last 1/5/15/60 min:       0 / 0 / 0 / 0
Hosts Up/Down/Unreach:                  1011 / 2 / 0
Hosts Flapping:                         0
Hosts In Downtime:                      0

Active Host Checks Last 1/5/15 min:     297 / 1106 / 3263
   Scheduled:                           283 / 1033 / 3013
   On-demand:                           14 / 73 / 250
   Parallel:                            285 / 1046 / 3056
   Serial:                              0 / 0 / 0
   Cached:                              12 / 60 / 207
Passive Host Checks Last 1/5/15 min:    0 / 0 / 0
Active Service Checks Last 1/5/15 min:  578 / 3288 / 10054
   Scheduled:                           578 / 3288 / 10047
   On-demand:                           0 / 0 / 7
   Cached:                              0 / 0 / 2
Passive Service Checks Last 1/5/15 min: 0 / 0 / 0

External Commands Last 1/5/15 min:      0 / 0 / 0


accept_passive_host_checks=1
accept_passive_service_checks=1
additional_freshness_latency=15
admin_email=nagios at localhost
admin_pager=pagenagios at localhost
auto_reschedule_checks=1
auto_rescheduling_interval=30
auto_rescheduling_window=180
cached_host_check_horizon=15
cached_service_check_horizon=15
check_external_commands=1
check_for_orphaned_hosts=1
check_for_orphaned_services=1
check_host_freshness=0
check_result_path=/opt/nagios/var/spool/checkresults
check_result_reaper_frequency=4
check_service_freshness=1
command_check_interval=-1
command_file=/opt/nagios/var/rw/nagios.cmd
daemon_dumps_core=0
date_format=us
debug_file=/opt/nagios/var/nagios.debug
debug_level=0
debug_verbosity=1
enable_embedded_perl=1
enable_environment_macros=1
enable_event_handlers=1
enable_flap_detection=1
enable_notifications=1
enable_predictive_host_dependency_checks=1
enable_predictive_service_dependency_checks=1
event_broker_options=-1
event_handler_timeout=30
execute_host_checks=1
execute_service_checks=1
external_command_buffer_slots=4096
high_host_flap_threshold=20.0
high_service_flap_threshold=20.0
host_check_timeout=30
host_freshness_check_interval=60
host_inter_check_delay_method=s
illegal_macro_output_chars=`~$&|'"<>
illegal_object_name_chars=`~!$%^&*|'"<>?,()=
interval_length=60
lock_file=/opt/nagios/var/nagios.lock
log_archive_path=/db/sysadmin/nagios/archives/svxnagios03
log_event_handlers=1
log_external_commands=1
log_file=/opt/nagios/var/nagios.log
log_host_retries=1
log_initial_states=0
log_notifications=1
log_passive_checks=1
log_rotation_method=d
log_service_retries=1
low_host_flap_threshold=5.0
low_service_flap_threshold=5.0
max_check_result_file_age=3600
max_check_result_reaper_time=30
max_concurrent_checks=0
max_debug_file_size=1000000
max_host_check_spread=10
max_service_check_spread=10
nagios_group=nagios
nagios_user=nagios
notification_timeout=30
object_cache_file=/opt/nagios/var/objects.cache
obsess_over_hosts=0
obsess_over_services=0
ocsp_timeout=5
p1_file=/opt/nagios/3.0rc3/bin/p1.pl
passive_host_checks_are_soft=0
perfdata_timeout=5
precached_object_file=/opt/nagios/var/objects.precache
process_performance_data=1
resource_file=/opt/nagios/etc/resource.cfg
retain_state_information=1
retained_contact_host_attribute_mask=0
retained_contact_service_attribute_mask=0
retained_host_attribute_mask=0
retained_process_host_attribute_mask=0
retained_process_service_attribute_mask=0
retained_service_attribute_mask=0
retention_update_interval=60
service_check_timeout=60
service_freshness_check_interval=60
service_inter_check_delay_method=s
service_interleave_factor=s
service_perfdata_file=/var/spool/nagios/perfdata.log
service_perfdata_file_mode=a
service_perfdata_file_processing_command=process-service-perfdata
service_perfdata_file_processing_interval=30
service_perfdata_file_template=$LASTSERVICECHECK$||$HOSTNAME$||$SERVICEDESC$||$SERVICEOUTPUT$||$SERVICEPERFDATA$
sleep_time=0.125
soft_state_dependencies=0
state_retention_file=/opt/nagios/var/retention.dat
status_file=/opt/nagios/var/status.dat
status_update_interval=10
temp_file=/opt/nagios/var/nagios.tmp
temp_path=/tmp
translate_passive_host_checks=0
use_aggressive_host_checking=0
use_embedded_perl_implicitly=1
use_large_installation_tweaks=1
use_regexp_matching=1
use_retained_program_state=1
use_retained_scheduling_info=1
use_syslog=1
use_true_regexp_matching=0

Vincent Alloo
TI France Design Systems Operations Manager
Europe, Middle East and Africa IT Services
Texas Instruments France

E-Mail: v-alloo at ti.com<mailto:v-alloo at ti.com>
Phone: +33 4 93 22 26 97
Mobile: +33 6 82 13 00 80
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20081119/0ef129e4/attachment.html>
-------------- next part --------------
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list