service latency troubles

Antoine Musso antoine.musso at laposte.fr
Fri Oct 10 17:26:12 CEST 2008


Hello,

  For testing purposes, we are monitoring 1750 services on 580 hosts. We 
would like to check each service every 300 second, unfortunatly, nagios 
report a service check latency of 150 to 250 seconds !

I noticed nagios launch roughly 200 checks and then idle for roughly a 
minute. I checked that using a lame script :

   while true; do ps -u nagios | wc -l; sleep 1; done;


Our inter_check_delay_method and inter_leave_factor_method are set to 
smart :

     SERVICE SCHEDULING INFORMATION
     -------------------------------
     Total services:                     1750
     Total scheduled services:           1750
     Service inter-check delay method:   SMART
     Average service check interval:     300.00 sec
                                         ^^^
This is our aim -----------------------///

     Inter-check delay:                  0.17 sec
     Interleave factor method:           SMART
     Average services per host:          3.01
     Service interleave factor:          4
     Max service check spread:           5 min
     First scheduled check:              Fri Oct 10 17:00:15 2008
     Last scheduled check:               Fri Oct 10 17:05:15 2008


The maximum concurrent service checks is set to 200 :

      CHECK PROCESSING INFORMATION
      ----------------------------
      Check result reaper interval:       10 sec
      Max concurrent service checks:      200


And here is an overview of nagiostat output (we do not use passive 
checks nor flapping detection).

   Total Services:                        1750
   Services Checked:                      1750
   Services Scheduled:                    1750
   Services Actively Checked:             1750

   Total Service State Change:            0.000 / 55.720 / 0.530 %
   Active Service Latency:                146.819 / 214.765 / 177.500 sec
   Active Service Execution Time:         0.083 / 12.984 / 1.074 sec
   Active Service State Change:           0.000 / 55.720 / 0.530 %
   Active Services Last 1/5/15/60 min:    92 / 1098 / 1750 / 1750
   Services Ok/Warn/Unk/Crit:             1671 / 60 / 7 / 12

Active Service Checks Last 1/5/15 min:  176 / 1159 / 3432
    Scheduled:                           176 / 1159 / 3432
    On-demand:                           0 / 0 / 0
    Cached:                              0 / 0 / 0


I have not found how to make nagios to launch service checks more often 
than every minutes. Does anyone have any idea ? :)


-- 
Antoine MUSSO
DISIT/PROD/QFO/OUT
mailto:antoine.musso at laposte.fr
tél. 02 40 12 73 62
-------------- next part --------------
Post-scriptum La Poste

Ce message est confidentiel. Sous réserve de tout accord conclu par
écrit entre vous et La Poste, son contenu ne représente en aucun cas un
engagement de la part de La Poste. Toute publication, utilisation ou
diffusion, même partielle, doit être autorisée préalablement. Si vous
n'êtes pas destinataire de ce message, merci d'en avertir immédiatement
l'expéditeur.


-------------- next part --------------
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list