Nagios scaling

Jason Martin jhmartin at toger.us
Tue Sep 7 18:22:34 CEST 2004


I compiled the latest Nagios 2.0cvs with gcc's profiling and debugging
options. After loading up 10,002 services and launching nagios, I
executed the status.cgi. You can see below that two functions
are taking up the vast majority of runtime.  If anything can be
done to improve upon these two functions it would significantly
speed up the responsiveness of the CGI's with large service
lists.  

$ gprof status.cgi |head
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 41.13      3.94     3.94    10002     0.39     0.41  add_service_status
 38.83      7.66     3.72    10002     0.37     0.40  add_service
  4.28      8.07     0.41  1122533     0.00     0.00  strip

For the same set of services I profiled the daemon. 

$ gprof nagios |head
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 53.58      2.77     2.77    10002     0.28     0.28  xodtemplate_find_service
 24.95      4.06     1.29    12361     0.10     0.10  add_event
  9.09      4.53     0.47    10002     0.05     0.05  add_service
  4.84      4.78     0.25       11    22.73    22.73  xsddefault_save_status_data

I noticed that the intensive functions have linear
list-traversals; at scale they seem to be getting rather
intensive. Perhaps some sort of in-memory index could be built
to optimize that?

Thanks,
-Jason Martin
-- 
This message is PGP/MIME signed.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 211 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20040907/df411193/attachment.sig>


More information about the Developers mailing list