RETRY: CPU Question

Fred Albrecht Fred.Albrecht at za.tiscali.com
Mon Apr 7 15:43:59 CEST 2003


The system idles at 80% because I run other apps on the machine as well, like RTG to get my router interface stats, scripts to check passives' etc, and this runs continuously.  I have used top, and the only processes that really load the system are the cgis.  I've also placed some checks in the status.c file to time how long it takes to run different parts of itself.  What I found was:
TIME TO read_all_object_configuration_data=7.000000
TIME TO read_all_status_data=3.000000
TIME TO finish all=6.000000
TIME TO run=16.000000

These are in seconds.  So to read the object configuration data takes 43% of the time, to read status data 18% and to generate the web interface 39%.
 
Lane, what type of system do you run, the specs, to so that I can compare with what I have.  Please.
 
Thanx
 
:)
fred


-----Original Message-----
From: Williams, P. Lane [mailto:Lane.Williams at jhuapl.edu]
Sent: 07 April 2003 01:52 PM
To: Fred Albrecht
Subject: RE: [Nagios-users] RETRY: CPU Question


The fact that your system idles at 80% normally, may have something to do with it.  All Linux distributions I've used have typically idles at 98.xxx% or better, when not under a load.  I've also done what you've done, where I would cycle through the cgi's to test performance...I would see momentary spike in cpu use.....but only around 30%-40%.  At the moment I do not have as many checks as you.  If you haven't already, you may want to use 'top' and see if you have any run away processes or possible memory leaks.
 
Lane

-----Original Message-----
From: Fred Albrecht [mailto:Fred.Albrecht at za.tiscali.com]
Sent: Monday, April 07, 2003 7:33 AM
To: Williams, P. Lane
Subject: RE: [Nagios-users] RETRY: CPU Question


No, I am saying that no swap is being used, there's no need.  The system is configured with a Gig's worth of swap, but everything manages to run in memory without swapping to disk.  Looking at the system now there is 3Meg swap used, 980M free.  43MB normal memory free.  Thanx for your reply.

-----Original Message-----
From: Williams, P. Lane [mailto:Lane.Williams at jhuapl.edu]
Sent: 07 April 2003 01:15 PM
To: Fred Albrecht
Subject: RE: [Nagios-users] RETRY: CPU Question


Are you saying you have no "swap" file?  
 
Lane

-----Original Message-----
From: Fred Albrecht [mailto:Fred.Albrecht at za.tiscali.com]
Sent: Monday, April 07, 2003 4:04 AM
To: nagios-users at lists.sourceforge.net
Subject: [Nagios-users] RETRY: CPU Question



Hi 

Not having received a reply on my previous question, I'll try again. :)  (Please tell me where I can ask this question, if this is the wrong place to ask.)

My cgi's take about 30 seconds from clicking on their links to displaying something on my screen.  I'm running a P4, 512M Red Hat 7.2 (uname shows Linux 2.4.20).  System idles at 80% CPU free most of the time, until I hit a cgi which drops the idle down to 0%, until the cgi finishes (as mentioned earlier, 25-30seconds later) and the system goes back to 80% idle.  No swap is being used.

I've done the following optimizations: 

Placed my critical files on ramdisk.  They are: 

-rwxr-xr-x    1 nagios   nagios        755 Apr  4 15:43 contactgroups.cfg 
-rwxr-xr-x    1 nagios   nagios       2822 Apr  4 15:43 contacts.cfg 
-rwxr-xr-x    1 nagios   nagios      14999 Apr  7 09:43 hostextinfo.cfg 
-rwxr-xr-x    1 nagios   nagios       1565 Apr  4 15:43 hostgroups.cfg 
-rwxr-xr-x    1 nagios   nagios      26585 Apr  4 15:43 hosts.cfg 
-rwxr-xr-x    1 nagios   nagios        536 Apr  4 15:43 hosts-uses.cfg 
drwxr-xr-x    2 nagios   nagios      12288 Apr  3 16:23 lost+found 
-rwxr-xr-x    1 nagios   nagios       3092 Apr  4 15:43 misccommands.cfg 
-rwxr-xr-x    1 nagios   nagios    1987817 Apr  4 15:43 serviceextinfo.cfg 
-rwxr-xr-x    1 nagios   nagios    1696675 Apr  4 15:43 services.cfg 
-rwxr-xr-x    1 nagios   nagios       3941 Apr  4 15:43 services-uses.cfg 
-rw-r--r--    1 nagios   nagiocmd   759981 Apr  7 09:50 status.log 
-rw-rw-r--    1 nagios   nagios     209360 Apr  7 09:43 status.sav 
-rwxr-xr-x    1 nagios   nagios       1112 Apr  4 15:43 timeperiods.cfg 

retention_update_interval=15 

aggregate_status_updates=15 

My nagios stats are as follows: 

Check Execution Time: 0 / 7 / 0.052 sec 
Check Latency: 0 / 14 / 0.605 sec 
# Active Checks: 3404 
# Passive Checks: 334 
I've done everything that I could implement in the "Tuning Nagios For Maximum Performance" section. 
At one stage I even nfs mounted the nagios directory to another machine from which I let my clients access the cgi's.  Sharing CPU this way worked fine, meaning that whenever the web interface becomes too slow, I'll just add another server in my nagios farm.  The only drawback is that the clients can't write to the nagios.cmd file accross the nfs mount.  Would have been a nice feature if it did work.  Which raises the next question.  Nagios is a distributed NMS system, how about making it a distributed client interface system, if you follow what I mean?  How can I get this done?

Is there anything else I can do to get the response time of the cgi's better?  Is this a hardware or software issue? 

Any suggestions will be highly appreciated. 

Thanx 

fred 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20030407/2bf6e267/attachment.html>


More information about the Users mailing list