high load on nagios server due to status.cgi

Marc Powell marc at ena.com
Thu Aug 20 17:43:57 CEST 2009


Please always respond on-list. More below.

On Aug 20, 2009, at 9:11 AM, rakesh kakde wrote:

> Hello Marc,
>
> Please find my answer to your question.
>
>
> What specific status.cgi view is causing this?
>
> Whenever we look for a perticular server which is being monitored by  
> nagios in nagios portal,
> information about its all services to apper on nagios web portal is  
> taking time to render. (It is not 2 min but exctly it is 40 to 45 sec)
> Once we hit such query one process with satus.cgi will be generated  
> which will consumes the CPU load.
> There is no problem with IO devices. %iowait is always normal (zero %)
>
> For the Service comment and acknowledge:
>
> We have remedy ticketing system integrated with mastercell and nagios.
> Whenever there is a mail alert(problem) for any service/host  
> monitoring, nagios will fire the alert.
> This alert will be send to mastercell and ticket will be created in  
> remedy ticketing tool.
> Once the ticket is created in remedy, mastercell will acknowledge  
> the problem and set the service or host comment and acknowledgement  
> in status.dat file with that ticket no.

You should configure mastercell to _not_ set the 'Persistent Comment'  
flag in the acknowledgment. Note that the meaning of this flag changed  
with 3.x.

>
> Now after the problem is resolved nagios will send the recovery  
> alert but there is no any mechanism from mastercell which will check  
> the staus of the remedy ticket and remove the comments and  
> acknowledgement from status.dat

If the service recovers, nagios will automatically remove the  
acknowledgement and the comment (if Persistent Comment is not set).  
This may or may not fit into your business processes, but it's there.

>
> Due to this number of comment and acknowledgement associated with  
> host and services in status.dat file keep on increasing and hence  
> the size of status.dat keep on increasing which in turn resulting in  
> the rendering issue.

I will be moving to a similar automatic system as you. We use Remedy  
as well but with manual acknowledgements. The process was essentially  
the same. Under 2.x I had a script that would look for ticket numbers  
in comments, get the status of the ticket and if it was closed, remove  
the comment. I don't expect that I'll need that under 3.x any longer  
but still have it running as it does a few other things I do still  
need like checking if a ticket is closed but the service is still not  
OK and removing the acknowledgement so it shows up again for our NOC.

--
Marc


------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list