nagios threads and process model

Andreas Ericsson ae at op5.se
Mon Aug 27 14:46:04 CEST 2007


Yishai Hadas wrote:
> Hi List,
> 
>  
> 
> Where can be found some information about Nagios threads and process
> model:
> 
> *	How many threads are running?   
> *	What are their main tasks? 
> *	Are there some threads that are starting doing some work then
> terminating? 
> *	What about the process model (parent/child - fork) - When fork
> is done? How it relates to the threads model? 
> 

Primarily inside very few nagios hackers heads.

Basically nagios 2 additional :
The command_worker_thread continuously reads the unbuffered FIFO through
which a user or a program can send commands to nagios.

The service_result_worker_thread reaps service check results.

The "main" thread is responsible for scheduling and launching checks. It
also takes care of whatever action needs doing upon a state change (such
as sending notifications, executing event-handlers, updating logfiles, etc)

>  
> 
> Is there any documentation about those issues?
> 

Not really, no. Unfortunately, the code handling the threads is sparingly
commented as to what the particular threads do, but you can find the main()
function of each thread in base/utils.c of a [23].x nagios work tree.

> Can someone give here some light?
> 

Hopefully I just did. I assume you want this information in order to hack
on nagios, in which case your C-skills are hopefully sufficient to grok
the rest of it by looking at the source.

>  
> 
>  
> 
> This information can help understanding what is reasonable to do and
> what is not.
> 

For a given value of "reasonable", practically anything is ;-)

> For example assuming that process_performance_data is enabled then
> service_perfdata_command is called periodically,
> 
> Nagios uses perfdata_timeout which is by default 5 seconds to wait for
> the command and if hasn't finished it will be terminated.
> 
> 
> If there is some work that might take in the command much more time e.g.
> 1 minute, is it reasonable to it in this command and let's Nagios to
> wait for it 1 minute?
> 

I'd highly recommend that you try to keep your check execution times to
an absolute minimum. It was designed to run small programs that perform
a limited task quickly and then react on the outcome of the execution
of that small program. If you put nagios to use for a different purpose,
it will ultimately perform less excellent.

> Which other Nagios's tasks will be delayed? 
> 

That would depend on how nagios designs your scheduling queue. If it's
a host-check that's expected to take several minutes, nagios 2.x and
earlier will stall exactly everything while waiting for the check to
finish. If it's a service-check, only the other services that happened
to end up in the same batch will be delayed, but only wrt the reporting
back to the main thread.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list