Question about NRPE operation

Andreas Ericsson ae at op5.se
Fri Sep 9 10:41:13 CEST 2005


schönfeld / in-medias-res wrote:
> Hi,
> 
> i'm having some problems with checking a ncpfs filesystem and
> got a suspicious on my mind, so i have a question about how
> the NRPE does operate.
> 
> Ok, here we go:
> Nagios initiates a check of a particular service on Host X,
> so it does send a request to the NRPE daemon on Host X.
> 
> Host X checks the request and starts the plugin which can do the
> requested service check and switches into wating state => waiting for
> the plugin answere.
> 
> Now imagine that the ncpfs is busy, because of another "heavy operation"
> on it. So the plugin runs and runs, but has to wait for the filesystem
> and does not return a result to the nrpe in the meanwhile.
> 

The plugin itself is supposed to exit gracefully after some specified 
amount of maximum time. This is generally achieved by installing a 
signal-handler to catch SIGALRM and making an alarm(2) call. The 
signal-handler should make sure all locks and resources are released 
(the kernel will handle it otherwise, but that's considered terribly bad 
form).

> So now the question is: Does the NRPE Server has an timeout after which
> it'll *kill* the plugin?

Yes, naturally. Otherwise it could risk filling up the process-table, or 
plugins with infinite loops could bring the entire system down.

> If so: Linux ncpfs is not able of threading
> ncpfs operations. So if one process is accessing the ncpfs and gets
> a SIGKILL, the ncp connection becomes invalid and the source of my
> problem would be identified.
> 

This really can't be. Any locks and resources held by a terminated 
process should be cleared by the kernel (if not by the process itself). 
If they aren't, you've found a kernel bug.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Lead Developer


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list