Core 4 Remote Workers

Eric Stanley estanley at nagios.com
Sat Feb 2 15:12:50 CET 2013


All,

I've been giving some thought to remote workers for core 4 and wanted to 
run those thoughts by this list. I see remote workers as a very useful 
extension to the worker concept in core 4.

To implement remote workers, I think there are about 4 basic things that 
would need to be done.
1. Implement the ability to listen to multiple query handler interfaces 
(precursor to #2)
2. Implement the ability to create and listen on TCP socket query 
handler interfaces.
3. Add a host key to the worker registration to allow workers to specify 
the host(s) for which it will handle checks.
4. Write a stand-alone remote worker that can connect to the core 
instance via TCP.

The reason I have steps 1 and 2, instead of combining them is first, 
because a generalized solution is more extensible and second, I think 
having multiple TCP listeners is a reasonable use case where you have a 
multi-homed system, but you may not want to listen on all interfaces.

The host key should be allowed to specify one or more IP addresses, IP 
subnets, contiguous IP address ranges, host names and host name 
patterns/wildcards (i.e. *.example.com). If multiple workers register 
for the same host, some sort of distribution mechanism should be used to 
load balance the workers.

Using the second criteria of host to determine which worker gets the 
check raises the question of the order of precedence for the criteria. 
Initially, I think the host should have precedence over plugin, but I 
can see implementing and order of precedence option in the core 
configuration file. This would be more important if additional worker 
selection criteria were added.

The communication between the remote worker and the core process should 
be able to be protected by SSL. The remote worker will need a mechanism 
to retry the connection in the event the network drops the connection.

I realize this is a sizable change and we may not want it to happen 
before the release of 4.0. Thoughts on this are welcome.

Further down the road, I can see developing a remote worker proxy, whose 
sole job is to broker the communication between core and even more 
remote workers. This would enable a tree-shaped worker hierarchy for 
monitoring environments that are both large and dispersed geographically 
and/or topologically. This would require a re-registration process so 
the proxy workers could keep core updated with their abilities as 
leaf-node workers connected and disconnected.

Thoughts?

-- 
Eric Stanley
___
Developer
Nagios Enterprises, LLC
Email:estanley at nagios.com
Web:www.nagios.com  


------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_jan




More information about the Developers mailing list