Discussion of future of Nagios NT client agents and their functions (NSClient and NRPE_Win) - long

Tim Shouldice tim at mintoskatingclub.com
Thu May 15 16:36:37 CEST 2003


All:

Development of a second client for Windows (nrpe_win) has been started by Michael 
Wirtgen. 

This client will have much of the nrpe functionality that I had proposed for NSClient 
3.0. Specifically, the use of the check_nrpe client, the use of ssl, use of responding to 
only pre-specified ip addresses and having nrpe_win functioning through executing plugins 
on the Windows server. 

The plugins can be written in any language - perl, phython, wsh, vb, c, delphi, whatever. 
With the increasing emphasis by Microsoft on exposing easy to use interfaces to system 
configuration and performance data (WMI), I see a very bright future for this approach.

A plugin however has some limitations, it is executed, collects its data, determines the 
state and returns the state and a text string. This is good for many things (getting CPU 
percentages, etc.), however is not good where the monitoring process needs to be on-going.

On-going monitoring is needed (or easier) for some types of checks. Anything which 
submits a passive check usually does so by a continually running process. A good example 
would be monitoring the NT Event Log and sending a passive check when certain criteria 
are met. For performance reasons, a plugin would have a lot of overhead to perform this 
type of check - it would have to maintain a state file, check it on exectuion to 
determine the event id of the last event it examined, it would have to open the event log 
and scan all events from that event id to the most current one and apply each event to 
the alerting criteria. This would take more time than most active checks are permitted 
(10-30 seconds) and would be a strain on Nagios to be waiting for the return for these 
types of checks if they are being executed against 20-50 Windows servers at once.

A second example of where on-going monitoring is needed is when you are monitoring by 
Windows call-back functions. A call back function is where your program registers a 
function with Windows that Windows calls when an event has occured. The Service Control 
Manager uses call back functions to query a service for its status on startup and 
shutdown. This is why wrappers for services are less than ideal. There are a number of 
useful things in Windows that can be monitored through call-back functions, specifically 
a variety of security events are implemented through callbacks. Some of Microsoft's 
applications such as Message Queuing also use callbacks due to the asychronous nature of 
the events.

A third example is wait functions. These use a messaging type paradigm where you register 
the wait function in one part of your code and then in another part you go into a wait 
loop and react when the function returns. Code in this needs to be threaded as this type 
of code obviously is blocking by nature. Good things that can be monitored by wait 
functions include FindFirstChangeNotification() for changes to files and directories and 
FindFirstPrinterChangeNotification() for changes to printers.

I see NSClient evolving to handle these types events. NSClient would also continue to 
provide its existing functionality for backwards compatibility.

I see it working as follows:

NSClient is loaded as a service, monitoring only CPU (as currently).

Check_nt requests active_check data and gets the results returned immediately.

A mechanism then needs to be created to request NSClient go into event-based monitoring 
mode for things such as Event Log events, directory changes, printer changes, security 
changes, etc. I haven't decided on the best mechanism to submit these requests. The 
results would be sent to Nagios as passive check results. There could be a second Unix 
client that maintains a config file of NT clients and checks. This could be executed as 
part of the Nagios startup processes. This at least would keep configuration centralized. 
An alternative would be a windows client that populates a config file that NSClient reads 
on startup.

Some may wonder - why two services instead of one? Well, with two developers, maintaining 
the code requires CVS and release coordination. Secondly, NSClient is currently in Delphi 
and short of someone else picking it up and re-writing it in C, it will continue to be in 
Delphi where nrpe_win will be in C/C++. Also, there is much to be said for modularity. 
BMC Patrol is an industrial strength monitoring tool for Windows and it implements its 
monitoring through 3 NT sevices, Oracle databases are typically implemented as 3-5 
services. So I don't see a big issue with Nagios's windows client being implemented as 
two services.

Thoughs, comments, suggestions?

Tim Shouldice
NSClient Support - http://support.tsmgsoftware.com

This thread is also in the enhancements section of the above forum.


-------------------------------------------------------
Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara
The only event dedicated to issues related to Linux enterprise solutions
www.enterpriselinuxforum.com

_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list