Hundreds of passive checks in a second

Sean Dilda agrajag at dragaera.net
Fri Sep 3 23:00:58 CEST 2004


I have a plugin that I'm running for a few hundred hosts that checks
what SGE (Sun GridEngine) thinks the status of the node is.  The thing
is, these checks don't hit the actual nodes.  Instead they all interact
with the main SGE master daemon.  This means that all these checks and
quickly run up the load if they're run close enough.

In order to keep from running the load up I was thinking about setting
up a cron job that would do a single query against the SGE master daemon
for all the hosts (only slightly more overhead than querying about a
single host) and running all the results to nagios as passive checks. 
The problem is that nagios uses a named pipe for the command file, and
the buffer on linux is only 4k.  So I can't write all those checks at
once as it would overrun the 4k buffer.

I looked and there's an option for nagios to check the file as often as
possible, but I read the code and found out that's a lie.  I would think
that using select(2) (as opposed to sleep(3)) would really allow nagios
to check as often as possible.

Does anyone have any ideas of how to work around this?  Or has anyone
already tried replacing that sleep() call with a select call()?

Thanks,


Sean



-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list