oscp command design and FIFO locking?

Fred f1216 at yahoo.com
Sun Sep 11 16:11:44 CEST 2005


Does anyone have an idea why the oscp command (for distributed monitoring)
would
kick off more then one command at a time?  For example, if there are a number
of checks that are completed, nagios kicks off multiple oscp scripts (submit
commands).  

This causes the design of the submit command to need to throttle the access
to whatever resources it might need to touch.  If using the default send_nsca
command, there can now be multiple (and many multiple) send_nsca's kicked off
and each of these on the target server will all be attempting to write to
the nagios FIFO.  The nagios FIFO can get horribly overloaded.  If the nagios
master demon is not aggresively reading the FIFO (check_command_interval=-1)
then the demons can stack up and eventually consume socket resources and
memory etc.   As far as I can tell, nsca doesn't lock the FIFO, which also
means that writes will get intermixed with writes from plug-ins that might be
running on the master system.  (I have seen this over and over)

To avoid this, I have had to implement serious locking in all plug-ins and
not use nsca as it has no locking mechanism (that I know of).

Right now I am fighting with the oscp commands that can launch dozens of
copies at a time and each of these (in my case) write to a local file that
will eventually be pushed up to the master and written (while locking) the
nagios FIFO.

So ... I guess my questions are:

1) Should nagios be forking off more then one oscp command at a time?
2) Has anyone else run into FIFO corruption because of the lack of advisory
   locking in all the plug-ins? 

Thanks in advance for any thoughts or observations here.
-FredC






-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list