monitor number of open files on linux

Parkman, Mikhail Mikhail_Parkman at cable.comcast.com
Fri Jun 8 23:30:45 CEST 2012


Thank you, Allan -  yes this so obvious after you spelled it out for met.
Somehow I was thinking in the absolute numbers of open files instead of % of total.

From: Allan Clark [mailto:allanc at chickenandporn.com]
Sent: Friday, June 08, 2012 2:09 PM
To: Nagios Users List
Subject: Re: [Nagios-users] monitor number of open files on linux

On Fri, Jun 8, 2012 at 1:53 PM, Parkman, Mikhail <Mikhail_Parkman at cable.comcast.com<mailto:Mikhail_Parkman at cable.comcast.com>> wrote:
Thanks - I decided to go with check_open_files.pl<http://check_open_files.pl>
http://exchange.nagios.org/directory/Plugins/Uncategorized/Operating-Systems/Linux/check-open-files/details

I didn't find help_me/read_me info for this plugin.
After I installed it on the target box into /usr/local/nagios/libexec and just executed it, I got:
----------
[root at target_host libexec]# ./check_open_files.pl<http://check_open_files.pl>
Usage:  -w <warn> -c <crit> [-t <timeout>] [-v version] [-h help]
[root at target_host libexec]#
======
That told me that I should run it at least with "-w some_value1 -c some_value2"
Then I tried to run it with different -w -c values and I am not clear why I am getting different threshold values (bold, red) :
===============
[root@ target_host libexec]# ./check_open_files.pl<http://check_open_files.pl>  -w 500 -c 10000
OK: open files (4590) is below threshold (16194515/323890300)|open_files=4590;16194515;323890300
[root@ target_host libexec]# ./check_open_files.pl<http://check_open_files.pl> -w 1000 -c 10000
OK: open files (4590) is below threshold (32389030/323890300)|open_files=4590;32389030;323890300
[root@ target_host libexec]# ./check_open_files.pl<http://check_open_files.pl> -w 10 -c 100
OK: open files (4590) is below threshold (323890/3238903)|open_files=4590;323890;3238903
===============
Why do I get in response 2 threshold values and why are they different each time I enter another number of warning and critical limits?

Clearly, in general terms compared to other plugins:

1) you're getting "OK" because 4590 is less than the thresholds you've set; had it exceeded 323890 (in the -w10 example) then you'd get WARN, and if it exceeded the other, an ERROR response.  The actual thresholds are returned back because they are based on a calculation, and when the values are below, but the suer thinks they shouldn't be, the Nagios/Icinga screen would show the ref values as well as a comment.

2) your question as to why the numbers change might be more complex than I'm reading, but it's clearly taking % of total system files as a threshold:

-w 500 --> 500% of (cat /proc/sys/fs/file-max) ==> 16194515
-c 10000 --> 10000% of (cat /proc/sys/fs/file-max) ==> 323890300

Have I misread your question(s)?

I would suggest you set your thresholds to alarm on percentages; I'm not sure 50% and 80% are good numbers, but "-w 50 -c 80" would achieve those.

Allan
--
allanc at chickenandporn.com<mailto:allanc at chickenandporn.com>  "金鱼" http://linkedin.com/in/goldfish
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20120608/3f7e0bbe/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list