nagios-cvs: Too many open files?
Gerd Mueller
gmueller at netways.de
Thu Feb 8 18:45:52 CET 2007
> 1. Too many physical temp files.
> 2. Too many open files that were deleted files, but still have kernel
> references
>
> For #1, could you sort the files by modification time and see how they
> look. If you've got a lot of "old" files (> 1 hour), there' a problem.
> Some of these older files are normal, as I've mentioned before, and
> its best to run something like tmpwatch on the directory to remove them.
Yes, a few of them (about 150) have content and are older than 1h. Ok, I
will take care of them. Btw debian's tmpwatch is called tmpreaper :-)
nag01:/tmp# ls -al | grep nagios | grep -e "nagios\s*0\s*" -v | wc -l
150
> For #2...
> lsof reports a number of temp files that are still open, but were
> deleted. You can see if this is your problem by running:
>
> lsof | grep nagios | grep DEL
>
> I did some digging and this was caused by mmap() and munmap() when
> Nagios encountered a temp file of 0 byte size, which will happen when
> checks have no output. I changed the code to skip mmap()ing altogether
> when it encounters 0 byte files, and that solved the problem for me. A
> patch will be in CVS shortly for this...
It's true we are checking hundreds of services on unreachable networks
to test the new host checking logic. ;-) Most of these checks end with
"(Service Check Timed Out)". So this must cause this 0 size files.
nag01:/tmp# ls -al | grep nagios | grep -e "nagios\s*0\s*" | wc -l
21473
I will check your patch on my nagios-cvs installations as soon as it is
available.
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
More information about the Developers
mailing list