Warning: Monitoring process may not be running! (take 2)

Andreas Ericsson ae at op5.se
Fri Feb 18 21:01:49 CET 2005


Tom Gwilt wrote:
> Hi all,
> 
> I recently installed nagios 1.2 on a FreeBSD 5.3 box from the ports, 
> along with the plugins, slifeed plugins, snmp plugins, and nagiostat.
> 
> I'm new to Nagios, but not to Unix, nor to config files - so I've spent 
> the previous few days reading the documentation, writing the config 
> files, and testing.
> 

That's a first. Most users dive headlong into the list after having 
failed getting the sample config to monitor their entire network. ;)

> Everything seems to be well, execpt for the error referenced in the 
> subject.
> 
> Here is some of the pertinent info:
> 
> cgi.cfg - nagios_check_command:
> /usr/local/libexec/check_nagios /var/spool/nagios/status.sav 5 
> '/usr/local/bin/nagios'
> 
> nagios.cfg - logfiles, etc.
> log_file=/var/spool/nagios/nagios.log
> status_file=/var/spool/nagios/status.log
> command_file=/var/spool/nagios/rw/nagios.cmd
> 
> etc.
> 
> When checking the Process Status Information via the web interface, here 
> is the message:
> 
> Check Command Output:     Nagios problem: located 1 process, status log 
> updated 1108752482 seconds ago
> 
> What's interesting is the update looks like it goes back to the epoch.
> 
> /var/spool/nagios/status.log and status.sav show 0 bytes. There is a lot 
> of free space on var:
> 

What does strace tell you? Perhaps its stuck somewhere (filesystem 
spinlocks?) in interruptable IO. I know this can happen with named pipes 
on Linux, but I've never seen Nagios stall from it.

> lab01# df -H | grep var
> 
> Filesystem     Size    Used   Avail Capacity  Mounted on
> /dev/da0s1e    1.0G     29M    902M     3%    /var
> 
> Also a log of free inodes. Permissions are as follows:
> 

Neither permissions nor inodes should be a problem since the files are 
created properly.

> lab01# ls -l /var/spool
> total 18
> drwxrwx---  2 smmsp   smmsp   512 Feb 17 15:35 clientmqueue
> drwxr-xr-x  3 root    daemon  512 Feb  4 23:56 cups
> drwxrwxr-x  2 uucp    dialer  512 Feb 17 22:42 lock
> drwxr-xr-x  2 root    daemon  512 Nov  4 20:23 lpd
> drwxr-xr-x  2 root    daemon  512 Nov  4 20:23 mqueue
> drwxrwxr-x  5 nagios  nagios  512 Feb 18 13:29 nagios
> drwx------  2 root    daemon  512 Nov  4 20:23 opielocks
> drwxr-xr-x  4 root    daemon  512 Feb 14 12:31 output
> drwxrwxrwt  2 root    wheel   512 Feb  4 23:56 samba
> 
> lab01# ls -l /var/spool/nagios
> total 14
> drw-r--r--  2 nagios  nagios   512 Feb 18 10:47 archives
> -rw-r--r--  1 nagios  nagios     0 Feb 18 12:02 comment.log
> -rw-r--r--  1 nagios  nagios     0 Feb 18 12:02 downtime.log
> -rw-r--r--  1 nagios  nagios     6 Feb 18 13:29 nagios.lock
> -rw-r--r--  1 nagios  nagios  5886 Feb 18 13:29 nagios.log

Any hints in nagios.log? The fact that it has written some data to it is 
encouraging.

> drwxr-xr-x  2 root    nagios   512 Feb 18 10:26 nagiostatrrd
> drwxrwxr-x  2 nagios  www      512 Feb 18 13:29 rw
> -rw-r--r--  1 nagios  nagios     0 Feb 18 13:29 status.sav
> -rw-r--r--  1 nagios  nagios     0 Feb 18 13:29 status.log
> 
> The output of the ps command as defined by the config.h for the plugins:
> 
> /bin/ps -axwo 'stat uid ppid ucomm command'
> 
> Ss    1003     1 nagios              /usr/local/bin/nagios -d 
> /usr/local/etc/nagios/nagios.cfg
> 
> Anybody got any thoughts?
> 

Run it in gdb. Step 1000 instructions or so at a time and put a 
breakpopint at write_to_all_logs so you see what's going on. Run an 
unstripped version. You could also try compiling from source (and bypass 
the ports directory). I know for a fact that Nagios runs successfully on 
many *bsd systems, so it's at least solveable.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Lead Developer


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list