False alerts on http service

francis picabia fpicabia at gmail.com
Wed Sep 12 19:03:16 CEST 2012


We have used nagios successfully for many years and never seen
a case like this.  I cannot get nagios sevice to see the remote
http service is up, although the check command indicates it is up
and the remote apache log shows nagios visited with no error.

The site to monitor runs webwork, a math quiz system.  I have it
set to redirect / to /webwork and also redirect insecure to https.

At first I did a plain check_http.

I switched to -S option and added -u with the full URL to avoid hitting
the redirects, so I can get a clean code 200 returned, in case that
was muddling things.  No difference.

When I look at the apache log, I can see the visits from nagios,
For the early morning visits, there is no one
using the system, so it can't be unresponsive.

Here is my check command:


# 'check_www_ssl' command definition
define command{
        command_name    check_www_ssl
        command_line    $USER1$/check_http -S -I $HOSTADDRESS$ -f
follow -w 5 -c 20 -t 60 -u $ARG1$
        }

Here is my service:


define service{
        use                             generic-service
        host_name                       webwork
        is_volatile                     0
        service_description             Webwork Web Service
        check_command
check_www_ssl!'https://webwork.example.com/webwork/'
        check_period                    24x7
        contact_groups                  unix-admins
        max_check_attempts              3
        normal_check_interval           3
        retry_check_interval            1
        notification_interval           120
        notification_period             24x7
        notification_options            w,u,c,r
        }

Of course I have changed the actual domain to example.com in the above.

The alert report:

***** Nagios 3.2 *****

Notification Type: PROBLEM
Host: webwork
State: DOWN
Address: 131.162.201.91
Info: Server answer:

Date/Time: Wed Sept 12 06:59:04 ADT 2012


Here is a sample visit from nagios in the webwork apache log file
before this time.

XXX.YYY.2.50 - - [12/Sep/2012:06:58:50 -0300] "GET
https://webwork.acadiau.ca/webwork/ HTTP/1.0" 200 5015 "-"
"check_http/v1.4.14 (nagios-plugins 1.4.14)"

Our apache logs show nagios is visiting every 3 minutes, 24 hours a day.  None
of these visits results in an error.

In a nagios log, this is all that appears for webwork for the day:

# grep webwork nagios-09-11-2012-00.log
[1347246000] CURRENT HOST STATE: webwork;DOWN;HARD;1;Server answer:
[1347246000] CURRENT SERVICE STATE: webwork;Webwork Web
Service;OK;HARD;1;HTTP OK: HTTP/1.1 200 OK - 4053 bytes in 0.274
second response time
[1347270994] HOST NOTIFICATION:
david;webwork;DOWN;host-notify-by-email;Server answer:
[1347270994] HOST NOTIFICATION:
bob;webwork;DOWN;host-notify-by-email;Server answer:
[1347270994] HOST NOTIFICATION:
winston;webwork;DOWN;host-notify-by-email;Server answer:
[1347270994] HOST NOTIFICATION:
larry;webwork;DOWN;host-notify-by-email;Server answer:
[1347299794] HOST NOTIFICATION:
david;webwork;DOWN;host-notify-by-email;Server answer:
[1347299794] HOST NOTIFICATION:
bob;webwork;DOWN;host-notify-by-email;Server answer:
[1347299794] HOST NOTIFICATION:
winston;webwork;DOWN;host-notify-by-email;Server answer:
[1347299794] HOST NOTIFICATION:
larry;webwork;DOWN;host-notify-by-email;Server answer:
[1347328594] HOST NOTIFICATION:
david;webwork;DOWN;host-notify-by-email;Server answer:
[1347328595] HOST NOTIFICATION:
bob;webwork;DOWN;host-notify-by-email;Server answer:
[1347328595] HOST NOTIFICATION:
winston;webwork;DOWN;host-notify-by-email;Server answer:
[1347328595] HOST NOTIFICATION:
larry;webwork;DOWN;host-notify-by-email;Server answer:

If I do the check_http manually, I seem to get through fine:

# /usr/lib/nagios3.2/libexec/check_http 0-S -I webwork -f follow -w5
-c 20 -t 60 -u https://webwork.example.com/webwork
HTTP OK: HTTP/1.1 200 OK - 5162 bytes in 0.025 second response time
|time=0.024700s;5.000000;20.000000;0.000000 size=5162B;;;0

Can anyone spot a reason why this alert is not set up properly or
there is a better way to do it?

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list