check attempts problem

kudjat kudjat at voila.fr
Wed Jan 28 12:27:52 CET 2004


Hello,

I wrote my own nagios plugins.
Those are requested thanks to SSH.
Those are written in perl and run on linux or win2K servers.
They work since they display their service status and description in nagios.

However, if such a check returns me a CRITICAL or a WARNING state, nagios does not try to check another time 1 minute later as I asked it to do in services.cfg
As a consequence nagios cannot reach the max_check_attempts and so cannot notify me.

Thanks for your help.

==>Here is the  service syntax from /etc/nagios/services.cfg :

define service{
use             generic-service
host_name       nt_francois_2
service_description IS_WINDOWS_CLOCK_RUNNING
is_volatile     0
contact_groups                 nt-admins
notification_period            workhours
notification_interval          15
normal_check_interval   5
retry_check_interval    1
max_check_attempts      5
check_period    workhours
check_command   check_ssh_pour_checks_persos!"./check_service_generique_v3 --process W32Time --desc IS_WINDOWS_CLOCK_RUNNING"
}

==>Here is my check_ssh_pour_checks_persos prototype as written in the checkcommands.cfg file :

define command{
command_name    check_ssh_pour_checks_persos
command_line    $USER1$/check_ssh_pour_checks_persos -H $HOSTADDRESS$ -C $ARG1$
}

==>Here is how is written my perl check_ssh_pour_checks_persos script (I know that it is a really dirty script)


#! /usr/bin/perl
$NAGIOS_HOME="/usr/lib/nagios/plugins";
$NAGIOS_COMMAND_FILE_PATH="/var/log/nagios/rw/nagios.cmd";
#I did not past the arguments checking lines...


$cmd_a_exec="$NAGIOS_HOME/check_by_ssh -H $host_address  -t 40 -C '$cmd_a_executer_chez_l_hote'";
$return_code = system("$cmd_a_exec > /var/log/nagios/rw/test");
open(FH,"/var/log/nagios/rw/test"); #usefull for monitoring what is returned by the ssh check while true do more /var/log/nagios/rw/test...
$first_line=<FH>;
close(FH);
$la_date=time();
$to_nagios_cmd_file="[$la_date] PROCESS_SERVICE_CHECK_RESULT;$host_address;$first_line";
system("echo \"$to_nagios_cmd_file\" >> $NAGIOS_COMMAND_FILE_PATH");

==>Here is a log extract : 

==>performing "Re-schedule the next check of this service" command :
Jan 27 16:37:49 nagios nagios: EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;nt_francois_2;IS_WINDOWS_CLOCK_RUNNING;1075217867
Jan 27 16:37:55 nagios nagios: EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;10.207.51.121;IS_WINDOWS_CLOCK_RUNNING;2;Process W32Time arrete! Demarres le!
Jan 27 16:38:00 nagios nagios: SERVICE ALERT: nt_francois_2;IS_WINDOWS_CLOCK_RUNNING;CRITICAL;SOFT;1;Process W32Time arrete! Demarres le!
==>seems to be ok : since there is a service alert. Waiting for the next check which should be in 1 minute as set in the services.cfg file
==>No more checks and SERVICE ALERT ... I was expecting for the 4 remaining checks before the notification event.
==>What follows  is a non forced check :
Jan 27 16:42:54 nagios nagios: EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;10.207.51.121;IS_WINDOWS_CLOCK_RUNNING;2;Process W32Time arrete! Demarres le!
==>I wonder why I get the next line and why it is followed by the right output
Jan 27 16:43:00 nagios nagios: SERVICE ALERT: nt_francois_2;IS_WINDOWS_CLOCK_RUNNING;OK;SOFT;2;(No output!)
Jan 27 16:43:00 nagios nagios: SERVICE ALERT: nt_francois_2;IS_WINDOWS_CLOCK_RUNNING;CRITICAL;SOFT;1;Process W32Time arrete! Demarres le!
==>And then next check will happen at 16:48....

==>So I'like to know if the non notification (due to the non reaching max attempts) is due to the nearly blank line (....OK;SOFT;2;(No output!)) inserted in the nagios log file
PS : notification well works with classic nagios plugins.

Many thanks for your help

François

------------------------------------------

Faites un voeu et puis Voila ! www.voila.fr 




-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn




More information about the Developers mailing list