Regarding Trends status after Network Outage

Nilesh nilesh at databasedba.com
Wed Dec 8 09:38:58 CET 2004


Dear Eric

I tried your perl script but noluck
the problem still parsists...
any full proof solution for this problem ...

wating for reply
regards
linux admin

BOLLENGIER Eric wrote:

>Hi,
>
>I have the same bug (nagios 1.2), in a race condition (after a host
>reboot).
>
>ssh down -> reboot -> host up -> ssh down -> ssh up
>
>[1099042385] SERVICE ALERT: test;ssh;CRITICAL;SOFT;1;Connection refused
>[1099042445] SERVICE ALERT: test;ssh;CRITICAL;SOFT;2;Socket timeout
>[1099042525] SERVICE ALERT: test;ssh;CRITICAL;HARD;3;Socket timeout
>[1099042715] HOST ALERT: test;DOWN;SOFT;1;CRITICAL
>[1099042725] HOST ALERT: test;DOWN;SOFT;2;CRITICAL
>[1099042735] HOST ALERT: test;DOWN;SOFT;3;CRITICAL
>[1099042745] HOST ALERT: test;DOWN;SOFT;4;CRITICAL
>[1099042755] HOST ALERT: test;DOWN;HARD;5;CRITICAL
>[1099042755] SERVICE ALERT: test;ping;CRITICAL;HARD;1;CRITICAL
>[1099042935] HOST ALERT: test;UP;HARD;1;PING OK
>[1099042935] SERVICE ALERT: test;ping;OK;HARD;1;PING OK
>[1099042945] SERVICE ALERT: test;ssh;CRITICAL;SOFT;1;Socket timeout
>[1099043005] SERVICE ALERT: test;ssh;OK;SOFT;2;TCP OK
>
>====> BUG ssh is in CRITICAL HARD STATE, but OK is SOFT !!
>
>[1099043265] SERVICE ALERT: test;ssh;CRITICAL;SOFT;1;Socket timeout
>[1099043335] SERVICE ALERT: test;ssh;CRITICAL;SOFT;2;Socket timeout
>[1099043395] SERVICE ALERT: test;ssh;CRITICAL;HARD;3;Socket timeout
>[1099043475] HOST ALERT: test;DOWN;SOFT;1;CRITICAL
>[1099043485] HOST ALERT: test;DOWN;SOFT;2;CRITICAL
>[1099043495] HOST ALERT: test;DOWN;SOFT;3;CRITICAL
>[1099043505] HOST ALERT: test;DOWN;SOFT;4;CRITICAL
>[1099043515] HOST ALERT: test;DOWN;HARD;5;CRITICAL
>[1099043565] SERVICE ALERT: test;ping;CRITICAL;HARD;1;CRITICAL
>[1099043715] HOST ALERT: test;UP;HARD;1;PING OK
>[1099043715] SERVICE ALERT: test;ping;OK;HARD;1;PING OK
>[1099043745] SERVICE ALERT: test;ssh;CRITICAL;SOFT;1;Socket timeout
>[1099043815] SERVICE ALERT: test;ssh;CRITICAL;SOFT;2;Socket timeout
>[1099043865] SERVICE ALERT: test;ssh;OK;HARD;3;TCP OK
>
>=====> hier it's ok, because ssh goes up after 2 test
>
>If you want look this bug in your nagios log file, you could use
>my simple perl script (see attachment)
>
>PS :
>to use it
>
>for i in nagios-*2004*
>do
>	./mayday_bug_trends.pl $i
>done
>
>Regards
>
>Le jeudi 02 décembre 2004 à 10:05 +0530, Nilesh a écrit :
>  
>
>>Dear All,
>>
>>I have noticed a strange behaviour of Trends in nagios.
>>I'm using nagios-1.2
>>
>>When ever there is a network outage, It is updating information 
>>immediately for the same.
>>After Recover of network connectivity all host check and service checks 
>>are getting checked and updating information
>>for availability of hosts and services. But many times Trends keeps on 
>>continuin with either "HOST UNREACHABLE" status  and services with 
>>"CRITICAL" status.
>>
>>In such  cases when i reboots nagios server then it is recovering it , 
>>but it is not a solution.
>>
>>So how to resolve this problem.
>>What i want is, as soon as host &/OR service check get success after 
>>network outage, Trends Must get update immediately.
>>
>>Waiting For Reply
>>With regards
>>
>>Linux Admin
>>
>>    
>>
>
>  
>



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list