bischeck suddenly stops working

Francesco Giuseppe Toffoli ftoffoli at skylogic.it
Tue Jul 25 09:55:08 CEST 2017


Hi Anders,
thanks for your reply. I'll answer you to the variuos questions:

(1) the java version is:

openjdk version "1.8.0_91"
OpenJDK Runtime Environment (build 1.8.0_91-b14)
OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode)

and has not been updated recently. In our test environment, (where the 
problem does not occur), the version is nearly the same (1.8.0_121).
The OS has not been updated, (CentOS release 6.6).

(2) Redis has not been uptaded recently, (redis 2.8.23). At the moment 
we have more or less 13.000 keys used.

(3) We usually add checks, maybe weekly. The issue started to occur some 
months ago, but it could happen that for 2 or 3 weeks everything is ok,  
then we have several crashes in a week. I'm not so inclined to give the 
guilt to some new checks, also because the testing server is aligned to 
the production one.


(5) Yes, the restart is done via '/etc/init.d/bischeckd restart' and it 
solves the issue. Physical memory on the server is always OK, i don't 
think to a jvm out of memory.

In the Bischeck logs i didn't notice any error. However, at the next 
crash i'll try have a deeper look at them.
Could i have a look at some other logs maybe?

Thanks,
Francesco





Il 24/07/2017 21:57, Anders Håål ha scritto:
>
> Hi Giuseppe,
>
> Sounds strange that it just stopped working after along time of 
> stability if not something has change:
>
> - Anything change on the server you run bischeck on - OS, jdk version, 
> ......
>
> - Update redis version? Change in configuration?
>
> - Added any new bischeck check or changed something in the configuration?
>
> - Anything else you can think about that may have change?
>
> When you say restarting is it the normal /etc/init.d/bischeckd restart 
> that fix the problem? The reason I ask is that the script just do a 
> kill with TERM signal. If the jvm would be in a out of memory 
> situation it may not be enough, but you should have seen that in the 
> log I guess. Sure you do not have any ERROR or WARN entries in the log.
>
> /Anders
>
>
>
> On 07/24/2017 02:14 PM, Francesco Giuseppe Toffoli wrote:
>>
>> Hi,
>> we are experiencing a critical problem with Bischeck. It's a couple 
>> of months it sometimes suddenly stops working: the daemon  
>> /etc/init.d/bicheckd is running but no check results are sent to 
>> Nagios. Restarting bischeck daemon fixes the issue.
>> Unfortunately we can't find any clue about the root cause on bischeck 
>> logs, not even with DEBUG logging level enabled. Redis database seems 
>> working properly  and no increasing of memory/cpu usage are reported 
>> on the server hosting bischeck while the issue occurs.
>>
>> Do you have any suggestion on how to deeply investigate this?
>>
>> Regards,
>> Francesco
>>
>> -- 
>>
>> Francesco Giuseppe Toffoli
>> Monitoring Engineer
>>
>> GSE Department
>>
>> Tel: +39 01127387488
>>
>> Mobile: +39 349.800.60.35
>> Email: _ftoffoli at skylogic.it <mailto:ftoffoli at skylogic.it>_
>> *
>> **Skylogic S. p. A.*
>> Strada Pianezza, 289
>> 10151 Torino, Italy
>>
>>
>>
>> This message contains confidential information and is intended only 
>> for the individual named. If you are not the named addressee you 
>> should not disseminate, distribute or copy this e-mail. Please notify 
>> the sender immediately by e-mail if you have received this e-mail by 
>> mistake and delete this e-mail from your system. E-mail transmission 
>> cannot be guaranteed to be secure or error-free as information could 
>> be intercepted, corrupted, lost, destroyed, arrive late or 
>> incomplete, or contain viruses. The sender therefore does not accept 
>> liability for any errors or omissions in the contents of this 
>> message, which arise as a result of e-mail transmission. If 
>> verification is required please request a hard-copy version. Please 
>> note that any views or opinions presented in this email are solely 
>> those of the author and do not necessarily represent those of the 
>> Company.
>> No employee or agent is authorized to conclude any binding agreement 
>> on behalf of this Company nor, through this latter, any of the 
>> Eutelsat Communication group with another party by email without 
>> express written confirmation by a duly authorized officer of the 
>> Company. The list of duly authorized officers and the scope of their 
>> powers is published on the Trade Register according to the national 
>> law of each affiliate.
>
> -- 
>
>
> Ingby<http://www.ingby.com>
>
> bischeck - dynamic and adaptive monitoring for Nagios<http://www.bischeck.org>
>
> anders.haal at ingby.com<mailto:anders.haal at ingby.com>
>
> Mjukvara genom ingenjörsmässig kreativitet och kompetens
>
> Ingenjörsbyn
> Box 531
> 101 30 Stockholm
> Sweden
> www.ingby.com  <http://www.ingby.com/>
> Mobil: +46 70 575 35 46
> Tele: +46 75 75 75 090
> Fax:  +46 75 75 75 091

-- 

Francesco Giuseppe Toffoli
Monitoring Engineer

GSE Department

Tel: +39 01127387488

Mobile: +39 349.800.60.35
Email: _ftoffoli at skylogic.it <mailto:ftoffoli at skylogic.it>_
*
**Skylogic S. p. A.*
Strada Pianezza, 289
10151 Torino, Italy





This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. Please note that any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the Company. No employee or agent is authorized to conclude any binding agreement on behalf of this Company nor, through this latter, any of 
 the Eutelsat Communication group with another party by email without express written confirmation by a duly authorized officer of the Company. The list of duly authorized officers and the scope of their powers is published on the Trade Register according to the national law of each affiliate.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/bischeck-users/attachments/20170725/0c243b36/attachment-0001.html>


More information about the Bischeck-users mailing list