NSCA strange behaviour

Cedric Jeanneret cedric.jeanneret at camptocamp.com
Tue Dec 8 11:40:00 CET 2009


Hello,

I'm having troubles with NSCA.
What we have :

- about 47 passive hosts
- about 220 passive services

Versions : all are redhat servers, with:
- NSCA 2.7.2 (latest one)
- Nagios 3.1.2

We have a single "nagios aggregator", which collect all NSCA status from the other hosts.

What's happening:
a host was reinstalled yesterday (say client22), and now it seems NSCA daemon on the aggregator (say server01) doesn't seem to collect data.

What I've done:

- tcpdump on both client22 and server01, both show me traffic between them, on NSCA default port (5667)

- checked iptables rules, all is ok (as tcpdump shows me traffic, that's a confirmation)

- trying to push status by hand from client22 to server01; ALL packets are sent successfully """1 data packet(s) sent to host successfully.""". I've done this with a loop like that:
for i in $(seq 1000); do /usr/local/bin/submit_ochp $(hostname -f) UP 'Host is up'; sleep 2; done

- Enbling debug for nsca on server01 doesn't show me anything interesting. I just don't see where nsca catch up client22 status, and it keeps on saying :
Warning: The results of host 'client22.domain.lt' are stale by 0d 0h 2m 0s (threshold=0d 0h 6m 0s).  I'm forcing an immediate check of the host.

On another hand, it shows me:
[1260267958.216051] [016.1] [pid=23191] Check results for service 'Cron service' on host 'client22.domain.lt' are fresh.


I really don't know where to find a solution, neither where is the real problem. We have another network with about 200 passive hosts and over 350 passive services, and it works fine.

The only differences are :
- the working network is debian-only
- the working network's NSCA server doesn't do anything else than central nagios server. server01 does some other stuff, like syslog server and collectd server... maybe there's a bottleneck in there, but I can't be sure about that.

Does anyone of you have an idea ?

Thank you in advance.

Best regards,

C.



-- 
Cédric Jeanneret                 |  System Administrator
021 619 10 32                    |  Camptocamp SA
cedric.jeanneret at camptocamp.com  |  PSE-A / EPFL
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/users/attachments/20091208/6b5c60ac/attachment.sig>
-------------- next part --------------
------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list