Problems with initial install of Nagios

Sean R. Clark sclark at nyroc.rr.com
Mon Aug 16 16:01:35 CEST 2004



I was tasked with converting our Big Brother over to nagios

I am running nagios v 1.2 on gentoo with 670 hosts in the hosts.cfg

Currently I am only running check_ping on them to see how well it scales


Right now, I have 57 Down 	0 Unreachable 	55 Up 	558 Pending

The "pending list" seems to go down at the rate of 1 host per hour, making
it seem like the testing is very serial in it's nature

The hosts them selves say things like "Service check scheduled for Mon Aug
16 09:16:38 2004 " but it's 9:54 and still no check.

I tried using  parallelization, setting max_concurrent_checks to 0, this did
not make the list go down at all

I set max_concurrent_checks to 700, and this didn't not help either

Here are my timeout values: 

service_check_timeout=60
host_check_timeout=30
event_handler_timeout=30
notification_timeout=30
ocsp_timeout=5
perfdata_timeout=5


Nagios -s gives me

        SERVICE SCHEDULING INFORMATION
        -------------------------------
        Total services:             672
        Total hosts:                670

        Command check interval:     -1 sec
        Check reaper interval:      5 sec

        Inter-check delay method:   SMART
        Average check interval:     61.607 sec
        Inter-check delay:          0.092 sec

        Interleave factor method:   SMART
        Average services per host:  1.003
        Service interleave factor:  2

        Initial service check scheduling info:
        --------------------------------------
        First scheduled check:      1092664412 -> Mon Aug 16 09:53:32 2004
        Last scheduled check:       1092664473 -> Mon Aug 16 09:54:33 2004

        Rough guidelines for max_concurrent_checks value:
        -------------------------------------------------
        Absolute minimum value:     55
        Recommend value:            165



All the hosts fall under this service

define service {
    use    generic-service
    host_name    *
    service_description    PING
    contact_groups    rdc-staff
    check_period    24x7
    notification_interval    480
    notification_options    w,u,c,r
    notification_period    24x7
    check_command    check_ping!100.0,20%!500.0,60%
    max_check_attempts    1
    normal_check_interval    1
    retry_check_interval    1
}


I have fping installed also, which is what I was using with Big Brother, and
that took at most 300 seconds to give me the status for all the hosts, on
the same hardware.

The box does not seem taxed at all, either

top - 09:56:56 up 18 days, 19:00,  1 user,  load average: 0.00, 0.00, 0.00
Tasks:  47 total,   1 running,  46 sleeping,   0 stopped,   0 zombie
Cpu(s):   1.2% user,   0.4% system,   0.0% nice,  98.4% idle
Mem:   1292264k total,   687864k used,   604400k free,   288856k buffers
Swap:   999576k total,     8016k used,   991560k free,   150448k cached




And it seems like only one ping process is running

 ps aux  | grep ping
nagios   12453  0.1  0.0  1688  712 ?        S    09:57   0:00
/usr/nagios/libexec/check_ping -H 172.16.20.139 -w 3000.0,80% -c 5000.0,100%
-p 1
nagios   12454  0.1  0.0  1840  656 ?        S    09:57   0:00 /bin/ping -n
-U -c 1 172.16.20.139



Sorry for the lengthy first post, but I am at wits end with this. Also, even
though the plug-in asked me where fping was, it seems it's using just ping
to do the pings. Can anyone point me in the right direction?



-Sean




-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list