Optimal Config for hundreds of passive checks && Hundreds of Nagios procs

Mooney, Ryan ryan.mooney at pnl.gov
Wed Jun 25 02:17:31 CEST 2003


On linux:

<snip /etc/fstab>
none                    /tmpfs                  tmpfs   size=50M,mode=1777 0 0
</snip>

Also using passive checks that bundle data in can help a LOT.  I'm running over 6K 
checks/5 minute interval reasonably happily this way.  Otherwise the number of 
processes get a bit large.  Not sure what your checks are, and that part of the 
setup usually ends up being annoyingly site dependant (I know basically none of the
checks I do would be meaningful scripts anywhere else :< ).  This requires some
programming though so....

> -----Original Message-----
> From: solo molo [mailto:solomolo90 at hotmail.com]
> Sent: Tuesday, June 24, 2003 4:59 PM
> To: jlancaster at affinity.com; nagios-users at lists.sourceforge.net
> Subject: Re: [Nagios-users] Optimal Config for hundreds of passive
> checks && Hundreds of Nagios procs
> 
> 
> Well I tried your config (except for the ramdisk) and Nagios is still 
> getting behind on processing the results.  I can't understand 
> why Nagios is 
> getting behind since my load averages are so low (less than 2 
> most of the 
> time).
> 
> Anyway I'm going to try the ramdisk, I'm curious as to how 
> you set yours up. 
>   Do you have a script that recreates and formats the disk at startup?
> 
> 
> >From: "Jason Lancaster" <jlancaster at affinity.com>
> >To: "solo molo" <solomolo90 at hotmail.com>, 
> ><nagios-users at lists.sourceforge.net>
> >Subject: Re: [Nagios-users] Optimal Config for hundreds of 
> passive checks 
> >&& Hundreds of Nagios procs
> >Date: Thu, 19 Jun 2003 15:16:39 -0400
> >
> >With regards to this thread and the thread titled "Hundreds of Nagios
> >procs," I thought I'd share the configuration file I use in my
> >implementation. This (complete) config is similar on all 
> systems, with
> >various tweaks for each one. The monitoring servers are all 
> at least 1ghz
> >machines with around 3000 services. Every server has a 
> ramdisk and all
> >monitoring servers run a custom "ocsp sweeper" application 
> to send nsca
> >stats in bulk to the central server. This lightened the load 
> on monitoring
> >servers quite a bit as each ocsp command takes execution time.
> >
> >It seems like there are a lot of threads on this mailing the 
> list right now
> >asking about why implementations of Nagios have a huge queue 
> of results to
> >process. You can fix it... it just needs to be tweaked.
> >
> >Let me know if you have any questions.
> >
> >-Jason
> >
> >log_file=/usr/local/nagios/var/nagios.log
> >cfg_file=/usr/local/nagios/etc/checkcommands.cfg
> >cfg_file=/usr/local/nagios/etc/misccommands.cfg
> >cfg_file=/usr/local/nagios/etc/contactgroups.cfg
> >cfg_file=/usr/local/nagios/etc/contacts.cfg
> >cfg_file=/usr/local/nagios/etc/dependencies.cfg
> >cfg_file=/usr/local/nagios/etc/escalations.cfg
> >cfg_file=/usr/local/nagios/etc/hostgroups.cfg
> >cfg_file=/usr/local/nagios/etc/hosts.cfg
> >cfg_file=/usr/local/nagios/etc/services.cfg
> >cfg_file=/usr/local/nagios/etc/timeperiods.cfg
> >resource_file=/usr/local/nagios/etc/resource.cfg
> >status_file=/usr/local/nagios/ramdisk/status.log
> >nagios_user=nagios
> >nagios_group=nagios
> >check_external_commands=1
> >command_check_interval=-1
> >command_file=/usr/local/nagios/ramdisk/nagios.cmd
> >comment_file=/usr/local/nagios/var/comment.log
> >downtime_file=/usr/local/nagios/var/downtime.log
> >lock_file=/usr/local/nagios/var/nagios.lock
> >temp_file=/usr/local/nagios/ramdisk/nagios.tmp
> >log_rotation_method=d
> >log_archive_path=/usr/local/nagios/var/archives
> >use_syslog=0
> >log_notifications=1
> >log_service_retries=1
> >log_host_retries=1
> >log_event_handlers=1
> >log_initial_states=1
> >log_external_commands=1
> >log_passive_service_checks=1
> >inter_check_delay_method=s
> >service_interleave_factor=s
> >max_concurrent_checks=600
> >service_reaper_frequency=1
> >sleep_time=1
> >service_check_timeout=60
> >host_check_timeout=30
> >event_handler_timeout=30
> >notification_timeout=30
> >ocsp_timeout=30
> >perfdata_timeout=5
> >retain_state_information=1
> >state_retention_file=/usr/local/nagios/var/status.sav
> >retention_update_interval=0
> >use_retained_program_state=0
> >interval_length=60
> >use_agressive_host_checking=0
> >execute_service_checks=1
> >accept_passive_service_checks=1
> >enable_notifications=1
> >enable_event_handlers=1
> >process_performance_data=0
> >obsess_over_services=0
> >check_for_orphaned_services=0
> >check_service_freshness=1
> >freshness_check_interval=1200
> >aggregate_status_updates=1
> >status_update_interval=5
> >enable_flap_detection=0
> >low_service_flap_threshold=5.0
> >high_service_flap_threshold=20.0
> >low_host_flap_threshold=5.0
> >high_host_flap_threshold=20.0
> >date_format=us
> >illegal_object_name_chars=`~!$%^&*|'"<>?,()=
> >illegal_macro_output_chars=`~$&|'"<>
> >admin_email=nagios
> >admin_pager=pagenagios
> >
> >
> >----- Original Message -----
> >From: "solo molo" <solomolo90 at hotmail.com>
> >To: <nagios-users at lists.sourceforge.net>
> >Sent: Wednesday, June 18, 2003 18:41
> >Subject: [Nagios-users] Optimal Config for hundreds of passive checks
> >
> >
> > > I have nagios running on redhat 8.0 on a compaq DL360 
> with dual 800mhz
> >procs
> > > and 1GB ram.  Nagios receives 400 passive check results 
> every 10 minutes
> >and
> > > another 100+ active checks are perfomed every 5 minutes.  
> My loads are
> >never
> > > very high, but nagios gets way behind on processing the 
> passive checks.
> >The
> > > problem is especially bad when some of the passive checks return 
> >critical
> > > results.  I've seen the delay as bad as 20 hours.  That 
> is when I check
> >the
> > > log, nagios is receiving current passive results, but 
> displaying results
> > > from 20 hours ago in the UI.  I'd appreciate any 
> suggestion as to how I
> >can
> > > configure nagios to process the passive results more 
> quickly.  I'm using
> >the
> > > following config:
> > >
> > > inter_check_delay_method=d #I can't use smart because I 
> have a few 
> >checks
> > > that only run once every 24 hours and throw off the average.
> > >
> > > service_interleave_factor=s
> > > max_concurrent_checks=0
> > > service_reaper_frequency=5
> > > sleep_time=1
> > >
> > > _________________________________________________________________
> > > Protect your PC - get McAfee.com VirusScan Online
> > > http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963
> > >
> > >
> > >
> > > -------------------------------------------------------
> > > This SF.Net email is sponsored by: INetU
> > > Attention Web Developers & Consultants: Become An INetU 
> Hosting Partner.
> > > Refer Dedicated Servers. We Manage Them. You Get 10% 
> Monthly Commission!
> > > INetU Dedicated Managed Hosting 
http://www.inetu.net/partner/index.php
> > _______________________________________________
> > Nagios-users mailing list
> > Nagios-users at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nagios-users
> > ::: Please include Nagios version, plugin version (-v) and OS when
>reporting any issue.
> > ::: Messages without supporting info will risk being sent to /dev/null
> >
>
>
>
>-------------------------------------------------------
>This SF.Net email is sponsored by: INetU
>Attention Web Developers & Consultants: Become An INetU Hosting Partner.
>Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
>INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
>_______________________________________________
>Nagios-users mailing list
>Nagios-users at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/nagios-users
>::: Please include Nagios version, plugin version (-v) and OS when 
>reporting any issue.
>::: Messages without supporting info will risk being sent to /dev/null

_________________________________________________________________
MSN 8 helps eliminate e-mail viruses. Get 2 months FREE*.  
http://join.msn.com/?page=features/virus



-------------------------------------------------------
This SF.Net email is sponsored by: INetU
Attention Web Developers & Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


-------------------------------------------------------
This SF.Net email is sponsored by: INetU
Attention Web Developers & Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list