I set up my nagios system to monitor 256 odd nodes each with about 6 services (direct and NRPE). It is working fine but my load averages have started edging upwards. Not critical yet but I wanted some tips to make things more efficient and see if there are things I might have done ineffeciently. One of the points I identified is this: I am doing a ping and ssh check on each server. This seems redundant. Is there a way to set it up so that: Do a ssh check; if this succeds obviously ping is ok. If it fails do a ping check and report on that. How about the other way around too? I have a bunch of NRPE checks: load_average, total-processes, scratch and home dir usage, pbs_mom, ntp_time. If ssh fails then there is obviously no reason to try these other checks right? But I think the monitoring_host wastes its cycles still trying them (based on the "Last Check" time) Any tips how I can achieve these effeciency tweaks? Or is there a problem in my strategy? Any other performance tweaks so that I can squeeze every ounce of Nagios performace? Already I am using NRPE rather than check_by_sshh since I was told the latter might be ineffecient for the monitoring host load usage. -- Rahul