Nagios 2.0!

Andreas Ericsson ae at op5.se
Thu Oct 21 12:52:24 CEST 2004


Finally, we get to see the elusive Ethan on the mailing list.

Ethan Galstad wrote:
> Okay, there have been a number of messages on the list over the past 
> few days, relating to Nagios 2.0 development (or lack thereof), that 
> need to be addressed.
> 
> First, this project does not rule my life.  I imagine the plugin 
> developers feel the same about their involvement, though I can't 
> speak for them.  This project is something we work on in our spare 
> time.  We don't work at this full time and we don't get paychecks 
> from Nagios, Inc.  We all have day jobs and, believe me, we don't 
> rush home after a full day of work and plop ourselves back down at a 
> computer to eagerly apply all the latest patches so we can get a warm 
> fuzzy feeling inside.
> 

We all understand that, and I'm sure many of us has similar problems. 
What those of us that have discussed this matter feel is that there 
should be some means by which we can assist you to speed up development 
by some more effective means than posting patches to the mailing list. 
More on that below.

> Development on this project has its ups and downs, its slow periods 
> and its frenetic periods.  This is a slower time as far as 
> development is concerned.  Please realize that without slowing down 
> occassionally, we'd all go crazy, end up hating this project, and 
> eventually abandon it altogether.  Amazingly, this project has 
> managed to survive and thrive over the past 5+ years.
> 

It is an excellent program. Excellence has a way of surviving.

> As far as patches are concerned, yes there is a bit of a backlog.  
> That's just the way I've had to juggle things lately.  Every so often 
> I'll go through and apply some of the backlogged patches.  Some, not 
> all.  I don't always think all the patches have merit.  Some patches 
> I sit on and think about for months before I decide whether or not 
> they should be incorporated.

Which is your right as the founding father.

>  Those that I do commit are often 
> rewritten or mangled before doing so.  I rarely, *rarely*, ever apply 
> patches to CVS verbatim.  Sometimes I edit for coding style, 
> othertimes its to change to patch so it doesn't break things 
> elsewhere.  I always manually review the patches that come in, so I 
> can completely understand what they're doing and what they'll affect. 
>  As such, it doesn't matter to me if different developers submit 
> conflicting patches or patches against a slightly older version of 
> the code.  I can manage that just fine.
> 

Someone (Wayne, I think) suggested you give a couple of devoted 
patch-submitters 'official tester and reviewer' status. That should let 
you do less work per patch, since testing, coding style and such could 
be managed by your reviewers. It would also increase the understanding 
of the inner logic for the reviewers, which is never a bad thing.

> As far as giving additional developers CVS write access, I'm not at 
> that point yet.  After 2.0 or 3.0 I may very well decide to leave 
> this project for good and hand over the reins to others.  At that 
> point, you can all go nuts and do whatever the new maintainers allow. 
>  For the time being, however, patches for the core program still need 
> to go through me.  If you're not happy with that, you can always:
> 
> 1.  Run 1.x and not 2.0 alpha code in your production environment
> 2.  Keep bugging me until I commit the patch to CVS
> 3.	Maintain a separate repository with your own patches (a mini-fork)
> 4.  Fully fork the code into another project
> 

#2 is hard to do when there are no responses to the mailing-list that 
maintainers have even received the patch or are aware of the problem. 
Re-posting the same issue several times is bad form. Pestering is very 
rude, and should we send patches anywhere else than nagios-devel then 
the list description on sourceforge is somewhat askew.

I suggested #3 to a couple of fellow maintainers. It was intended to 
hold proven and tested bugfixes only. No extensions or add-ons.

I have done #4 for the plugins. So far I've fixed numerous bugs, made 
several optimizations, moved most of the plugins (with auditing for 
standard options and such) from contrib to standard, added several new 
plugins and removed a plethora of compilation requirements.
The nagiosplug project won't be able to benefit from these fixes because 
it's not being maintained actively enough (developers don't; respond to 
patches and contributions, checks anything in to CVS, appear on the 
mailing list), and porting the fixes to the old code will simply require 
too much work. My own development is moving at a much faster and 
steadier pace since I stopped sending in patches and awaiting the 
never-existing responses before moving on.

> If you choose option #3, you might very well run into the problem 
> where you have a highly customized version of Nagios which is no 
> longer stock.  As I mentioned previously, I don't accept all patches 
> and I rewrite/mangle many of them before committing them to CVS.  As 
> long as you're able to keep on top of the Nagios CVS commits when 
> they occur, you can manage it, but it'll keep you busy.  Some big 
> organizations do something like this, so they can have a customized 
> version of Nagios in house.  Of course, they have some extra work to 
> do when Nagios CVS commits are made and when new versions are 
> released.
> 
> If you want to fork the project, please feel free to do so.  Many of 
> you are well qualified to do this, and I am certain that your project 
> will succeed, so long as you can dedicate the time and energy to 
> maintain the project over a number of versions and years.  Just don't 
> name the forked project anything similar to "Nagios", as I have a 
> trademark on the name.
> 
> I've heard mention of the fact that "some people" may be abandoning 
> Nagios because the alpha 2.0 code isn't being patched quickly enough 
> or released soon enough.  What is this?  Slashdot??  What FUD!  
> Nagios/NetSaint has been around for over 5 years and it gets better 
> and has more users with every version.  
> 

That was me airing my concern that users may do that because of a 
problem that can easily be fixed by people eager to help. Noone said 
they would leave Nagios, and I would be surprised if anyone who actually 
did took the time to write about it.

"Lack of timely updates" has long been the strongest argument of closed 
source advocates (marketing people at Microsoft, mostly). It's a shame 
to give that argument validity if it can be helped at all.

> Am I to believe that people who have used NetSaint and upgraded 
> through Nagios 1.x are going to abandon it for a commercial app 
> because 2.0 isn't coming out soon enough?  If that's true, why in the 
> world were they running NetSaint x.xx or Nagios 1.x in the first 
> place then?  Those versions didn't have the new features that Nagios 
> 2.0 will.  And yet, amazingly, they chose to use it.  Give me a 
> break.  If you desperately need the features that Nagios 2.0 will 
> have (or won't) and a commercial app offers it, lay down the cash and 
> buy it.  Geez!   Don't use Open Source for purely philosphical 
> reasons when your business would be better off with a commercial app.
> 

With Netsaint and early versions of Nagios 1.x most people were more or 
less on their own. Since then a lot of companies have started selling 
support and packaged versions of Nagios. Some companies have deployed 
resources to actively find and fix bugs in Nagios. If bugfixes aren't 
incorporated or at least given a message "will be incorporated" or "will 
not be incorporated", some of them might fork their own versions and 
thus the resources are lost to the Nagios project.

> What about the people run 2.0 alpha code you ask?  What about them?  

Just for the record; I've been running Nagios 2.0 in a plethora of 
different setups and configurations since late july without crashes or 
misbehaviour. I would consider that mature for beta release, but the 
decision is ofcourse up to you.

> Oh dear!  If you choose to run *alpha* code, you are asking to get 
> put through the ringer on a few things.  Bugs galore, "slow" patches, 
> etc.  If you want stable, run 1.x.  If you want bleeding edge, try 
> 2.0.  But don't complain too loudly if it doesn't work perfectly.  
> Don't complain if all the new patches don't get applied fast enough, 
> or at all.  If you're using alpha code in a production environment 
> and your business depends on it, you should tidy up your resume 
> immediately because nothing is guaranteed when it comes to this 
> stuff.
> 

Hence the resources set aside for code auditing and prompt 
patching/bugfixing.

> Bottom line is: don't run Nagios because of what features it *might* 
> have in the future.  Run it because it works for you *now*.  Ask 
> yourself, "Why am I running Nagios *right now*"?  Present moment.  
> Its not a Zen thing, its just common sense.  If it doesn't work well 
> enough for you right now, put your energy towards finding something 
> that does.
> 

It does work, but not flawlessly in its present CVS version (not 
counting the last checkin, which I haven't had time to test). Hence the 
submitted patches.

> Andreas, you've stated that you're concerned by the lack of CVS 
> activity with regards to patches.  Okay then.  In May I spun off NRPE 
> as a separate project from the Nagios CVS repository in order to help 
> free up some of my time and let others take over as maintainers.  You 
> and Derrick volunteered to be the primary maintainers, with me as a 
> backup.  At that time you had made some mods to NRPE that were 
> supposedly going to be committed to CVS.  Five months down the road 
> and there's still nothing in CVS.  The project site 
> (http://sourceforge.net/projects/nrpe) is deserted, other than for 
> the  barebones home page I put up.  What's the status with this?  
> This is as much of a concern to me as backlogged patches for Nagios.  
> Should I import the old NRPE CVS repository into the new and/or 
> recruit other maintainers?  Please let me know.
> 

NRPE is undergoing active development but I get sloggish connects to the 
sourceforge repository so I've got a separate CVS for maintenance. The 
next major version will make use of ssl certficates for authentication 
so that command arguments can be enabled without risk, but it's a lot of 
work to test it on various different distributions. Seeing as nobody has 
really discovered the separate nrpe project yet I've taken this time to 
re-design the codebase and give myself the luxury of thorough testing 
before releasing publically.

> Alright, enough for now.  I'm tired, irritated, most likely 
> irrational, and have probably managed to tick off more than a few 
> people.  I'll post a followup in the next few days with a list of 
> outstanding 2.0 patches that I'm aware of and list off which ones 
> aren't going to make it, which ones are, etc.  Ciao.  
> 

Tired and irritated I can understand, but I have to disagree about 
irrational.

To sum it up (for further discussions if nothing else);
CVS write access is kept strictly to Ethan (and Karl, although unclear why).
Some people have found bugs and created patches, or thought up new 
extensions they would like to see and implemented them but are a bit 
vexed about the lack of responses from Ethan to submissions to the list. 
Contributors are obviously eager to help, but feel it's a bit hard to do so.

Someone suggested the introduction of code reviewers that could be able 
to respond to patch submitters and review patches for coding style and such.
I proposed to a select group of active developers to set up and maintain 
a separate repository of Nagios to keep bugfixes only up to date.


In conclusion to the conclusions, I think contributors would be much 
happier if they could get a response from someone 'in charge', saying 
they've received the patch, what they think about it (commit, no commit) 
and they'll review it before sending it to Ethan. Reviewers/testers 
should be able to do this just fine and it doesn't require CVS write access.

As it stands now, several people find and fix the same problem and then 
sit back and wait anxiously for the next checkin to see if their patch 
made it before being able to move on. It's very cumbersome to have 
several patches against the same CVS tree unless you know some of them 
will be submitted, even if in an altered form. Reviewers could relieve 
contributors of some of their worries and help speed up their 
contribution pace while keeping some of the workload of testing and 
basic editing off Ethan's shoulders.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Lead Developer


-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl




More information about the Developers mailing list