Open Source Projects Manage Themselves?

Dream On.

By Chuck Connell

Much has been written about the open source method of software development. By far, one of the most tantalizing statements about open source development is that these projects manage themselves. Gone are layers of do-nothing managers with bloated bureaucracies and interminable development schedules. In their place is a new paradigm of self-organizing software developers with no overhead and high efficiency.

The dust jacket for Eric Raymond's open source manifesto The Cathedral and the Bazaar makes this statement clearly. It says: "…the development of the Linux operating system by a loose confederation of thousands of programmers -- without central project management or control -- turns on its head everything we thought we knew about software project management. … It [open source] suggested a whole new way of doing business, and the possibility of unprecedented shifts in the power structures of the computer industry." This is not just marketing hype on a book cover, as Raymond expands the point inside: "… the Linux community seemed to resemble a great babbling bazaar of differing agendas and approaches … out of which a coherent and stable system could seemingly emerge only by a succession of miracles." Other open source adherents make similar statements in trumpeting the virtues of open source programming.

There is just one problem with the statement that open source projects manage themselves. It is not true. This article shows that open source projects are about as far as you can get from self-organizing. In fact, these projects use strong central control, which is crucial to their success. As evidence, I examine Raymond's fetchmail project (which is the basis of The Cathedral and the Bazaar) and Linus Torvalds's work with Linux. This article describes a clearer way to understand what happens on successful open source projects and suggests limits on the growth of the open source method.

(Note: This article addresses issues raised in the essay titled The Cathedral and the Bazaar. The essay also is included in a book with the same title, which contains other essays as well.)

What Really Happened with fetchmail

The Cathedral and the Bazaar revolves around Raymond's experience in creating a program called fetchmail by the open source method. As he describes the software development process, he annotates the story with lessons about open source programming and how well it worked for him. One of Raymond's key points is that the normal functions of management are not needed with open source development.

Raymond lists the responsibilities of traditional software managers as: define goals and keep everybody pointed in the same direction, monitor the project and make sure details don't get skipped, motivate people to do boring but necessary work, organize the deployment of people for best productivity, and marshal resources needed to sustain the project over a long period of time. Raymond then states that none of these tasks are needed for open source projects. Unfortunately, the majority of The Cathedral and the Bazaar describes, in detail, how important these management functions are and how Raymond performed them.

Eric Raymond decided what piece of software he would use as a test for open source programming. He decided what features fetchmail would have, and what features it would not. He generalized and simplified its design. In other words, he defined the software project. Mr. Raymond guided the project over a considerable period of time, remaining a constant as volunteers came and went. In other words, he marshaled resources. He surely was careful about source code control and build procedures (or his releases would have been poor quality) so he monitored the project. And, most significantly, Raymond heaped praise on volunteers who helped him, which motivated those people to help some more. (In his essay, Raymond devotes considerable space to describing how he and Torvalds motivate their helpers.) In short, fetchmail made full use of traditional and effective management operations, except that Eric Raymond did all of them.

Another compelling (and often-quoted) section of The Cathedral and the Bazaar is the discussion about debugging. Raymond says: "Given enough eyeballs, all bugs are shallow" and "Debugging is parallelizable." These assertions simply are not true and are distortions of how the development of fetchmail proceeded. It is true that many people, in parallel, looked for bugs and proposed fixes. But only one person (Raymond) actually made fixes, by incorporating the proposed changes into the official code base. Debugging (the process of fixing the program) was performed by one person, from suggestions made by many people. If Raymond had blindly applied all proposed code changes, without reading them and thinking about them, the result would have been chaos. It is a rare bug that can be fixed completely in isolation, with no effect on the rest of the program.

Lessons from Linux

In a similar way, on an even larger scale, Linus Torvalds pulled off a great feat of software engineering: he coordinated the work of thousands of people to create a high-quality operating system. But the basic method was the same one Raymond used for fetchmail. Torvalds was in charge of Linux. He made all major decisions, assigned subsystems to a few trusted people (to organize the work), resolved conflicts between competing ideas, and inspired his followers.

Raymond provides evidence of Torvalds' control over Linux when he describes the numbering system that Torvalds used for kernel releases. When a significant set of new features was added to the code, the release would be considered "major" and given a new even number. (For example, release 1.4 would lead to release 1.6.) When a smaller set of bug fixes was added, the release would get just a new minor number. (For example, release 1.4.8 would become 1.4.9.) But who made the decisions about when to declare a major release or what fixes were minor? Torvalds. The Linux project was (and still is) his show.

Further proof of Torvalds' key role is the fact that the development of Linux slowed to a crawl when Torvalds was distracted. The birth of his daughter and his work at Transmeta corresponded precisely with a period of slow progress for Linux. Why? The manager of Linux was busy with other things. The project could not proceed efficiently without him.

Finally, there is a quote from Torvalds himself during an interview with Bootnet.com. "Boot: You’ve got a full slate of global developers who are working on Linux. Why hasn't it developed into a state of chaos? Torvalds: It’s a chaos that has some external constraints put on it. … the only entity that can really succeed in developing Linux is the entity that is trusted to do the right thing. And as it stands right now, I'm the only person/entity that has that degree of trust."

Open Source Revisited

So, if the open source model is not a bazaar, what is it? To the certain consternation of Raymond and other open source advocates, their bazaar is really a cathedral. The fetchmail and Linux projects were built by single, strong architects with lots of help -- just like the great cathedrals of Europe. Beautiful cathedrals were guided by one person, over many years, with inexpensive help from legions of workers. Just like open source software is. And, just as with open source software, the builders of the cathedrals were motivated by religious fervor and a divine goal. Back then, it was celebrating the glory of God, now it is toppling Bill Gates. (Some people think these goals are not so different.)

Consider three diagrams showing different ways of organizing a software development project.


Figure #1 -- Traditional


Figure #2 -- Cathedral / Open Source


Figure #3 -- Bazaar

The first method (traditional) shows a Vice President of Development at the top, with several Directors of Engineering reporting to the VP. Below the Directors are Engineering Managers, and finally the engineers who write the code. Many organizations use this model, and everyone agrees it is sometimes grossly inefficient. The second method (cathedral or open source) uses a single designer/architect at the top, with many engineers reporting directly to the architect. The third method (bazaar) is a peer-to-peer network of many engineers, all reporting to and coordinating with each other, without central control. In The Cathedral and the Bazaar, Raymond claims open source projects are run in the third style. In fact, they are run in the style shown by the second diagram.

For more information about the parallel between software construction and cathedral building, see the classic work of software engineering The Mythical Man-Month by Fred Brooks. In this book, now 25 years old, Brooks describes the chief-programmer (or surgeon) method of software development. It is remarkably similar, in its basic philosophy, to the open source method that Raymond and Torvalds used. Interestingly, the book even contains a picture of a cathedral and relates it to this style of development.

A Real Bazaar

If fetchmail and Linux were not run as bazaar projects, what would a true bazaar project look like? A real bazaar software development method would proceed as follows.

Would this development method work well? While I don't know this for a fact, I suspect it would not. I believe significant human endeavors (such as software projects) need some type of unified control in order to create high-quality results.

It would be an interesting experiment to run an open source project in the method I describe above. (If you know of such a project, or want to start one, please let me know.) And it would be very exciting if it actually worked. This would indeed fulfill the hype on the dust jacket of The Cathedral and the Bazaar of a "whole new way of doing business, and the possibility of unprecedented shifts in the power structures of the computer industry."

Conclusion

I am sure many readers got out their flamethrowers at the beginning of this article, and are now resetting their weapons from stun to vaporize. So let me make myself clear. Open source projects are an important development in the computer world. The open source programming method is an exciting innovation in software engineering. But these projects do not manage themselves. They are not run by groupthink or any self-organizing dynamism. Successful open source projects are run by smart, effective project leaders. In other words, these projects have good managers.

To be fair, Raymond does address the issue of project leader control in his essay. He quickly dismisses the great importance of this control however, by claiming he and Torvalds did not have crucial roles in the design or creation of their software projects. He states that he and Torvalds did not design anything new, but merely recognized good ideas from others. Raymond is not seeing clearly his and Torvalds' contributions. It is very hard to shift through thousands of suggestions from swarms of users, find the good ones, synthesize them together, and incorporate them into an existing code base. This constitutes strong management and central control.

The need for good management suggests the scalability of the open source method may be limited. How many people have the technical sophistication to make good software design decisions, the people skills to motivate hundreds of contributors, and the time to dedicate to a complex project? We should be wary about assuming that the open source method can solve the world's software problems. It is possible only a small number of humans, such as Raymond and Torvalds, have the requisite skill set to run an effective open source project. If this is the case, as I suspect it is, the number of true success stories in open source development will be small. The open source method will run into the same wall as traditional software development. Good technical managers are few and far between.

Biography: Chuck Connell is a Domino/Notes consultant with 15 years of experience. He taught software engineering at Boston University and writes frequently on computer topics. Chuck can be reached at www.chc-3.com.

Revision History

September 2000 -- Published by IBM/Lotus Developers Network.

January 2007 -- Rehosted on my own server, after IBM dropped older articles from their site.