Wednesday, January 3, 2007

Open Source Problems and Limitations

Open source is interesting and important phenomenon, but like science it has its own set of problems that one needs to understand. For each free/open sourced software package we have a developer or group of developers committed to making that package the best it can be. If the product has a bug, the developer responsible for it wants it to not have bugs. But the reality is not that simple. Below is my a limited and incomplete attempt at classifying open source problems.

Does open source really provide a rapid development environment?

"Open source does work, but it is most definitely not a panacea ... Software is hard. The issues aren't that simple."


Suspicions were voiced in the "Halloween I" memo indicating that the re-creation of existing systems and protocols could be achieved faster by using the Internet. For other kinds of projects it could be slower than traditional development processes. A known target could be used as a testbed for checking progress; it could also provide parallelism that could be beneficial for Internet-connected communities of developers.

When programming work requires everyone to develop on a single path the pace of open source development can became as slow or even slower than traditional projects. In these cases there is no implicit advantage in using the Internet. A key factor often is the personality of the leading developer. For example, TeX was created without the medium of the Internet; I doubt that the Internet or an Internet-based community of developers could have made any difference.

At the same time, a large community inhibits innovation, so a given project may stagnate. As Jamie Zawinski put it:

"There exist counterexamples to this, but in general, great things are accomplished by small groups of people who are driven, who have unity of purpose. The more people involved, the slower and stupider their union is."

But there is one more problem with speed and open source development. I feel that if speed is really important then authoritarian methods have distinct advantages over democratic methods. "Speed kills" and the first victim is democracy (first noted by Frederick Brooks in his analysis of OS/360 development). A purely authoritarian style will lead to creation of project elite and strict hierarchy that sooner or later will kill any given open source project. Therefore any attempt to speed up an existing OSS project beyond certain limits could lead to unforeseen consequences including changes in the project social structure. Competition with Microsoft (as promoted by some Linux evangelists) is a dangerous threat to the open source movement as a whole. In order to compete with an authoritarian organization speed of delivery will be a matter of survival.

Small projects do not need explicit coordination structures. Coordination is usually automatically performed by the leader of a project or a group of the leaders. This approach does not scale well. For large projects an authoritarian model is the only choice if speed of development is a major issue. If speed is perceived as an important project goal the community of developers consciously or unconsciously will act by increasing the authoritarian tendencies of their leader. Eventually this can lead to problems.

As Jordan Hubbard put it "Despite what some free-software advocates may erroneously claim from time to time, centralized development models like the FreeBSD Project's are hardly obsolete or ineffective in the world of free software. A careful examination of the success of reputedly anarchistic or "bazaar" development models often reveals some fairly significant degrees of centralization that are still very much a part of their development process."

Also it depends whether a project has a single, clear direction (e.g. recreate UNIX, implement HTTP protocol) that can be efficiently communicated and acted upon by a group.

Generally for any rapidly evolving OSS project the level of centralization is high. This need of centralization may be perceived as a "cult of personality", where one developer has a tremendous amount of authority (for example, Linus Torvalds and Linux). This in turn can create a high level of discontent sometimes splitting a project into two of more competing development tracks. Examples are NetBSD/OpenBSD, Emacs/Xemacs, and gcc/egcs. If Linus Torvalds would drop too many important patches and be perceived as a bottleneck for all Linux kernel development, you could see a splitting Linux kernel development in the future.

Open source generally emphasizes quality and simplicity, not speed of development. An emphasis on quality as a project goal actually improve the chances of a project surviving in the long run.
The Town council effect or the effect of "The committee for the administration of the structural planning of the Linux kernel"

"Show me the source."

Developers are not creating programs in isolation. Users have the final word and users with e-mail can play an important part in the process. E-mail can often take the form of flames or operate to improve status in the movement by creating some benefit to a particular bureaucratic superstructure, the "town council effect." Alan Cox coined both the term "town council effect" and the phrase "The committee for the administration of the structural planning of the Linux kernel." In his paper "Cathedrals, Bazaars and the Town Council" Alan Cox wrote (italics are mine):

"The problem that started to arise was the arrival of a lot of (mostly well meaning) and dangerously half clued people with opinions -- not code, opinions. They knew enough to know how it should be written but most of them couldn't write "hello world" in C. So they argue for weeks about it and they vote about what compiler to use and whether to write one - a year after the project started using a perfectly adequate compiler. They were busy debating how to generate large model binaries while ignoring the kernel swapper design.

Linux 8086 went on, the real developers have many of the other list members in their kill files so they can communicate via the list and there are simply too many half clued people milling around. It ceased to be a bazaar model and turns into a core team, which to a lot of people is a polite word for a clique. It is an inevitable defensive position in the circumstances.

In the Linux case the user/programmer base grew slowly and it grew from a background group of people who did contribute code and either had a basis in the original Minix hacking community or learned a few things the hard way reboot by reboot. As the project grew people who would have turned into "The committee for the administration of the structural planning of the Linux kernel" instead got dropped in an environment where they were expected to deliver and where failure wasn't seen as a problem. To quote Linus "show me the source"."

Signal to noise ratio of e-mail conferences and supersized ego problems

"To succeed in the world, it is not enough to be stupid, you must also be well-mannered."

The content of messages on discussion groups might lead one to the wrong impression that the Internet is filled with junk and jerks. It is common for Internet inhabitants to complain bitterly about the lack of cooperation, decorum, and useful information. This is not completely true, but the signal-to-noise ratio is bad and getting worse.

A casual trip through cyberspace will turn up evidence of hostility, selfishness, and simple nonsense (much like a random walk in the real world will yield evidence of hostility, selfishness and nonsense). The recent developments over the open source trademark provide an excellent example. For many, it is much easier to be hostile in an e-mail discussion than face-to-face. That characteristic partially explains many flame wars that consume useful bandwidth of otherwise technical discussions.

Let me provide some examples. Fred Moody wrote:

"When I contacted one resolute Microsoft-hater, who works at what is probably the Internet's busiest Linux-using commercial site, he replied immediately via e-mail that he was willing to detail Linux's numerous shortcomings, but only anonymously. "I work with all these linux zealots who have nearly fired me over my pokings at linux."

Most of Linux's failings, according to my expert witness, are about what you would expect from a free operating system that exists in almost as many versions as there are people who use it. (Like Unix, Linux keeps mutating because the source code is available to anyone who wants to tinker with it.)

"Linux isn't secure and it isn't stable," my informant writes, with his usual bracing disdain for grammar and punctuation. "its a moving target that never really gets out of beta. sure people run production sites on linux. I know a lot of these people. they don't get much sleep and have grown opaque from the lack of sunlight. I have admin'd large Linux shops. they require huge amounts of admin overhead, and if you want shit to really work you are going to spend alot of time manually fixing things. the number of outstanding security holes and lack of stable functionality is monumental."

... Discussions of Linux's weaknesses can be found on several Web sites, along with programs used by hackers to attack Linux sites. To outsiders, many of the exchanges between devotees of BSD, Solaris or other Unix variations sound opaque or shrill: ("It will be a cold day at the equator before L. Torvalds sets aside his ego for the sake of someone else's better ideas.")."

There are diatribes against Free BSD; Larry McVoy assessed the situation this way:

"I'd like to point out that some BSD bigots (like myself) have abandoned BSD for Linux for reasons other than the purely technical ones. In particular, the BSD world is elitist, antagonistic, and uncooperative. You are either part of the in crowd or you are not and it makes a big difference. I'm someone that certainly has the credentials to be part of the "in crowd" but has rejected the invitations because I don't like organizations that play that game. Linux is a much nicer place to be.

Finally, Linux is covered by the GPL - BSD is not. If BSD ever gets popular again, the people running the show can, and will, take the source access away (try and get all the BSDi source, for example)."