Thursday, February 22, 2007

Why Open Source Software / Free Software (OSS/FS, FLOSS, or FOSS)? Look at the Numbers!

1. Introduction

Open Source Software / Free Software (OSS/FS) (also abbreviated as FLOSS or FOSS) has risen to great prominence. Briefly, OSS/FS programs are programs whose licenses give users the freedom to run the program for any purpose, to study and modify the program, and to redistribute copies of either the original or modified program (without having to pay royalties to previous developers).

The goal of this paper is to convince you to consider using OSS/FS when you’re looking for software, using quantitive measures. Some sites provide a few anecdotes on why you should use OSS/FS, but for many that’s not enough information to justify using OSS/FS. Instead, this paper emphasizes quantitative measures (such as experiments and market studies) to justify why using OSS/FS products is in many circumstances a reasonable or even superior approach. I should note that while I find much to like about OSS/FS, I’m not a rabid advocate; I use both proprietary and OSS/FS products myself. Vendors of proprietary products often work hard to find numbers to support their claims; this page provides a useful antidote of hard figures to aid in comparing proprietary products to OSS/FS.

I believe that this paper has met its goal; others seem to think so too. The 2004 report of the California Performance Review, a report from the state of California, urges that “the state should more extensively consider use of open source software”, and specifically references this paper. A review at the Canadian Open Source Education and Research (CanOpenER) site stated “This is an excellent look at the some of the reasons why any organisation should consider the use of [OSS/FS]... [it] does a wonderful job of bringing the facts and figures of real usage comparisons and how the figures are arrived at. No FUD or paid for industry reports here, just the facts”. This paper been referenced by many other works, too. It’s my hope that you’ll find it useful as well.

The following subsections describe the paper’s scope, challenges in creating it, the paper’s terminology, and the bigger picture. This is followed by a description of the rest of the paper’s organization (listing the sections such as market share, reliability, performance, scalability, security, and total cost of ownership). Those who find this paper interesting may also be interested in the other documents available on David A. Wheeler’s personal home page.

1.1 Scope

As noted above, the goal of this paper is to convince you to consider using OSS/FS when you’re looking for software, using quantitive measures. Note that this paper’s goal is not to show that all OSS/FS is better than all proprietary software. Certainly, there are many who believe this is true from ethical, moral, or social grounds. It’s true that OSS/FS users have fundamental control and flexibility advantages, since they can modify and maintain their own software to their liking. And some countries perceive advantages to not being dependent on a sole-source company based in another country. However, no numbers could prove the broad claim that OSS/FS is always “better” (indeed you cannot reasonably use the term “better” until you determine what you mean by it). Instead, I’ll simply compare commonly-used OSS/FS software with commonly-used proprietary software, to show that at least in certain situations and by certain measures, some OSS/FS software is at least as good or better than its proprietary competition. Of course, some OSS/FS software is technically poor, just as some proprietary software is technically poor. And remember -- even very good software may not fit your specific needs. But although most people understand the need to compare proprietary products before using them, many people fail to even consider OSS/FS products, or they create policies that unnecessarily inhibit their use; those are errors this paper tries to correct.

This paper doesn’t describe how to evaluate particular OSS/FS programs; a companion paper describes how to evaluate OSS/FS programs. This paper also doesn’t explain how an organization would transition to an OSS/FS approach if one is selected. Other documents cover transition issues, such as The Interchange of Data between Adminisrations (IDA) Open Source Migration Guidelines (November 2003) and the German KBSt’s Open Source Migration Guide (July 2003) (though both are somewhat dated). Organizations can transition to OSS/FS in part or in stages, which for many is a more practical transition approach.

I’ll emphasize the operating system (OS) known as GNU/Linux (which many abbreviate as “Linux”), the Apache web server, the Mozilla Firefox web browser, and the OpenOffice.org office suite, since these are some of the most visible OSS/FS projects. I’ll also primarily compare OSS/FS software to Microsoft’s products (such as Windows and IIS), since Microsoft Windows has a significant market share and Microsoft is one of proprietary software’s strongest proponents. Note, however, that even Microsoft makes and uses OSS/FS themselves (they have even sold software using the GNU GPL license, as discussed below).

I’ll mention Unix systems as well, though the situation with Unix is more complex; today’s Unix systems include many OSS/FS components or software primarily derived from OSS/FS components. Thus, comparing proprietary Unix systems to OSS/FS systems (when examined as whole systems) is often not as clear-cut. This paper uses the term “Unix-like” to mean systems intentionally similar to Unix; both Unix and GNU/Linux are “Unix-like” systems. The most recent Apple Macintosh OS (MacOS OS X) presents the same kind of complications; older versions of MacOS were wholly proprietary, but Apple’s OS has been redesigned so that it’s now based on a Unix system with substantial contributions from OSS/FS programs. Indeed, Apple is now openly encouraging collaboration with OSS/FS developers.

1.2 Challenges

It’s a challenge to write any paper like this; measuring anything is always difficult, for example. Most of these figures are from other works, and it was difficult to find many of them. But there are two special challenges that you should be aware of: legal problems in publishing data, and dubious studies (typically those funded by a product vendor).

Many proprietary software product licenses include clauses that forbid public criticism of the product without the vendor’s permission. Obviously, there’s no reason that such permission would be granted if a review is negative -- such vendors can ensure that any negative comments are reduced and that harsh critiques, regardless of their truth, are never published. This significantly reduces the amount of information available for unbiased comparisons. Reviewers may choose to change their report so it can be published (omitting important negative information), or not report at all -- in fact, they might not even start the evaluation. Some laws, such as UCITA (a law in Maryland and Virginia), specifically enforce these clauses forbidding free speech, and in many other locations the law is unclear -- making researchers bear substantial legal risk that these clauses might be enforced. These legal risks have a chilling effect on researchers, and thus makes it much harder for customers to receive complete unbiased information. This is not merely a theoretical problem; these license clauses have already prevented some public critique, e.g., Cambridge researchers reported that they were forbidden to publish some of their benchmarked results of VMWare ESX Server and Connectix/Microsoft Virtual PC. Oracle has had such clauses for years. Hopefully these unwarranted restraints of free speech will be removed in the future. But in spite of these legal tactics to prevent disclosure of unbiased data, there is still some publicly available data, as this paper shows.

This paper omits or at least tries to warn about studies funded by a product’s vendor, which have a fundamentally damaging conflict of interest. Remember that vendor-sponsored studies are often rigged (no matter who the vendor is) to make the vendor look good instead of being fair comparisons. Todd Bishop’s January 27, 2004 article in the Seattle Post-Intelligencer Reporter discusses the serious problems when a vendor funds published research about itself. A study funder could directly pay someone and ask them to directly lie, but it’s not necessary; a smart study funder can produce the results they wish without, strictly speaking, lying. For example, a study funder can make sure that the evaluation carefully defines a specific environment or extremely narrow question that shows a positive trait of their product (ignoring other, probably more important factors), require an odd measurement process that happens show off their product, seek unqualified or unscrupulous reviewers who will create positive results (without careful controls or even without doing the work!), create an unfairly different environment between the compared products (and not say so or obfuscate the point), require the reporter to omit any especially negative results, or even fund a large number of different studies and only allow the positive reports to appear in public. (A song by Steve Taylor expresses these kinds of approaches eloquently: “They can state the facts while telling a lie”.)

This doesn’t mean that all vendor-funded studies are misleading, but many are, and there’s no way to be sure which studies (if any) are actually valid. For example, Microsoft’s “get the facts” campaign identifies many studies, but nearly every study is entirely vendor-funded, and I have no way to determine if any of them are valid. After a pair of vendor-funded studies were publicly lambasted, Forrester Research announced that it will no longer accept projects that involve paid-for, publicized product comparisons. One ad, based on a vendor-sponsored study, was found to be misleading by the UK Advertising Standards Authority (an independent, self-regulatory body), who formally adjudicated against the vendor. This example is important because the study was touted as being fair by an “independent” group, yet it was found unfair by an organization who examines advertisements; failing to meeting the standard for truth for an advertisement is a very low bar.

Steve Hamm’s BusinessWeek article “The Truth about Linux and Windows” (April 22, 2005) noted that far too many reports are simply funded by one side or another, and even when they say they aren’t, it’s difficult to take some seriously. In particular, he analyzed a report by the Yankee Group’s Laura DiDio, asking deeper questions about the data, and found many serious problems. His article explained why he just doesn’t “trust its conclusions” because “the work seems sloppy [and] not reliable” ( a Groklaw article also discussed these problems).

Many companies fund studies that place their products in a good light, not just Microsoft, and the concerns about vendor-funded studies apply equally to vendors of OSS/FS products. I’m independent; I have received no funding of any kind to write this paper, and I have no financial reason to prefer either OSS/FS or proprietary software. I recommend that you

This paper includes data over a series of years, not just the past year; all relevant data should be considered when making a decision, instead of arbitrarily ignoring older data. Note that the older data shows that OSS/FS has a history of many positive traits, as opposed to being a temporary phenomenon.

1.3 Terminology and Conventions

You can see more detailed explanation of the terms “open source software” and “Free Software”, as well as related information, in the appendix and my list of Open Source Software / Free Software (OSS/FS) references at http://www.dwheeler.com/oss_fs_refs.html. Note that those who use the term “open source software” tend to emphasize technical advantages of such software (such as better reliability and security), while those who use the term “Free Software” tend to emphasize freedom from control by another and/or ethical issues. The opposite of OSS/FS is “closed” or “proprietary” software.

Other alternative terms for OSS/FS, besides either of those terms alone, include “libre software” (where libre means free as in freedom), “livre software” (same thing), free-libre / open-source software (FLOS software or FLOSS), open source / Free Software (OS/FS), free / open source software (FOSS or F/OSS), open-source software (indeed, “open-source” is often used as a general adjective), “freed software,” and even “public service software” (since often these software projects are designed to serve the public at large). I recommend the term “FLOSS” because it is easy to say and directly counters the problem that “free” is often misunderstood as “no cost”. However, since I began writing this document before the term “FLOSS” was coined, I have continued to use OSS/FS here.

Software that cannot be modified and redistributed without further limitation, but whose source code is visible (e.g., “source viewable” or “open box” software, including “shared source” and “community” licenses), is not considered here since such software don’t meet the definition of OSS/FS. OSS/FS is not “freeware”; freeware is usually defined as proprietary software given away without cost, and does not provide the basic OSS/FS rights to examine, modify, and redistribute the program’s source code.

A few writers still make the mistake of saying that OSS/FS is “non-commercial” or “public domain”, or they mistakenly contrast OSS/FS with “commercial” products. However, today many OSS/FS programs are commercial programs, supported by one or many for-profit companies, so this designation is quite wrong. Don’t make the mistake of thinking OSS/FS is equivalent to “non-commercial” software! Also, nearly all OSS/FS programs are not in the public domain. the term “public domain software” has a specific legal meaning -- software that has no copyright owner -- and that’s not true in most cases. In short, don’t use the terms “public domain” or “non-commercial” as synonyms for OSS/FS.

An OSS/FS program must be released under some license giving its users a certain set of rights; the most popular OSS/FS license is the GNU General Public License (GPL). All software released under the GPL is OSS/FS, but not all OSS/FS software uses the GPL; nevertheless, some people do inaccurately use the term “GPL software” when they mean OSS/FS software. Given the GPL’s dominance, however, it would be fair to say that any policy that discriminates against the GPL discriminates against OSS/FS.

This is a large paper, with many acronyms. A few of the most common acryonyms are:
Acronym Meaning
GNU GNU’s Not Unix (a project to create an OSS/FS operating system)
GPL GNU General Public License (the most common OSS/FS license)
OS, OSes Operating System, Operating Systems
OSS/FS Open Source Software/Free Software

This paper uses logical style quoting (as defined by Hart’s Rules and the Oxford Dictionary for Writers and Editors); quotations do not include extraneous punctuation.

http://www.dwheeler.com/oss_fs_why.html