- Sep 4, 2023
Why and how companies do open source
This is a memo I initially wrote for internal consumption at Edgeless Systems to decide which open-source license to use for our "always-encrypted" Kubernetes Constellation. We ended up going with AGPL. Here are the considerations that led to this decision...
Traditionally, companies have formed around existing, successful open-source projects. In these cases, open source wasn't a deliberate choice and the companies had to work with what was already there. Examples of this setting include well-known "older" companies like Elastic, GitLab, or Redis, as well as some newcomers like Chainguard.
However, these days, open source oftentimes is a deliberate choice. Companies are started with the goal to build an open-source business or existing companies decide to release products as open source. Examples include well-known companies like CockroachDB, MongoDB, or HashiCorp, as well as an abundance of newcomers like n8n, or Kong. For these companies, open source is a means to an end, a strategic choice to maximize value.
While with open source, at least some part of the product is free-to-use in some way, the monetization options are:
- Selling support
- Selling commercial licenses for extended features ("open core")
- Selling SaaS
Benefits of open source
Roughly, the benefits of open source for companies are:
Low entry barriers: People, especially developers, are drawn to open-source software when they're looking for a solution they can use right away.
Trust: People can read the source code and modify it.
Community: People contribute to the software and help each other with problems.
Image: People perceive the company as friendly and not only focused on shareholder value.
Word of mouth: People are more likely to recommend open-source software.
Some of these are directly connected. For example, community and word of mouth are closely related. Further, low entry barriers and image clearly also impact word of mouth.
Risks of open source
The risks that come with the above benefits are:
Free riding: The open-source version of the software may be so good that no one wants to buy anything.
Competition: Other companies may build commercial products with or around the open-source product.
Loss of IP value: The open sourcing of products may lower the IP value and thus the valuation of the company.
It is common wisdom that competition oftentimes doesn't turn out to be a huge problem, because people typically like to buy from the people who built the software originally. For infrastructure SaaS like databases, the story may be different though, because here people typically like to buy from the big CSPs, regardless.
Balancing benefits and risks
All open-source companies need to balance the benefits against the risks. It is virtually impossible to have the former without the latter. Central to this tension is the license under which a product's source is published. A loose license that lets people do anything with the code will push all five forms of benefits, while also pushing all three forms of risks. A tighter license will mitigate the risks but will also dampen the benefits.
Types of open-source licenses
There are roughly the following groups of licenses.
Popular examples include MIT, BSD, and Apache. They essentially allow anyone to do anything with the software. Only Apache reserves some trademark rights. Many infrastructure software companies initially use Apache. It is also the required license to become a CNCF project.
Copyleft licenses also allow anyone to do anything with the software. However, they require that source code is made available to users of derived software under certain circumstances. Popular examples include:
GPL: any modifications and linked software need to be made available under the GPL. Therefore, the GPL is considered "viral".
AGPL: the same as the GPL, but it also includes "network users", which means that SaaS deployments are also affected. For example, if Company A is offering an AGPL-licensed software as SaaS, it needs to make all modifications and linked software available to its customers under the AGPL.
MPL: the loosest of the copyleft licenses; only modifications of existing source files need to be made available. EGo and MarbleRun are licensed under MPL.
It's important to note that AGPL only applies if users directly interact with the software. Thus, it would not apply in cases where someone is running cloud offerings on top of Constellation. It would only apply in cases where Constellation is made available as SaaS itself.
A range of restricted licenses exist that reserve certain rights. This includes:
Business Source License (BSL): introduced by MariaDB. The licensed software cannot be used "in production" by anyone. The vendor can define certain exceptions like: "for less than 3 nodes it's fine". After a maximum of 4 years, this version of the software automatically becomes GPL licensed. The vendor is free to set the period to anything below 4 years.
Elastic License: anyone is free to do anything with the software, except: (1) offer SaaS and (2) remove license checks.
Server Side Public License (SSPL): introduced by MongoDB. Like AGPL, but ALL code related to a managed/SaaS offering needs to be made public, including all infrastructure code. This is in essence a more complicated way for saying: "SaaS is not allowed".
Restricted licenses are typically not considered "real" open source. Rule of thumb: a license is real open source if it is accepted by the Open Source Initiative (OSI). If one declares a product released under a "restricted" license as "open source", one risks a shitstorm. At least for open-source purists, the correct term here is "source available".
Different licenses for different monetization models
The monetization model plays an important role for the choice of license. On the one side of the spectrum, SaaS-focused and user-facing companies like n8n, Gitpod, or Confluent can mostly only gain "trust" and "image" from open source, because their product is best consumed as SaaS and few people wish to manage the software on their own. Correspondingly, "competition" and "loss of IP value" are the primary risks. If the audience of a SaaS company is not tech-savvy, then open source doesn't even make sense in the most cases. There is little to gain and much to lose. Think of companies like Personio or HubSpot. These are actually at the extreme end of the spectrum. Consequently, such companies often opt for "restricted" licenses, which prevent "competition" (and by extension "loss of IP") and only moderately impact "trust" and "image".
Some developer-focused SaaS companies like Gitpod or Crowd.dev also pick a middle ground with the "copyleft" AGPL: while the license doesn't rule out SaaS competition, it at least requires potential competitors to open-source their offerings, giving the competitors some disadvantage. This still effectively keeps out a lot of potential competitors while paying in on "image".
On the other side of the spectrum is infrastructure software like cloud-native tooling, which is often used in self-managed deployments by experts and is primarily monetized through support and open core. For this type of software, all benefits and risks apply to their fullest. Therefore, the gains but also the losses can be big. The choice of license is critical for the success of the company. Here, many companies go for copyleft or loose licenses. This includes younger companies like HashiCorp, Kong, or Puppet as well as older ones like Red Hat (OpenShift) or VMWare (Tanzu). However, the older ones typically have a lot of paid enterprise features and no real community traction.
In between infrastructure software and user-facing SaaS is "somewhat infrastructure & somewhat SaaS" software like databases or AI frameworks. One can observe all types of licenses here, possibly with a bias towards "restricted" licenses.
Main takeaway: In the infrastructure space, most companies go for "loose" or "copyleft" licenses.
Different licenses for different phases
It can be observed that quite a few companies, especially in the database space, have changed their licenses in recent years. Examples include:
- Elastic changed from Apache to the Elastic License
- CockroachDB changed from Apache to BSL
- Redis changed from BSD to a custom tight license
- MongoDB changed from AGPL to SSPL
All these companies cited competition from cloud vendors as their primary motivation for the switch. For example, AWS was offering managed versions of Elastic Search and MongoDB without paying the original vendors. When the license change happened, AWS forked the last permissively licensed version and kept maintaining that in the open.
However, one can also clearly interpret these license switches as follows: All the companies are mature businesses. Some of them are even publicly traded. Their growth was substantially fueled by the benefits of open source. Now that they have reached a certain point in their development, they have less to gain from the benefits than they can lose. However, they can't afford to fully alienate their communities and therefore do an awkward dance around licenses, which in effect exclude most competition and also strictly limit free riding.
Main takeaway: Evidence seems to suggest that permissive licenses mostly benefit early-stage companies. (However, one could also interpret the evidence in such a way that permissive licenses are a relic of the past and companies are now switching to more modern, restrictive licensing schemes.)
Is open source good for everyone?
It is common wisdom that open source works best for markets with many potential customers. There needs to be enough room for an attractive number of paying customers even if a large fraction of customers is "free riding". If there are only five potential (huge) customers in the world, one probably shouldn't go for a "loose" or "copyleft" license. However, it could still work out if some converted to paying customers over time.
- Jul 25, 2023
- Jul 19, 2023