About a year ago news stories began to surface that wikipedia was losing more contributors that it was gaining. These stories were based on the research of Felipe Ortega who had downloaded and analyzed millions the data of contributors.
This is a question of importance to all of us. Crowdsourcing has been a powerful and disruptive force socially and economically in the short history of the web. Organizations like Wikipedia and Mozilla (at the large end of the scale) and millions of much smaller examples have destroyed old business models, spawned new industries and redefined the idea about how we can work together. Understand how the communities grow and evolve is of paramount importance.
In response to Ortega’s research Wikipedia posted a response on its blog that challenged the methodology and offered some clarity:
First, it’s important to note that Dr. Ortega’s study of editing patterns defines as an editor anyone who has made a single edit, however experimental. This results in a total count of three million editors across all languages. In our own analytics, we choose to define editors as people who have made at least 5 edits. By our narrower definition, just under a million people can be counted as editors across all languages combined. Both numbers include both active and inactive editors. It’s not yet clear how the patterns observed in Dr. Ortega’s analysis could change if focused only on editors who have moved past initial experimentation.
This is actually quite fair. But the specifics are less interesting then the overall trend described by the Wikmedia Foundation. It’s worth noting that no open source or peer production project can grow infinitely. There is (a) a finite number of people in the world and (b) a finite amount of work that any system can absorb. At some point participation must stabilize. I’ve tried to illustrate this trend in the graphic below.
As luck would have it, my friend Diederik Van Liere was recently hired by the Wikimedia Foundation to help them get a better understanding of editor patterns on Wikipedia – how many editors are joining and leaving the community at any given moment, and over time.
I’ve been thinking about Diederik’s research and three things have come to mind to me when I look at the above chart:
1. The question isn’t how do you ensure continued growth, nor is it always how do you stop decline. It’s about ensuring the continuity of the project.
Rapid growth should probably be expected of an open source or peer production project in the early stage that has LOTS of buzz around it (like Wikipedia was back in 2005). There’s lots of work to be done (so many articles HAVEN’T been written).
Decline may also be reasonable after the initial burst. I suspect many open source lose developers after the product moves out of beta. Indeed, some research Diederik and I have done of the Firefox community suggests this is the case.
Consequently, it might be worth inverting his research question. In addition to figuring out participation rates, figure out what is the minimum critical mass of contributors needed to sustain the project. For example, how many editors does wikipedia need to at a minimum (a) prevent vandals from destroying the current article inventory and/or at the maximum (b) sustain an article update and growth rate that sustains the current rate of traffic rate (which notably continues to grow significantly). The purpose of wikipedia is not to have many or few editors, it is to maintain the world’s most comprehensive and accurate encyclopedia.
I’ve represented this minimum critical mass in the graphic above with a “Maintenance threshold” line. Figuring out the metric for that feels like it may be more important than participation rates independently as such as metric could form the basis for a dashboard that would tell you a lot about the health of the project.
2. There might be an interesting equation describing participation rates
Another thing that struck me was that each open source project may have a participation quotient. A number that describes the amount of participation required to sustain a given unit of work in the project. For example, in wikipedia, it may be that every new page that is added needs 0.000001 new editors in order to be sustained. If page growth exceeds editors (or the community shrinks) at a certain point the project size outstrips the capacity of the community to sustain it. I can think of a few variables that might help ascertain this quotient – and I accept it wouldn’t be a fixed number. Change the technologies or rules around participation and you might make increase the effectiveness of a given participant (lowering the quotient) or you might make it harder to sustain work (raising the quotient). Indeed, the trend of a participation quotient would itself be interesting to monitor… projects will have to continue to find innovative ways to keep it constant even as the projects article archive or code base gets more complex.
3. Finding a test case – study a wiki or open source project in the decline phase
One things about open source projects is that they rarely die. Indeed, there are lots of open source projects out there that are the walking zombies. A small, dedicated community struggles to keep a code base intact and functioning that is much too large for it to manage. My sense is that peer production/open source projects can collapse (would MySpace count as an example?) but the rarely collapse and die.
Diederik suggested that maybe one should study a wiki or open source project that has died. The fact that they rarely do is actually a good thing from a research perspective as it means that the infrastructure (and thus the data about the history of participation) is often still intact – ready to be downloaded and analyzed. By finding such a community we might be able to (a) ascertain what “maintenance threshold” of the project was at its peak, (b) see how its “participation quotient” evolved (or didn’t evolve) over time and, most importantly (c) see if there are subtle clues or actions that could serve as predictors of decline or collapse. Obviously, in some cases these might be exogenous forces (e.g. new technologies or processes made the project obsolete) but these could probably be controlled for.
Anyways, hopefully there is lots here for metric geeks and community managers to chew on. These are only some preliminary thoughts so I hope to flesh them out some more with friends.