Predicting the rise and fall of an open source project
Having abandoned or unmaintained components in your projects is not ideal.?
Why? Because having unmaintained open source projects is tech debt. The risk of having projects which are no longer maintained is grave and a potential black hole for resources. ?
Most companies are used to battling vulnerabilities and license issues
Why do projects fail?
Maintenance practices and the lack thereof?
In order to figure out if we in fact can predict failure, we need to understand why projects fail in the first place.?
While developing Open Source Select, the Debricked R&D team read a lot of research and wrangled the data to figure out what correlates with the rise or fall of OSS projects. We asked ourselves, “Are there any distinguishing characteristics of abandoned projects compared to continuously maintained ones?”.?
A wish for documentation?
It turns out that projects that do not document their contribution process and make it easy for new contributors to onboard themselves fail to a larger extent than those who do.?
This is typically embodied in the Contributing{.md/.txt/.rst} file, where developers read up on how to contribute or submit code changes to the project. 16% of failed projects, 72% of top-performing projects, and 32% of a random sample had such a file. Furthermore, a CI setup for the continuous test, build, and deployment
Look left and right before importing?
The truck factor of different projects?
When looking at which projects fail, it is important to analyse the Truck Factor of said projects. By Truck Factor or Truck Factor Developers, we refer to the number of developers that must stop contributing to the project for the project to die. Truck Factor developers can be found in multiple ways, for instance, by analysing the commit history
In Open Source Select, we call this “Core Team Commitment”. This feature analyses the activity of all maintainers with merge rights to the repository (due to its scalability). In this study, the authors investigated the impact of a Truck Factor in failed projects and found that 66% of all failed projects had a Truck Factor of 1, and 57% of repositories in their dataset had a Truck Factor of 1.?
领英推荐
On top of that, 16% of projects in their dataset faced a “failure event”, where a core team no longer existed in the project and activity was dead for at least a year. Luckily, in 41% of these cases, the failed projects managed to get back up on their feet.?
So now we know that without Truck Factor developers and no proper documentation, projects stand a bigger risk of failing. If you cannot easily onboard a new contributor, you will eventually dry up your supply of developers. As they say, “A one-man ship can only stay afloat for so long”.?
Maintainers leaving; what’s up with that??
Life as an Open Source Maintainer is quite the balancing act
Why maintainers leave?
Taking a closer look, Project consists of Obsolete (your project isn’t relevant anymore), Outdated Technologies, and Low Maintainability, where Obsolete is the most common. The Team (35%) factor consists of Lack of time, Lack of interest, and Conflict among developers
?So, how do I find well-maintained projects??
Our tool Open Source Select digs rather deep into some of this data to help developers choose and compare open source projects. Choosing carefully can potentially save you lots of time, sweat and money in the long run. Select gives you some of these key data points and helps you make informed decisions
Open Source Select is only a baby, a beta baby, and there’s a lot more to be added. For example, we want to help you search in code (yes, in the code) for functionality, contextualize your searches to make sure you get results that align with your organizational policies, and much more. Stay tuned!?
This post also exists in the form of a talk; watch it here.
References?
[Why Modern Open Source Projects Fail, Jailton Coelho and Marco Tulio Valente. 2017, https://doi.org/10.1145/3106237.3106246].?
[On the abandonment and survival of open source projects: An empirical investigation, Guilherme Avelino and Eleni Constantinou and Marco Tulio Valente and Alexander Serebrenik. 2019, https://arxiv.org/abs/1906.08058]?
[Understanding the Factors That Impact the Popularity of GitHub Repositories, Borges, Hudson & Hora, Andre & Valente, Marco. 2016, https://ieeexplore.ieee.org/document/7816479]?
Wanna rewatch the talk? Just follow this link ??: https://lnkd.in/dEqRa7nR