OSS Community Health
Thanks to a nudge from Kiran Mova , I was inspired to dig up some previous work I did with the Cloud Native Computing Foundation (CNCF) project OpenEBS , and given that the annual CNCF conference is happening next week, in Chicago, I thought it would be fitting to throw back to MJ's Bulls and also share some insights I learned on how we can assess an open source project community's health and practices by providing a real time instrumentation fabric from which project maintainers and contributors can collectively make decisions to further progress of the project. Here is what we did:
Here are some of the visualizations, and subsequent insights, which we found - by intersecting the specific git repos team members, and slack messages, and centering it around each of the individual projects within the community, we found four collaboration patterns:
Looking at these, I think it is interesting to continue to slice and dice in to the various projects, and super-impose the various companies in play as it maybe signal towards political dynamics that always exist in such communities. Also, it may be opportunities to course correct certain behavioral patterns such as tribal knowledge, hero culture and collaboration silos.
Further to this, we expanded on the collaboration theme to look into some characterizations, which we grouped into the following four groups:
So using this, if someone actually claims to be an active contributor, one can measure the validity of that - but more importantly, we can find the unsung heroes, the ones who may not be vocal in public physical forums and conferences, but quietly be the leaders behind the scenes!
领英推荐
Shifting over to more pure code repo analysis, we decided to look at the topic of "risk" as it relates to code stability. From that we informed the developers which repos were the highest risk, based on rate of change, pace of change, and also dependancies based on cross-repo file usage. This provided a sense to which code repo to ensure there is a deeper level of caution...
We subsequently also spotted areas of mitigation based on repo to repo dependancies, and recommended when, and where, collaborations between contributors should be intensified, with an extra level of peer review given the heightened risk profile.
On this note, we realized that it would be beneficial to deepen the analysis and plot out collabortation patterns within, and across specific busy, and not so busy, repos. The most important insight we found were there were areas of high amounts of community chatter , coupled with an intensely high degree of forking, suggesting that certain code files could be made into libraries or 'platform' entities so as to encourage reuse and save time and energy!
Finally, last but not least, we had the beginnings of plotting a sort of rudimentary value stream of the development activities within the repos. What we found was completely counter-intuitive ( to me at least): the issues and PRs closed by a bot happened to have higher cycle time than those which were manually closed by humans. As I scratch my head on that one, could it mean that humans aren't always slower, less efficient, than the machine? Who knows. Mind you, this was pre-GenAI days ;-) so...more data needed for sure.
This was fun work, and I think various other CNCF projects can definitely benefit from such analysis, and more - there is much more where it came from!. Last but not least, all this could not have been possible with the encouragement from my friends in this OSS/CNCF space: Nithya Ruff , Chris Aniszczyk , Evan Powell , Murat Karslioglu , Martin Casado and Bogomil Balkansky to name a few...
Have fun in Chicago during the CNCF Conference !
Senior Engineering Manager - VMware Cloud Foundation (VCF) | Building the best Kubernetes offering on vSphere
1 年Thank you Sumit For taking time to publish this! This is going into my talk and I hope there will be more project that will come forward to do more around this space! Something on my mind for this conference -- is to search for the recipe for building an sustainable open source community that powers enterprises and enables enterprises to fund more open source projects.
Very interesting ways to think about community interactions!
Many time founder & 5 exits - lots of open source - now working to reimagine cyber security with deep learning
1 年Thanks Sumit - it was a great study. I still believe this sort of approach can be helpful in understanding the sustainability & “resilience” of a community, team, and company.