OSS Community Health

OSS Community Health

Thanks to a nudge from Kiran Mova , I was inspired to dig up some previous work I did with the Cloud Native Computing Foundation (CNCF) project OpenEBS , and given that the annual CNCF conference is happening next week, in Chicago, I thought it would be fitting to throw back to MJ's Bulls and also share some insights I learned on how we can assess an open source project community's health and practices by providing a real time instrumentation fabric from which project maintainers and contributors can collectively make decisions to further progress of the project. Here is what we did:

  1. First, we understood the process Kiran and team were following, and we then subsequently aggregated exhaust sources, namely Slack ( for collaboration ), Git ( for planning and source code ) and Travis ( for CICD ) - by funging the data from these sources together, and putting it over a time series.
  2. Next, we started to align the data over a time series and started to look for patterns, any messages and any key 'captain obvious' messages.
  3. Finally, we culminated in a series of visualizations that informed a hypothetical narrative that we continued to fine tune, test out, prove and disprove.
  4. Finally, we arrived at a set of insights that would inform Kiran, who as project maintainers, appreciated the visibility and transparency into what was happening, within the open source community, and why...

Here are some of the visualizations, and subsequent insights, which we found - by intersecting the specific git repos team members, and slack messages, and centering it around each of the individual projects within the community, we found four collaboration patterns:

  • Kumbaya chamber: this was a collaborative, inclusive and vibrant community with lots of bi-directional interactions.
  • Silo Chamber: lots of 1:1 chats, not very team oriented and quite individualistic.
  • Echo chamber: lots of command and control style of uni-directional communication.
  • Needy chamber: this was the exact opposite of the echo chamber, where we found lots of people talking / reporting, to a single person and again - very uni-directional.

Looking at these, I think it is interesting to continue to slice and dice in to the various projects, and super-impose the various companies in play as it maybe signal towards political dynamics that always exist in such communities. Also, it may be opportunities to course correct certain behavioral patterns such as tribal knowledge, hero culture and collaboration silos.

Further to this, we expanded on the collaboration theme to look into some characterizations, which we grouped into the following four groups:

  • Active Contributor / Leader centric behaviour: Lots of bi-directional collaboration, spread across many people signalling a strong network.
  • Sales and Marketing centric behaviour: Lots of outbound one-way conversations.
  • Customer/User centric behaviour: Lots of inbound interactions from sales people, as well as some level of bi-directional chats, but no where near to a leader
  • Lurker: These are the dormant people, with very sporadic activity.

So using this, if someone actually claims to be an active contributor, one can measure the validity of that - but more importantly, we can find the unsung heroes, the ones who may not be vocal in public physical forums and conferences, but quietly be the leaders behind the scenes!

Shifting over to more pure code repo analysis, we decided to look at the topic of "risk" as it relates to code stability. From that we informed the developers which repos were the highest risk, based on rate of change, pace of change, and also dependancies based on cross-repo file usage. This provided a sense to which code repo to ensure there is a deeper level of caution...

We subsequently also spotted areas of mitigation based on repo to repo dependancies, and recommended when, and where, collaborations between contributors should be intensified, with an extra level of peer review given the heightened risk profile.

On this note, we realized that it would be beneficial to deepen the analysis and plot out collabortation patterns within, and across specific busy, and not so busy, repos. The most important insight we found were there were areas of high amounts of community chatter , coupled with an intensely high degree of forking, suggesting that certain code files could be made into libraries or 'platform' entities so as to encourage reuse and save time and energy!

Finally, last but not least, we had the beginnings of plotting a sort of rudimentary value stream of the development activities within the repos. What we found was completely counter-intuitive ( to me at least): the issues and PRs closed by a bot happened to have higher cycle time than those which were manually closed by humans. As I scratch my head on that one, could it mean that humans aren't always slower, less efficient, than the machine? Who knows. Mind you, this was pre-GenAI days ;-) so...more data needed for sure.

This was fun work, and I think various other CNCF projects can definitely benefit from such analysis, and more - there is much more where it came from!. Last but not least, all this could not have been possible with the encouragement from my friends in this OSS/CNCF space: Nithya Ruff , Chris Aniszczyk , Evan Powell , Murat Karslioglu , Martin Casado and Bogomil Balkansky to name a few...

Have fun in Chicago during the CNCF Conference !

Kiran Mova

Senior Engineering Manager - VMware Cloud Foundation (VCF) | Building the best Kubernetes offering on vSphere

1 年

Thank you Sumit For taking time to publish this! This is going into my talk and I hope there will be more project that will come forward to do more around this space! Something on my mind for this conference -- is to search for the recipe for building an sustainable open source community that powers enterprises and enables enterprises to fund more open source projects.

Very interesting ways to think about community interactions!

Evan Powell

Many time founder & 5 exits - lots of open source - now working to reimagine cyber security with deep learning

1 年

Thanks Sumit - it was a great study. I still believe this sort of approach can be helpful in understanding the sustainability & “resilience” of a community, team, and company.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了