The team's carrying capacity
Franjo Stipanovic
engineering manager | software engineer | security engineer | reverse engineer | penetration tester | security manager | ctf player
Ivan Klaric asked the question about the team’s carrying capacity and my answer exceeded the comment maximum character limit so I decided to post it like this.
I've been toying with this idea for the past few weeks: there's a limited number of production lines of code a team can have per engineer before quality starts to suffer through unexpected bugs or reliability issues. Let's call it team's carrying capacity :-) Teams can increase their carrying capacity by investing in testing and integration practices, reducing model complexity, and as their engineers learn more about the domain and the tech stack. Carrying capacity probably drops temporarily as you add more people to the team or add more external dependencies to the project. This framing is interesting because it opens up interesting trade-off discussions as: is adding engineers to the team worth it? our carrying capacity will drop for the next 6mos so we won't do new stuff or will start seeing bugs in existing stuff if we were already at capacity not deprecating a project will either decrease our quality or prevent us from building new features * is it worth adding a dependency to the project instead of doing it ourselves? It feels intuitive, but obviously needs more digging. I'd be curious to see if tools like Sourcegraph or Swarmia could be used to start measuring the ratio of production lines of code / engineer and see if there's something useful there. Has anyone seen attempts at measuring something similar?
Here’s how I do it.?
In my case, “carrying capacity” is not (directly) related to the number of lines but to the number of applications and features per team. Still, the complexity of applications and features is connected to the number of code lines but can be related to domain complexity (for example, building identity and access management services). To be more precise, I am trying to measure the cognitive load of my teams and total team ownership (number of applications/components/features).
I use a combination of task and source tools (in one of my cases, Jira and Bitbucket).
# Conversation with the team
Initially, I ask the team how they feel about their work - do they feel overwhelmed, do they think they can take more work? That’s just the conversation starter, and yes, there can be a lot of reasons why they feel that way. Anyway, after a deeper conversation, if they feel overwhelmed by context switching and work complexity, I look at the tasks. Answers can be biased, but they are good for building relationships and can still be good data points for different analyses.
# Task view
Let’s assume the team has good hygiene and tasks are appropriately labeled - applications/components/features and features vs bugs.?
领英推荐
I try to figure out the Lead time. Yes, this metric can be worthless if the planning framework is poorly designed, but let’s assume it’s not.?
If the lead time is long, I also try to figure out the work distribution. Is the same person doing all the tasks on one application/component/feature? Is it complicated (for example OAuth2 integration) or long code??
This one is tricky and usually requires manual work, but I try to figure out how many tasks are new “work”? and what are the “rewrites” or “upgrades” of previously poorly defined tasks. I don’t mean bugs here because bugs reveal opportunities in code quality, pull request review, test automation, etc and these “rewrites” or “upgrades” reveal opportunities in the planning phase.?
For example, this gave me enough data to be 100% sure we need to ditch the in-house-built identity solution and migrate to external (SaaS).??
# Code view
Code and activities around code also give me interesting data. For example, if the task lead time is unusually long, I look at the code level. By this, I mean how large or complex is the pull request, how many commits were there, how long was the pull request opened, how many comments were there, where there multiple nested question/answer comments, and how many people participated in this pull request review). Unfortunately, BitBucket does not have good (well, actually, any) metrics so I built my solution that can visualize these different graphs and tables (using Python daily cron jobs, git, BitBucket hooks, and Redash). Code view can also be helpful for engineering managers in other scenarios, like collecting data points and recognizing someone’s significant effort in mentoring others and awarding them accordingly. Also, it can collect data points about someone’s blind spot, a gap in knowledge due to the number of comments or content in comments, and create an appropriate education plan.?
PS, I trust that you are familiar with the DORA, SPACE, DevEx, Jedi, and Sith metrics. Also with the McKinsey document about software developer productivity the reaction by Dan North, Kent Beck, etc.?
Yes, there's a lot of “it depends” here :)?
VP of Product Engineering at Cognism
1 个月Nice one Fritz, thanks a lot for adding more color to the topic! Your approach is exactly what an engineering manager should be doing in my opinion - take a look at the total cognitive load on the team from different angles and act accordingly: cut scope, reduce work in progress, reward desirable behaviors, outsource some logic to 3rd parties, etc. This is the core of what EMs are supposed to do imho. If you take a broader view of the bigger organization, you're not going to be able to go so deep with all teams and might still be interested in knowing whether certain teams are carrying more cognitive load than others and especially if they're successful with it, what are they doing differently. That's where I thought a crude proxy metric like production_loc/engineer might be useful. Also, as you allude in the end, any engineering productivity metric can be abused in more ways than it can be properly used :-)