登录查看更多内容

The myth of full utilization - Part 1: The downsides of parallelizing software development

Manuel Drews

Director Of Engineering | Instruments & FX + Developer Platform at Native Instruments

发布日期: 2020年4月27日

It sounds like a good idea: "We try to work in parallel as much as possible to be as efficient as possible. We want to spend our time working, not discussing." I heard this statement from the lead developer of a project I was consulting for a while ago. And at first glance that seems to make a lot of sense, especially to an engineer: most modern software uses multiple threads that run on multiple processor cores to parallelize work and often derives a significant performance boost from that. If parallelism speeds up the software itself, it should be able to speed up its development as well, right? It also should keep every engineer constantly busy, so every resource is fully utilized and the company gets the maximum value for the salaries it pays.

Unfortunately this idea often does not work out well in reality, as it is based on incorrect assumptions and ignores some important aspects.

Let's start to look at it from an engineering perspective: An important rule to consider when thinking about parallelizing work is Amdahl's law (1). It originates from computer science but is applicable to any situation where work is split up in tasks that are to be executed in parallel. Amdahl's law states that the maximum achievable speedup is limited by the amount of work that is interdependent and thus needs to be carried out sequentially. For example, even if 75% of all tasks can be done in parallel, the maximum speedup factor is 4, no matter how many resources are thrown at the problem.

Even if 75% of all tasks can be done in parallel, the maximum achievable speedup is 4x

Factor 4 is a great improvement, if we could transfer that to software development that would be awesome. So what's wrong with it?

First of all, this is a theoretical limit that according to Amdahl's formula requires at least 64 people working on the parallel tasks (and which you'd have to manage without additional overhead). With a more realistic team size of 4, the achievable speedup is only about 2.3. And remember, this is for a project where 75% of the work can be parallelelized. Unfortunately this is hardly ever the case in software development, however much one might wish it be so. The engineers are working on the same codebase and their code is supposed to contribute to the same product, so there has to be a certain amount of synchronization. I think assuming 50% of the work being really independent is still a quite generous assumption for most projects (*). The achievable speedup with a 4 person team then is only 1.6. Ok, that's still much better than nothing so let's go with it, right?

Not quite. The second faulty assumption is that every task can be executed equally well by any resource. This assumption might be true for classical multiprocessor architectures where indeed all the CPU cores are identical. It doesn't hold up for humans, and even less so if the job requires expertise and information like in software development. The more the engineers work in isolation from each other, the longer the ramp-up time will be if one has to jump on a different area of the codebase. This introduces a significant risk of resource bottlenecks where several tasks are waiting to be taken on by the same person. So, ironically, parallelizing too much can actually force you to serialize work that could be tackled at the same time. This is exactly what happened in the software project I mentioned in the beginning: towards the end, 16 of the remaining 20 tickets in the backlog could only reasonably be worked on by one single engineer.

Parallelizing too much can actually force you to serialize work that could be tackled at the same time.

A common side-effect of this situation is that developers who can't work on the most important tasks start to pick up other, less important tickets. As a consequence, you end up with a lot of loose ends and unfinished features that clutter the codebase, making it harder (=slower) to work with. Managing all those open tasks increases the organizational overhead as well.

Lastly, imagine that the engineer responsible for those 16 tasks suddenly becomes unavailable due to sickness or other circumstances. This so-called 'bus-factor' (3) (or 'lottery' factor for the more optimistic among you) usually is quite unfavorable in a team where a lot of these information islands exist. Any person that suddenly becomes unavailable puts the whole project timeline at risk.

Speaking of risk: trying to work in parallel for as long as possible can have other severe downsides in terms of risk-management. In this work mode, feature integration and testing tend to be delayed significantly in an attempt to avoid the synchronization overhead that comes with it. But the later in time integration happens, the later problems are identified and the less time remains to fix them. Software modules that work well in isolation might display unexpected side-effects when connected. You might find some fundamental conceptual issues that nobody thought of before. Some usability problems only become visible when people are actually able to try out the application. Whatever it is, fixing these issues will be the harder the later in the game you are. For some it might not be possible anymore at all, or only in ways that have severe negative impacts on the code quality and future maintainability.

The later in time integration happens, the later problems are identified and the less time remains to fix them.

If you push integration to the very end of the project, you also loose the flexibility to modify or reduce the feature set in order to make a release-date or react to changed requirements. And what about your marketing department, or the people translating your software and writing documentation for it? They need finished features to work with too.

The bottom line here is that the speedup that is achievable by parallelizing is often not that great, and that it significantly increases risk and uncertainty if you don't keep an eye on the side-effects.

But that's still not all, there's yet another aspect: not only does over-parallelizing not really speed up your project, it also reduces it's quality. Every developer has a skewed perspective on their own work and it's all too easy to over-engineer, to forget about corner cases or simply miss a bug. If the engineers don't talk to each other it's unrealistic that they always find the best solution to a problem. If they try to save time and effort by not doing code-reviews it's almost certain that problematic code goes to production unnoticed. You'll also very likely end-up with a codebase in which problems are solved multiple times in different ways with all the negative consequences that has.

So, now I've talked a lot about the negative aspects of parallelizing work. Criticizing without offering alternatives is terribly unconstructive though. Therefore in my follow-up article I will address ways to improve software development throughput without falling into these traps.

Thanks for reading. As always, I'm interested in feedback and your experiences with the topic.

TL;DR:

Working in parallel can speed up software development to a certain degree. Parallelizing too much however will result in resource bottlenecks and increased risk for the whole project, as well as reduced software quality. So, apply it with care or the drawbacks will rapidly outweigh the benefits.

* 50% is a ballpark estimation that reflects my personal experience. It varies with the complexity of the application and the quality of the codebase, especially its modularity. As my former colleague Tom Smith pointed out, the number also greatly depends on the domain (where e.g. web development tends to be better parallelizable than work on a complex desktop application). The takeaway here is that you have to look closely at your project at hand and assess the respective trade-offs carefully.

References:

https://en.wikipedia.org/wiki/Amdahl%27s_law
https://en.wikipedia.org/wiki/Bus_factor

Acknowledgements

Many thanks to my colleagues and friends who have reviewed this article and shared their feedback and insights: Anna Gough, Alex Pukinskis and Tom Smith

要查看或添加评论，请登录

Manuel Drews的更多文章

Constraints

2024年8月8日

Constraints

I recently heard the story of the “K?ln Concert” by Keith Jarrett. In a nutshell, it goes like this: jazz pianist Keith…

5 条评论
A fine line

2024年6月2日

A fine line

If you've ever been part of a sports team, you'll know the saying "the team wins, the team loses". The idea behind this…

9 条评论
Métaphores dangereuses

2023年8月2日

Métaphores dangereuses

I recently read a piece of feedback that contained a statement along the lines of 'this team is running like a…
Beware the White Knight

2023年6月23日

Beware the White Knight

Product development is an inherently messy business. I can say that from my own experience but it seems to be true…
Musings on Metrics

2023年4月12日

Musings on Metrics

Introduction Data-driven decision making has been on the rise for many years, especially in the digital product space…

3 条评论
The myth of full utilization - Part 2: Flow efficiency

2020年5月7日

The myth of full utilization - Part 2: Flow efficiency

In my last article I discussed several reasons for why I think that parallelizing software development work at all…

2 条评论
Developing a feedback culture - Part 4: "Creating culture"

2019年12月19日

Developing a feedback culture - Part 4: "Creating culture"

Imagine a workplace where you can talk to any colleague, even the CEO, whenever you feel you have something to say…
Developing a feedback culture - Part 3: "How to receive feedback"

2019年11月22日

Developing a feedback culture - Part 3: "How to receive feedback"

In the previous article I've talked about how to give feedback. Obviously there's also the other, the receiving side.

2 条评论
Developing a feedback culture - Part 2: "How to give feedback"

2019年11月4日

Developing a feedback culture - Part 2: "How to give feedback"

How feedback is given and received plays a crucial role for how effective it will be. Neither of the two is trivial, so…

2 条评论
Developing a feedback culture - Part 1: "Why?"

2019年10月18日

Developing a feedback culture - Part 1: "Why?"

'Good Feedback is one of the most effective management tools for a supervisor'. I learned this in my first training…

2 条评论

See all articles

The myth of full utilization - Part 1: The downsides of parallelizing software development

Manuel Drews

Director Of Engineering | Instruments & FX + Developer Platform at Native Instruments

Manuel Drews的更多文章

社区洞察

其他会员也浏览了

Mastering the Art of Identifier Naming: Enhancing Readability and Maintainability

Code Rub

Understanding S.O.L.I.D Principles

Insights from a 45+ Year Experienced Developer: Conversations with Rob Sovyae

Enhancing Software Safety: An Overview of MISRA C Guidelines and Best Practices

SOLID Principal

Applying the SOLID Principles in C++

Follow the DRY Principle

Software Design: How does it smell?

Manuel Drews的更多文章

Constraints

A fine line

Métaphores dangereuses

Beware the White Knight

Musings on Metrics

The myth of full utilization - Part 2: Flow efficiency

Developing a feedback culture - Part 4: "Creating culture"

Developing a feedback culture - Part 3: "How to receive feedback"

Developing a feedback culture - Part 2: "How to give feedback"

Developing a feedback culture - Part 1: "Why?"

社区洞察

其他会员也浏览了

Mastering the Art of Identifier Naming: Enhancing Readability and Maintainability

Code Rub

Understanding S.O.L.I.D Principles

Insights from a 45+ Year Experienced Developer: Conversations with Rob Sovyae

Enhancing Software Safety: An Overview of MISRA C Guidelines and Best Practices

SOLID Principal

Applying the SOLID Principles in C++

Follow the DRY Principle

Software Design: How does it smell?