登录查看更多内容

TDD for legacy code

Stephan Erbs Korsholm

Independent Consultant | Team starter | Moving embedded builds to the cloud

发布日期: 2023年3月30日

+ 关注

How can you apply TDD for huge legacy code bases that were not written with TDD in mind?

There are several challenges to this scenario. Both technical and non-technical.

So it is tempting to just talk about TDD in the context of green field projects or projects where the code is organized in such a way that it is straightforward to build and run isolated units of code in a consistent manner.

But unfortunately that is the rare case (in my experience). For much important and business critical code - that were designed and implemented perhaps decades ago, and have been maintained by a significant number of developers with different coding and development styles - it is usually not straightforward to compile "units" individually and test them.

So applying TDD for legacy systems can be very challenging. There are technical challanges and then there are social/organizational/personal challanges. As usual with such things the technical challanges are comparatively "easy", and the social/organizational/personal challanges are "hard". But given the right approach it is possible, and it has huge benefits.

TDD for legacy code - the technical challenge

Assume you have identified a section of code that has a bug. Using a TDD approach it is suggested to write a test (that can be automated) that provokes the bug before fixing it. This step is difficult if the code is embedded inside a context of significant size. In this case - and without refactoring - the unit under test can not be compiled or run individually. So to test it the developer will have to set up and compile the entire context. This is rarely feasible as it may involve e.g. starting a UI, a web server context, a database connection or something a lot more complicated.

Instead the code has to be lifted out of its bigger context and into a much smaller - preferably trivial - standalone context. When doing that, the input to the unit under test must be identified. This input can be in the form of simple basic datatypes, which are easy to pass to the unit, but in real world scenarios, the input is most likely in the form of complex datatypes that may carry more information than is actually used by the unit and that themselves has huge dependencies. So how can such huge data objects be passed as input? One approach is to not pass the entire object, but instead identify only those pieces of data that are actually used by the unit under test and pass those only. Another (perhaps supplementary) approach is to cast the complex input data object to a much simpler interface type and expose in the interface only those pieces of data used by the unit. All programming languages will have some way of applying all these methods in a more or less structured manner, depending on the language.

A similar exercise will have to be done for the output from the unit under test. The output can be either in the form of returned data or in the form of side effects (message passing, writing to global variables, etc.). The challenge is mainly to identify how much code to lift out of its context (at least containing the bug of course) and how to handle the input/output to the new unit in such a manner that the new unit has a manageable dependency extent and can be executed in isolation.

The concept of mocks can be used here, but I usually avoid it if possible. I prefer selecting the unit and the input/output so that mocks are not needed.

领英推荐

September 2024 Edition: How TDD and Unit Tests Go…

HyperTest 5 个月前

Test Driven Development principles

Md Asraful Islam 4 个月前

TDD vs. BDD vs. ATDD: Understanding the Differences…

Asutosh Pandya 3 个月前

In all cases that I have come across it is indeed possible to lift out a unit of code and detach it from its context, so that it can be plugged in to both its original runtime/business context, but also be executed autonomously in a test. Sometimes it's easy, but most times it requires some effort and creativity.

If developers are following this path, more and more units of code will be moved out of their original context and into contexts that are independent of e.g. the hardware (for embedded) the technology stack (for backends) or independent of the UI framework (frontends). Following the path consistently and to completion will reorganize the legacy code base into two parts 1) a thin platform specific layer (hardware, technology stack, UI) that is inherently unportable and untestable and depends highly on the details of the platform and 2) a much larger portion of business logic that is independent of the currently chosen platform and can be tested in isolation.

Having this separation is hugely advantageous for development costs and maintainability (for many reasons), but getting there usually requires effort. In some cases the advantages outweigh the efforts required, in some cases it does not. Any organization has a responsibility to their owners and shareholders to evaluate and make the right choice so the business can be as profitable as possible.

But it is technically possible.

TDD for legacy code - the non-technical challenges

Imagine a lengthy debugging session has finally paid off and you have found the error! And imagine that the change required is easy to make (e.g. '>' should be '>=' instead). As a developer you can make the change and deploy it to production almost immediately and right after proudly get the praise of your peers, management and customers.

But if you follow TDD you should write a test first, which in turn requires you to isolate the unit under test into an autonomous unit as described above. Then add the test to a suite of regression tests, fix the error, and deploy it after all regression tests has completed successfully. To make the situation even more frustrating, imagine that the act of lifting out the unit under test has some risks, and that there is a chance that it may have adverse side-effects. So if you are a real TDDer, instead of just making the fix, you have to tell your manager, colleagues and even customers that you know what the issue is, but it will take you a day or two to actually fix this one line of code in the right manner.

This here is exactly why TDD fails: it is so much easier to fix errors inline where they are found, than to apply the whole TDD approach. So why should you ever apply TDD for legacy systems? This is because we should remember why we got ere: after a lengthy debugging session the bug was finally revealed! There should not have been any debugging session in the first place, and had the entire code base been under TDD control there would have been a lot less debugging.

But in the trenches under stress and impending deadlines it is very hard for the individual developer to carry the burden of applying TDD - especially for legacy code bases. It is so hard that - even if we all agree that TDD is great - in the real everyday world of the developer, it can be almost impossible to uphold the standards.

How can we solve this issue? I personally don't know, but I know that it is not a technical issue, and I know that it cannot be solved through technical means. And in my opinion all talk about TDD for legacy systems is a waste of time and resources if this core issue is not acknowledged and handled in the first place.

?smail Halit KARAKA?

Embedded Software Engineer - SLB

1 年

God helps you :(

1 次回应

Martin Schr?der

Senior Embedded Firmware Engineer | Embedded Firmware Expert | Educator | Scrum Master | ?? Contact me if your company needs to build secure connected products on Zephyr RTOS

1 年

It has to start with refactoring and untangling the code. Then not stopping until everything is tested.

查看更多评论

要查看或添加评论，请登录

Stephan Erbs Korsholm的更多文章

Why Dual-Target Development Speeds Up Embedded Software Development

2025年2月28日

Why Dual-Target Development Speeds Up Embedded Software Development

In embedded software development, it can significantly speed up progress to separate the software into two parts: * one…
When to Use Inheritance vs. Delegation in OOP

2025年1月22日

When to Use Inheritance vs. Delegation in OOP

Recently, I have been working on some legacy code where an original class has been copied into new classes that are…
Multi Vendor Embedded Development

2024年5月28日

Multi Vendor Embedded Development

Are you considering switching MCU platforms in your embedded projects? Whether it's due to supply chain disruptions…
The Subtle Complexity of Agile Steps

2024年4月10日

The Subtle Complexity of Agile Steps

At first glance, the concept seems straightforward: An agile software developer transitions in small steps from one…
Unit testing of embedded software,a short read series (part 4)

2022年3月30日

Unit testing of embedded software,a short read series (part 4)

This is the 4th and (so far) final part of 4 in a series of short reads about unit testing embedded software. The…
Unit testing of embedded software,a short read series (part 3)

2022年3月22日

Unit testing of embedded software,a short read series (part 3)

This is the third part of 4 in a series of short reads about unit testing embedded software. The previous two entries…
Unit testing of embedded software, a short read series (part 2)

2022年3月15日

Unit testing of embedded software, a short read series (part 2)

This is the second part of 4 in a series of short reads about unit testing embedded software. The previous entry…

3 条评论
Unit testing of embedded software, a short read series (part 1)

2022年3月8日

Unit testing of embedded software, a short read series (part 1)

This is the first part of 4 in a series of short reads about unit testing embedded software. This series covers the…

1 条评论
Cloud based build servers for embedded software

2020年3月31日

Cloud based build servers for embedded software

This is a simple guide on how to get started setting up a cloud based build server for an embedded software project…
Cloud based automated builds for embedded systems

2020年3月27日

Cloud based automated builds for embedded systems

Doing embedded software development? Using Git? GitLab? If you don't use "GitLab Auto DevOps" already to automatically…

4 条评论

See all articles

TDD for legacy code

Stephan Erbs Korsholm

Independent Consultant | Team starter | Moving embedded builds to the cloud

TDD for legacy code - the technical challenge

领英推荐

TDD for legacy code - the non-technical challenges

Stephan Erbs Korsholm的更多文章

社区洞察

其他会员也浏览了

Are You Down With TDD?

Unit Tests - from waste to asset

TDD Changed My Life

How To Avoid The TDD Slowdown

TDD Practice #2: The Sudoku Example in Clojure – Starting the Unit Testing

TDD - The Knight we all need

TDD and BDD with GoLang

TDD & avoiding the impasse

TDD is an unsatisfying name

TDD Cycle: Understanding the Workflow

TDD for legacy code - the technical challenge

领英推荐

TDD for legacy code - the non-technical challenges

Stephan Erbs Korsholm的更多文章

Why Dual-Target Development Speeds Up Embedded Software Development

When to Use Inheritance vs. Delegation in OOP

Multi Vendor Embedded Development

The Subtle Complexity of Agile Steps

Unit testing of embedded software,a short read series (part 4)

Unit testing of embedded software,a short read series (part 3)

Unit testing of embedded software, a short read series (part 2)

Unit testing of embedded software, a short read series (part 1)

Cloud based build servers for embedded software

Cloud based automated builds for embedded systems

社区洞察

其他会员也浏览了

Are You Down With TDD?

Unit Tests - from waste to asset

TDD Changed My Life

How To Avoid The TDD Slowdown

TDD Practice #2: The Sudoku Example in Clojure – Starting the Unit Testing

TDD - The Knight we all need

TDD and BDD with GoLang

TDD & avoiding the impasse

TDD is an unsatisfying name

TDD Cycle: Understanding the Workflow