Navigating Technical Debt: How to Identify, Measure, and Manage

Navigating Technical Debt: How to Identify, Measure, and Manage

What is Technical Debt?

Imagine you're building a LEGO castle. You're in a hurry, so you start using pieces that don't fit together quite right. The castle looks fine from a distance, but up close, you can see the mismatched parts.

In software development, technical debt is like those mismatched pieces. They can lead to increased maintenance times, bugs, and difficulty in adding new features or scaling.

A properly maintained technical asset

Paying off technical debt means taking the time to go back and replace those pieces with well-thought-out, robust solutions that align with the overall design. It's an investment that ensures that software is maintainable, scalable, and performs well for the long run.

Measuring to Improve

At AppDirect , we believe in driving continuous improvement based on what we can measure. But how can we measure technical debt?

It turns out that measuring the level of technical debt in a system is a recognized problem in the software industry. Recently researchers at Google attempted to correlate technical debt as understood by different engineering teams with 116 existing metrics that measure codebases.

"The results were disappointing, to say the least. No single metric predicted reports of technical debt from engineers; our linear regression models predicted less than 1% of the variance in survey responses."

They found that over time, these measurement techniques didn't correlate with team member estimates of technical debt. Their conclusion was that existing software engineering metrics can't be used to accurately measure technical debt. New ways of measurement are needed.

We Love Hard Problems

A conclusion like this is exciting because this represents a widespread, unsolved problem in the software engineering industry.

One reason that measuring technical debt is so difficult is that the ideal state of a software project is held in the minds of the team that maintains it. What measurements other than an opinion survey can look into the team's imagination?The question we're asking is, what impedes our forward progress? What could be better?

For this reason, it's the responsibility of each team to assess and document their own level of technical debt as part of their understanding of platform resilience. Not all technical debt needs to be resolved. It's important to focus on technical debt that directly impacts a team's productive work.

Kinds of Technical Debt

Let's examine the different kinds of technical debt to consider how they apply to engineering strategy. Consider four major categories of tech debt.

Codebase Challenges

Challenges with the codebase are most obvious kind of technical debt. Problems that accumulate here correspond to issues with other engineering processes. That's the nature of debt -- it accumulates. So proactive processes around refactoring during routine development are essential to prevent larger problems down the road.

Code Quality

In this case, product architecture or code within a project was not well designed. It may have been rushed, or was a demo that became a production system. In the interests of time to market, a team is racing ahead to build features needed for product/market fit. "We can fix it later," we say, but later never comes. This becomes a migration challenge, discussed below.

Shipping the hackathon project to production

While this might be okay as long as the software is delivering value for the business, these poor architecture choices or outdated code standards may lead to maintainability issues. Refactoring and adherence to design principles are key.

Dead and/or abandoned code

In this case, code, features, or projects were replaced but not removed. This reflects a lack of proper code maintenance and clean-up. Regular audits can mitigate this issue.

Dependencies

Modern software uses an ever-growing list of dependencies. These need to be kept up to date, and sometimes dependencies can be unstable, rapidly changing, or trigger rollbacks. Problems here highlight the need for strategic management and version control.

To address issues with dependencies, sometimes teams decide to move an implementation from one dependency to another, triggering another migration.

Pipeline and Process Challenges

Very closely related to codebase issues are challenges around the pipeline for testing and releasing the codebase. Investments made in seed data, unit, and integration tests act as technical assets that can help prevent the accumulation of debt.

Testing Issues

Poor test quality or coverage, such as missing tests or poor test data, results in fragility, flaky tests, or lots of rollbacks. This reveals the necessity for robust testing, which strengthens the product and minimizes post-deployment issues.

Automated tests provide confidence for frequent, rapid delivery. If tests cannot be trusted, cycle times stretch out.

Release process issues

The ability to deploy faster helps solve many problems in software development. To enable this, the rollout and monitoring of production needs to be updated, migrated, or maintained. A need for an updated release process reveals the importance of deployment planning and monitoring in maintaining product quality.

Large charges to the build, testing, or deployment pipeline can require switching from one technology to another. These are complex migrations that need to be managed carefully.

At AppDirect, we have seen significant changes to our release process when we moved from a weekly release of a monolithic application to on-demand deployment of smaller services. With improved testing automated into our release process, we can deploy hundreds of times per day.

Knowledge Challenges

In this era of doing more with less, we're faced with more challenges around knowledge management. Simultaneously, we have more tools than ever to help address them.

Documentation Gaps

As time passes at a startup, information about how a project works is hard to find, missing, incomplete. This may include documentation on APIs or inherited code. It can also include documentation about functionality and how features are intended to be used or supposed to work. A lack of clear information can slow down development and increase mistakes, underlining the need for comprehensive documentation practices.

A documentation problem that builds up over time

At AppDirect, we've continued to invest in tools to improve access to documentation such as our own internal Knowledge Center, Stack Overflow for teams, and custom Udemy Business courses.

More recently we've experimented with building our own tools that leverage AI and large language models to consolidate information scattered across the enterprise. You can then ask your technical questions chat-GPT style. While these internal tools were quite capable and useful, maintaining them is not our core competency, and so we recently became early users of a tool called Unblocked.

As clever as AI tools can be, if they're reading artifacts of the software development process like Jira tickets and pull requests, getting the best results from these tools requires teammates to create high quality documentation. Text documents, such as markdown stored with code repositories, work best because it's easier to update together. The first answer to learning something can't be to jump on a call, watch a video, or find a slide presentation.

Team lacks necessary expertise

This may be due to staffing gaps and turnover or inherited orphaned code/projects. A lack of skilled resources can lead to inefficiencies and reduced quality, emphasizing the need for ongoing training and proper hiring strategies.

Migration Challenges

These may be motivated by the need to scale, due to mandates, to reduce dependencies, or to avoid deprecated technology. In the Google study referenced above, 20% of the technical debt types were identified as migration issues.

As we've seen, significant challenges in every other category of technical debt result in migrations. In effect, paying down technical debt itself is a migration. If not managed well, these migrations themselves become a form of technical debt.

Migrations are projects, and while there are challenges in measuring other kinds of technical debt, there are a lot of measurements and strategies we can deploy around project management.

Migration is needed, planned, or ongoing

At AppDirect, our engineering team has global initiatives that require large migrations, such as moving from RabbitMQ to Kafka. This requires special coordination between teams to address properly, because migrations like this involve message services with producers and consumers in different teams. These kinds of changes are internal to the engineering function and need to be planned and completed.

A carefully managed migration in progress

Some migrations are initiated by our enablement teams. These teams create and operate services used by every engineering team for the organization, such as the cloud platform, and the data platform. They're migrating to new technologies that the rest of the organization can then adopt.

There are also migrations that are specific to the individual teams working on parts of the value stream. As these teams build new capabilities for the business, migrating away from existing functionality must be planned out. How customers are impacted is a big part of this.

A whopper of a migration is something like moving from a monolith to microservices, which many companies have struggled with. In migrations like this functionality is typically extracted in different ways. Extraction by reimplementation remains the slowest, most comprehensive way to decompose a monolith. Often this method of extraction is coupled to a rewrite story, which offers it's own challenges.

Migration was poorly executed or abandoned

Sometimes, a migration that's in progress might become overlooked, either due to shifting company priorities, staffing changes, lack of documentation, or some other reason. When migrations of any nature are not well managed, they can become abandoned.

Multiple redundant versions in production simultaneously

This can make an already complex system a lot more complex, especially if it happens a lot. The system can become more difficult to support and maintain because of the many different code paths caused by the combinatorial explosion of complexity.

It's incredibly important to make sure that the team minimizes the amount of time that multiple versions of the same functionality need to be supported in production.

As we've seen, while there are four major categories of tech debt, migrations are a special category. Large challenges from the other areas eventually become migrations when we prioritize them.

Prioritizing Technical Debt

I urge all engineering teams to incorporate the prioritization of and reduction in technical debt into their day-to-day work. This commitment requires:

  • Regular Assessment: Continuously evaluate areas of potential debt and align them with business goals and customer needs within each technical domain, and in the interface points between domains.
  • Measure and Quantify: Develop techniques to quantify the results of your assessments so that you can track progress over time.
  • Strategic Planning: Build clear plans for tackling existing debt, considering both immediate needs and long-term scalability. Plan for debt reduction as part of building new capabilities.
  • Ongoing Education: Foster a culture of learning and collaboration, where understanding technical debt is a shared responsibility. This isn't merely the responsibility of a few, it's shared by all.
  • Alignment with Business Objectives: Ensure that actions to reduce technical debt align with overall business strategies and contribute to a seamless customer experience. Just as not all debt is bad, not all technical debt needs to be addressed.

By embracing these principles, at AppDirect we're not just building the #1 subscription commerce platform; we're crafting resilient, efficient systems that can adapt and thrive.

Your Turn: How Are You Tackling Technical Debt?

We've explored the complexities and challenges of identifying and managing technical debt. Now, I'd love to hear from you. How is your organization prioritizing and tackling technical debt? Are you using specific metrics or methodologies that you've found effective? Share your insights and strategies in the comments below. Together, let's build a more resilient and efficient software engineering culture.

Kathy Hadizadeh

Empowering IT, Product & Engineering Leaders: Elevate your leadership, amplify your impact, and secure the promotions you deserve without burnout. Discover LeaderSHIFT's transformative approach within weeks ?? | Speaker

1 年

Kudos on the article! Technical debt indeed is a slippery slope. Understanding its nuances helps in better project management and ultimately, a more robust engineering culture. Sharing your insights is like a lighthouse in the fog...much appreciated! @Mathew Spolin

Joseph (Joe) Mardini

Managing Partner at Empower Industries

1 年

fantastic article Mathew Spolin i am very far from being a sotware engineer and to be honest may not fully understand the article but i have a better grasp of what Technical Debt is and its impact. Well done!!!

Kevin Connelly

{Accelerating Application Development w/MongoDB}

1 年

Knew it was going to be a great piece when you started with a LEGO analogy. Mathew Spolin Respect tech debt - and pay down the principal. Mark Porter wrote an interesting article in 2021 about the "innovation tax" and how applications are the "currency of the new economy." Very insightful!

Daniele Fusi

Fusi-On Engineering Group - Manufacturing support in Asia

1 年

Great topic Matthew, and one of top problems afflicting tech companies. Do you think it applies to hardware as well?

Stefanie Lingle Beasley

Partner at Beasley & McCusker Communications

1 年

Insightful piece Mathew Spolin

要查看或添加评论,请登录

Mathew Spolin的更多文章

社区洞察

其他会员也浏览了