A Technical Debt Fairy Tale
Generated by MidJourney, prompt: software architecture diagram in decline, looking like a fairy tale

A Technical Debt Fairy Tale

Once upon a time, there was a lead developer called Annabel. She worked for vakation.com, a travel booking site. She was the tech lead of the devops team that maintained the web-app front-end. Springtime was upon the land, for ‘twas April, a busy time, when people were starting to book their summer holidays. But sales were disappointing. Conversion of visits into actual bookings were getting worse and worse.

One morning Ramon, the team’s business owner, requested an urgent gathering: they had found one of the root causes of the problem that was plaguing them. “Our site does not show visitors whether the accommodations they are booking have facilities to cast streaming media from their phones to the TV in the room,” he said, a frown upon his face. “It turns out that many of our competitors now do show that information, and clients end up booking their travel on other sites. So we need to add the media casting info to the vakation.com website with great speed and utmost alacrity!”

Annabel turned to Edwin, who headed up the hotel reservation back-end system, and asked Edwin whether the media casting info was available in the back-end system. Edwin smiled, for he had good news: the latest hotel booking communication standard included the media casting information! It was already present in their accommodations database. Vakation.com used an Enterprise Service Bus (ESB) to connect the various systems in their landscape, so Annabel asked Edwin to expose the desired info on the ESB for the web-app front-end to display. Seeing Annabel’s look of joyful expectation on her face, Edwin was happy to oblige. But things were not as merry as they seemed – for at that time, Edwin’s team had quite a full backlog: it looked like they wouldn’t have time for the change until June.

Now Annabel started to get worried: fearing Ramon’s wrath, she didn’t want to go back to him and tell him that she wouldn’t be able to fix the website until June. But she soon cheered up, for due to a stroke of fortune, the web-app fornt-end was on the same physical database as the hotel reservation back-end system, and Annabel’s team of diligent developers knew the table structure. So Annabel decided to temporarily ignore the company’s ESB policy and obtain the media casting information directly from the accommodations database, taking on some technical debt. Her mind was firmly set on refactoring the temporary fix as soon as the info was exposed on the ESB, hopefully in June.

No alt text provided for this image

Two months went by, after which Edwin perused his backlog, and lo and behold: he spotted the story card with Annabel’s request. With some trepidation in his heart, he visited Annabel in her lair, and asked her: “Fair Annabel, is this story still needed? Some other, more urgent stories have popped up. Would it be terrible if we pushed back the media casting story a few more sprints? After all, things are working now, and nobody is complaining”. Failing to reach agreement, Edwin and Annabel decided to ask Ramon, the product’s business owner. But to their great surprise, Ramon had trouble remembering that the fix was temporary, and was not at all interested in prioritizing the refactoring.

Days, weeks and months passed by. Eventually, it took until November before the info was finally accessible through the ESB. In the meantime, it turned out that some new members of Annabel’s team had been copying the method of accessing data in the hotel reservation system directly. Having run into Annabel’s temporary fix in the code, they felt comfortable using the same method – this time without even asking Edwin’s team. The cursed technical debt had multiplied itself! This had already led to the website breaking down after Edwin’s team had made some changes to their table structure, unaware that it was accessed directly by other teams. These outages had caused Ramon to wax angry and scold both teams: “Why did you allow things to become so bad? You should have managed that better!”

As we close the magical ?book on this horror story, there are some questions we may ask ourselves:

  • Was Annabel’s temporary fix to access Edwin’s team’s database directly a wise decision? After all, the company would have lost a lot of revenue if she had waited two months and made the fix the “right way”.
  • What about Edwin’s decision to postpone the exposure of the media casting data on the ESB? Doesn’t it make sense to prioritize by business value?
  • Was Ramon right to be upset with the teams? After all, he was completely uninterested in fixing the shortcut when it didn’t cause any problems yet.
  • Code scanning tools like SonarCube claim to detect tecnical debt, but would such tools have flagged this temporary fix as technical debt?
  • What could Annabel, Edwin and Ramon have done to prevent the problems?


?I look forward to reading your answers!

Disclaimer: all characters and organizations in this story are fictitious, and any resemblance to real persons or companies is purely accidental.

Dr Melina Vidoni

Engineering Leader | Building Strong, Empowered Teams | Equity Advocate | Tackling Backend Complexities & Driving Technical Success

1 年

I think Gartner's definition of technological debt would have fit better here, since that is the debt and cost associated to continue doing business in a software-factory/software-dependent environment. The issue are never "just the devs" (as the original definition implied) but every stakeholder with decisions affecting the software. TechDebt is _always_ a business risk, not a developers' problem on its own. Excellent article!

ven though I am from the embedded world it is intresting to read your story. If her team would have had understood the system thinking and it seems they had the time, then they could have supported and created these necessary APIs and published themselves on the ESB. To me it looks like a poor architectural decision to go for a centrally managed ESB in service-oriented landscape.? We are having similar problems in the embedded automotive industry where we are heavily relying upon CAN and LIN communication links, and as of today they require a centrally managed "signal database" where you prewire signals (payload) into link frames. Thus that will always be the bottle neck.?

Christos Charalampous

Senior Full-stack Software Engineer at Relex Solutions

1 年

Isn’t the problem lack of visibility that there is an issue here? From the description it seems like people thought that “things are working fine” and were comfortable copying and reusing a solution, so they thought the solution was “right” and it was okay to copy and reuse. So there was no indication, metric, comment, documentation, implementation detail, (etc) that would point to the issue here. If there were, the whole team could have been accountable for it (which I find more powerful than assigning a single responsible person to this).

  • 该图片无替代文字

I love this story! Thank you for posting it. But this is actually not the worst case. At least Annabel knew that she was taking on debt and planned to do something about it. But debt often accumulates without developers even being aware that they are doing anything wrong. After all (says the blithe developer): I can see that class, and I know the method signature, so I can just insert a call to it (never mind that there was an abstraction interface that I should have used). Each of these changes to the source code is innocuous in isolation, but they accumulate until the overall structure is completely eroded. For example, I once reverse engineered the "layered structure" of HDFS. It turns out that the layers were mostly in the minds of a few people, and did not exist in practice. In practice it was a big ball of random connections. That, I think, is where the biggest problems originate.

要查看或添加评论,请登录

Eltjo Poort的更多文章

  • Improving architecture in SAFe with RCDA

    Improving architecture in SAFe with RCDA

    It’s been over a decade since we bundled our experiences with agile architecture in our Risk and Cost Driven…

  • AI Art?

    AI Art?

    If you follow me on Instagram or Facebook, you may be wondering why I’ve been posting series of strange images lately…

    2 条评论
  • Architectural design with autonomous teams

    Architectural design with autonomous teams

    According to the agile manifesto, the best architectures emerge from self-organizing teams. The word emerge here has…

    4 条评论
  • Architecture: the outside view

    Architecture: the outside view

    Last month, I was asked to give a second opinion on some key architectural decisions and the way they were working out…

    5 条评论
  • A Map to Waterfall Wasteland and the Agile Outback

    A Map to Waterfall Wasteland and the Agile Outback

    Over the past 18 months, we have been iteratively developing a way to assess maturity with respect to architecture in…

    11 条评论
  • Value-driven Architecture Documentation

    Value-driven Architecture Documentation

    “[We value] working software over comprehensive documentation” features proudly on the front page of the Agile…

    7 条评论
  • Move slow and fix things

    Move slow and fix things

    Four years ago, Facebook changed its famous motto “Move fast and break things” to “Move fast with stable infra” (not…

    5 条评论
  • Opportunity Cost in the Technical Debt business case

    Opportunity Cost in the Technical Debt business case

    A few years back, I discussed the business case for reducing technical debt, and the importance of accounting for the…

  • Architecture is Context

    Architecture is Context

    For architects designing complex solutions, a well-documented set of requirements can never be the sole basis of the…

    1 条评论
  • Shortening the architectural feedback loop

    Shortening the architectural feedback loop

    One of the things architects can learn from the Agile mindset is the importance of short feedback loops. The quicker an…

社区洞察

其他会员也浏览了