登录查看更多内容

Insights From Software Architecture: A Tale of Two Systems

David Max

Senior Software Engineer at Datadog

发布日期: 2015年8月25日

A colleague understands a new system using his software architecture skills, and I try to figure out how he did it.

Part 1 (How is a Software Architect Like a Football Coach?) introduced the idea that developers can reason about the behavior of large scale complex software systems using high level abstractions that come from software architecture principles.

A Tale of Two Systems

A former coworker was once recounting a work experience where her team had an enterprise system that was getting dangerously close to exceeding its capacity. The large expensive Windows servers they had were fast enough to handle the load, but some tasks were exceeding the limits of what could fit in a single Win32 process. So they undertook a major overhaul of the system’s architecture to make it more scalable.

The new system had a different architecture, but functionally it was intended to be equivalent. Their strategy was to deploy the two systems in parallel so that any discrepancy in their outputs could be tracked down and fixed.

They expected that the new system’s state-of-the-art highly-scalable architecture would significantly outperform the old system’s more primitive and over-extended architecture. To their surprise, when they measured the performance in production, the new architecture was giving them no performance boost at all. In fact, the new system was significantly slower.

What happens next? Panic? Finger pointing?

My first thought as a software engineer was that somebody must have done something wrong. Another coworker whose experience includes a background in software architecture also heard the same story, but his reaction was different. He said that of course the new system was slower. What?!

I’m starting to notice that it is quite useful for a software engineer to know a bit about software architecture.

First, the long answer for why he was right.

Here’s a sample piece of the system. The program’s task was to read blocks of data from input files, add together all blocks that are the same type (i.e. same color in the diagram), and write the combined blocks to the output file.

The output file was opened as a memory mapped file, which means that the program interacted with the file as if it were just a big block of virtual memory. What’s cool about a memory mapped file is that it takes the same hardware-accelerated machinery that makes virtual memory fast and uses it to optimize reading and writing to a file.

And this was the problem. In 32-bit Windows, the maximum memory-mapped file size is limited by how much contiguous unused address space is left. The bigger the output file gets, the more cramped it gets.

In the new architecture, the program hadn’t changed much in the way it worked, except now the input and output “files” weren’t really files anymore. Instead of writing to a memory mapped file, now it wrote to some remote procedure call API for a clustered memory data grid. The data was stored on another set of computers.

Well, if you put it that way… I see why it makes sense that it’s slower. So what were the system architects thinking?

Remember that the original problem was scalability. Until now, if the program ran too slowly, they solved the problem by buying a bigger server, a so-called “scale up” tactic. That worked until they began to hit other barriers. To get beyond those, they needed to change the architecture to something that provided what the old architecture was lacking -- the ability to scale beyond what can fit inside one machine. But that ability to scale comes at a price in performance.

So let’s say that the old software could finish the job with a million records in 30 minutes, but the new architecture running on the same machine takes an hour. Is that a problem? What about if the job grows to 2 million records and takes 2 hours on the new architecture, whereas with the old software you could finish the job in… NEVER. You couldn’t do a job that size on the old software. It would simply abort with a memory allocation error. Now that’s a problem.

But what if 2 hours is a problem? The new architecture allows for that problem to be solved in a way that the old architecture couldn’t directly support, and that is to add more computers, a so-called “scale out” tactic.

Architecture Insight

I could see the logic after having it explained, but how is it that the developer with the software architecture training knew the answer right away, but I had to think about all this detail stuff before I could see it too?

He was looking past the details to see underlying patterns that were familiar to him, and once he could see the patterns, he could discard the irrelevant detail and just reason about the patterns in the abstract.

This kind of insight is valuable for all developers. If the rationale for all the architecture design decisions is to increase the system’s scalability, then you can make sure that your smaller design and implementation decisions remain consistent with that goal.

See part 3 (Developer Happiness).

Thanks for reading. Please like and share. You can find my previous LinkedIn articles here.

María Elena Ojea Fernández

Profesora de Secundaria-Tutora UNED en IE"12 de Outubro"-Ourense-UNED

9 年

Great article.

Kartik Sehgal

Head, Technical Product Management at EagleView

9 年

This is a great example of something that I have seen happen multiple times in past projects that were revamped for scalability. Very succinctly put!

Genti H.

Cybersecurity Analyst

9 年

Thanks David.I appreciate your beatiful and hard job

Daniel Cabrera Villela

Director General en PROAQUA

9 年

thank you David, great article

Tom Stocki

Bridging Business Goals and IT Solutions

9 年

Thank you for tackling an enterprise IT topic. Looking forward to part 3.

查看更多评论

要查看或添加评论，请登录

David Max的更多文章

Said The Engineer, “Let Me Tell You a Story…”

2020年2月4日

Said The Engineer, “Let Me Tell You a Story…”

Have you ever found yourself reading a book, sitting cozy on the couch, only to look up after who-knows-how-long and…

5 条评论
Slow Motion Change in Engineering Education

2019年12月9日

Slow Motion Change in Engineering Education

I wondered in my previous post, why have I met so many engineers who started out thinking that engineering wasn’t for…

3 条评论
How Do You Know If You’re an Engineer?

2019年11月7日

How Do You Know If You’re an Engineer?

I’m an engineer, and I’ve met a lot of them. One thing I’ve noticed is that many of the engineers I know started out…

6 条评论
What Makes a Good Online Group?

2019年7月22日

What Makes a Good Online Group?

Online groups have been around for longer than web browsers. If you're like most people reading this, you've visited or…
Embedding Content in LinkedIn Posts Using oEmbed

2017年5月23日

Embedding Content in LinkedIn Posts Using oEmbed

One of the more expressive features of LinkedIn’s Publishing Platform is the ability to embed content from another site…

74 条评论
Coders Aren’t Assembly Line Workers

2017年3月27日

Coders Aren’t Assembly Line Workers

Clive Thompson wrote a thought-provoking piece in Wired, The Next Big Blue Collar Job is Coding. The usual definition…

461 条评论
DON’T Follow Your Passion

2016年8月23日

DON’T Follow Your Passion

One of the most typical pieces of advice you’re likely to get for how to find a job that will bring you success and…

303 条评论
The Job You Already Have Could Be The Job You Want (With a Few Tweaks)

2016年8月16日

The Job You Already Have Could Be The Job You Want (With a Few Tweaks)

I was sitting on a bus chatting with a couple. The husband is a software engineer like me, and his wife is a nurse who…

64 条评论
What is Software Craftsmanship?

2016年7月11日

What is Software Craftsmanship?

I first started hearing about software craftsmanship when I started working at LinkedIn. It wasn’t a familiar term to…

27 条评论
Single-Purpose Concepts, Single-Concept Purposes

2016年5月9日

Single-Purpose Concepts, Single-Concept Purposes

Have you ever encountered a confusing computer program? Just to take one example of many, the following question…

1 条评论

See all articles

Insights From Software Architecture: A Tale of Two Systems

David Max

Senior Software Engineer at Datadog

A Tale of Two Systems

Architecture Insight

David Max的更多文章

社区洞察

其他会员也浏览了

Key Principles In Software Architecture

Implementing a Component-Based Architecture: Benefits + Challenges

Clean Architecture: A Comprehensive Summary

What I learned from the book “Just Enough Software Architecture” by George Fairbanks

Do’s and Don’ts of Clean Architecture: A Guide for Scalable Software

Unveiling the Power of Clean Architecture : Building Strong Foundations for Sustainable Software

Clean Architecture in C# .NET

AOSP : Architecture Overview

Software Architecture Patterns — Which one to choose ?

A Tale of Two Systems

Architecture Insight

David Max的更多文章

Said The Engineer, “Let Me Tell You a Story…”

Slow Motion Change in Engineering Education

How Do You Know If You’re an Engineer?

What Makes a Good Online Group?

Embedding Content in LinkedIn Posts Using oEmbed

Coders Aren’t Assembly Line Workers

DON’T Follow Your Passion

The Job You Already Have Could Be The Job You Want (With a Few Tweaks)

What is Software Craftsmanship?

Single-Purpose Concepts, Single-Concept Purposes

社区洞察

其他会员也浏览了

Key Principles In Software Architecture

Implementing a Component-Based Architecture: Benefits + Challenges

Clean Architecture: A Comprehensive Summary

What I learned from the book “Just Enough Software Architecture” by George Fairbanks

Do’s and Don’ts of Clean Architecture: A Guide for Scalable Software

Unveiling the Power of Clean Architecture : Building Strong Foundations for Sustainable Software

Clean Architecture in C# .NET

AOSP : Architecture Overview

Software Architecture Patterns — Which one to choose ?