登录查看更多内容

Simplicity in Distributed Complexity

Stephen Amsel

Software Engineer

发布日期: 2025年3月3日

We all want our work to be simple. Simplicity makes work easier,faster, less stressful, eases learning curves, and can carry all sorts of benefits. It can be a major goal in the workplace and certainly has been one in my experience building cloud computing platforms. Now for the big question: How do we get that?

We don't. Between business logic and technical needs, if you are doing something fundamentally complicated, there is no simple way to do it. If you can make money doing only simple stuff, then you are smart, lucky, and exceptional, and you don't need this article. How do we get the benefits of simplicity anyways? We can distribute complexity so that for any one task, we are only dealing with a small part of it.

Domain Divisions

The first approach is to look at your business model and processes, break them down into well-defined distinct parts, and follow that breakdown in implementation. Whether you are using modules, separate services, or just name-spacing things, this is part of Domain Driven Design and I have described how to separate systems ths way here. While certainly valuable and even often vital, there are other important divisions to consider.

Within Domain Divisions, you have both business-domains and technical domains. For example, you can treat management of payments and a tool used generally to let you handle thousands of transactions per minute as separate domains. In this article, we will concentrate on division of work across four technical layers: We have

the Application itself,
Tools, whether homebrew, libraries (or packages, crates, gems, etc.), or third-party services,
Infrastructure, inc;luding webservers, front-ends, databases, and standard tools like IDPs and CI/CD pipelines that are not an issue here as they come with well-defined responsibilities
Humans, because everything in these platforms begins and end with humans.

Application Layer

Here we implement the logic specific to our uses. There is extensive literature on best practices here, but there is one split that does not get enough attention: You can find complexity in functions or in relations between them. There is such extensive work on making individual functions simple that linters have limits built in. However, taking it too far means illegible logic-flows that run through many separate modules, delegations, and methods within them, inconvenient or impossible to display together, so you cannot see the overall process. This can drive illegibility, work-duplication, and more bugs and other trouble. Just to be clear, even if each function is simple, that does not make anything better if it achieved only by making more functions to juggle. It is widely known that things should not go too far the other way either: Logic coupled more closely in implementation than in underlying model can lead to the exact same pain-points.

You can get a good balance between well-organized DRY functions and simple flows by working from the business model or desired UX. If you understand that, you will see when some logic will only ever arise in the context of some other. If two steps will only ever come as a package in the underlying logic, there are rarely reasons to separate them into different methods. The same goes for actors in the business logic: If two will only ever come as a unit, then they are not two separate actors requiring separate classes. A good rule is that unless technical requirements demand otherwise, the coupling in the code should match the coupling in the business logic or users' interactions.

Tools Layer

Outside the immediate business logic, there is typically a layer of tools including libraries (or packages, gems, crates, or whatever they are called). Tools also include CI/CD pipelines, or others that have well-defined responsibilities and need not be discussed here. Using libraries can carry risks as you do not know exactly what you are putting into your system nor how it will interact with other software, and they are harder to modify than code you completely control. On the other hand, unlike homebrew tools, they are often already production-ready, well-documented, optimized, clearly separated from your business logic, and already known by some new team-members. Libraries can also be dependency injection attacks and difficulty in deployments, but those can be addressed by verifying and caching versions, and then drawing from your cache when deploying. Third party services mostly have the same issues and benefits aside from security concerns, but more so as they are supported according to their Service Level Agreements, generally cannot be modified nor extended for idiosyncratic needs at all, and take on liability for errors or leaks costly to users.

领英推荐

Distributed Tracing: Unraveling Complexities in Modern…

Mahesh D. 1 年前

Everything as Code: Unlocking the Power of Policy as…

Madhur Sabherwal 9 个月前

Unlocking the Power of Observability with OpenTelemetry

Jafar Khan 6 个月前

A good approach here is to consider whether each requirement is specific to your needs, or an industry-wide issue. If it is a general need, then there is probably a library or third-party service handling it better than would be practical wirth a homebrew tool, and you do not need to risk cluttering your application with extra non-business logic. If it is, though, use homebrew instead of trying to shoehorn into a dependency. Your needs will change and that inflexibility will be a problem. Also, IDPs, payment processors, and other lawsuits waiting to happen should usually not be handled internally unless that is your business. A development task usually involves either navigating libraries, third party services, or homebrew tools for idiosyncratic needs, but not all three if they are each kept to their own domains, so each task stays simple while meeting complex needs.

Infrastructure Layer

Here we have different elements of the platform with many ways to split logic that both ease overall implementation and confine individual tasks confined to small elements. Webserver architectures run on a spectrum from Monoliths to fine-grained Microservices. In Monoliths, every feature addition involves checking for eccentricities and interference with other teams' projects. There are variants of Monoliths with Monorepo patterns, serverless computing, and Modular Monoliths that partially mitigate the problems above, but different modules typically still share the tools-layer and issues still arise. Fine-grained microservices are hard to test and must be backwards-compaibe for smooth deployment, and raising new services requires configuration of resources. Databases and Front-End computing have a similar balancing act: Running many different database technologies complicates data-management as the application must juggle them to store or assemble information, and demands extensive cross-training. Shoehorning data into inappropriate technologies to limit the number used can create challenges in platform administration as some have very different optimizations from others. Determining how much computing should be run inside the database is a balancing act between creating trouble with limits of database CPU and Network Input/Output, neither of which is always scalable. Similarly in the front-end, running many frameworks together can create coordination challenges and demand cross-training while shoehorning all dynamics into one can force developers to create and maintain hacky solutions and destroy performance. Also, how much logic to manage in the front-end vs, back-end, is another balancing act between network bandwidth and client-side CPU limits, either of which could be prohibitive depending on the user.

To break down the system and avoid make-work, I recommend separate servers for separate business domains. Beyond that, another useful separation is between webservers and purely computing servers: The logic involved in heavy computations often only loosely couples to user-interactions even when it involves the same actors, so you can split it off, use more appropriate technology and resources, and avoid navigating its logic when working on other tasks. For databases, picking database technologies with broad ranges of application, but only using them within their intended ranges, can get a good balance. I recommend formal mapping of purposes to types of databases and selecting one for each, like choosing Redis for caching volatile data and Apache Druid for logging so it is clear what data are where, responsibilities are clearly divided and assigned to appropriate technologies, and niches are known to be filled so new technologies are not needlessly introduced. Keeping database-side processing to Tools-layer uses and not Application logic should help balance between database CPU load and network-load. I am not an expert on front-ends, but I understand a similar approach as for databases would work there too.

Human Layer

At some level, users will step away from it and do whatever they need to do. In many cases, we want to make life as easy as possible and automate as much as we can, but sometimes it is better to let them judge and do that for themselves outside of the platform.

Whether for reasons of liability, complex logic, or just a need for human judgement, many tasks are closely controlled by humans. Sometimes, some business logic is either a costly rabbit-hole to try to automate, to be configured by users, or to be frequently changed. It can be a quicker and more maintainable solution to document an interface and train users. Depending on needs, this can be as easy as a button labelled "Click Here to Pay", or as involved as building components for a Workflow Engine and training users to build or customize their own workflows. Internally, we also see these arise in system administration. While we like to automate scaling, self-repair, etc., DevOps and Platform Ops teams are skilled and exist for a reason. While parts can be automated within the platform or with third-party services like NewRelic, attempts to build their judgment into the platform, even where it can even be done, usually both fail and produce unmaintainable elements.

The Big Takeaway

We care avoiding make-work or needless complications of whatever we are doing right now. That demands local simplicity, not global, which is good because business and technical requirements sets a minimum global complexity. Concentrating logic in one element drives both the frequency that we have to work with that element and the difficulty of doing so. Distributing complexity across logically different ones will not cut our overall work, but also delays, errors, and stress. The common theme across it all is that optimization is a balancing act. Dogmatically going for microservices, minimal infrastructure, total automation, or whatever the latest buzzword is, and running to any one end of any spectrum will have roughly the same effects, bugs, slow development, and burnout. The most important tools to achieve this balance are well-defined principles of labour-division to guide a team.

This post was motivated by repeated professional experiences and object lessons in the importance of recognizing the nuances of simplicity. I have I.T. horror stories for anyone who wants to laugh and/or cry.

要查看或添加评论，请登录

Stephen Amsel的更多文章

Meeting Needs, not Asks

2025年3月14日

Meeting Needs, not Asks

This is Part 13 of a series on working with software development, for engineers and the non-technical personnel facing…
Fleshing Out Projects with BPMN

2024年8月21日

Fleshing Out Projects with BPMN

This is Part 12 of a series on working with software development, for engineers and the non-technical personnel facing…
Application Architecture and Unified Modeling Language

2024年7月3日

Application Architecture and Unified Modeling Language

This is Part 11 of a series on working with software development. This one is written from the developers' perspective,…
Functional and Technical Domains

2024年6月17日

Functional and Technical Domains

This is Part 10 of a series on working with software development. When dividing labour, we always aim to give tasks to…
Unblocking Development: Layers in Software Architecture

2024年4月9日

Unblocking Development: Layers in Software Architecture

This is Part 9 of a series on working with software development The first and most crucial step in anything is deciding…
TDD and Communication

2024年4月2日

TDD and Communication

Part 8 of a series on how to boost I.T.
Dealing with I.T. - Maintaining Agility

2024年3月27日

Dealing with I.T. - Maintaining Agility

Part 7 of a series on boosting I.T.
Dealing with I.T. - The Balance

2019年6月19日

Dealing with I.T. - The Balance

Part 6 of a series on how to boost I.T.
Dealing with I.T. - Have All the Pieces

2019年5月12日

Dealing with I.T. - Have All the Pieces

Part 5 of a series on how to boost I.T.
Dealing with I.T. - Crossing the Language Barrier

2019年3月13日

Dealing with I.T. - Crossing the Language Barrier

Part 4 of a continuing series on how to boost I.T.

See all articles

Simplicity in Distributed Complexity

Stephen Amsel

Software Engineer

Domain Divisions

Application Layer

Tools Layer

领英推荐

Infrastructure Layer

Human Layer

The Big Takeaway

Stephen Amsel的更多文章

社区洞察

其他会员也浏览了

REST API File Upload Best Practice

NuNet Technical Roadmap Update Q1

Prometheus Triggers for the KEDA for the Kubernetes Autoscalling - KEDA 2/4

Observability: The Combined Power of eBPF and OpenTelemetry with Zero-Code Instrumentation

Migrating Terabytes of metrics data with zero downtime

Prometheus: From metrics to insight

Log and trace management made easy. Quickwit Integration via Glasskube

Understanding Distributed Systems: The Key Challenges of Consistency, Availability, and Partition Tolerance (CAP Theorem)

Top 5 Open-source monitoring tools for Kubernetes

Proprietary Systems and Distributed Monoliths

Domain Divisions

Application Layer

Tools Layer

领英推荐

Infrastructure Layer

Human Layer

The Big Takeaway

Stephen Amsel的更多文章

Meeting Needs, not Asks

Fleshing Out Projects with BPMN

Application Architecture and Unified Modeling Language

Functional and Technical Domains

Unblocking Development: Layers in Software Architecture

TDD and Communication

Dealing with I.T. - Maintaining Agility

Dealing with I.T. - The Balance

Dealing with I.T. - Have All the Pieces

Dealing with I.T. - Crossing the Language Barrier

社区洞察

其他会员也浏览了

REST API File Upload Best Practice

NuNet Technical Roadmap Update Q1

Prometheus Triggers for the KEDA for the Kubernetes Autoscalling - KEDA 2/4

Observability: The Combined Power of eBPF and OpenTelemetry with Zero-Code Instrumentation

Migrating Terabytes of metrics data with zero downtime

Prometheus: From metrics to insight

Log and trace management made easy. Quickwit Integration via Glasskube

Understanding Distributed Systems: The Key Challenges of Consistency, Availability, and Partition Tolerance (CAP Theorem)

Top 5 Open-source monitoring tools for Kubernetes

Proprietary Systems and Distributed Monoliths