登录查看更多内容

Data should be Immutable in any Modern Architecture

David Strickland

Software Engineering Leader Specializing in Legacy Codebase Transformation & Team Revitalization

发布日期: 2024年10月9日

In Data Mesh, one of the most transformative concepts to me was when it presented us the notion that data is inherently immutable. This principle, akin to the realization of recursion or the power of code generation, fundamentally alters one's perspective as a developer once fully comprehended. At first glance, this concept may appear counterintuitive, as the terminology of "changing data" is ubiquitous in the field. However, in practice, the underlying data remains unaltered; instead, what is modified are the aggregates that represent abstractions derived from the original data.

The Concept of Immutability

Consider a seemingly straightforward example, such as an address. On average, individuals relocate approximately every seven years, leading us to perceive this as a change in address. However, the historical fact that a client resided at Address A from Date 1 to Date 2, and subsequently relocated to Address B, remains incontrovertibly true. This historical data persists irrespective of changes. While a business might primarily focus on the current address, this focus represents an abstraction defined by specific business rules, rather than an alteration of the original data itself. The underlying truth—the chronological sequence of addresses—remains unchanged. Even if the historical information is not recorded and only the current address is retained, the fundamental reality does not change. In this respect, data is immutable.

Developer Decisions and Data Preservation

As a developer, I often endeavored to convey to Product Owners and Managers the notion that I was making decisions on a continuous basis that significantly impacted their business. Among these decisions was the determination of which data to preserve, which to aggregate, and which to discard. Consider a scenario involving a user interaction—such as clicking a button. Do we retain all contextual information surrounding that action, or do we simply execute the action itself, such as deleting an account and purging all associated records from the database?

User Story Example

User Story: As a user, I want to delete my account so that my data is removed from the system.

When a user story states, "As a user, I want to delete my account," should we track every preceding action that led to the deletion, or simply execute a cascading delete operation based on the user ID? Regardless of whether we opt to record these events, the underlying data these events represent—the fact of the button click, the account deletion, the temporal sequence of actions—remains immutable. Even if we choose not to preserve every detail, the underlying events remain unchanged.

领英推荐

What is Data Mesh?

Data & Analytics 1 年前

PDI Model for Progressive Data Transformation in…

John Kamara 1 年前

DATA FABRIC AND REALITY - PART II

Bill Inmon 1 个月前

Event Sourcing and Data Integrity

The concept of Event Sourcing has gained increasing prominence, as it brings us closer to the goal of preserving all data in its original, unaltered form. Event Sourcing involves capturing state changes as a sequence of immutable events, which can then be replayed to reconstruct the current state of a system. Rather than merely updating an address, we now record that Person X declared their new address as Y at Time Z—an event that is retained indefinitely. To determine an individual's current address, one need only identify the most recent "New Address" event.

The primary challenge associated with Event Sourcing, however, is the coexistence of legacy and contemporary data paradigms. Event Sourcing is inherently temporal, representing a series of events in chronological order. When systems are initially constructed to store aggregates, rather than the complete sequence of events, transitioning to an event-sourced architecture becomes highly complex. The legacy aggregate data must be treated as the definitive source of truth unless superseded by more recent events, thereby creating a dual state that complicates the adoption process.

The Costs and Benefits of Immutability

While continuing to rely on aggregate data may seem more straightforward, the costs associated with this approach are becoming increasingly prohibitive. With the advent of machine learning and the data mesh paradigm, the intrinsic value of raw, historical data is becoming more apparent. A compelling argument now exists for maintaining legacy aggregates while simultaneously capturing every subsequent event. Event Sourcing involves managing an enormous volume of data—vast quantities that necessitate substantial computational power to query, process, and store. However, we are entering an era in which the costs of computation and storage are diminishing, thereby mitigating these barriers.

Simultaneously, advances in machine learning are providing the means to leverage this comprehensive dataset to develop sophisticated predictive models. As data becomes increasingly recognized as a product, we are better positioned to store and analyze it in its entirety. Within the next five to ten years, technological advancements will render the costs of storage and processing largely inconsequential. Yet, given that data is immutable, failing to capture it today results in a permanent loss.

Conclusion

The significance of understanding data as immutable cannot be overstated. It compels developers and organizations to fundamentally reconsider data architecture, Event Sourcing, and the enduring value of information. By adopting this perspective, we position ourselves to fully harness the potential of a data-driven future.

要查看或添加评论，请登录

David Strickland的更多文章

Eating the Elephant - Cleaning up a Ball of Mud

2024年11月14日

Eating the Elephant - Cleaning up a Ball of Mud

Recently, I was asked how I would approach a modernization project. It's a question I've answered many times throughout…
Perverting the Richardson Maturity Model

2024年11月13日

Perverting the Richardson Maturity Model

Today, I was evaluating how to categorize a client's maturity level in adopting MISMO. As I thought about it, I…
Dysfunction: The QA Conundrum

2024年10月23日

Dysfunction: The QA Conundrum

Early in my career, I worked as a systems administrator at a parts fabrication company. They used large laser cutters…

4 条评论
The Hidden Costs of Complexity: Why Small Iterations and Teams Matter in Software Development

2024年10月20日

The Hidden Costs of Complexity: Why Small Iterations and Teams Matter in Software Development

The software development industry is facing alarming trends that are reminiscent of past mistakes, threatening to…
Essential Legislation and Compliance Standards for Developers in 2024

2024年10月13日

Essential Legislation and Compliance Standards for Developers in 2024

It's been a few years since I last reviewed all the legislation and standards I need to comply with. While compliance…
Flow that Flows

2024年10月7日

Flow that Flows

If you’ve been in the IT industry for a while, you’ve probably noticed how acronyms and terminology evolve over time…
Addressing Developer Burnout: The Power of Motivation

2024年10月5日

Addressing Developer Burnout: The Power of Motivation

If you’ve recently found yourself searching for new talent, it’s crucial to ask why. Are you trying to replace someone…
The Effect of Conway's Law on the Future of Work from Home.

2024年10月3日

The Effect of Conway's Law on the Future of Work from Home.

Throughout my 30-year career, I have had the opportunity to work remotely on various projects. Since 2002, I have also…
The Tar Pit: The combinatorial explosion of product complexity as presented in "The Mythical Man-Month"

2024年9月29日

The Tar Pit: The combinatorial explosion of product complexity as presented in "The Mythical Man-Month"

A few days ago, while browsing through Half Price Books, I stumbled upon Frederick P. Brooks Jr.
Technical Debt: Leveraging or Overextending

2024年9月26日

Technical Debt: Leveraging or Overextending

As a consultant, I’ve had the privilege of working with many different companies, each with its own approach to…

2 条评论

See all articles

Data should be Immutable in any Modern Architecture

David Strickland

Software Engineering Leader Specializing in Legacy Codebase Transformation & Team Revitalization

The Concept of Immutability

Developer Decisions and Data Preservation

User Story Example

领英推荐

Event Sourcing and Data Integrity

The Costs and Benefits of Immutability

Conclusion

David Strickland的更多文章

社区洞察

其他会员也浏览了

Data Mesh with Snowflake AI Recipes

Untangling Dependencies in Domain Ownership: Navigating the Data Spaghetti of a Data Mesh Architecture

Delta Live Tables — Part 5— Exploring Advanced Features and Optimization Techniques in Delta Live Tables

Decoding Data Processing: Navigating the Kappa vs. Lambda Architecture Dilemma

January 28, 2024

Building a successful Data Mesh – More than just a technology initiative

Architectural guidelines to apply external context data

“Data Mess to Data Mesh” - Part:1

June 02, 2023

The Concept of Immutability

Developer Decisions and Data Preservation

User Story Example

领英推荐

Event Sourcing and Data Integrity

The Costs and Benefits of Immutability

Conclusion

David Strickland的更多文章

Eating the Elephant - Cleaning up a Ball of Mud

Perverting the Richardson Maturity Model

Dysfunction: The QA Conundrum

The Hidden Costs of Complexity: Why Small Iterations and Teams Matter in Software Development

Essential Legislation and Compliance Standards for Developers in 2024

Flow that Flows

Addressing Developer Burnout: The Power of Motivation

The Effect of Conway's Law on the Future of Work from Home.

The Tar Pit: The combinatorial explosion of product complexity as presented in "The Mythical Man-Month"

Technical Debt: Leveraging or Overextending

社区洞察

其他会员也浏览了

Data Mesh with Snowflake AI Recipes

Untangling Dependencies in Domain Ownership: Navigating the Data Spaghetti of a Data Mesh Architecture

Delta Live Tables — Part 5— Exploring Advanced Features and Optimization Techniques in Delta Live Tables

Decoding Data Processing: Navigating the Kappa vs. Lambda Architecture Dilemma

January 28, 2024

Building a successful Data Mesh – More than just a technology initiative

Architectural guidelines to apply external context data

“Data Mess to Data Mesh” - Part:1

June 02, 2023