The Schema is not the Model: Some thoughts on software patterns.
Hi, I'm Lennard, a building-architect turned software-builder. These are my notes as I relate the world of software design with my experience in architectural design. For the past weeks, I've been working through the DB and application layer for a piece of enterprise software. These are some thoughts from that experience.
TLDR
When using a layered approach to application design, domain models and database schemas serve fundamentally different purposes. Collapsing them can lead to grim consequences. But separating them leads to the next question: what comes first - the model or the schema? Here's my take, informed from the many giants that walked these steps prior:
The Parti
In architectural design, we have a term - 'parti'. It's an essential diagram capturing the main relationships of spaces in a building: public vs private, serving vs served, open vs closed. Having a clear and sensible parti is fundamental to good design. It allows sensible flexiblity and guides sound decision-making across the team.
Similarly, for any piece of software, how a team treats the relationship between domain models and database schemas is a foundational question. The answer shapes the long-term adaptability and robustness of a project.
An initial instinct to treat domain models and data schemas as overlapping concepts can quickly prove to be an anti-pattern. This seemingly innocent conflation, which appears to adhere to 'Keep It Simple, Stupid' principles, can quickly lead to maintenance nightmares and rigid architectures poorly suited to the necessary evolution of software.
The Anti-Pattern Example: Model and Schema Overlap
"You reach for the Banana, You get the Gorilla"
Consider a trading platform handling stocks, bonds and options. If your team falls into the schema-model coupling trap, your solution often looks something like this:
Here's an example of how it might look:
At first glance, this approach seems elegant in its simplicity. One table, one model, done. However, this apparent simplicity masks several serious problems that emerge as the system grows. Here are some:
领英推荐
A Solution: Domain Classes and Persistence, Relationship-Ignorant Persistance
You stir clockwise to mix the cake batter. You stir anti-clockwise to unmix it. Tada.
Instead of coupling models and schemas, it makes sense to set clear boundaries before things get too mixed up. Like baking, once things get mixed its hard to un-mix:
Domain Layer
Persistence Layer
In our trading system example, this means having distinct models for different instrument types, each encapsulating its own pricing logic, validation rules, and risk calculations. These models don't need to know anything about how they're stored - they're pure functions of business concepts and operations.
These classes have a bounded context and "own" several tables within the DB. Everything within the scope of those tables have to be updated as a batch - they work together. It provides a clear scope of concurrency. To connect these layers, simply employ the repository pattern, implemented with an Object-Relational Mapping (ORM) tool. This acts as a translator, mapping our domain objects to simpler database structures.
In simple terms, we've moved relationship management to the application layer. This architectural decision unlocks significant benefits:
By offloading relational complexities from the database, you achieve a clean separation of concerns. As an outcome, the database is freed for speed and stability, coupled with a flexible application layer for changes in business logic as the solution seeks product market fit.
(For more context on this, I recommend the excellent book "Architecture Patterns with Python" by Bob Gregory and Harry Percival )