Data modeling for the rest of us
If AI-driven image prompts could create an effective data model for business users to describe the data inside their organizations and identify key metrics that they wanted to track, it might look like the slightly insane diagram generated above. A magical process like this might also determine if metrics were based on counting entity data or contingent on events that happen in an environment.
As data professionals, we don’t always have a simple way to explain entity data, metrics, and events to the “typical” business user. “It depends” is a familiar term when talking about a data problem, finally coalescing on a technical definition that we memorialize with a SQL query or a very specific data definition language helping us to create relationships between data objects.
We don’t have a simple process to make data modeling easier for the average business stakeholder, but maybe we need to have one, especially when the discussion of data modeling looks like the confused faces you see on the other side of a Zoom call.
Begin with the End in Mind: how do we do business?
A perfect conversation that ends in a proper data model (like the movie posters above that magically align some bubbles that look like they make sense, but actually need critical detail) starts with a few assumptions about the way the company does business.
What are the entities that make up the key parts of that business?
When we talk about an?entity,?we mean a thing in the business that we want to model in data.?For most businesses, this starts with thinking about:
We do business by identifying key events that happen to actors or items in our environment that help us make decisions about what to do next. Data modeling should reinforce this conversation and give us decision support to take action.
A First Modeling Discussion
Here’s one way to think about modeling data if you’ve never tried it before.
The first time you imagine the model, you’ll need to start by defining the list of things you care about.
As we reviewed above, you’ll probably start with types of data like:
Each one of these items represents a set of data that you need to define to organize this information in your company’s systems.
What do you need to know about an entity?
This diagram shows one way to enter the conversation about data modeling, from the initial identification of entities through the relatedness of entities to one another to the metrics you’ll use to track things.
Entities consist of some basic attributes:
For example, your “Person” entity might be as simple as adding these fields:?first name, last name, email, PersonID, persontype, companyId, datecreated,?and?lastupdated.?However, you could easily add more fields.
领英推荐
Once you know which fields are in your entity, you need to know:
Understanding the basic schema and metadata of your entity gives you a good headstart to have other conversations about the relation of your data in your system.
Extending the conversation to talk about metrics
Now that you’ve thought about entities and how to relate to other entity data, let’s talk about how to think about metrics in context.
Metrics?use counting of entity or event data, bounded by conditions and a time schedule, to create numbers to compare.
Here’s a simple diagram to set up a metric. You need to think about which items are involved, the conditions or constraints that limit the number of records you are counting, and how often you need to count this metric.
Simple metrics involve counting items.
As you identify events that matter (a completed meeting, a login in your application), you may want to consider more complicated ideas:
The combination of?metrics happening in a specific time period?with a specific?entity field change?represents a trigger for action. It’s the key business question you want to answer when modeling.?When the opportunity status becomes closed-won, what needs to happen in the business so that we can welcome a new customer?
2nd order (and more interesting) metrics happen when you start chaining trigger events in patterns and tying them to specific entities. When this happens often enough, you start gathering data that could be used for predictive decisions.
Applying metrics to action
Metrics in a vacuum are just numbers you count. These definitions need to be shared with the whole organization so that anyone who needs to understand what’s being measured and who owns that result can go to one place to find it. Without agreement from the team, you’ve got siloed numbers.
One way to drive agreement and provide a living reference is a?metrics catalog.?This could be as simple as a spreadsheet that lists the top priorities of the organization, or it could be tracked by a more sophisticated system, listing:
The iterative process might look something like this:
The act of counting items or ratios to drive business decisions is the key reason that we model data. By defining the rules of engagement for the business and writing out how they happen, we make it a lot easier to discuss changes and understand how to materialize metrics in reports, dashboards, and alerts.
What’s the takeaway??Data modeling may sound like an esoteric topic. It’s a key task that we need to do as we create and run our business to clarify the most important metrics we track. Creating a decision log (and definitions) of our business objectives makes it easier to know whether we are doing well or not.
This article was originally published on Substack.