Data modeling for the rest of us
Nathan Duck / Unsplash / https://unsplash.com/photos/PzRuJU9v-oc

Data modeling for the rest of us

No alt text provided for this image
“A data model created by business users who want to describe their Salesforce data, illustration, movie poster” - created in Midjourney by PromptHunt.com

If AI-driven image prompts could create an effective data model for business users to describe the data inside their organizations and identify key metrics that they wanted to track, it might look like the slightly insane diagram generated above. A magical process like this might also determine if metrics were based on counting entity data or contingent on events that happen in an environment.

As data professionals, we don’t always have a simple way to explain entity data, metrics, and events to the “typical” business user. “It depends” is a familiar term when talking about a data problem, finally coalescing on a technical definition that we memorialize with a SQL query or a very specific data definition language helping us to create relationships between data objects.

We don’t have a simple process to make data modeling easier for the average business stakeholder, but maybe we need to have one, especially when the discussion of data modeling looks like the confused faces you see on the other side of a Zoom call.

No alt text provided for this image
A group of confused middle managers on a zoom call, Made with Midjourney

Begin with the End in Mind: how do we do business?

A perfect conversation that ends in a proper data model (like the movie posters above that magically align some bubbles that look like they make sense, but actually need critical detail) starts with a few assumptions about the way the company does business.

What are the entities that make up the key parts of that business?

When we talk about an?entity,?we mean a thing in the business that we want to model in data.?For most businesses, this starts with thinking about:

  • People -?customers, leads, contacts, and employees that represent the people of the business.?You might want to model a person as a single entity, use multiple entities to represent different kinds of people, or identify the key transitions that change people attributes.
  • For example, what’s the moment when you know a?lead?becomes a?contact?associated with an?account?that is a customer? This probably happens after a sale and implies that most people need to be associated with a company.
  • Companies - the?businesses we try to sell to?as we run our business. Companies are related to people (they can’t exist very well without them) and have their own set of particular attributes.
  • You might want to differentiate between companies of different sizes because this informs your sales motion and how you engage with various types of companies.
  • Business-specific items -?the unique items that you manage in your business.?Whether you have a digital-first or a physical business, these are the atomic units of your business. Just like an Airline has planes, reservations, tickets, and destinations, your business has its specific lingo for the items you manage.
  • At the most basic level, to model data, it helps to think about the items you would want to count or the events you would want to capture. If you have a business that creates project management software, you might measure tasks created, tasks completed, tasks abandoned, the number of tasks in a project, and the number of people assigned to a task.

We do business by identifying key events that happen to actors or items in our environment that help us make decisions about what to do next. Data modeling should reinforce this conversation and give us decision support to take action.

A First Modeling Discussion

Here’s one way to think about modeling data if you’ve never tried it before.

The first time you imagine the model, you’ll need to start by defining the list of things you care about.

As we reviewed above, you’ll probably start with types of data like:

  • People -?a general repository for people data
  • Leads -?information about people who have expressed sales interest and aren’t yet customers
  • Contacts -?People at companies who purchase from you
  • Companies -?a general repository for company data
  • Accounts -?Companies who buy from you
  • and specific entities to map to business goals in your environment

Each one of these items represents a set of data that you need to define to organize this information in your company’s systems.

What do you need to know about an entity?

No alt text provided for this image
One way to enter the data modeling process

This diagram shows one way to enter the conversation about data modeling, from the initial identification of entities through the relatedness of entities to one another to the metrics you’ll use to track things.

Entities consist of some basic attributes:

  • Name
  • Description
  • Attributes or fields
  • A date when the schema (this list metadata) was last updated
  • Any relationships to other entities, and information whether that is a one-to-one, one-to-many, or many-to-many relation

For example, your “Person” entity might be as simple as adding these fields:?first name, last name, email, PersonID, persontype, companyId, datecreated,?and?lastupdated.?However, you could easily add more fields.

Once you know which fields are in your entity, you need to know:

  • Where does this data come from?
  • Is there a particular system that is the?best?source of this data in your organization? For example, Salesforce might be the best source of contact data in your organization, as it’s confirmed by sellers during the sales engagement process. However, your billing system might be the best source of the company address if you send them a physical bill.
  • How often do you need to check this information? Many attributes don’t change all that often; others like email decay quickly.
  • How is this data related to other entities? A contact has a company relationship that is many-to-one, implying that for every contact that exists, you need to have a company record
  • How will you know the uniqueness of this information, and which of these fields are required to be filled?

Understanding the basic schema and metadata of your entity gives you a good headstart to have other conversations about the relation of your data in your system.

Extending the conversation to talk about metrics

Now that you’ve thought about entities and how to relate to other entity data, let’s talk about how to think about metrics in context.

Metrics?use counting of entity or event data, bounded by conditions and a time schedule, to create numbers to compare.

Here’s a simple diagram to set up a metric. You need to think about which items are involved, the conditions or constraints that limit the number of records you are counting, and how often you need to count this metric.

No alt text provided for this image

Simple metrics involve counting items.

  • How many customers do we have?
  • How many people visited the home page this week?

As you identify events that matter (a completed meeting, a login in your application), you may want to consider more complicated ideas:

  • Of the people who started an application flow, how many completed that flow within 60 minutes?
  • When people take this action in our system, do we want to contact them?

The combination of?metrics happening in a specific time period?with a specific?entity field change?represents a trigger for action. It’s the key business question you want to answer when modeling.?When the opportunity status becomes closed-won, what needs to happen in the business so that we can welcome a new customer?

2nd order (and more interesting) metrics happen when you start chaining trigger events in patterns and tying them to specific entities. When this happens often enough, you start gathering data that could be used for predictive decisions.

Applying metrics to action

Metrics in a vacuum are just numbers you count. These definitions need to be shared with the whole organization so that anyone who needs to understand what’s being measured and who owns that result can go to one place to find it. Without agreement from the team, you’ve got siloed numbers.

One way to drive agreement and provide a living reference is a?metrics catalog.?This could be as simple as a spreadsheet that lists the top priorities of the organization, or it could be tracked by a more sophisticated system, listing:

  • What metrics do we care about
  • Who owns them
  • Their definition
  • Where to find it and how to calculate it

The iterative process might look something like this:

No alt text provided for this image

The act of counting items or ratios to drive business decisions is the key reason that we model data. By defining the rules of engagement for the business and writing out how they happen, we make it a lot easier to discuss changes and understand how to materialize metrics in reports, dashboards, and alerts.

What’s the takeaway??Data modeling may sound like an esoteric topic. It’s a key task that we need to do as we create and run our business to clarify the most important metrics we track. Creating a decision log (and definitions) of our business objectives makes it easier to know whether we are doing well or not.

This article was originally published on Substack.

要查看或添加评论,请登录

Greg Meyer的更多文章

  • "The API of Me" in the age of AI

    "The API of Me" in the age of AI

    Our computing ability intersects with our own personal dataset to create new and differentiated solutions with AI at…

    2 条评论
  • Create a pacing graph with Google Sheets

    Create a pacing graph with Google Sheets

    As an operator, how many times do you get asked: “how are we doing this month vs last month? (Or vs. some previous…

  • In support of "boring" software

    In support of "boring" software

    I am an unabashed technology fan and an early adopter of new things. As a kid, I loved (and still love) science fiction…

  • 5 ways to make your low-code automation more effective

    5 ways to make your low-code automation more effective

    When I started my first software job, I remember thinking two things: I am definitely not the smartest person in the…

    2 条评论
  • Turning daily improvements into milestones

    Turning daily improvements into milestones

    You’ve seen the statistic. 1% improvements daily for a year yield a 37x return.

    2 条评论
  • Building Diagrams with Computers

    Building Diagrams with Computers

    Ethan Mollick writes about AI that “the only way to figure out how useful AI might be is to use it.” This is not…

    2 条评论
  • Redefining the Customer Journey

    Redefining the Customer Journey

    Have you ever played RevOps detective? ??? The story goes something like this. There’s a closed-won (or a closed-loss)…

  • Going from 0-1 in Data Operations

    Going from 0-1 in Data Operations

    Imagine you are starting a new venture and need to describe all the data tasks that need to happen to get you from…

  • An ode to console.log()

    An ode to console.log()

    Some of the first programs I ever wrote on a computer used PRINT to echo a line to the screen. Using BASIC, I filled…

    1 条评论
  • Great performance demands mental preparation

    Great performance demands mental preparation

    The coach will see you now When I was younger I wanted to be a professional baseball player. Professional baseball…

    2 条评论

社区洞察

其他会员也浏览了