Building a Media Mix Modeling Practice at The Knot Worldwide
The Knot Worldwide
Our purpose is to enable everyone to celebrate the moments that make us.
Introduction
At The Knot Worldwide, we focus on helping couples navigate the complexities of their entire wedding journey and beyond. We offer couples a Vendor Marketplace to connect with wedding professionals as well as an assortment of planning tools, invitation and registry services, and much more.
As with any modern company, marketing plays a key role in informing our customers about our services; and as with all marketing investments, questions regarding the measurement of marketing performance are crucial for reporting and strategic planning. As a data scientist supporting marketing at The Knot Worldwide, I spend a lot of time thinking about these questions and the complexity of the problem.
In this post, we’ll provide an overview of how we’re approaching marketing measurement at The Knot Worldwide; in particular, we’ll focus on our utilization of media mix modeling (MMM).
Measuring Marketing Performance
Historically, marketing at The Knot Worldwide was heavily focused on digital channels and, consequently, last-touch attribution played a key role in measuring performance and informing planning decisions. However, as we have evolved and our marketing mix has expanded, a need has arisen to have a more holistic approach to measurement. Despite having been in use since the 1960s, MMM is uniquely suited to this role for the following reasons:
The Knot Worldwide has recently expanded its marketing mix significantly with the launch of its largest ever integrated marketing campaign (recently announced by our CMO, Jenny Lewis ) to include a number of additional channels, including, Podcasts, OLV, CTV, and more.
With the increased investment in upper funnel brand-building channels to complement our lower funnel performance channels, the limitations of last-touch attribution and the benefits of MMM for reporting have become more pronounced. MMM as a core competency being relatively new, and tasked with helping to get its practice up and running as quickly as possible, one early and important task was evaluating the MMM options available to us.
A Key Decision: Build vs. Buy
For companies wishing to utilize MMM, a number of options are on the table, including:
There is no one-size-fits-all recommendation as to which of these options is the best fit for all companies. We chose to go the route of the third option. However, thanks to the hard work of open-source software developers, we don’t have to start from scratch and can leverage a variety of open-source tools and frameworks tailored to MMM.
Our Immediate Needs
Although a global company with many facets to our business, our first concerns were building national-level MMMs for digital marketplace leads in the United States, and finding an MMM framework that allowed us to build trustworthy models with both statistical and business validity.?
With these considerations in mind, and having decided to go the route of building an MMM practice in house, how did we go about choosing the open-source frameworks to utilize? The remainder of this post will provide an abbreviated version of our approach to selecting and validating an MMM framework, while also demonstrating the challenge of marketing measurement.
Robyn and pymc-marketing: A High-Level Comparison
Two open-source modeling frameworks receiving significant buzz in the MMM community are Robyn (developed by Meta) and pymc-marketing (developed by the consultancy PyMC Labs ). Both frameworks are useful, and as a data scientist, I clearly wanted to take both for a test drive before applying them to our internal data. I’ll use a simple example with simulated data taken from pymc-marketing’s documentation.
pymc-marketing
The newest MMM framework on the block, pymc-marketing, is a Python-based Bayesian approach developed by the consultancy PyMC Labs. Bayesian statistics brings with it a number of benefits, including:
领英推荐
While we won’t be exploring these (as well as many additional benefits of a Bayesian approach) features in this post, let it suffice to say that pymc-marketing provides an out-of-the-box MMM model widely used by media mix modelers. PyMC Labs has a nice example showing its ability to recover key marketing performance measurements. In particular, on a simulated weekly dataset with with two media channels, their simulated data has true values of the following:
To demonstrate the inherent complexity of media measurement even in this simple scenario, we’ll repeat a subset of their exercise using another framework which we’ll now introduce. (For a more detailed illustration of tuning Bayesian MMMs such as those implemented by pymc-marketing, please see Slava Kisilevich’s excellent article Modeling Marketing Mix Using PyMC3 .)
Robyn
Robyn is an approach to MMM based on traditional machine learning principles that also attempts to help incorporate business context in a semi-automated, “human-in-the-loop” fashion, reducing the time it takes to produce actionable insights. Its codebase is a mixture of R and Python and is currently limited to national-level models (e.g., those without a geographic split; e.g., by Nielsen DMA ). Instead of utilizing probability distributions to model uncertainty, it performs a large search over the space of MMM parameters, building thousands of models in the process and whittling down the options to those that have both a good statistical as well as “business fit” to the observed data. A data scientist can then choose from these models or further refine their MMM. (For anyone wishing for a more detailed overview of Robyn, this post by Recast as well as Robyn’s excellent documentation are highly recommended resources to learn more.)
While there are numerous differences compared to the model implemented in pymc-marketing (e.g., the saturation curves implemented, the various options for how carryover effects are implemented, etc.), we’d like to see what applying Robyn to PyMC Labs’s simulated dataset with minimal tuning returns.
Considerations for Marketing Data Scientists
Out-of-the box, Robyn did not recover the known source-of-truth values of the simulated data; however, its core estimates were directionally consistent with these values. This is not to suggest that Robyn is a bad framework to consider for MMM. Quite the opposite! In fact, there are many additional avenues to fine-tune Robyn models not explored here (e.g., more complex adstock transformations, addition of different features, etc.). Additionally, it is important to note some differences between the simulated data and the model Robyn applies to these data. For example, the so-called geometric adstock in Robyn does not apply a maximum lag on the carryover effect’s impact, whereas in the simulated data the maximum duration of the carryover effect was set to be 8 weeks.
Our main point is to highlight the challenges of accurate marketing measurement. As with all business applications of data science, working closely with stakeholders and utilizing all our business knowledge and all the measurement tools in our arsenal is critical to construct a useful model that matches our business reality as closely as possible. As the saying familiar to every statistician goes: “All models are wrong, but some are useful.” Knowing we’ll never be able to capture all of reality in our model, our goal is to have a useful model that empowers marketing stakeholders with actionable insights. Robyn and Bayesian models such as those supported by pymc-marketing both enable marketing data scientists to produce useful models.
In this post we focused on recovery of selected high-level reporting capabilities of MMMs. Of course, at The Knot Worldwide we’re also using MMMs for their key features of the ability to understand marketing channel saturation and, consequently, utilizing this information to perform scenario planning and run optimized budget allocations across these channels. Both Robyn and pymc-marketing outputs support such functionality; so, when choosing an MMM framework, what additional considerations should data scientists take into account?
For all these reasons, although Robyn has been a wonderful framework to work with to help “bootstrap” our MMM practice at The Knot Worldwide, a Bayesian approach is an increasingly attractive option for our future MMM work.
Concluding Thoughts
While TKWW is still at a relatively early stage in its MMM journey, we are excited about where we are and more excited about what’s coming. From stakeholder buy-in to implementation, building an in-house MMM practice is not simple. However, thanks to the work of open-source developers authoring the frameworks considered here, it becomes a much easier task.
As parting words, I’d like to thank the developers of both Robyn and pymc-marketing, as developing open-source software can often feel like a thankless job. However, I (and I’m sure the rest of the MMM community would agree) greatly appreciate all their hard work!
Written by: James Pooley
Director of Marketing Analytics @ PyMC Labs | Founder & CEO of 1749.io
10 个月Great to see pymc-marketing being evaluated and used in industry! Good luck on your continued MMM in-housing journey.
CMO & President at The Knot Worldwide
10 个月So grateful for this team & our partnership!
VP at The Knot Worldwide
10 个月Great work, team!