MMM is a real-life multi-objective problem

MMM is a real-life multi-objective problem

As a co-creator of Robyn, if I would pick a single most important innovation from this project, it'll be solving the Marketing Mix Modelling problem with multi-objective optimization.

Despite the popularity of the project, this aspect is rarely discussed. I'm very glad to find out that this is discussed in the latest episode of Mobile Dev Memo by Eric Seufert , Julian Runge from the Northwestern University and Prof. dr. Koen Pauwels from Northeastern University.

The "Sanity Check" in MMM

Every MMM practitioner knows this "sanity check": When facing multiple model candidates that have similar goodness of fit (e.g. adjusted R-squared), it's common to prefer a candidate that might have slightly less fit but more "plausible", meaning the share of contribution is less far away from the current spend allocation. According to Manchanda, Rossi, and Chintagunta (2004, as in Taylor 2023; Runge, Skokan and Gufeng 2024), a model where spend and effect share more closely align will be deemed more plausible than a model that suggests major deviations or produces extreme effects.

For example, for a dataset with 2 channels splitting 90%/10% media spend, a model candidate that predicts the contribution of 10%/90% is considered rather implausible, no matter how good the model fit is. In comparison, another candidate with 70%/30% contribution split would be considered more plausible.

Runge and the Meta team (2023) has also conducted a qualitative survey of MMM providers (n = 7) to gain more insights into this common practice. Despite the limited sample size, we found out that all respondents are practising the "sanity check", with three out of seven employ specific formulas or heuristics for evaluating model output against business logic, whereas the remaining four (out of seven) rely on a blend of expertise and experience.

What is a Multi-Objective Problem

Most real-life problems are multi-objective problems. If you're buying a car, you might consider price, comfort, performance and emission as the purchase criteria. The final purchase decision is most likely a trade-off between these four "objectives". Of course there might be a theoretical "perfect car" that's the cheapest, most comfortable, the fastest with the lowest emission. In reality, very unlikely.

For marketers and marketing science professionals that deals with marketing reality, this phenomenon is transferable. You want to select a model that's most precise in outcome prediction, while you also need to explain the "causal effect" of your marketing activities as sales or outcome drivers.

In other words, a good MMM needs to be both predictive and interpretable. This is the frequently cited conflict between science and craft in MMM.

Robyn's Multi-Objective Optimization

In Robyn, we include the "sanity check" into the optimization through parameterization in the form of additional objective functions. The implementation of multi-objective hyperparameter optimization is considered the most important innovation in Robyn. At the same time, the usage of hyperparameters enables stronger automation of the parameter selection for adstocking, saturation, regularization penalty and even training size of time-series validation.

Robyn uses Nevergrad, Meta’s gradient-free optimization platform to perform this task with its so-called "ask & tell" interface. Simply explained, Robyn "asks" Nevergrad for the mutating hyperparameter values by "telling" it which values have better scores (objective functions).

Currently, Robyn implements the following three objective functions:

  • NRMSE: The Normalized Root Mean Square Error is also referred to as the prediction error. Robyn allows time-series validation with the spitting of the dataset into train / validation / test. When fitting without the time-series validation, the training error nrmse_train is objective function for the evolving iterations. With time-series validation, the validation error nrmse_val becomes the objective function, while nrmse_test is used to assess the out-of-sample preditive performance.
  • DECOMP.RSSD: The Decomposition Root Sum of Squared Distance is also referred to as the business error and is a key invention of Robyn. It represents the difference between share of spend and share of effect for paid media variables. We're aware that this metric is controvertial because of the convergence of media ROAS. In the reality, multiple objectives always "work together" and trade off each other in the optimisation process. DECOMP.RSSD rules out models with extreme decomposition and helps narrowing down model selection.
  • MAPE.LIFT (optional): The Mean Absolute Percentage Error for experiments is activated when calibrating and is referred to as the calibration error. It's a key invention of Robyn and allows Robyn to minimise the difference between predicted effect and causal effect.

We hope this article inspires further research into this area. For more details, please see the paper "Packaging Up Media Mix Modeling: An Introduction to Robyn's Open-Source Approach" by Runge, Skokan & Zhou, 2024.

Dr. Peter Cain

Executive Partner and co-founder at marketscience

1 个月

Sounds akin to using Dorfmann-Steiner as a prior.

回复
Edoardo Piras

Manager | Strategy | Marketing measurement | Advanced Analytics | Experience in data driven strategic planning & Insights - Levi Strauss & Co.

2 个月

Great enhancements!

回复
Alessandro Cosci

Data Performance Analyst at SIDN Digital Thinking

2 个月
Venkat Raman

Co-Founder & CEO at Aryma Labs | Building Marketing ROI Solutions For a Privacy First Era | Statistician |

2 个月

Indeed. MMM is a real life multi objective function. I would say Decomp RSSD and Multi objective function are two of the hallmarks for Robyn. In case you would be interested Gufeng Zhou, in our recent research paper 'Only two can tango at the pareto front' we found that if you would index on Decomp RSSD in the multi objective function, one stands to get reasonably accurate model as opposed to indexing on say NRMSE, MAPE or KL Divergence. https://arymalabs.com/only-two-can-tango-at-the-pareto-front/

Harj Kalyan

Digital Solutions Lead at Connective3

2 个月
回复

要查看或添加评论,请登录

Gufeng Zhou的更多文章

社区洞察

其他会员也浏览了