Tech & Data Diary - Entry #020: Building MMMs Through BigQuery
As marketers, understanding the impact of various digital marketing channels on overall revenue is essential to optimize spending and maximize ROI. A Marketing Mix Model (MMM) allows you to measure the contribution of different channels to overall revenue and forecast future sales. In this article, I’ll walk you through how to build a full MMM using Google Analytics 4 (GA4) data, focusing on revenue as the dependent variable, and using SQL in BigQuery to analyze, model, and forecast future sales based on different marketing channel spends.
Step 1: Setting up Your Data in BigQuery
First, ensure that your Google Analytics 4 data is linked to BigQuery. GA4 exports data, including revenue and traffic sources, to BigQuery, which will serve as the foundation for our MMM.
You can query your GA4 data in BigQuery with a query like this:
SELECT DATE(TIMESTAMP_MICROS(event_timestamp)) AS date, source AS channel, SUM(totals.revenue) AS revenue, SUM(totals.transactions) AS transactions, SUM(totals.sessions) AS sessions FROM your_project.your_dataset.ga4_table WHERE event_name = 'purchase' GROUP BY date, channel ORDER BY date;
This query gives you the total revenue generated from each channel per day, which will be used as the dependent variable in the marketing mix model.
Step 2: Gathering Digital Marketing Spend Data
Next, collect the marketing spend data from your different channels. This could include Google Ads, Facebook, Instagram, TikTok, and others. Make sure to aggregate daily spends for each channel.
Here’s an example of how to format your spend data in BigQuery:
SELECT date, channel, SUM(spend) AS spend FROM your_project.your_dataset.marketing_spend GROUP BY date, channel ORDER BY date;
This gives you a breakdown of daily spend by channel, which will be the key independent variables in your model.
Step 3: Preparing Data for the Marketing Mix Model
Now that you have revenue and spend data for each channel, you need to join these datasets on the date and channel level:
WITH revenue_data AS ( SELECT DATE(TIMESTAMP_MICROS(event_timestamp)) AS date, source AS channel, SUM(totals.revenue) AS revenue FROM your_project.your_dataset.ga4_table WHERE event_name = 'purchase' GROUP BY date, channel ), spend_data AS ( SELECT date, channel, SUM(spend) AS spend FROM your_project.your_dataset.marketing_spend GROUP BY date, channel ) SELECT revenue_data.date, revenue_data.channel, revenue_data.revenue, spend_data.spend FROM revenue_data JOIN spend_data ON revenue_data.date = spend_data.date AND revenue_data.channel = spend_data.channel;
This step consolidates revenue and marketing spend data, setting up the foundation for building the MMM.
Step 4: Building the Marketing Mix Model
Now we can start modeling the impact of different marketing channels on revenue. To build the MMM, we will use linear regression, where revenue is the dependent variable, and marketing spends from various channels are the independent variables.
领英推荐
We can create a regression model using SQL in BigQuery. First, normalize the data using log transformation:
WITH log_transformed_data AS ( SELECT date, channel, LOG(revenue + 1) AS log_revenue, LOG(spend + 1) AS log_spend FROM your_project.your_dataset.revenue_spend_data )
Then, apply linear regression to estimate the impact of each channel on revenue:
SELECT channel, ML.LINEAR_REG( MODEL_NAME, STRUCT( log_revenue AS label, log_spend AS spend ) ) AS coefficient FROM log_transformed_data GROUP BY channel;
This model gives you the coefficients for each channel, which represent their relative impact on revenue.
Step 5: Forecasting Future Sales
Once the model is built, you can forecast future sales based on different marketing spend scenarios. For example, you can simulate what happens if you increase Google Ads spend by 10% or decrease Facebook spend by 5%.
To predict future sales, use the coefficients obtained from the model and plug them into a forecast equation:
WITH future_spend AS ( SELECT date, channel, spend * 1.1 AS adjusted_spend FROM your_project.your_dataset.future_spend_data WHERE channel = 'Google Ads' ) SELECT date, SUM(EXP(adjusted_spend * coefficient) - 1) AS forecasted_revenue FROM future_spend JOIN coefficients ON future_spend.channel = coefficients.channel GROUP BY date;
This query forecasts future revenue based on the new spend inputs for each channel.
Step 6: Visualizing Results
Finally, visualize the results of the model and the forecasts. You can use tools like Google Data Studio, Looker Studio, or even Tableau to create visuals that show the predicted revenue across various marketing spend scenarios.
This step helps stakeholders better understand the relationship between marketing spend and revenue, allowing them to make more informed budgeting decisions.
Conclusion
Implementing a full marketing mix model using GA4 revenue data allows businesses to measure the impact of different marketing channels and forecast future sales. By following the steps outlined above, you can build a comprehensive model that provides valuable insights and helps optimize marketing investments.
The power of MMM lies in its ability to quantify the contribution of each channel to the overall revenue and predict future performance, enabling marketers to make data-driven decisions that improve ROI.
Head of Operations | Product & Solution Consultant | ex-Google
2 个月Combining media spend and GA data for forecasting sounds powerful. In my experience, handling seasonality and external factors can be tricky. How does the model account for those?
Business Lead | Bridging gaps between brands & agencies one operation at a time.
2 个月Love this. While not a full blown econometric model, this is a great piece of analytics for beginners and can add a lot of value to marketeers. Keep the great diary entries coming Ryan Fletcher ! Mira Al-Qasimi Rishad Alavi try this.
Measurement Expert - Marketing Science
2 个月Great article! It’s worth mentioning that Marketing Mix Modeling (MMM) typically includes additional aspects like non-linear effects, lagged effects, and external factors. These can be important for capturing things like carryover effects (ad stock) and diminishing returns, which help explain how marketing spend impacts performance over time. To tackle these complexities, MMM often incorporates advanced techniques such as non-linear regression and distributed lag models to reflect delayed and non-linear effects. Bayesian methods and machine learning approaches can also provide deeper insights by handling complex interactions between channels. If you’re interested, you can explore Meta’s Robyn or Google’s Meridian, which are open-source MMM models designed to address these challenges.