How accurate is GA4? We reviewed 33 accounts and have an answer for you...
Andy Crestodina
Co-Founder and CMO at Orbit Media | SEO, Analytics, AI, Content Strategy and Website Optimization
New Research! ?? We compared GA4 data to other sources of truth and calculated the average level of inaccuracy in Google Analytics...
Every number in your Google Analytics account is wrong.
Maybe the reasons are obvious. Some visitors block tracking codes. Some don’t accept cookies. Some browse in private mode. We all have the right to not be tracked.
Maybe it’s not a big deal. You don’t need perfect data. There are still plenty of useful insights waiting to be discovered in every GA4 account.
But how inaccurate is Google Analytics?
It’s still an interesting and important question. Today we are answering it. We’ll show our approach for measuring GA4 accuracy and show you how to check for yourself in your own account.
Why is GA4 inaccurate?
Because there are so many ways to avoid being tracked.
Google Analytics records data when a little bit of Javascript (gtag.js) talks to cookies on the visitors device (_ga, _gid, etc.). If this little conversation doesn’t happen, no data is recorded in GA4. There are many reasons for this. The big ones are related to privacy and the trend toward Do Not Track (DNT).
There are several other reasons (could be GA4 setup problems!) but these are the big ones. So GA4 underreports all traffic and conversions, but there are other ways to record activity on your website…
How can you measure the accuracy of GA4?
Compare Google Analytics to other sources of truth.
Even if the visitor uses an ad blocker or doesn’t accept cookies, they may still submit a contact form, subscribe to a newsletter, buy a product, make a donation, apply for a job, etc.
So the databases that record these actions are a better sources of truth. They are in your CMS (WordPress) or marketing automation tools (Hubspot), the leads in your CRM (Zoho), the sales in your ecommerce system (Shopify), etc.
Comparing the recorded conversions with the “key events” in GA4 shows us how much GA4 is underreporting.
You can do this for yourself very easily. Just pick a nice long date range in both tools and compare the data. If Hubspot shows 100 leads and GA4 shows 85, then GA4 is underreporting 15% of leads.
We did this 60 times, comparing various recorded conversions across 33 accounts, comparing newsletter signups, get-a-demo requests, ecommerce revenue and many other interactions in many other databases and in GA4.
We also compared our own GA4 traffic (not just key events) with traffic from Plausible , a cookie-less analytics tool and we'll share that comparison below.
A lot depends on cookie consent, so we separated the data for sites with and without cookie consent banners. Sites that use a consent management platform (CMP) such as Cookiebot give the visitor the option to accept cookies. When visitors are given the option, some visitors choose not to be tracked, so those accounts have slightly less accurate data.
Here are the average levels of underreporting by Google Analytics.
Looking more closely at the data, we see that the level of underreporting varies widely across accounts.
Here you can see the number of comparisons in each percentage range for sites without consent banners. Although the average was 11%, sometimes the data was off by just 1%, while others were off by 30%.
We separated the accounts that use a cookie consent banner from those that do not. Let’s look closer at the impact of all of those “Accept Cookie” buttons.
Got a minute to help with another research project? It's a short survey for content marketers. You're going to keep seeing this CTA until we get to 1000 responses! ??
Now back to the article...
How do cookie consent banners affect GA4 accuracy?
It’s courteous to let people opt-out of tracking. It’s also good to comply with privacy laws.
The two biggies are the European Union’s GDPR law (General Data Protection Regular) and California’s CCPA law (California Consumer Privacy Act). To comply with these, website owners need to use a CMP (Consent Management Platform) which puts a cookie consent banner at the bottom of the website, at least for visitors in EU countries and California.
The Orbit Media site uses a cookie consent banner for visitors from those places. So we’re able to compare those users to other users to estimate the impact cookie consent banners have on GA4’s accuracy.
Here we’ll use another method to measure the difference. It’s more accurate, but uses a smaller dataset. We’ll compare traffic recorded in Orbit Media’s GA4 to traffic recorded in Plausible. Plausible is a cookie-less and GDPR-compliant analytics tool . It tracks traffic using only Javascript. No cookies (or consent) necessary.
In both GA4 and Plausible, we separated the traffic from places where the banner was shown (for us, that’s California and European users) with traffic from places where the banner wasn’t shown. It was a tedious process but it revealed the recorded traffic impact of the cookie consent banner.
Note: This is different from the data above because although those websites use consent banners, they may not show it to all users. Here we are looking specifically at users who did and did not see the banner.
Remember, cookie consent banners do not affect your traffic. They only affect your recorded traffic. And 45% of your traffic data is plenty to get valuable insights. A true GA4 pro can come up with great hypotheses and make informed decisions while looking at small datasets. Of course, accounts with very low data have other issues. Those GA4 reports are sometimes filled with internal traffic.
Keep in mind, this analysis above is all from one GA4 property with a dataset of around 115K users. The impact of the consent banners varies widely on three big factors.
Of course, implied consent will allow you to gather more Analytics data, but it is not GDPR compliant. Here’s what the most permissive settings look like:
领英推荐
Our good friend and GA4 expert, Dana, has some tips for better data quality...
“Don’t try for 100% accuracy in GA4 – it isn’t possible. Instead, make sure that the data that you do have is of good quality. Some tips:
Now you know the average level underreporting for all GA4 data. That’s conversion tracking gaps, inaccurate real time data and missing transactions, regardless of the traffic source and attribution model. Keep in mind that metrics showing percentages (bounce rates, engagement rates, conversion paths, key event rates) should be more accurate, regardless of privacy tools and user behavior.
The future of the web: more privacy and less data
It’s safe to predict that more privacy laws are coming. It’s likely that in the future, more users will use tools that protect their privacy. Big tech that prioritizes users (Apple, Mozilla) will get better at blocking trackers by big tech ad companies (Google, Meta)
This all means a bit less data for marketers.
“GA4 is a bit of a mess. We frequently hear complaints about its user experience, privacy implications and surprisingly, its data accuracy. It is significantly impacted by cookie consent/GDPR banners, privacy-focused browsers and ad blockers. And the data modeling used to fill in the gaps caused by the missing data doesn’t seem to be that accurate either.”
What is GA4 data modeling?
The Analytics team at Google tries to fill in these gaps with data modeling based on machine learning, but this doesn’t work for everyone. You need to have consent mode turned, you need to have enough traffic and you need to set your “Reporting Identity” to Blended. Then it might work.
Even if it does, it will never know how many visitors are using AdBlock or in private browsing mode. This is Google making its best guess.
There’s another gotcha with Google Analytics guessing, according to our friend and Analytics pro, Chris Penn.
“The problem is that not every audience is the same. If our data were missing at random, we could be assured that the guessing was more or less accurate. When data isn’t missing at random, then we run into problems.
For example, iPhone data is often missing due to Apple’s privacy protections. Are iPhone customers different from Android customers? You bet they are. They’re a different demographic with different purchase patterns. Is GA4 compensating for that? We don’t know. If a missing audience is assumed to behave the same as the existing audience, there could be disastrously wrong assumptions.”
What about GA4 consent mode?
Google has also launched Consent Mode V2Opens a new window which is a method for having your consent management system tell GA4 that your visitor has granted or declined tracking. The idea is that if they decline to be tracked, Google will know you have a visitor, but track them just a little bit.
Ironic, right? Tell the tracker you don’t want to be tracked!
What about other sources of data?
Beyond GA4, there are other places to look for more accurate website data, such as your CDN or server logs. Server logs are the ultimate for accuracy because nothing can hide the fact that pages are being served. But they’re hard to check and don’t have great reporting. And because they don’t filter bot traffic the numbers might look very strange.
In this screenshot, Cloudflare shows 20x more users than GA4. How many bots are out there? This isn’t helpful…
In a separate analysis comparing GA4 to Matomo (an open source analytics package that runs on the server), Chris found that GA4’s traffic numbers were incorrect, but they were unpredictably incorrect. They had a variance between 49-173% of Matomo’s numbers that didn’t correlate with other sources of traffic.
So in theory, you could use this predictable variance as a modifier to Analytics data to calculate more accurate traffic stats. Compensating for GA4 inaccuracies.
In the end, none of these workarounds will ever give you perfect data. But is that really a problem?
What if GA4 had perfect data?
Try this thought experiment: open GA4 and pretend for a moment that every number you see is 100% correct. Now proceed with your analysis.
Need ideas for analysis? Try answering these questions:
All done? How did it go?
Now ask yourself: Do you really think the insights you found were better because of the data accuracy? Is the lack of GA4 accuracy really holding back your marketing? Be honest.
In my experience, the problem with Google Analytics isn’t accuracy. It’s that marketers don’t approach it consistently, with curiosity and with an eye toward action.
“Let’s stop talking about tools and start talking about data literacy, critical thinking and storytelling with data. Unfortunately, too few digital marketing professionals learn basic analysis and blindly trust digital analytics tools. Also the pressure from Google and Meta to spend more and from agencies profiting from big budgets creates a conflict of interest that isn’t in the interest of brands.
My advice is to take part of your media budget and invest in your data literacy. Then build a simple, consistent approach to data collection, analysis and interpretation. This could be based on GA4 and Search Console or any other software. But analytics is essential for businesses that invest in Google Ads or SEO. Therefore it’s a smart choice to get your data to work for your business.”
To get the most from your GA4 account, find those 10x performers and double down on those activities.
Find unicorns and make baby unicorns!
Want to share or cite this article? Consider using the original version on the Orbit blog . We are happy to post these here, but we're thrilled when the original source gets referenced!
Digital Marketing & Automation | Video & Content Specialist | Occasional Brewer
2 个月Vijay Krishnan ??
Serving Notice Period | Data Engineer at Cognizant
2 个月It's interesting to see how GA4's accuracy varies, especially with the use of cookie consent banners. I'm currently working with GA4 and facing similar issues with data accuracy. The 11% underreporting without banners and 20% with them is definitely concerning, but your point about whether more accurate data would actually make a significant difference is thought-provoking. Thanks for sharing this analysis .I'll definitely check out the steps to assess underreporting in my own GA4 account.
??Freelance Digital Trafficker & Paid Media Specialist ????PPC, SEM, SEA, Paid Search, Paid Social??(Autónomo)
3 个月Really insightful Andy Crestodina Thanks for sharing! Do you recommend using both? GA4 and tools like Plausible Analytics?
Virtual CMO and Go-to-Market Builder for Video Tech Companies
3 个月Clayton Christensen emphasized the importance of understanding "jobs to be done." When it comes to analytics, the real job is making informed decisions. As long as GA4 provides directional insights that drive strategy effectively, perfect accuracy isn't crucial. Founders should focus on actionable data rather than obsessing over every missed percentage point.
Strategy Director at Adido
3 个月Hi Andy Crestodina, thanks for sharing this analysis. I actually stumbled upon your post because I'm having GA4 data headaches and wondered if anyone has found the answer. Whilst your analysis doesn't cover it exactly, I wondered if you had done a comparison with the data that GA4 reveals for device only vs. blended? The headache I have, even with a 95% opt-in rate on my user data, is that GA4 in blended mode is increasing my user and session counts by a whopping 50%! And then it's not surfacing any further details so all the extra dimensions are (not set) like landing page etc. Curious to know if you have seen anything similar when you've been doing these comparisons? Assuming that your benchmarking data set has been GA4 device-only vs. Plausible (and other sources) rather than the blended reporting profile. Dana DiTomaso curious to know if you have also seen similar staggering uplifts with any of the work you've done switching between the reporting profiles and the impact the machine learning has (I currently can't believe this is an effective machine model yet, and if that's the case should we be opting out for at least a year so that the system trains itself better?)