Google's Semantic Search: Going to the Dogs?
Google is the undisputed leader in web search – technically a monopoly in fact. The coverage of web properties (good and bad) is vast – about 400 billion documents – so in quantitative terms it's really very good.?
But what about in qualitative terms? Google says it does "semantic search" and uses knowledge graphs (they actually popularized the term in 2012), at least since they acquired Metaweb in 2010. Unfortunately, there seems to be a recent uptick in reports of worse quality search results. So my team collected a bit of data to get more insights into what's going on.
Why should you care?? Over 1 million companies rely on Google Ads and 76% of retail search ad spend goes to Google Shopping, generating 85% of clicks for retailers. So when Google search does something wrong, a huge chunk of the economy leaves a mountain of cash on the table. Maybe all these companies should pay a little more attention to exactly how good Shopping search really is.
The Google search bar suggests that different search engines are available for everything on the web, for images, videos, forums, and for the specific use case of Shopping. But the Shopping search use case is different from the others:? it offers a more focused search task with much clearer user intent. We can safely assume that most users who opt for the Shopping experience are interested in acquiring something (as described in a query) and are willing to pay for it. So this clearer use case should enable a more targeted evaluation of how relevant the results really are – and more relevant results in the end.
Is Google Search barking up the wrong tree?
If Google is indeed doing semantic search for shoppers, then the most basic thing it should be able to deal with really well is synonyms. Queries with the same meaning should simply yield the same or very similar results. That's it, just same or different. Nothing fancy. If you get different results for different synonyms, advertisers run a serious risk that customers won't find what they're looking for (your products!) when they (without knowing) use the "wrong" term. And every unsuccessful search means dollars lost. Lots of unsuccessful searches means lots of dollars lost.?
So to double check this, we started with a very, very simple scenario:? go to the Shopping tab and buy a dog. We simply wanted to buy a four-legged creature as a pet. Seems simple enough, right? If they can't even get that right, why bother checking more sophisticated stuff??
Apparently it's not as simple as it seems.?
We looked at the very, very best results that Google's purebred search could offer – the top 10 results under All Products. Not the sponsored items, popular items, or Google Maps results -- which use different tech. Just plain vanilla search.?
We even gave Google a little help:? we tested a dozen or so synonyms (or near synonyms) for dog to make sure we didn't miss anything at all: dog, pooch, mutt, cur, barker, collie, beagle, canine, puppy, puppy dog, Snoopy, show dog, mastiff, mongrel, doggie, whelp. This yielded a total of 160 search results.?
Here's what we found:
领英推荐
I'm not making this stuff up. We tried hard – with 16 different, very simple queries (how many users will go that far?!) – but had to work quite a bit to find out where to buy a regular old pooch.
It's clear that Google's "semantic" search cannot even deal systematically with synonyms of simple, high-frequency words like dog. All the mumbo-jumbo about deep learning and transformer models and web-scale natural language processing doesn't seem to be doing them much good at all.
What does Google Shopping search return instead? A dog's breakfast!
Why these results? It's apparently because these specific items included the string dog or one of the other query terms. This is spelling-based search, not meaning-based search. As other advertisers used to say, Where's the beef? There's precious little semantics in their so-called semantic search.
And instead of fixing this super basic issue, they hide it. Have you noticed that more and more "special" results are appearing above the plain ol' search output – and pushing it down "below the fold"?? Google Maps, Popular Items, Sponsored Items, Fast Pickup items, and now a generative AI pane – Google apparently sees that these other results are now all more useful than search itself.???
The lesson here is very clear. Their engineers are chasing quantity over quality, math over meaning. So what you see is what you get – mentions of strings, not meaningful groupings or relevant products. Clients pay for smart semantic search but that's clearly not what they get.
So for lack of an effective approach to semantics, there's a huge amount of money that Google's 1 million clients are NOT earning. A mountain of missed sales and an ocean of missed opportunities.
If only I had a penny for each irrelevant result...
Head of Operations at Ortelius, Transforming Data Complexity into Strategic Insights
2 个月As I recently was told as a tech conference with a frown directed at me. “… are you still using google?”. It was said in the way that not long ago people said about Facebook ??
Principal at Legacy Software, Ltd.
2 个月@MikeD - To best of my knowledge not mentioned is that Google search is built on existence of EXPLICIT HTML links. Lots of "content" lacks such hard links. The "context" is assumed, not stated.
Creating content that means business
3 个月I've been writing a lot of articles lately for the veterinary industry -- and I mean for the veterinary profession itself, not pet parents! Then I wrote an article on Privacy Enhancing Technologies (PETs), and almost all of my search results were about pet care. Which is especially odd since, not once in the research I did for the veterinary industry, did I include the word "pets" in my search terms.
Bibliotecária especialista em tratamento de dados, recupera??o de informa??o e representa??o de conhecimento com foco em negócio e tecnologia.
3 个月Alexander Rodrigues Silva
Disambiguation Specialist
3 个月Mike Dillinger, PhD - Excellent post. There is a reason why Google's Search Appliance fails in - and has since been discontinued - for use in the enterprise. It could not semantically separate structurally identical strings like 'order' as a functional noun and an imperative verb. Nor, as you point out, could it deal with equivalences like synonyms. Basically, it said; Yep, Mike: I found the string 'order' 2,192 times in 12 nanoseconds. You're on your own to determine whether there is one on the list that makes sense to you. It sucks at acronyms, codes, and abbreviations, too.