登录查看更多内容

Why is it so hard to sort by price?

Daniel Tunkelang

Query Understanding

发布日期: 2018年8月7日

A common use case for online shopping is searching for products — say, coffee makers — and then sorting the results by price, from low to high, in order to find a cheap one. Surprisingly, this use case fails on most ecommerce sites.

You’d think that ecommerce sites would have solved this problem years ago. But sorting by price isn’t as simple as it sounds. Why is it so hard?

Sorting exposes the difference between “relevance” and “ranking”.

Most shopping sites allow searchers to sort results by price, but the results often aren’t what you’d expect. For example, the cheapest results for “coffee maker” usually aren’t coffee makers. Instead, they’re accessories like coffee filters and single-use capsules. These irrelevant results match the keywords “coffee” and “maker”, and they cost less than the cheapest coffee maker.

The problem is that most search engines conflate relevance with ranking. In theory, a search engine should only return results that are relevant to the query. In practice, search engine developers focus primarily — or even exclusively — on the relevance of results that searchers are likely to see.Indeed, search evaluation metrics like discounted cumulative gain (DCG) are position-biased, mostly measuring the relevance of top-ranked results.

Position bias isn’t a problem when results are sorted by their relevance scores — indeed, it’s the whole point of such a sort. But when searchers sort results by price — or by any attribute other than relevance score — then the irrelevant results that had been buried suddenly show up front and center.

So it’s just a relevance problem?

Yes, it’s just a relevance problem. But it’s a problem that isn’t easy to solve.

Search engines face a trade-off between precision (aka relevance) and recall. No search engine is perfect. Increasing precision causes recall to suffer: attempts to remove irrelevant results lead to removing some of the relevant results. Conversely, increasing recall causes precision to suffer: attempts to include more of the relevant results increase the number of irrelevant results.

When searchers sort by relevance score, the search engine can afford to err on the side of recall, since searchers won’t see most of the irrelevant results. But when searchers sort by price, poor precision has very negative impact on the search experience.

In fact, sorting by price tends to amplify precision issues. Consider the coffee makers example: irrelevant results like accessories tend to be outliers on price, so they dominate the lowest-priced results.

Allowing searchers to sort requires separating relevance from ranking.

A search engine that allows searchers to sort by price — or other attributes that don’t correlate with the relevance score — needs to separate relevance from ranking. Rather that optimizing for a position-biased relevance metric like DCG, it needs to consider precision for the entire result set.

In other words, search engines can’t keep relying on relevance ranking as a crutch that allows them to err aggressively on the side of recall. Instead, they have to face a real trade-off between precision and recall.

In practice, that means establishing a relevance threshold and filtering out results whose relevance score is below that threshold.

It’s important to provide transparency and control to searchers.

Filtering results using a relevance threshold inevitably filter out some results that are relevant. As discussed above, it’s a trade-off. But, in a sort by price, searchers may think that the search engine is deliberately — and deviously — filtering out less expensive results.

In order to earn searchers’ trust, it’s important for the search engine to provide transparency and control. A best practice for implementing a relevance threshold is to show a prominent message at the top of the results, e.g., “We removed some results to show you the most relevant listings. Click here to view all results.”

Offering searchers transparency and control helps assure them that the search engine is making a good-faith effort to satisfy their intent. It’s also useful as a quality check: if many searchers click the link to view all results, it probably means that the relevance threshold is too aggressive.

Maybe the searcher doesn’t really want to sort by price.

Given the challenges of implementing sorting by price, it’s a good idea to offer searchers alternative ways to refine or organize the results.

One alternative to sorting by price is filtering by price, e.g, allowing searchers to only see results that cost less than $20. Filters don’t change the sort, i.e., results that satisfy the price filter are still sorted by relevance score. But it is important to provide price filters that are appropriate to the search query, e.g., different ranges for coffee makers than for diamond rings.

Another alternative is to organize results using facets or clusters. In the case of coffee makers, that could mean organizing results by the different kinds of coffee makers: automatic drip, French press, super automatic espresso, etc. Indeed, sorting results by price isn’t very useful for heterogenous result sets, but it can be quite useful once the searcher has narrowed down the result set.

Many searchers use sorting by price as a proxy for some other intent. If the search engine can directly address that intent, everybody wins.

Summary

Sorting by price is surprisingly difficult — and we haven’t even considered issues like unit pricing and product variations. Allowing users to sort results requires the search engine to separate relevance from ranking, which requires them to face the trade-off between precision and recall. If a search engine implements a relevance threshold, it’s important that it provides transparency and control to earn searchers’ trust. Finally, sorting by price may be a proxy for some other intent that the search engine can address directly.

Prathyusha Senthil Kumar

Engineering Leader, Machine Learning at Meta | Trust & Safety | Generative AI

6 年

Wow, that was quick! :)

1 次回应

Sanjeev Somani

CEO @ Tribyl. Improve revenue conversion by eliminating silos, guesswork and opinions.

6 年

? ? ? ? ? ? ?Another factor at play here is that the User is implicitly going through a conjoint analysis trade-off when making a purchase decision.? A user's decision is rarely as simple as picking the lowest price product, even if the results belong to the same product category.? A great search experience needs to have a "dialog" with the User, AFTER the basic search is executed -- e.g., what features does the User truly care about, and is willing to pay (more) for.? This requires a deep understanding of customer.?? ? ? ? ? ? ? ?Travel sites have done a particularly good job of this, e.g., allowing the User to refine initial results based on # of stops.? In the offline world, a good Salesperson goes through "Discovery" before telling you the actual price.? A lazy approach of loading the product catalog into the search engine leaves money on the table for the vendor, in addition to creating the precision vs recall UX issue you mention in your article.? That's why this is so hard to do right.

1 次回应

Ashutosh Garg

The right career for everyone in the world

6 年

Daniel Tunkelang in general price sort breaks where you are comparing items of different type. What I have seen work well is asking user to pick a product category before sorting by price. Otherway is to predict users primary intent and provide sort within it.

4 次回应

Benoit Rottembourg

Algorithmic Audit & Regulation at INRIA

6 年

There is a special case I have seen in the travel industry, working on an OTA web site. The searcher looks for say a flight from Paris to Hammamet on a friday afternoon. He would like to see the offers by ascending price. The OTA could produce this list but knows that the customer might be seduced by a less costly flight ... not exactly corresponding to his search. And miracle, the OTA has a splendid product, for Hammamet, but on thursday. Strangely the OTA has a very attractive price for this thursday .... 200$ less. Why ? because the flight is empty and the OTA has negociated a special allotment with the airline. In that case the OTA doesnt really want to give the exact product of the search, but sthing good enough. And more attractive in terms of price (and margin for the OTA). I have studied the stats there and obviously... desobeying the search yields considerable benefit. There was a latent search that was not expressed by the customer.

4 次回应

Henry Blaufox

Digital technology & marketing professional - technology adoption, onboarding and process transformation

6 年

Dan nails? a critical problem in the middle of this essay. For users, when we don't see the products we identified in the search on a price list (usually ascending order,) we get frustrated, even suspicious. Are they trying to keep me from seeing the lower priced items? Not exactly stellar experience, more like the old days rummaging through Alexander's or E J Korvettes. One is left with the impression we are supposed to be frustrated.

查看更多评论

要查看或添加评论，请登录

Daniel Tunkelang的更多文章

Precision, Recall, and Desirability: A Deep Dive

2025年3月27日

Precision, Recall, and Desirability: A Deep Dive

This post expands on my previous discussion of “Precision, Recall, and Desirability,” diving deeper into defining…
ChatGPT, Are You Just Telling Me What I Want to Hear?

2025年3月3日

ChatGPT, Are You Just Telling Me What I Want to Hear?

These days, the Turing Test — which Turing originally called the “imitation game” — feels hopelessly outdated. With…
Not All Recall is Created Equal

2025年2月24日

Not All Recall is Created Equal

Search application developers constantly navigate tradeoffs, particularly between precision and recall. Precision…

1 条评论
To Bot or Not to Bot: It Depends on the Question

2025年1月31日

To Bot or Not to Bot: It Depends on the Question

I was one of Quora’s earliest users. I earned Top Writer status for several years and even made some money through…
Ground Truth: A Useful Fiction

2025年1月14日

Ground Truth: A Useful Fiction

A key concern about AI is that models “hallucinate” — technical jargon for saying that they make up things that look…

5 条评论
Conjunction, Disjunction, What’s Your Function?

2025年1月6日

Conjunction, Disjunction, What’s Your Function?

Like many folks of my generation, I grew up on Schoolhouse Rock, a series of animated educational shorts that aired…
Modeling Queries as Bags of Documents

2024年12月2日

Modeling Queries as Bags of Documents

Last week, I had the honor of presenting “Modeling Queries as Bags of Documents” at Search Solutions 2024 with Aritra…
Documents, Queries, and Categories

2024年11月25日

Documents, Queries, and Categories

I have published a number of posts and presentations about the bag-of-documents model, which essentially represents…
Where Do Categories Come From?

2024年11月20日

Where Do Categories Come From?

In my previous post, I argued that categories are fundamental for search applications. I characterized a robust set of…

1 条评论
Categories are Fundamental for Search

2024年11月18日

Categories are Fundamental for Search

As a search consultant, I have learned to be flexible about structured data. However, I do insist on content being…

5 条评论

See all articles

Why is it so hard to sort by price?

Daniel Tunkelang

Query Understanding

Daniel Tunkelang的更多文章

社区洞察

其他会员也浏览了

Is Google Still Relevant For Shopping In 2025 After All?

Grow Your Online Store's Profits with Featured Categories

Grow Your Online Store's Profits with Featured Categories

eCommerce Express Digest - September 2024

Smart Shopping Campaigns by Google

Google Shopping: The ultimate guide for your online business

We tell you how to boost your Shopping ads on Google

How do Recommendation Engines work? And what are the benefits?

The Beginner's Guide to Google Shopping: How to Boost Your Business Online

Coupon Code Scraping

Daniel Tunkelang的更多文章

Precision, Recall, and Desirability: A Deep Dive

ChatGPT, Are You Just Telling Me What I Want to Hear?

Not All Recall is Created Equal

To Bot or Not to Bot: It Depends on the Question

Ground Truth: A Useful Fiction

Conjunction, Disjunction, What’s Your Function?

Modeling Queries as Bags of Documents

Documents, Queries, and Categories

Where Do Categories Come From?

Categories are Fundamental for Search

社区洞察

其他会员也浏览了

Is Google Still Relevant For Shopping In 2025 After All?

Grow Your Online Store's Profits with Featured Categories

Grow Your Online Store's Profits with Featured Categories

eCommerce Express Digest - September 2024

Smart Shopping Campaigns by Google

Google Shopping: The ultimate guide for your online business

We tell you how to boost your Shopping ads on Google

How do Recommendation Engines work? And what are the benefits?

The Beginner's Guide to Google Shopping: How to Boost Your Business Online

Coupon Code Scraping