Why is it so hard to sort by price?
A common use case for online shopping is searching for products — say, coffee makers — and then sorting the results by price, from low to high, in order to find a cheap one. Surprisingly, this use case fails on most ecommerce sites.
You’d think that ecommerce sites would have solved this problem years ago. But sorting by price isn’t as simple as it sounds. Why is it so hard?
Sorting exposes the difference between “relevance” and “ranking”.
Most shopping sites allow searchers to sort results by price, but the results often aren’t what you’d expect. For example, the cheapest results for “coffee maker” usually aren’t coffee makers. Instead, they’re accessories like coffee filters and single-use capsules. These irrelevant results match the keywords “coffee” and “maker”, and they cost less than the cheapest coffee maker.
The problem is that most search engines conflate relevance with ranking. In theory, a search engine should only return results that are relevant to the query. In practice, search engine developers focus primarily — or even exclusively — on the relevance of results that searchers are likely to see.Indeed, search evaluation metrics like discounted cumulative gain (DCG) are position-biased, mostly measuring the relevance of top-ranked results.
Position bias isn’t a problem when results are sorted by their relevance scores — indeed, it’s the whole point of such a sort. But when searchers sort results by price — or by any attribute other than relevance score — then the irrelevant results that had been buried suddenly show up front and center.
So it’s just a relevance problem?
Yes, it’s just a relevance problem. But it’s a problem that isn’t easy to solve.
Search engines face a trade-off between precision (aka relevance) and recall. No search engine is perfect. Increasing precision causes recall to suffer: attempts to remove irrelevant results lead to removing some of the relevant results. Conversely, increasing recall causes precision to suffer: attempts to include more of the relevant results increase the number of irrelevant results.
When searchers sort by relevance score, the search engine can afford to err on the side of recall, since searchers won’t see most of the irrelevant results. But when searchers sort by price, poor precision has very negative impact on the search experience.
In fact, sorting by price tends to amplify precision issues. Consider the coffee makers example: irrelevant results like accessories tend to be outliers on price, so they dominate the lowest-priced results.
Allowing searchers to sort requires separating relevance from ranking.
A search engine that allows searchers to sort by price — or other attributes that don’t correlate with the relevance score — needs to separate relevance from ranking. Rather that optimizing for a position-biased relevance metric like DCG, it needs to consider precision for the entire result set.
In other words, search engines can’t keep relying on relevance ranking as a crutch that allows them to err aggressively on the side of recall. Instead, they have to face a real trade-off between precision and recall.
In practice, that means establishing a relevance threshold and filtering out results whose relevance score is below that threshold.
It’s important to provide transparency and control to searchers.
Filtering results using a relevance threshold inevitably filter out some results that are relevant. As discussed above, it’s a trade-off. But, in a sort by price, searchers may think that the search engine is deliberately — and deviously — filtering out less expensive results.
In order to earn searchers’ trust, it’s important for the search engine to provide transparency and control. A best practice for implementing a relevance threshold is to show a prominent message at the top of the results, e.g., “We removed some results to show you the most relevant listings. Click here to view all results.”
Offering searchers transparency and control helps assure them that the search engine is making a good-faith effort to satisfy their intent. It’s also useful as a quality check: if many searchers click the link to view all results, it probably means that the relevance threshold is too aggressive.
Maybe the searcher doesn’t really want to sort by price.
Given the challenges of implementing sorting by price, it’s a good idea to offer searchers alternative ways to refine or organize the results.
One alternative to sorting by price is filtering by price, e.g, allowing searchers to only see results that cost less than $20. Filters don’t change the sort, i.e., results that satisfy the price filter are still sorted by relevance score. But it is important to provide price filters that are appropriate to the search query, e.g., different ranges for coffee makers than for diamond rings.
Another alternative is to organize results using facets or clusters. In the case of coffee makers, that could mean organizing results by the different kinds of coffee makers: automatic drip, French press, super automatic espresso, etc. Indeed, sorting results by price isn’t very useful for heterogenous result sets, but it can be quite useful once the searcher has narrowed down the result set.
Many searchers use sorting by price as a proxy for some other intent. If the search engine can directly address that intent, everybody wins.
Summary
Sorting by price is surprisingly difficult — and we haven’t even considered issues like unit pricing and product variations. Allowing users to sort results requires the search engine to separate relevance from ranking, which requires them to face the trade-off between precision and recall. If a search engine implements a relevance threshold, it’s important that it provides transparency and control to earn searchers’ trust. Finally, sorting by price may be a proxy for some other intent that the search engine can address directly.
Engineering Leader, Machine Learning at Meta | Trust & Safety | Generative AI
6 年Wow, that was quick! :)
CEO @ Tribyl. Improve revenue conversion by eliminating silos, guesswork and opinions.
6 年? ? ? ? ? ? ?Another factor at play here is that the User is implicitly going through a conjoint analysis trade-off when making a purchase decision.? A user's decision is rarely as simple as picking the lowest price product, even if the results belong to the same product category.? A great search experience needs to have a "dialog" with the User, AFTER the basic search is executed -- e.g., what features does the User truly care about, and is willing to pay (more) for.? This requires a deep understanding of customer.?? ? ? ? ? ? ? ?Travel sites have done a particularly good job of this, e.g., allowing the User to refine initial results based on # of stops.? In the offline world, a good Salesperson goes through "Discovery" before telling you the actual price.? A lazy approach of loading the product catalog into the search engine leaves money on the table for the vendor, in addition to creating the precision vs recall UX issue you mention in your article.? That's why this is so hard to do right.
The right career for everyone in the world
6 年Daniel Tunkelang in general price sort breaks where you are comparing items of different type. What I have seen work well is asking user to pick a product category before sorting by price. Otherway is to predict users primary intent and provide sort within it.
Algorithmic Audit & Regulation at INRIA
6 年There is a special case I have seen in the travel industry, working on an OTA web site. The searcher looks for say a flight from Paris to Hammamet on a friday afternoon. He would like to see the offers by ascending price. The OTA could produce this list but knows that the customer might be seduced by a less costly flight ... not exactly corresponding to his search. And miracle, the OTA has a splendid product, for Hammamet, but on thursday. Strangely the OTA has a very attractive price for this thursday .... 200$ less. Why ? because the flight is empty and the OTA has negociated a special allotment with the airline. In that case the OTA doesnt really want to give the exact product of the search, but sthing good enough. And more attractive in terms of price (and margin for the OTA). I have studied the stats there and obviously... desobeying the search yields considerable benefit. There was a latent search that was not expressed by the customer.
Digital technology & marketing professional - technology adoption, onboarding and process transformation
6 年Dan nails? a critical problem in the middle of this essay. For users, when we don't see the products we identified in the search on a price list (usually ascending order,) we get frustrated, even suspicious. Are they trying to keep me from seeing the lower priced items? Not exactly stellar experience, more like the old days rummaging through Alexander's or E J Korvettes. One is left with the impression we are supposed to be frustrated.