You Won't Believe the Insights a Data Scientist Uncovers about Google Maps!
Personalisation on Google Search

You Won't Believe the Insights a Data Scientist Uncovers about Google Maps!

谷歌 Maps is an amazing product, and it brings a lot of value to people's lives. Today, I thought to share my views on how someone can build a product like this from scratch.

While it is true that search involves both engineering and data science, many companies may underestimate the importance of data science in search. Search is not just about building a technical infrastructure to process and return results for a given query, but also about understanding and anticipating the needs and preferences of users, and using data to tailor the search experience to individual users.

As a data scientist, I find search personalization and ranking to be a complex and interesting challenge. We capture users' interest by examining: user search queries, past search history, and click data.

This newsletter will answer all your questions about:

  • What data is required?
  • Need of Ranking and Relevance in Maps?
  • Why a Particular result shows up?

Disclaimer: Please DM before reposting this strategy on other platforms, as it is entirely original to me and was not copied. I don’t endorse any brand; the examples shared are just for learning. Anyone can create their own maps from scratch using this concept.

If anyone wants consultation from me reach out at HERE

Why this Result and How?

What we are looking for: I live in Gurugram and searching for the "Eiffel Tower, Paris"

Let's start typing "EI" on Maps and analyze. We got the below-listed top-5 results from the query.

No alt text provided for this image
Fig.1 Search Analysis

Why are there only five results? Search is all about ranking and relevance. The aim of the algorithm is to get the user his desired result with a minimum query length. On the other hand, more context for the algorithm is preferable to the effort of scrolling through a longer list of results returned by the query "ei."

Reason for this Outcome: Popularity and contextual signals are fired here. Google Maps recognizes my location and displays the top five most popular locations visited or clicked by people who have previously typed "ei." All the results are near my location (see Fig. 1). Query "eif" puts the Eiffel Tower in the 3rd position because of its higher popularity score. (see Fig.2)

No alt text provided for this image
Fig.2 Popularity and Distance-Based Search

How to build a user's location-based search?

What do we have:

  • We know the User's Location
  • Locations of all Entities registered - Shops, Cafes, Restaurants, Hotels, etc.

Naive Approach

Using the Haversine formula, calculate the distance between User Lat/Long and other locations within a city or zip code. (see Fig.3)

No alt text provided for this image
Fig.3 Haversine Formula

At the GMaps level, where there are billions of entities and millions of users, this approach is not scalable.

Smarter Approach with DS Intelligence

  • Lets us say we are doing this search build-up only for the Bangalore region.
  • We cluster every entity registered - breaking down the entire region into smaller sub-entity clusters let us say 50. (see Fig.4)

No alt text provided for this image
Fig.4 Bangalore into sub-entity clusters

  • When a User comes online just check its Lat-Long and rank entities based on popularity within the cluster. This approach adds both Distance and Popularity based elements to our search results.

PS. Some of you might ask if this can be done within Elastic Search. Check out Geo-Sorting

No alt text provided for this image
Fig.5 ES Geo-Sorting

How to make Search more Contextual or Personalized?

To make search more contextual or personalized, you can consider using data-based approaches that take into account the user's specific needs or preferences. For example, if Mr. Wolf is searching for restaurants and has made multiple restaurant-related queries within a short period of time or the same session, you can prioritize showing him results that are personalized to his location and also consider his past search history. This can involve ranking restaurants near him higher in the search results and presenting him with options that are tailored to his tastes or preferences or his previous visits to different locations.

By using data-based approaches, you can provide Mr. Wolf with more relevant and targeted search results rather than simply relying on popular or distance-based criteria.

Someone may ask, Shaurya you have talked about the popularity aspect in search but what are some different ways to identify a location as popular?

  • Number of views/clicks in the last 1month
  • Number of people who visited a place in the last 1month
  • Proxy cross-platform: Google Pay Transaction analysis, a merchant shop is considered popular if the transactions volume is high

Recommendations from Past Cached Searches

No Data Science model can beat the simplicity of well-presented historic data (see Fig. 6) with caching algorithms: LFU (Least Frequently Used) or LFU with Dynamic Aging (check the below link for a detailed LFU with Dynamic Aging explanation).

No alt text provided for this image
Fig. 6 Cached Search


Summarization of our Analysis

1. We discussed why 5 results are shown in the search results list

2. Ranking of Results based on Popularity + Distance scoring

3. Geo-Sorting in Elastic Search

4. Personalisation in Search based on the historic category of queries

5. How do you identify an entity as a popular location?

6. Recommendation based on caching of past search queries and places visited.

Thank you everyone for gifting me this Award
No alt text provided for this image
Noonies Tech 2022
Connect, Follow or Endorse me on?LinkedIn?if you found this read useful. To learn more about me visit:?Here

This newsletter has a large subscriber base, with over 4800 ??. If you are a company or individual working on artificial intelligence or data-related products or services, you are invited to consider sponsoring one of the upcoming newsletter issues. For more information on sponsorship opportunities, please contact [email protected].

Subscribe to get Email Notification of every Newsletter being Published:?HERE
Shaurya Uppal

Data Scientist | MS CS, Georgia Tech | AI, Python, SQL, GenAI | Inventor of Ads Personalization RecSys Patent | Makro | InMobi (Glance) | 1mg | Fi

1 年
回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了