Modern Visual RecSys: How does a recommender work?
I have worked in the data industry for over seven years and had the privilege of designing, building, and deploying two recommender systems (RecSys) that went on to serve millions of customers. In this series of articles, I will introduce modern approaches to visual recommender by walking through case studies with code and sharing some of my experience designing RecSys.
This is part of my Modern Visual RecSys series; feel free to check out the rest of the series at the end of the article.
RecSys Basics — Spotify Case Study
We begin with a case study of Spotify to understand how RecSys works and introduce several key concepts, including a modern approach called convolutional neural networks (CNN), applied to music.
My “discover weekly” recommendations from Spotify
Let us take a look at my personalized music recommendations from Spotify. It contains a mix of Chinese/Japanese/ English pop music with new music and old tunes going back some 20 years. A few observations as I scroll through the recommendations:
- None of the music is from artists that I “save to liked”.
- The genre is similar to what I usually listen to.
- The tunes are similar to what I usually listen to.
- There is a mix of new and old songs.
It seems that this recommendation product is trying to help me discover new music that is familiar yet different from my usual listening habits. But how is this achieved? Chris Jonson from Spotify has the following slide on the architecture of Discover Weekly:
Source: slide from presentation: From Idea to Execution: Spotify’s Discover Weekly by Chris Johnson
We see that there are three main methods employed (the red box):
- Collaborative filtering (CF) using user behavior (play logs) and music content (track metadata).
- Natural Language Processing (NLP) and text mining/scraping of news/blogs/text and music content (track metadata).
- Audio models that analyze the raw audio data.
The result is a “Spotify blob” for each user, with an ever-shifting musical preference based on user interactions and an expanding library of music from Spotify. Visually, the goal of Discover Weekly is to find these white contour lines that cut across the user’s musical preferences.
Source: The magic that makes Spotify’s Discover Weekly playlists so damn good by Quartz & Spotify
Let us dive deeper into each of these methods.
Collaborative filtering (CF)
Source: The magic that makes Spotify’s Discover Weekly playlists so damn good by Quartz & Spotify
CF is the classic method employed across different RecSys. It simply takes user interactions (your clicks, saves, likes, purchases, etc.) and matches them with other users in the system with similar tastes (in music, films, fashion, etc.).
CF assumes that users with similar tastes will appreciate content from others within the same community.
There are drawbacks such as:
- Echo chambers (Facebook showing you left/right-wing posts over and over again based on your reading behavior)
- Safe but boring recommendations (recommending another Artist A song when you know that I am a big fan of Artist A)
- Cold start problem where CF cannot match new items/users due to a lack of data — CF will always need to be paired with a backup plan in deployment (top/most popular products for example)
There are various implementations of CF. For example, Spark’s alternating least squares (ALS), FastAI’s collab, Surprise (for explicit/user rating data). Check out the further readings section for more tutorials.
Natural Language Processing (NLP)
One way to handle the cold start problem, especially for new releases, is to scrape the internet news/blogs and fill in metadata information about the song (artist, title, mood {happy, love…}, genre {pop, Korean}, etc.) with web scrapers like Beautiful Soup, Scrapy, etc.
Source: slide from presentation: From Idea to Execution: Spotify’s Discover Weekly by Chris Johnson
With the scrapped textual data, together with details from the playlist, it is possible to associate keywords with individual artists/playlists.
Modern approaches make use of word embeddings to construct sentence/document vectors; mathematical representations that allow for comparison across the vector space.
Common techniques are word2vec, doc2vec, and Latent Dirichlet Allocation (LDA). Vectorization is key to the content-based recommender my team built at Tech in Asia.
Source: Introducing Tech in Asia’s unique content recommender by By Will Ho & Joshua Lim
Audio Models
Sander Dieleman (Research Scientist at DeepMind) once interned at Spotify and wrote a great article on Recommending music on Spotify with deep learning. He used a technique called convolutional neural networks (CNN) that we will cover in the later chapters. Intuitively, our goal is for each filter (shown as columns in the image below) picks up a distinct musical feature.
Visualization of the filters learned in the first convolutional layer. The time axis is horizontal, and the frequency axis is vertical. Source: Recommending music on Spotify slides by Sander Dieleman
If we zoom in to take a look at the specific filters, we can pick up trends as noted by Sander:
Closeup of filters 14, 242, 250 and 253. Source: Recommending music on Spotify slides by Sander Dieleman
- "Note that the time axis is horizontal, the frequency axis is vertical (Frequency increases from top to bottom). Negative values are red, positive values are blue, and white is zero".
- "Filter 14 seems to pick up vibrato singing. [Notice the recurring blue shades for column 14 across different frequencies]"
- "Filter 242 picks up some kind of ringing ambience. [Notice the blue stripe +red base]"
- "Filter 250 picks up vocal thirds, i.e., multiple singers singing the same thing, but the notes are a major third (4 semitones) apart. [Notice the neat recurring alternation between red and blue rows]"
- "Filter 253 picks up various types of bass drum sounds. [Notice that most of the music exists within a small range of frequencies at the top]".
These musical patterns act as a musical signature, allowing Spotify to mix and match songs of similar signatures to generate playlists that sounds familiar but with degrees of controlled novelty for the user.
The data scientists can always dial up or dial down the novelty mix to the musical signature based on user response to the recommendations. Such is the power of modern tools like CNN.
What have we learned
RecSys are very interesting models to explore. Even seemingly simple music playlist recommendation can involve a diverse array of models that brings together the user interactions, content, external data, and domain-specific techniques such as audio models in this Spotify case study.
In the next chapter, we will learn how to design a recommender.
Reflections
Take a look at your recommendations on Spotify (or Amazon/ Netflix/ YouTube/ any other services you used with personalization).
- Are they relevant to you? What % of the recommendations are spot on? What % is terrible?
- How will you improve the recommendations?
- Will you put more weight on recent behaviors vs. historical?
- How will you introduce new products?
- What will you show new users?
- How will you design a recommender that keeps up with the latest trends?
Explore the rest of Modern Visual RecSys Series
- How does a Recommender Work? [Foundational][we are here]
- How to Design a Recommender? [Foundational]
- Intro to Visual RecSys [Core]
- Convolutional Neural Networks Recommender [Pro]
- COVID-19 Case Study with CNN [Pro]
- Building a Personalized Real-Time Fashion Collection Recommender [Pro]
- Temporal Modeling [Pro]
- The Future of Visual Recommender Systems: Four Practical State-Of-The-Art Techniques [Foundational]
Series labels:
- Foundational: general knowledge and theories, minimum coding experience needed.
- Core: more challenging materials with code.
- Pro: Difficult materials and code, with production-grade tools.
Further Readings
- Recommending music on Spotify with deep learning
- How Does Spotify Know You So Well?
- Introducing Tech in Asia’s unique content recommender
- Machine Learning for Recommender systems by Pavel Kordík | Part2
- Introduction to RecSys by Kung-Hsiang, Huang (Steeve) | Part2
- Introduction to recommender systems by Baptiste Rocca
- Machine Learning — Recommender System by Jonathan Hui
Software Engineer @ Meta
4 年Surya Omesh check this out!
Head of Data @ GoTo Group
5 年Cool stuff! Kai Xin Thia
Senior Developer | ex-CTO @ MyanLearn | Ex-ShopBack
5 年Thanks! Excited to read them and learn from you!
Data and AI/ML Engineering Lead @HKMA - Follow Me for Updates in Data, AI/ML and Engineering ??
5 年Really nice introduction! Love the case study of Spotify! It is definitely a good reading material on top of college slide about RecSys which often covers only collaborative filtering on MoiveLen dataset.