Spotify System Architecture!

Spotify System Architecture!


Thanks to the original writer and article :

https://medium.com/geekculture/spotify-system-architecture-6bb418db6084

System Functional Requirements

  1. Download Music (up to 10,000 songs on a maximum of 5 devices under the same account)
  2. Discover Music (the system will provide a tailored playlist generated by their algorithm)
  3. Spotify Connect (provide a grant to play your music on different devices, you can use your mobile phone as a control for playing music on your PC under the same account)
  4. Discover Friend Activities, what music they like (allows you to see what your friends you follow are listening to in real-time, and also will have friend recommendations in your search lists when connected to other social media apps like Facebook, Instagram, etc.)
  5. Can share music with your friends through the other platform (Copy song links, embed code and Spotify URL, and share to your other social media apps)
  6. Provide the lyrics for most of the music (A collaboration between Spotify and Genius, music lyrics will be provided on a selection of songs)
  7. Create Playlists (You can make any of your playlists into a collaborative playlist with your friends which your friends can edit, add, remove and reorder songs)
  8. Private mode (your friends will not be able to see your listening activity)
  9. Is applicable for all of the device types (Mobile, tablet, PC, web player, TV, car audio, smartwatches, game console, smart display and etc)
  10. Daily Mixes (Playlists created on the basis of what you have listened to previously, and genres you likes = your taste of music)
  11. Spotify Radio
  12. Users can check their listening history
  13. Users can search the song which they want to listen to
  14. Shuffle play
  15. podcasts and shows
  16. sort and filter
  17. follow artists
  18. Autoplay
  19. concerts
  20. crossfade tracks
  21. Artist Fundraising pick
  22. Visualization with WinAmp
  23. Ad-free
  24. Unlimited Skip

System Non-Functional Requirements

  1. Usability
  2. Reliability
  3. Good Performance
  4. Low latency

The Scale Estimation

  1. The active user's bases =365 million
  2. The active premium users bases = 158 million
  3. the low quality = 3MB per song
  4. The high quality = 10MB per song
  5. Can download up to 10,000 songs
  6. 30+ language support
  7. 60,000+ songs are uploaded daily
  8. 8 million song creators yearly
  9. 8k active artists

System Component Design

The high-level diagram

No alt text provided for this image

The assumptions we made :

  • The users are already logged in
  • Can fit millions of users
  • The users play specific songs

  1. Load balancer


  • to maintain and handle the traffic coming into a server
  • When a user requests a specific song, it will go to the load balancer.
  • The technique we used here is Round Robin load balancing to distribute the requests to the server with the least traffic to achieve the low latency

2. Publish/Subscribe

  • it is a message queuing system to collect all those requests and then process
  • RabbitMQ and Apache Kafka can be adopted
  • When users click play, the requests will be processed by the system. The data is being updated to the artist server. The number of time the user listen to this song will be stored in the Artist’s-User DB. Perhaps, we can add caching in the system flow.
  • The data warehouse will be updated with the new information.

3. App Server Producer/App server Consumer

  • Producers are from client applications that request a song, the app server consumer read and process this request. The producers and consumers are fully decoupled and agnostic of each other so that high scalability can be achieved.


4. Data Warehouse

  • To store million or up to billions of data (song)
  • Hadoop Distributed File system is used

5. Payment Gateway

  • Checkout API — try to validate payment details and initiate the payment workflow. The state machine is used.
  • Billing API — It communicates with the payment providers to charge the user. This API is the bridge between the payment providers and their payment backend or the second interior API for the payment to occur.

The Recommendation Algorithm

  • Latent Factor Models?(Machine learning algorithm)
  • The latent factor model is based on observable variables.
  • It is usually more complicated because observable variables are not fixed. It will bring about 2 problems.
  • 1) an equation may not be able to be derived
  • 2) Even we can sort the equation out, it is difficult to apply
  • Erik Bernhardsson spent 6 years developing the first music algorithm in Spotify
  • The links below share some insightful information about this model
  • The recommendation system detects elements (user, music, frequency) similarity using implicit collaborative filtering based on the latent factor models. So, similar songs will be sent to the user playlists.
  • Erik Bernhardsson also built an open-sourced data pipeline engine (Luigi and Annoy) which are still used by Spotify

Arafat Olayiwola

Software Engineer || SMNSE || Senior CSE at OAU || 3x AWS Community Builder || Grow with Google PM Scholar

2 年

Thanks for sharing ?? I enjoy your articles ??

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了