登录查看更多内容

Using machine learning to predict new movie ratings

Kiran Brahma

Co-Founder/CEO Knighthood - Get the Right Staffing Solutions for Your Business | Entrepreneur, Mentor & Investor | ISB

发布日期: 2017年11月8日

Most of us prior to deciding whether a movie is worth a watch, try to seek out a general opinion on movie so that we are not disappointed in the end. Prior to a movie release, most movies are reviewed by critics, which set the tone for initial rating. With passage of time as more people view the movie, the movie ratings starts heading towards what will be the aggregate average rating for the movie. Slowly after a certain period of time, the movie rating stabilizes with minor change. The rating that we observe now is what can be termed as the actual rating of the movie. Imagine the possibility that prior to release of the movie, if we can come out with a estimate of the rating, so that appropriate plans can be developed to maximize the rating given the quality of the movie. It is important to note that the initial rating of movie has major impact on its earning and considering that movies need to earn most of their money in a short period, the initial rating is a powerful influencer in this regard.

Most of us head over to IMDb to get a movie rating and further provide our own ratings for already watched movies. Simply put, IMDb depends on crowd sourcing to assist in arriving at movie rating.

Crowd sourcing is a specific sourcing model in which individuals or organizations use contributions from Internet users to obtain needed services or ideas ( 1)

Though IMDb rating does not impact a movie much in its opening weekend as marketing has higher influence but post the initial euphoria of movie, this single number surely must play a role in determining the movie’s further collection. Many of us can certainly name movies which went onto becoming blockbusters despite a low opening due to high rating from general public. For Ex: Titanic, which is among the highest cumulative gross earnings did not do great in its opening weekned. Its ranked around 336 for it's opening week collection but ranked 2nd when overall earnings are considered (2)

In the entire equation, IMDb is dependent on a certain mass of users to see a movie before any rating, providing it with the initial set of observations, post which people further decide to see it or not. With the current advent of AI, is it possible that we can provide a rating for a movie which is close to the rating that people will give with passage of time. Most of us are aware of Recommendation Systems, where you are recommended another product or service on the basis of consumption of current product. We can see this system in play in full force in Amazon, Netflix and numerous other sites. However, this system has is unable to recommend you new products as the system does not have sufficient data on the new product. Some people can argue that we can take assume a similar product as replacement of new product to arrive a good guess but in case of songs or movies, its not that simple as it is for products or services.

So how does one crowdsource the opinion of an idea of a movie even before you get some people to view the movie. The solution that I predict should work well is borrowed from Spotify, who uses machine learning, which recommends new songs on weekly basis to its subscribers (3). One of the solution adopted breaks down song into a raw audio file to data which is fed as learning data for its deep learning algorithm . It then matches the same with songs that a listener prefers to hear when recommending new songs. This approach allows Spotify to recommend newly released songs which are yet to garner attention from sufficient audience to allow the similar algorithm to work. For movies, we can adopt a similar approach, wherein the new movie is broken down into raw video files and comparison drawn to other movies. Now in our case when we need to arrive at a final rating, so simply put, we need to draw comparison from existing movies rated by a user and predict the rating for the new movie by same user.

Implementing this idea can be computationally heavy and I am not even sure if we have the computational horsepower to run such analysis for each movie at user level to arrive at final prediction of the movie rating by the user. To simplify this, we can adopt the following structure to reduce computational load:

Club multiple users into various segments so that now we focus on how a certain segment will rate the movie against the individual user. The user segment can be defined by us or arrived on basis of clustering, which is commonly used by marketing teams. We can then compare ratings given across various genre by users to arrive at segments that we feel are homogeneous in nature. The final number of segments that we arrive at needs to cognizant of the current computational capability. If there are too many segments, we will need longer time to arrive at our predictions, thereby failing to serve any purpose. We can even simplify this process by defining our own segments as current method can throw up a segment which is practically impossible to adopt (Imagine a segment wherein users are spread across 100 different locations)
IBM Watson had already demonstrated on how AI can be used to develop a movie trailer(4). Borrowing from this idea, we can train our AI system to evaluate scenes from different genre of movies to understand how the combination of all such scenes influence the final movie rating by a segment that we have defined above
Now, when a new movie is released, we bucket the specific movie into a genre and then draw comparisons from our pre-determined data sets for each genre to predict how each segment will rate the movie. We can even remove the bucketing of genre and do a general analysis. However, people are fans of specific genre so it makes little sense to consider a comedy genre fan segment to provide rating for horror genre as initial movie watchers will be from the genre fans mostly. We can run analysis on both and use it to understand which gives a better prediction rather than trying to guess which is better for a more objective analysis
Finally, we take in a weighted average of rating predicted by each segment to arrive at a final rating of the movie

We will obviously need to compare our predicted rating with actual ratings to ensure that we are able to develop a more robust learning system. We will need to compare the following metrics from our predicted values to actuals

The average ratings by each user segment
Weightage of each segment in the final rating. The final aggregate rating is dependent on our prediction on which segments had actually decided to view the movie and rate against that we had predicted early on

The final prediction of the movie that we have predicted can have multiple real world applications. Some of the ideas that I can think of can be as following:

Understand which user segments will give a higher rating so focus the movie marketing efforts to those specific segments
The Movie makers can self define their own segments on basis of geographical locations to predict how movie will fare across different geographies. This can assist in planning locations where movie needs to be screened and where to limit it. Most movies have a limited time to recoup their money and if movie is released in a location that does not prefer, leading to poor rating than it can result in poor show even in locations where movie is expected to do well
If we have limited budget and screen in limited theaters, we can optimize the locations where movie needs to be screened to get best possible final rating
Movie makers can understand on what make a movie work and what makes it fail on basis of learnings from prior data.

I am sure that one can think of numerous other ideas on how we can derive further value from this process. I wont be shocked if IMDb (An Amazon company) actually releases such a tool available for Producers and generate a steady stream of revenue. In fact it can even use a similar concept to understand its Prime Video users better and deliver better content, thereby locking in more users to Prime, further developing a more powerful grip on its consumers on its way to world domination

Note:

1: Crowd Sourcing: https://en.wikipedia.org/wiki/Crowdsourcing

2: Titanic Box-Office Earnings: https://www.boxofficemojo.com/movies/?id=titanic.htm

3: Spotify Weekly New Song Recommedations: https://hackernoon.com/spotifys-discover-weekly-how-machine-learning-finds-your-new-music-19a41ab76efe

4: IBM Watson used in making Movie Trailer: https://www.ibm.com/blogs/think/2016/08/cognitive-movie-trailer/

要查看或添加评论，请登录

Kiran Brahma的更多文章

Book Notes: Failing To Succeed

2023年9月18日

Book Notes: Failing To Succeed

Execution is key. Once you have identified a problem and defined a solution, the most important thing is to execute…
Working in a Startup - The Good & The Bad

2022年10月11日

Working in a Startup - The Good & The Bad

Why work in a Startup Context In the past few years, startups have grown exponentially, introducing us all to new…
BookNotes: Zero To One

2022年10月10日

BookNotes: Zero To One

Book: Zero To One My Rating: 4 / 5 Short Notes Have the ability to think how the future will be by focussing on what…
Book Notes: Superforecasting

2022年10月8日

Book Notes: Superforecasting

Book: Super-Forecasting My Rating: 4.5/5 Short Notes Every event has multiple outcomes and its important to go beyond…
Recommended Reading

2022年10月6日

Recommended Reading

A list of Books that I would recommend for reading. I update this list on a monthly basis.
Reading List 2019

2019年2月28日

Reading List 2019

Since the start of last year, I took it upon myself to read one interesting book every week with key focus being to…
Recommended Reading

2018年11月15日

Recommended Reading

During school, I am sure many of us found History interesting to read but loathed the subject due to the rote learning…
Survivorship Bias: How it affects our decision making

2017年4月17日

Survivorship Bias: How it affects our decision making

During World War 2, US had created a group called Statistical Research Group (SRG), whose goal was to help the army…

4 条评论

See all articles

Using machine learning to predict new movie ratings

Kiran Brahma

Co-Founder/CEO Knighthood - Get the Right Staffing Solutions for Your Business | Entrepreneur, Mentor & Investor | ISB

Kiran Brahma的更多文章

社区洞察

其他会员也浏览了

Future Beat newsletter

2022 – The Year of ‘Text-to-Anything’

Retain Your Roots, Reach for the Sky(net): Project Strawberry

The Rise of Deepfakes: Friend or Foe?

AI News: Week Ending 12/08/2023

Unlocking the Minds of Giants: D&D Alignment of Top LLMs for 2025 REVISITED

Sam Altman enters ‘God Mode’ with his vision of Artificial Intelligence

The secret chickens that run LLMs

Algorithms With No Chill: A Netflix Original Series

Futurist: Seeking the information age and other mythical creatures

Kiran Brahma的更多文章

Book Notes: Failing To Succeed

Working in a Startup - The Good & The Bad

BookNotes: Zero To One

Book Notes: Superforecasting

Recommended Reading

Reading List 2019

Recommended Reading

Survivorship Bias: How it affects our decision making

社区洞察

其他会员也浏览了

Future Beat newsletter

2022 – The Year of ‘Text-to-Anything’

Retain Your Roots, Reach for the Sky(net): Project Strawberry

The Rise of Deepfakes: Friend or Foe?

AI News: Week Ending 12/08/2023

Unlocking the Minds of Giants: D&D Alignment of Top LLMs for 2025 REVISITED

Sam Altman enters ‘God Mode’ with his vision of Artificial Intelligence

The secret chickens that run LLMs

Algorithms With No Chill: A Netflix Original Series

Futurist: Seeking the information age and other mythical creatures