Finding Similar Movies Based on Ranking Using Singular Value Decomposition (SVD)
?? Introduction
Movie recommendation systems power platforms like Netflix, Amazon Prime, and Disney+, helping users discover films based on preferences. One powerful technique for finding similar movies is Singular Value Decomposition (SVD), which is widely used in collaborative filtering.
?? Goal: Use SVD to analyze movie rankings and recommend similar movies based on user ratings.
?? 1?? What is Singular Value Decomposition (SVD)?
?? SVD Overview
SVD is a matrix factorization technique that breaks down a large dataset into smaller, meaningful parts, making it easier to find patterns in movie ratings.
? Extracts hidden relationships between users and movies
? Reduces dimensionality, improving efficiency
? Helps predict missing ratings in a user-movie matrix
?? 2?? Dataset Overview: Movie Ratings Matrix
?? We use a MovieLens dataset, where users rate movies on a scale of 1-5. The dataset is structured as follows:
?? Goal: Given a movie, recommend other movies based on similarity in user ratings.
?? 3?? Implementing SVD for Movie Recommendations
?? Step 1: Import Required Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.sparse.linalg import svds
?? Step 2: Load & Prepare Data
# Sample Movie Rating Matrix
ratings = np.array([
[5, 4, 3, 5, 4],
[4, 5, 4, 3, 5],
[3, 4, 5, 2, 4],
[5, 3, 2, 5, 4]
])
movies = ["Inception", "Avatar", "Titanic", "The Dark Knight", "Interstellar"]
# Convert to DataFrame
ratings_df = pd.DataFrame(ratings, columns=movies)
print(ratings_df)
领英推荐
?? Step 3: Apply Singular Value Decomposition (SVD)
# Perform SVD
U, sigma, Vt = svds(ratings_df, k=2) # k=2 (reduced dimensions)
# Convert sigma into diagonal matrix
sigma = np.diag(sigma)
print("User Feature Matrix (U):\n", U)
print("\nSingular Values (Sigma):\n", sigma)
print("\nMovie Feature Matrix (Vt):\n", Vt)
?? SVD decomposes the matrix into three parts:
?? Step 4: Find Similar Movies Based on Ranking
# Find Similarity for a Given Movie (e.g., "Inception")
movie_idx = movies.index("Inception")
# Extract movie features
movie_vector = Vt[:, movie_idx]
# Compute similarity with other movies (Cosine Similarity)
similarities = np.dot(Vt.T, movie_vector)
# Sort movies by similarity score
similar_movies = sorted(zip(movies, similarities), key=lambda x: x[1], reverse=True)
# Display Similar Movies
print("\nMovies similar to 'Inception':")
for movie, score in similar_movies:
if movie != "Inception":
print(f"{movie}: {score:.2f}")
?? Output Example:
Movies similar to 'Inception':
Interstellar: 0.98
The Dark Knight: 0.85
Avatar: 0.67
Titanic: 0.52
?? Interpretation:
? Interstellar is most similar to Inception (due to similar user ratings)
? Titanic is least similar based on rankings
?? 4?? Analyzing the Results
?? Why SVD Works for Movie Recommendations:
? Identifies hidden patterns in user preferences
? Groups movies based on user behavior rather than simple genres
? Handles missing ratings by approximating missing values
?? Conclusion & Future Enhancements
? SVD effectively finds similar movies using user rankings
? Dimensionality reduction helps improve recommendation speed
? Future enhancements: