Hands-On Example: Google Trends Visualization with Pytrends API – Keyword Volumes, Regional Insights, and Correlations
Bj?rn Thomsen
Marketing Lead at meshcloud.io | Driving B2B Market Growth for Platform Engineering Company | Performance Marketing, Data Analytics, Marketing Strategy
When it comes to keyword research, spotting trends, identifying anomalies, or uncovering correlations between search terms, you don’t necessarily need a bloated SEO tool. While Google Trends is a useful platform, its interface is quite limited. So why not take matters into your own hands by pulling Google Trends data directly through an API and analyzing it in Python?
In this article, I’ll walk you through a practical project using Pytrends ??, a Python library that connects to Google Trends. We'll combine it with popular data science tools like Matplotlib and Seaborn to create engaging visualizations. For this project, we’ll focus on comparing keyword search volumes, analyzing regional search data, and uncovering correlations in search trends over time. I’ll also share some code snippets to help you build a simple analysis tool from scratch.
To make things even more interesting, I’ve included an example at the end of integrating the data with my favorite library, Pygwalker. ?? If you’re unfamiliar, Pygwalker is a lightweight, Python-friendly alternative to tools like Power BI or Tableau. It allows you to create quick and effective visualizations, making exploratory data analysis not only efficient but also visually compelling.
Installation and Documentation
Let's assume that you know Google Trends: https://trends.google.com/trends/ To analyze Google Trends data effectively, we will utilize Pytrends. It allows users to connect to Google Trends, bypass rate limits with proxies, and retrieve various datasets like interest over time, region-specific trends, related topics, and trending searches.
Advanced functionalities include historical hourly data, real-time trends, and keyword suggestions, all with customizable parameters like language, region, and timeframe: https://pypi.org/project/pytrends/
Our first action will be to install this library by using the command-line tool:
pip install pytrends
Initializing Libraries & Creating GUI
Typically, I don't pay much attention to user interfaces, but in this case, we'll create a straightforward GUI using Tkinter to allow users to input keywords and a timeframe. This approach simplifies interaction.
Using the Pytrends library, we'll connect to Google Trends and validate the user-provided data to ensure that neither the keywords nor the timeframe are left blank. If any invalid input is detected, the program will display an error message via a message box and terminate gracefully. Once validated, the keywords are cleaned and processed into a list for further use.
This setup not only makes the tool more user-friendly but also eliminates the need to hardcode parameters in the script. Of course, we’ll also use essential libraries like Pandas for processing the data and Matplotlib and Seaborn for visualization to round out the functionality.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from pytrends.request import TrendReq
import time
import random
import tkinter as tk
from tkinter import simpledialog, messagebox
# Initialize pytrends with custom user agent
pytrends = TrendReq(hl='en-US', tz=360)
# Create a GUI to get keywords and timeframe
root = tk.Tk()
root.withdraw() # Hide the root window
try:
keywords = simpledialog.askstring("Input", "Enter keywords separated by commas:", parent=root)
if not keywords:
raise ValueError("No keywords entered.")
keywords = [keyword.strip() for keyword in keywords.split(',') if keyword.strip()]
if not keywords:
raise ValueError("Keywords cannot be empty or just commas.")
timeframe = simpledialog.askstring("Input", "Enter the time frame (e.g., 2023-11-01 2023-12-31):", parent=root)
if not timeframe:
raise ValueError("No timeframe entered.")
except ValueError as e:
messagebox.showerror("Input Error", str(e))
exit()
Fetching Data & Retry Logic API
Before analyzing the data, it is crucial to structure and prepare it properly, as our goal is to compare keyword volumes, including regional differences, and identify correlations. In doing so, we need to be cautious about the frequency of API requests to avoid hitting rate limits imposed by Google Trends:
# Top 10 largest economies (ISO country codes)
largest_economies = ["United States", "China", "Japan", "Germany", "India", "Great Britain", "France", "Italy", "Brazil", "Canada"]
# Retry logic for API requests
def fetch_data_with_retry(pytrends_method, retries=5, delay=5, *args, **kwargs):
for attempt in range(retries):
try:
return pytrends_method(*args, **kwargs)
except Exception as e:
print(f"Attempt {attempt + 1} failed: {e}")
if attempt < retries - 1:
sleep_time = delay + random.uniform(0, 2)
print(f"Retrying after {sleep_time:.2f} seconds...")
time.sleep(sleep_time)
else:
print("Max retries reached. Skipping this request.")
return None
# Build payload and fetch interest over time data
pytrends.build_payload(keywords, timeframe=timeframe, geo="")
interest_over_time = fetch_data_with_retry(pytrends.interest_over_time)
if interest_over_time is None:
print("Failed to fetch interest over time data.")
interest_over_time = pd.DataFrame()
# Fetch regional interest for the top 10 largest economies
top_regions = fetch_data_with_retry(pytrends.interest_by_region, resolution='COUNTRY', inc_low_vol=True, inc_geo_code=False)
if top_regions is not None:
top_regions = top_regions[top_regions.index.isin(largest_economies)]
else:
print("Failed to fetch regional interest data.")
top_regions = pd.DataFrame()
# Data Visualization
plt.figure(figsize=(10, 10))
sns.set_palette("coolwarm")
Generating & Displaying Search Volume-Plots
# Interest Over Time: Line Plot
plt.subplot(2, 2, 1)
if not interest_over_time.empty:
interest_over_time[keywords].plot(ax=plt.gca(), colormap="cool")
plt.title("Interest Over Time", fontsize=12, fontweight='bold')
plt.xlabel("Date", fontsize=12)
plt.ylabel("Search Interest", fontsize=12)
plt.legend(keywords)
plt.grid(True, linestyle="--", alpha=0.7)
else:
plt.text(0.5, 0.5, "No data available", horizontalalignment='center', verticalalignment='center', transform=plt.gca().transAxes, fontsize=12)
plt.title("Interest Over Time", fontsize=12, fontweight='bold')
# Regional Interest: Bar Plot
plt.subplot(2, 2, 2)
if not top_regions.empty:
top_regions_sorted = top_regions.sort_values(by=keywords, ascending=False).head(10)
top_regions_sorted.plot(kind="bar", ax=plt.gca(), colormap="cool")
plt.title("Regional Interest Top 10 Economies", fontsize=12, fontweight='bold')
plt.xlabel("Region", fontsize=12)
plt.ylabel("Search Interest", fontsize=12)
plt.xticks(rotation=45)
else:
plt.text(0.5, 0.5, "No data available", horizontalalignment='center', verticalalignment='center', transform=plt.gca().transAxes, fontsize=12)
plt.title("Regional Interest Top 10 Economies", fontsize=12, fontweight='bold')
# Seasonal Trends: Combined Line Plot
plt.subplot(2, 2, 3)
if not interest_over_time.empty:
interest_over_time[keywords].rolling(12).mean().plot(ax=plt.gca(), colormap="cool")
plt.title("Seasonal Trends 12 Month Avg.", fontsize=12, fontweight='bold')
plt.xlabel("Date", fontsize=12)
plt.ylabel("Search Interest", fontsize=12)
plt.legend(keywords)
plt.grid(True, linestyle="--", alpha=0.7)
else:
plt.text(0.5, 0.5, "No data available", horizontalalignment='center', verticalalignment='center', transform=plt.gca().transAxes, fontsize=12)
plt.title("Seasonal Trends 12 Month Avg.", fontsize=12, fontweight='bold')
# Correlation Analysis: Heatmap
plt.subplot(2, 2, 4)
if not interest_over_time.empty:
correlation = interest_over_time[keywords].corr()
sns.heatmap(correlation, annot=True, cmap="cool", ax=plt.gca(), cbar_kws={'shrink': 0.8})
plt.title("Correlation", fontsize=12, fontweight='bold')
else:
plt.text(0.5, 0.5, "No data available", horizontalalignment='center', verticalalignment='center', transform=plt.gca().transAxes, fontsize=12)
plt.title("Correlation", fontsize=12, fontweight='bold')
# Adjust layout and display
plt.tight_layout()
plt.show()
The plt.tight_layout() function at he end ensures that all plots fit neatly within the figure without overlapping labels or titles. Finally, plt.show() displays the completed visualizations. Together, these plots provide a simple analysis of search trends for marketing, SEO, or general research.
Pygwalker: A Tableau-like Alternative
I’ve written extensively about Pygwalker, a free and lightweight alternative to tools like Power BI and Tableau. With Pygwalker, you can save yourself a lot of coding effort by handing over your data directly to an intuitive user interface for visualization and analysis. We could have easily used it with our dataset right from the start—or even passed the data to Google Looker Studio for similar results.
import pygwalker as pyg
import webbrowser
import os
# Render Pygwalker visualization and open it in the browser
if not interest_over_time.empty:
# Create a PygWalker object
pygwalker_obj = pyg.walk(interest_over_time)
# Export the PygWalker object to an HTML string
pyg_html = pygwalker_obj.to_html()
# Save the HTML string to a file
file_path = os.path.abspath("pygwalker_visualization.html")
with open(file_path, "w", encoding="utf-8") as f:
f.write(pyg_html)
# Open the HTML file in the default web browser
webbrowser.open(f"file://{file_path}")
print(f"Pygwalker visualization opened in browser: {file_path}")
else:
print("No data available for Pygwalker visualization.")
In previous articles, I’ve demonstrated how to use Pygwalker to analyze data, visualize geospatial information, and even save views for future reference. If you're interested in more details, I recommend to check the kanaries website: https://kanaries.net/pygwalker
Conclusion
Admittedly, tools like Semrush, Ahrefs, Sistrix, Screaming Frog, Google Trends, and Google Search Console are essential in any marketing team's toolkit. However, Pytrends offers the flexibility to generate custom views and insights that can be challenging to achieve with these tools while pulling actual Google data directly. This makes it particularly useful for automating repetitive tasks by creating small, tailored programs with Pytrends.
How practical this approach is may be debatable, but the experiment itself felt insightful. That said, working with the Google API can be finicky—it’s crucial to incorporate delays and randomize timings to avoid hitting rate limits. This is why my fetching function ended up being a bit more elaborate, but it ensures a reliable workflow.