2 Undenied Technical SEO Factors to Implement In Any SEO Campaign
Dr. Tuhin Banik
Founder of ThatWare?, Forbes DGEMs 200 | TEDx & BrightonSEO Speaker | Pioneering Hyper-Intelligence & AI-Based SEO | International SEO Expert | 100 Influential Tech Leaders | Global Frontrunner in SEO | Ex-Forbes Council
??URL Inspection API Automation using Screaming frog
The Google URL Inspection API allows users to request the data Search Console has about the indexed version of a URL, including index status, coverage, rich results, mobile usability and more. This means you’re able to check in bulk whether URLs are indexed on Google, and if there are warnings or issues.
The URL Inspection API has been integrated into the Screaming Frog SEO Spider, so users can pull in data for up to 2k URLs per property a day alongside all the usual crawl data.
??How to Connect to The URL Inspection API
Click ‘Config > API Access > Google Search Console’,?connect to a Search Console account, choose the property?
Then under the ‘URL Inspection’ tab, select ‘Enable URL Inspection’.
Then start the website crawl, URL Inspection API data will then be populated in the ‘Search Console’ tab, alongside the usual Search Analytics data (impressions, clicks, etc).
??Start Crawl analysis
Select Search Console tab?
Select The Search Console tab includes the following URL Inspection API related issues pages–
Export Google Rich Result types, errors and warnings, details on referring pages and Sitemaps via the ‘Bulk Export > URL Inspection’ menu.
If you have hit the 2k URLs per day per property limit for the URL Inspection API you will receive this message.
If your website have more than 2K URLs in URL inspection API?then how to choose pages.
Use the SEO Spider configuration to focus the crawl to key sections, pages or a variety of template types.
Some of the main options include –
Under ‘Config > API Access > Google Search Console’ and the ‘URL Inspection’ tab, you can ‘Ignore Non-Indexable URLs for URL Inspection’, if you’re only interested in data for URLs that are Indexable in a crawl.
You wait for 24hrs,?re-open the crawl, connect to the API again and then bulk highlight and ‘re-spider’ the next 2k URLs to get URL Inspection API data.
export the previous crawl, copy the URLs you want URL Inspection data from, and upload in?list mode.
Before exporting and combining with the previous days crawl data.
??How to Automate URL Inspection Data & Index Monitoring
Go to ‘File > Scheduling’?
under ‘General’ choose a task and project name and daily interval.
Next, click ‘Start Options’ and switch ‘Crawler Mode’ to ‘List’.?
For ‘File path’, click ‘browse’ and select a .txt file with the URLs you want to check every day for URL Inspection data.
For ‘Crawl Config’ in scheduling ‘Start Options’,?
click ‘File > Config > Save As’.?
Then Select the download file
Enable the ‘Google Search Console’ API, click ‘Configure’ and select the account and property.
On the ‘Export’ tab, enable ‘Headless’ and choose the ‘Google Drive Account’ to export the URL Inspection API data in a Google Sheet.
Next, click ‘Export For Data Studio’ and then the ‘Configure’ button next to it.
Select ‘Site Crawled’, ‘Date’ and ‘Time’ metrics
then search for ‘Search Console’ to see the list of metrics available for this tab. Select the bottom 7 metrics, which are related to URL Inspection and click the right arrow.
When the scheduled crawl has run the ‘Export for Data Studio’ Google Sheet will be exported into your chosen Google Drive account.
By default the ‘Export for Data Studio’ location is ‘My Drive > Screaming Frog SEO Spider > Project Name > [task_name]_crawl_summary_report’.
Now make a copy of our?URL Inspection Monitoring Data Studio template?and connect to your own Google Sheet with data from the ‘Export for Data Studio’ crawl summary report.
You now have a daily index monitoring system for the most important URLs on the website, which will alert you to any URLs that are not indexed, or have issues.
??Update website cache using Python and the Google PageSpeed Insights API
First, you will need to create a Google API key to access the PageSpeed Insights API. To do this, go to the Google Cloud Console (https://console.cloud.google.com/) and sign in with your Google account.
In your Python file, import the necessary libraries:
import requests
import json
Define the API endpoint and parameters for the PageSpeed Insights API. Replace “YOUR_API_KEY” with your actual API key, and replace “https://example.com” with the URL of the website you want to analyze:
url = “https://www.googleapis.com/pagespeedonline/v5/runPagespeed”
params = {
????“url”: “https://example.com”,
????“strategy”: “desktop”,
????“fields”: “lighthouseResult/cachedPages/cachedHtml”,
????“key”: “YOUR_API_KEY”
}
Make an HTTP GET request to the API endpoint using the requests library.
import requests
import json
Parse the JSON response using the json library.
# Parse the JSON response
data = json.loads(response.content)
Extract the cached HTML data from the response
# Extract the relevant data from the response
cached_html = data[“lighthouseResult”][“cachedPages”][“cachedHtml”]
Update the cache with the new data by creating a dictionary with the key “website_data” and the value of the cached HTML data. You can use any caching mechanism you like, such as writing the data to disk or storing it in a database
# Update the cache with the new data
cache = {“website_data”: cached_html}
# code to write the updated cache back to disk or a database
Save your Python file and run it using the command line or your IDE’s run command. If everything is set up correctly, your Python code should make a request to the PageSpeed Insights API, retrieve the cached HTML data, and update your website cache accordingly.
That’s it! By following these steps, you should now be able to update website cache using Python and the Google PageSpeed Insights API.
Full Python Code:
import requests
import json
# Define the API endpoint and parameters
url = “https://www.googleapis.com/pagespeedonline/v5/runPagespeed”
params = {
????“url”: “https://example.com”,
????“strategy”: “desktop”,
????“fields”: “lighthouseResult/cachedPages/cachedHtml”,
????“key”: “YOUR_API_KEY”
}
# Make an HTTP GET request to the API endpoint
response = requests.get(url, params=params)
# Parse the JSON response
data = json.loads(response.content)
# Extract the relevant data from the response
cached_html = data[“lighthouseResult”][“cachedPages”][“cachedHtml”]
# Update the cache with the new data
cache = {“website_data”: cached_html}
# code to write the updated cache back to disk or a database
These undenied technical SEO factors will certainly going to improve your SEO campaign performance. So don’t hesitate to get it implemented right away!