Retrieving Real Estate Data from the Zillow API with Python ?????
Photo by Tierra Mallorca on Unsplash

Retrieving Real Estate Data from the Zillow API with Python ?????

Have you ever wanted to build a real estate data analysis tool, but weren't sure where to start? One option is to use APIs to access real estate data from a third party. In this article, we'll explore how to use the Zillow API with Python to retrieve real estate data for analysis.

First, let's set up our API credentials and import the necessary libraries:

import asynci
import pandas as pd
import requests
import time


# Set up Zillow API credentials
headers = {
? ? "X-RapidAPI-Key": "API_SECRET_KEY",
? ? "X-RapidAPI-Host": "zillow-com1.p.rapidapi.com"
}

        

Next, we'll define a function to make a request to the Zillow API to retrieve the details of a page of multi-family properties for sale in a given county. This function will take the county and page number as input parameters, and return the list of properties from the API response:


async def get_properties(county, page_number)
? ? # Set up request parameters
? ? url = "https://zillow-com1.p.rapidapi.com/propertyExtendedSearch"
? ? querystring = {
? ? ? ? "location": county,
? ? ? ? "status_type": "ForSale",
? ? ? ? "home_type": "Multi-family",
? ? ? ? "count": 10,? # retrieve 100 properties per request
? ? ? ? "page": page_number,
? ? }
? ??
? ? # Send request to Zillow API
? ? response = requests.request("GET", url, headers=headers, params=querystring)
? ??
? ? # Convert the response to a dictionary
? ? response_dict = response.json()["props"]


? ? # Check if the response contains an error
? ? if "error" in response_dict:
? ? ? ? # Print the error message
? ? ? ? print(response_dict["error"]["message"])
? ? ? ??
? ? ? ? # Return an empty list of properties
? ? ? ? return []
? ??
? ? # Return the list of properties from the response
? ? return response_dict:        

Now that we can retrieve a single page of properties, let's define a function to retrieve all of the properties for a given county. This function will make multiple requests to the Zillow API using the get_properties function until all of the properties have been retrieved:


# Define a function to retrieve the details of all properties
# for the specified county
async def get_all_properties(county):
? ? # Set up an empty list to store property details
? ? properties = []
? ? # reduce api overload
? ? time.sleep(.3)
? ? # Set up a variable to track the current page number
? ? page_number = 1


? ? # Set up a variable to keep track of the total number of properties
? ? total_properties = None


?? ? # Loop until all properties have been retrieve
? ? while True:
? ? ? ? # Make a request to the Zillow API to retrieve the
? ? ? ? # details of the current page of properties
? ? ? ? response = await get_properties(county, page_number)


? ? ? ? # Extract the list of properties from the response
? ? ? ? props = response


? ? ? ? # Update the total number of properties
? ? ? ? if total_properties is None:
? ? ? ? ? ? total_properties = response


? ? ? ? # Add the current page of properties to the list
? ? ? ? properties.extend(props)


? ? ? ? # Increment the page number
? ? ? ? page_number += 1


? ? ? ? # Check if all properties have been retrieved
? ? ? ? if len(properties) >= len(total_properties):
? ? ? ? ? ? break


? ? # Return the list of property details
? ? return properties
        

Now that we can retrieve the details of all properties for a given county, let's define a function to retrieve the details of all properties for multiple counties. This function will make requests to the Zillow API using the get_all_properties function for each county in a list:


# Define a function to retrieve the details of all properties
# for the specified counties
async def get_properties_for_counties(counties):
? ? # Set up a list to store the property details for each county
? ? property_lists = []


? ? # Loop through each county
? ? for county in counties:
? ? ? ? # reduce api over load
? ? ? ? time.sleep(.3)
? ? ? ? # Retrieve the details of all properties for the current county
? ? ? ? properties = await get_all_properties(county)
? ? ? ? # Add the list of properties for the current county to the list
? ? ? ? property_lists.append(properties)
? ? ? ? time.sleep(.3)
? ? # Add a delay before making the next request to avoid overloading the API
? ? ? ? await asyncio.sleep(1.1)


? ? # Return the list of property lists
? ? return property_lists
        

Now that we have a way to retrieve the details of all properties for multiple counties, let's use this function to retrieve the details of all properties for sale in Union County, Middlesex County, and Somerset County in New Jersey:


# Set up a list of counties to search for properties
countys = ["union county, nj","middlesex county, nj","somerset county, nj"]


# Retrieve the details of all properties for the specified counties
property_lists = asyncio.run(get_properties_for_counties(countys))
        

We can then store the property details in a Pandas DataFrame and save the DataFrame to a CSV file:

# Create a Pandas DataFrame to store the property detail
output = pd.DataFrame()


# Loop through each list of properties
for properties in property_lists:
? ? # Create a Pandas DataFrame to store the current list of properties
? ? df = pd.DataFrame(properties)
? ??
? ? time.sleep(.3)
? ? # Append the current DataFrame to the output DataFrame
? ? output = output.append(df)


# Retrieve the URLs for the images of each property and
# add them to the output DataFrame
output["image_url"] = output["image_url"].apply(lambda x: requests.get(x).url)


# Save the output DataFrame to a CSV file
output.to_csv("properties.csv", index=False)
        

And that's it! With just a few lines of code, we were able to retrieve real estate data from the Zillow API and save it to a CSV file for further analysis. Whether you're a data scientist, real estate investor, or just curious about the housing market, the Zillow API is a great resource for accessing real estate data.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了