API requests with Rate Limit in Python

API requests with Rate Limit in Python

This is a short article to detail how one can apply API rate limiting in python using the ratelimit library and to be more specific, the limits & the sleep_and_retry classes within the library.

Most APIs employ rate limiting to ensure that the number of requests they encounter per given unit time do not exceed their capabilities. Even for very advanced APIs equipped with load balancing and efficient scalability configurations, rate limiting is still employed as an additional layer of ensuring the API is not strained beyond its capacity. Additionally, it can also serve as an implicit denial-of-service (DoS) attack barrier.

The sending or retrieval of data from APIs is common in Data Engineering and one ought to be familiar with the construct of APIs and how to work with them accordingly.

The question now is how does one do rate limiting in python? I am sure there are many other ways to go about it but the option we will explore is to use the limits and the sleep_and_retry classes from the ratelimit library as mentioned earlier.

For this l have written a class with details how this can be done. See the code in the link below:

https://github.com/jp0793/public_code_shares/blob/main/APIRequestsWithRateLimit

Alternatively, the code is as shown below although the indentation of LinkedIn is quite lacking.

import requests
from ratelimit import limits, sleep_and_retry

class RateLimitedAPIClient:
    def __init__(self, api_endpoint: str, headers: dict, rate_limit: int = 100, period: int = 1):
        """
        Initialize the RateLimitedAPIClient.
        
        :param api_endpoint: The API endpoint URL.
        :param headers: Dictionary containing the API request headers.
        :param rate_limit: The number of requests per period (default is 100 requests per second).
        :param period: The time period in seconds (default is 1 second).
        """
        self.api_endpoint = api_endpoint
        self.headers = headers
        self.rate_limit = rate_limit
        self.period = period

    def __str__(self) -> str:
        """
        Dunder method for an end user-friendly string of the class object description
        """
        return f"RateLimitedAPIClient: api_endpoint: {self.api_endpoint}, headers: {self.headers}, rate_limit: {self.rate_limit}, period: {self.period}"
    
    def __repr__(self) -> str:
        """
        Dunder method for a programmer-friendly description of the class object
        """
        class_name = type(self).__name__
        return f"{class_name}, api_endpoint: {self.api_endpoint}, headers: {self.headers}, rate_limit: {self.rate_limit}, period: {self.period}"
    

    @sleep_and_retry
    @limits(calls=100, period=1)  # Adjust rate limit here
    def __patch_request(self, data):
        """
        Send a PATCH request to the API with rate limiting.
        
        :param data: The JSON data to be sent in the request body.
        :return: The response object from the API request.
        """
        response = requests.patch(self.api_endpoint, json=data, headers=self.headers)
        if response.status_code != 200:
            raise Exception(f"API request failed with status code {response.status_code}")
        return response

    def send_request(self, json_list) -> dict:
        """
        Upload a list of JSON objects to the API. The functions also returns a dictionary of all failed messages together with the errors which caused the fail.
        
        :param json_list: List of JSON objects (dictionaries) to be uploaded.
        :return: a dictionary of a message (key) and error (value) for all msgs which failed to send.
        """
        
        # Initiate list to store messages which failed to send
        failed_messages_list = []
        
        # Sequentially send requests with rate limiting
        for data in json_list:
            try:
                response = self.__patch_request(data)
                print(f"Success: {response.status_code}")
            except Exception as e:
                # Construct the dictionary in iteration
                failed_msg_dict = {"message": data, "error": str(e)}
                
                # Append "failed_messages" dictionary
                failed_messages_list.append(failed_msg_dict)
                
                print(f"The message failed to upload due to error: {e}")
        
        return failed_messages_list        

The class consists of two python dunder methods ( str & repr ) which though not essential to the core functionality, are included for good programming practices and are extremely helpful for debugging if one knows how to use them.

Beyond the dunder methods, there are two further methods. The first method is __patch_request. This method is used to send patch requests to the API. One can edit it to do a different operation (e.g. a PUT request). For our purposes, we will use the PATCH operation. This method makes use of the api endpoint and the headers to connect to the API in question and send patch requests. Of course, to conduct a patch request, one needs a json | python dictionary with the appropriate data key fields and entries as per the formatting requests of the API. For example, see image below:


An example of data top be send on a patch request

Using python decorators, the functionality and properties of the classes limits and the sleep_and_retry is passed to the __patch_request method without editing any aspects of the later. In other words, if the code is sending messages and reaches the API rate limit e.g .100 msgs before a second is up, the code will go into sleep mode till a full second is past and only then attempt to send more messages.

NOTE: Without delving into too much theory etc. about python decorators, here is a nutshell to those who might not be familiar with them. Decorators are a powerful feature that allows you to modify or enhance the behavior of functions or methods without altering the actual method. When you use decorators with methods in a class, you can use another class to implement the decorator logic, enabling you to add or modify the capabilities of those methods.

In essence, a decorator is essentially a callable (like a function or class) that takes a function or method as an argument and returns a new function or method that usually extends or alters the original one.

Moving on, the headers are specific to the API, and one would need to read the documentation to ascertain the specifics. For our purposes, we will focus on an API which makes use of an API key for authentication and the headers looks as shown below:


Headers example

The API endpoint is self-explanatory and is also specific to each API. This will be available in the API documentation. The __patch_request? method returns a response from the API. This method is internal to the class and cannot be called directly outside the class. Rather it is used by the send_request method. The send_request method simply calls the __patch_request method with appropriate error handling in place. It also has a failed_messages_list which consist of a list of dictionaries for all failed messages with each dictionary in the format: shown below (failed_msg_dict):


Failed message dictionary format

This list will contain all messages which returned an error together with the error message returned. This can be used for debugging to even logged as logs etc. Apart from using the __patch_request method to send requests to the API, the send_request API returns the failed_messages_list as discussed above.

Wrapping this up, the class can be used as shown in the code below:

1.??????? Retrieve the api endpoint and the api key from a key vault or as environmental variables. The way its done below is for illustrative purposes. In practice, always use the key vault or environmental variables

2.??????? Define your headers dictionary as per your particular API’s documentation

3.??????? Get a list of dictionaries for the data you want to send to the API for PATCH or PUT etc (as per your requirement). This data could be from your dataframes and needs to be converted into a list of dictionaries etc or whatever your use case if, as long as its in the format of a list of dictionaries, it will be usable in this class. The method send_request loops over the items in the list of dictionaries and in each iteration, sends a request to the API. For illustrative purposes, a list of dictionaries can look like the one below:


List of dictionaries example

In the real world, of course this list will likely be more complex and way larger (could be thousands of items etc.).

With the above in mind, this class can be used as shown below

Use of class in main function

NB: Lastly, there is a big issue we did not address here. This code ensures that the rate limit of the API will never be exceeded. However, in most applications, this kind of code will not get anywhere close to the actual rate limit (depending on whole bunch of factors, most the looping. Loops are notorious performance killers). In the next article, we will talk about enhancing this code to enhance performance drastically while still ensuring the rate limit of the API is never exceeded using the ThreadPoolExecutor class to ensure parallelism.

Matthew Oladiran

Data Scientist | Data Analyst | Transforming Complex Data into Clear, Actionable Insights for Impactful Decision-Making

6 个月

Using the ratelimit library in Python is a cool way for Data Engineers to manage sending data to rate-limited APIs. ??

回复

要查看或添加评论,请登录

Prince Baloyi的更多文章

社区洞察

其他会员也浏览了