Concurrency with Frameworks and Servers for Python

Concurrency with Frameworks and Servers for Python

I would like to return to the topic of concurrent programming in Python today, but this time we're going to move a few levels higher. I'm talking about web technologies, so we're going to talk about web frameworks, web servers and how different types of concurrent programming are implemented there, as well as what ASGI and WSGI are, what their key differences are and why we need them at all.

When we talk about web frameworks and web servers concurrent programming starts to interest us in terms of how requests are handled. There are essentially only two models, synchronous and asynchronous, on which all interaction is built. They have fundamental differences that make them more or less suitable for certain tasks. It is important to understand this in order to be able to choose the right tool for different scenarios and application requirements. Let's start with frameworks.

Frameworks

Synchronous Web Frameworks

Synchronous web frameworks handle each request one at a time. When a request is received, it is processed completely before the framework can receive the next one from the server. This means that each request is blocking, occupying the server's resources until the request is fully handled.

Key features:

  • Execution model: Sequential, blocking operations. Each request blocks the server from processing other requests until it completes.
  • Concurrency: Typically achieved through multithreading or multiprocessing, which can lead to higher memory usage and context switching overhead.
  • Complexity: Generally simpler to understand, implement and debug since code execution is sequential.
  • Resource Utilization: Higher resource consumption due to thread/process overhead and blocking behavior.
  • Use Cases: Should be used for applications with low to moderate concurrency needs, or CPU-bound tasks where concurrency gains are limited.

Most popular choices:

  • Django: Full-featured, "batteries-included", suitable for large and complex applications, that traditionally handles requests synchronously.
  • Flask: Microframework, simple and flexible, great for small to medium-sized projects.
  • Pyramid: Scalable and flexible, suitable for both simple and complex applications.
  • Bottle: Lightweight, single-file framework, ideal for small applications and rapid prototyping.

Asynchronous Web Frameworks

Asynchronous web frameworks handle requests in a non-blocking manner. They use an event loop (which is usually initialized by server) to manage independent operations, allowing the server to handle multiple requests concurrently without waiting for one to complete before starting another.

Key features:

  • Execution model: Non-blocking, event-driven operations. Requests can yield control back to the event loop when waiting, for example, for I/O operations, allowing other requests to be processed simultaneously.
  • Concurrency: Use event loop for cooperative multitasking. Efficiently handles many connections with lower resource consumption compared to threading or multiprocessing.
  • Complexity: Requires understanding of async/await syntax and event-driven programming, which can be more complex to write and debug.
  • Resource Utilization: More efficient resource utilization with lower overhead, due to the use of lightweight coroutines managed by an event loop running in a single thread.
  • Use Cases: Ideal for high-concurrency scenarios, especially I/O-bound tasks like web scraping, real-time communication, and APIs with high traffic.

Most popular choices:

  • FastAPI: High-performance, easy-to-use, great for APIs and microservices.
  • AIOHTTP: Flexible, supports both client and server, ideal for real-time applications.
  • Sanic: Simple and powerful, designed for high performance and fast HTTP responses.
  • Starlette: Lightweight, high-performance, suitable for async web applications and as a building block for other frameworks.
  • Quart: Flask-compatible, async/await support, perfect for migrating to asynchronous web applications.

I should make a note that in practice the categorization of frameworks is not so unambiguous. For example, in Flask you can use async/await, but it will still work according to the blocking model. And Django can work as both an asynchronous and synchronous framework. But, if we talk about what model of interaction is traditionally implemented by a particular framework, then the above list is correct.

Okay, we've learned a little bit about frameworks. Now let's go? a level higher and see how things work with web servers.

Servers

I'm sure you're aware that web servers are the backbone of any web application, which manages incoming requests, transfers them to applications (frameworks), then receives responses from applications (frameworks) and delivers them back to users. In the Python ecosystem, web servers can be also categorized as synchronous and asynchronous.?

In general, synchronous and asynchronous servers are built on the same principles as synchronous and asynchronous frameworks respectively, since they are closely related to frameworks and directly interact with them. Let's take a look.

Synchronous Web Servers

Synchronous web servers handle each request one at a time. In other words, once a request is received the server passes it to the framework for processing and this particular server worker will be blocked until the framework returns a response. Pretty the same principles as for the synchronous framework. Let's move on.

Most popular choices:?

  • Gunicorn: Pre-fork worker model, simple, widely supported, great for WSGI applications.
  • uWSGI: Highly configurable, supports multiple protocols, efficient resource management.

Asynchronous Web Servers

Asynchronous web servers handle requests in a non-blocking manner. They use an event loop to manage I/O operations, allowing the server to handle multiple requests simultaneously without waiting for one to complete by framework before starting another. Each server worker can handle hundreds or thousands of client connections concurrently.

  • Uvicorn: High-performance ASGI server, ideal for modern async frameworks like FastAPI.
  • Daphne: Developed for Django Channels, supports HTTP/2 and WebSockets, ASGI compliant.
  • Hypercorn: Flexible ASGI server with support for HTTP/2 and WebSockets.
  • Sanic: Fast web server and framework with built-in support for asynchronous request handling.
  • AIOHTTP: Comprehensive async HTTP client/server framework with WebSocket support.

The server is the base on which any framework functions and in fact dictates the rules of how the framework should interact with it. And they interact with each other using the ASGI and WSGI interfaces, which we have already seen before. Let's talk a little bit about them.

ASGI and WSGI

ASGI (Asynchronous Server Gateway Interface) and WSGI (Web Server Gateway Interface) are specifications that define a standard interface between web servers and Python web applications or frameworks. They serve similar purposes but are designed with different capabilities in mind, reflecting the evolution of web technologies and application requirements.?

WSGI

The Synchronous Pioneer WSGI, introduced in 2003, standardized the way Python web applications communicate with web servers. It's a synchronous, single-threaded interface that processes one request at a time.

ASGI

The Asynchronous Evolution ASGI, introduced in 2016, extends WSGI's concepts to support asynchronous programming. It extends the concepts of WSGI to support asynchronous programming, WebSockets, and other protocols beyond HTTP. ASGI allows for more efficient handling of concurrent connections and long-lived requests.

Summary

We've dealt with this question a bit and it looks like we put everything into the right places, but let me break it down again and we'll look at this variety of tools from a different angle. In fact, it is not quite correct to divide frameworks and servers only into synchronous and asynchronous ones. From a technical point of view, they should also be categorized by those that implement one or another gateway interface.

For example WSGI is really traditional and probably still the preferred option for synchronous communication, but the newer and more advanced ASGI can offer both asynchronous and synchronous communication. Or, if you take a Uvicorn server, which is an ASGI server, it can work with a framework like Flask, which is a WSGI framework, via a special ASGI-to-WSGI adapter (I don't know why you would do that, but it can work nevertheless). Gunicorn can work with a special Uvicorn worker acting as a process manager for Uvicorn, eventually serving some ASGI framework. And such frameworks as for example Django or Quart can be run on both ASGI and WSGI servers.

As you can see all this is a bit confusing, but let's just try to categorize this whole “zoo” into two categories at once and I think this will give the most clear and appropriate result from a technical point of view.

Synchronous (WSGI) Frameworks and Servers

  • Frameworks: Django (traditional use), Flask, Pyramid, Bottle
  • Servers: Gunicorn, uWSGI

Asynchronous (ASGI) Frameworks and Servers

  • Frameworks: FastAPI, Starlette, Sanic, AIOHTTP, Django (with Channels)
  • Servers: Uvicorn, Daphne, Hypercorn, Sanic (built-in server), AIOHTTP (built-in server)

Frameworks Supporting Both Sync (WSGI) and Async (ASGI)

  • Frameworks: Django (with Django Channels), Quart (Flask-compatible with async/await support)

We have already dealt with a lot of things, but we still haven't started on the main topic of the article. I apologize so much, but I hasten to rejoice that this moment has come. We have just gained the critical mass of necessary knowledge and can now proceed to discuss what you've been reading all the way to these lines for.

So, how can concurrency be achieved?

WSGI Frameworks and Servers

For this setup concurrency usually can be achieved with 3 main approaches:

1. Multiprocessing

This is the only approach with true parallelism at the moment. It can lead to significant performance improvements and is widely used for CPU-bound tasks.

Benefits:

  • Improved Performance on Multi-Core Systems: Multiprocessing allows you to leverage multiple CPU cores, enabling parallel execution of requests.
  • Isolation of Processes: Each process runs in its own memory space. If one process crashes, it doesn't affect the others. This can enhance the stability and reliability of the server.

  • Better Handling of Blocking Operations: Multiprocessing can handle blocking operations better than threading, as each process has its own Python interpreter and Global Interpreter Lock (GIL). This can be advantageous for I/O-bound tasks as well.
  • Scalability: By increasing the number of worker processes, you can scale the application to handle more concurrent requests.
  • Simpler to Implement: Multiprocessing is often easier to implement than threading, as it avoids many of the complexities and pitfalls associated with thread synchronization and shared memory.

Drawbacks:

  • Increased Memory Usage: Each process has its own memory space, which can lead to higher memory consumption compared to a multithreaded approach where threads share the same memory space.

Example of running several workers (processes) with Gunicorn server:

?gunicorn main:app --workers=4        

2. Multithreading

This approach is usually used as a traditional solution for improving performance of I/O operations for WSGI servers.

Benefits:

  • Improved Resource Utilization: Threads share the same memory space, leading to more efficient memory usage compared to multiprocessing, where each process has its own memory space.
  • Lower Overhead: Creating and managing threads generally incurs less overhead than creating and managing processes, leading to faster context switching and lower resource consumption.
  • Better for I/O-Bound Tasks: Multithreading can significantly improve performance for I/O-bound applications since threads can handle multiple I/O operations concurrently, allowing other threads to proceed while one is waiting for I/O operations to complete.
  • Easier Communication: Threads share the same memory space, making it easier to share data and communicate between threads without the need for inter-process communication mechanisms.

Drawbacks:

  • Global Interpreter Lock (GIL): In Python, the Global Interpreter Lock (GIL) can be a significant limitation for CPU-bound applications. The GIL prevents multiple native threads from executing Python bytecodes simultaneously, which means multithreading does not effectively utilize multiple CPU cores for CPU-bound tasks.
  • Thread Safety Issues: Multithreading can introduce complexity related to thread safety, including issues like race conditions, deadlocks, and synchronization problems, which require careful management and can be challenging to debug.
  • Limited Scalability for CPU-Bound Tasks: Due to the GIL, multithreading in Python does not provide significant performance benefits for CPU-bound tasks, as threads are not truly concurrent at the bytecode execution level.

Example of running several threads per worker with Gunicorn server:

?gunicorn main:app --worker-class=gthread --workers=4 --threads=2        

It's generally recommended to use gthread worker classes when you want to leverage multiple threads, although the sync worker class can also use multiple threads. This is because gthread is designed to be more efficient for threaded operations.

3. Gevent worker class

This approach leverages greenlets (lightweight coroutines) and cooperative multitasking to handle concurrency efficiently, especially for I/O-bound tasks.

Benefits:

  • Non-Blocking I/O: Gevent uses non-blocking I/O, allowing a single worker to handle many I/O-bound tasks concurrently without being blocked by slow I/O operations.
  • Lightweight Concurrency: Greenlets are much lighter than threads or processes, allowing you to run thousands of concurrent tasks within a single process.
  • Reduced Memory Usage: Since greenlets are lightweight, they consume less memory compared to threads or processes.
  • Faster Context Switching: Switching between greenlets is faster than switching between threads or processes, leading to improved performance under high concurrency.
  • Simpler Code for Concurrency: Gevent allows you to write asynchronous code in a synchronous style, which can be more intuitive and easier to maintain compared to callback-based asynchronous programming.
  • Better Resource Utilization: Gevent can make better use of a single process, handling many tasks concurrently, which can be advantageous in environments with constrained resources.

Drawbacks:

  • Limited CPU-Bound Performance: Gevent's cooperative multitasking doesn't provide significant benefits for CPU-bound tasks, as these tasks do not yield control frequently enough to allow other greenlets to run.
  • Complex Debugging and Profiling: Debugging and profiling issues related to greenlets can be more complex than traditional threading or multiprocessing, especially when dealing with race conditions and context switching.
  • Monkey Patching Required: To work effectively, gevent often requires monkey patching of standard library modules to make them non-blocking. This can lead to compatibility issues and unexpected behaviors if not handled carefully.
  • Potential Compatibility Issues: Not all libraries and codebases are fully compatible with gevent’s monkey patching, which can limit its applicability or require significant code changes.
  • Potential for Blocking Operations: If any part of the application or its dependencies performs blocking I/O without yielding control, it can block the entire process, reducing the concurrency benefits of gevent.

Example of running several gevent workers (processes) with Gunicorn server:

?gunicorn main:app --worker-class=gevent --workers=5 --worker-connections=1000        

You can also set the --worker-connections option to a high value to allow each worker to handle as many concurrent connections as you need. The default value is 1000, which is a good starting point.

ASGI Frameworks and Servers

There are 2 main approaches:

1. Multiprocessing

Pretty the same as for WSGI, which we already discussed above. We can increase the number of async workers, which will work in parallel.

Example of running several workers (processes) with Unicorn server:

?uvicorn main:app --workers=4 --limit-concurrency=1000        

You can adjust --limit-concurrency parameter to manage a limit on the number of concurrent tasks (connections) that the server will handle simultaneously. It allows you to optimize server performance based on your hardware capabilities, application requirements and prevents the server from becoming overwhelmed by too many concurrent requests

Example of running several workers (processes) with Gunicorn with Uvicorn worker:

?gunicorn main:app -k uvicorn.workers.UvicornWorker --workers=4 --worker-connections=1000        

2. Asynchronous Programming

The core of asynchronous programming is Event Loop, where tasks are scheduled and executed. In Python, AsyncIO provides the event loop implementation or the alternative and more performant solution as uvloop can be used. In your application you will define functions with async def which are called coroutines that can pause their execution (await) to allow other coroutines to run. The switching of coroutines will be handled by the Event Loop.

Benefits:

  • Efficient and Non-blocking I/O Operations: Asynchronous programming allows you to handle many concurrent connections using a single thread or process, making it well-suited for I/O-bound tasks like network requests, file I/O, or database queries.
  • High Throughput and Lower Latency: By leveraging asynchronous I/O, ASGI frameworks can handle more requests per second compared to traditional synchronous frameworks. This is because while one request is waiting for an I/O operation, the server can process other requests.
  • Simplified Code with Async/Await Syntax: The async/await syntax simplifies writing and understanding asynchronous code, which makes it easier to maintain compared to traditional callback-based approaches.
  • Rich Async Ecosystem: The async ecosystem in Python includes libraries like httpx, asyncpg, aiohttp that support non-blocking operations and integrate well with ASGI frameworks.

Drawbacks:

  • Debugging Challenges: Asynchronous code can introduce complexities such as race conditions and subtle bugs that are difficult to reproduce and debug. The event loop model requires a different approach to error handling compared to synchronous code.
  • Learning Curve: Developers who are used to synchronous programming may need time to adapt to asynchronous programming paradigms such as coroutines, the event loop, and async context managers.
  • Library Compatibility: Not all third-party libraries support asynchronous operations. While many popular libraries are async-compatible, some libraries may only offer synchronous APIs, requiring workarounds or alternatives.
  • Risk of Blocking Operations: If your application or its dependencies perform blocking operations (e.g., synchronous HTTP requests or long-running tasks), it can negate the benefits of asynchronous programming.

Workers and Threads Configuration (Gunicorn and Uvicorn example)

Synchronous WSGI server:?

For CPU-bound application:

  • Number of workers: <(2 x number_of_cores) + 1> - This can help ensure that you have enough workers to handle spikes in load.
  • Number of threads: <1 (default) per worker> - Multiple threads don't help much for CPU-bound tasks due to Python's Global Interpreter Lock (GIL).

For I/O-bound application:

  • Number of workers: <(2 x number_of_cores) + 1> - This can help ensure that you have enough workers to handle spikes in load.
  • Number of threads: <2-4 per worker> - Since I/O-bound tasks can benefit from threading, you should configure each worker to handle multiple threads.

Asynchronous WSGI server:

Such servers are specifically designed to be highly effective for I/O-bound applications. So, we will discuss only I/O-bound application:

  • Number of workers: <number_of_cores - 1> - You can potentially increase this as Gevent workers are pretty lightweight.
  • Number of threads: <not applicable> - Since Gevent workers use greenlets instead of threads.

Asynchronous ASGI server:

For such servers recommended configuration for I/O-bound and CPU-bound applications is pretty similar:

  • Number of workers: <number_of_cores> - Start with a number of workers close to the number of CPU cores to fully utilize the processing power, as async workers can handle many connections concurrently.
  • Number of threads: <not applicable> - Since Uvicorn workers are usually executed in one thread. The focus is on using async/await for concurrency within each worker.

Remember, these are starting points. The optimal configuration can vary based on your specific application, infrastructure, and workload. It's important to test and monitor your application under realistic loads to fine-tune these settings.

General Best Practices:

  1. Monitor and adjust: Start with these guidelines and monitor your application's performance. Adjust as needed.
  2. Consider memory usage: Each worker consumes memory. Ensure you have enough RAM for your chosen configuration.
  3. Database connections: Be aware of the total number of database connections (workers x threads).
  4. Max requests: Use the --max-requests option to restart workers periodically to help manage memory leaks.
  5. Timeout: Set appropriate timeout values using --timeout.
  6. Keep some capacity in reserve: Don't max out your CPU; leave some capacity for spikes and system tasks.

Conclusion

We didn’t discuss here all possibilities for achieving concurrency with all plenty of Python’s frameworks and servers. There are a lot of more approaches to touch like uWSGI with async mode, Tornado and AIOHTTP, ASGI-WSGI adapters. But as you can see the article is already long enough and in general, I don't think it makes sense to explore all the options at once. Let’s say we tried to cover only widely used approaches and mention other possibilities you can explore in more detail if you need it.

要查看或添加评论,请登录

Alexander Antonov的更多文章

社区洞察

其他会员也浏览了