Why you need Python GIL and how it works

Python Global Interpreter Lock (GIL) is a lock that allows only one thread to control the Python interpreter. Let's take a look at how it works.

The Python Global Interpreter Lock (GIL) is a peculiar lock that allows only one thread to control the Python interpreter. This means that at any given moment, only one specific thread will be executing.

The operation of the GIL may seem insignificant to developers creating single-threaded programs. However, in multi-threaded programs, the absence of the GIL can negatively impact the performance of CPU-bound programs.

Since the GIL allows only one thread to execute even in a multi-threaded application, it has gained a reputation as a "notoriously infamous" feature.

This article will discuss how the GIL affects the performance of applications and how this impact can be mitigated.

What problem does GIL solve in Python?

Python counts the number of references for proper memory management. This means that objects created in Python have a reference count variable that stores the number of all references to that object. Once this variable becomes zero, the memory allocated for that object is freed.

Here's a small code example demonstrating the operation of reference count variables:

import sys 
a = [] 
b = a 
sys.getrefcount(a)

Result: 3

In this example, the number of references to an empty array is 3. This array is referenced by the variable a, variable b, and the argument passed to the sys.getrefcount() function.

The problem that GIL solves is related to the fact that in a multi-threaded application, multiple threads can increment or decrement the values of this reference count variable simultaneously. This can lead to incorrect memory cleanup and deletion of an object that still has a reference.

Protecting the reference count variable by adding locks to all data structures that are accessed by multiple threads can ensure that the variable is modified sequentially.

However, adding locks to multiple objects can lead to another problem—deadlocks—especially when there is more than one lock on an object. Additionally, this approach would also decrease performance due to the repeated locking.

GIL is a single lock on the Python interpreter itself. It introduces a rule: any bytecode execution in Python requires the interpreter lock. In this case, deadlocks can be excluded since GIL will be the only lock in the application. Moreover, its impact on CPU performance is not critical. However, it's worth noting that GIL effectively makes any program single-threaded.

Although GIL is used in other interpreters, such as Ruby, it is not the sole solution to this problem. Some languages address the issue of thread-safe memory deallocation using garbage collection.

On the other hand, this means that such languages often need to compensate for the loss of single-threaded advantages of GIL by adding additional performance-enhancing features, such as JIT compilers.

Why was GIL chosen to solve the problem?

So, why is this not a very "good" solution used in Python? How critical is this decision for developers?

According to Larry Hastings, the architectural decision of GIL is one of those things that made Python popular.

Python has existed since the time when operating systems had no concept of threads. This language was developed with the expectation of easy usage and speeding up the development process. More and more developers started switching to Python.

Many extensions needed by Python were written for existing C libraries. To prevent inconsistent changes, the C language required thread-safe memory management, which GIL was able to provide.

GIL could be easily implemented and integrated into Python. It increased the performance of single-threaded applications since control was managed by only one lock.

It became easier to integrate C extensions that were not thread-safe. These C extensions became one of the reasons why the Python community started to expand.

As you can understand, GIL is actually a solution to the problem that CPython developers faced at the beginning of Python's life.

The impact of GIL on multi-threaded applications

If we look at a typical program (not necessarily written in Python), there is a difference whether this program is CPU-bound or I/O-bound.

CPU-bound operations involve all computational operations: matrix multiplication, searching, image processing, etc.

I/O-bound operations are those operations that often wait for something from input/output sources (user, file, database, network). Such programs and operations sometimes may wait for a long time until they receive what they need from the source. This is because the source may perform its own (internal) operations before it is ready to provide the result. For example, the user may be thinking about what to enter in the search bar or what query to send to the database.

Below is a simple CPU-bound program that simply counts down:

import time
from threading import Thread

COUNT = 50000000

def countdown(n):
    while n > 0:
        n -= 1

start = time.time()
countdown(COUNT)
end = time.time()

print('Elapsed time -', end - start)

Running this on a 6-core computer will give such a result:

Elapsed time - 2.5986981868743896

Here is the same program with a slight modification. Now the countdown is done in two parallel threads:

import time
from threading import Thread

COUNT = 50000000

def countdown(n):
    while n > 0:
        n -= 1

t1 = Thread(target=countdown, args=(COUNT//2,))
t2 = Thread(target=countdown, args=(COUNT//2,))

start = time.time()
t1.start()
t2.start()
t1.join()
t2.join()
end = time.time()

print('Elapsed time -', end - start)

And here is the result:

Elapsed time - 2.760884475708008

As can be seen from the results, both variants took approximately the same amount of time. In the multithreaded version, GIL prevented parallel execution of threads.

GIL does not significantly affect the performance of I/O operations in multi-threaded programs because the lock is spread across threads during I/O waiting.

However, a program whose threads will work exclusively with the CPU (for example, image processing in parts) due to locking will not only become single-threaded but will also take more time to execute than if it were strictly single-threaded.

This increase in time is the result of the appearance and implementation of the lock.

Why is GIL still being used?

Language developers have received plenty of complaints regarding GIL. However, for a popular language like Python, such a radical change as removing GIL cannot be made easily, as it would naturally lead to a slew of compatibility issues.

In the past, attempts were made to remove GIL. However, all of these attempts were thwarted by existing C extensions that heavily relied on the existing GIL solutions. Naturally, there are other alternatives similar to GIL. However, they either degrade the performance of single-threaded and multi-threaded I/O applications or are simply difficult to implement. You wouldn't want your program to run slower in newer versions, would you?

Guido van Rossum, the creator of Python, addressed this issue in September 2007 in an article titled "It isn’t Easy to remove the GIL":

I would be happy with patches to Py3k only if the performance of single-threaded or multi-threaded I/O applications does not decrease.

Since then, none of the attempts made met this condition.

Why wasn't GIL removed in Python 3?

Python 3 actually had the opportunity to overhaul some functions from scratch, although many C extensions would simply break as a result, requiring them to be rewritten. This is why the early versions of Python 3 were so poorly received by the community.

But why not remove GIL alongside the update to Python 3?

Its removal would make single-threadedness in Python 3 slower compared to Python 2, and just imagine the consequences of that. The benefits of single-threadedness in GIL cannot be ignored. That's why it still hasn't been removed.

However, Python 3 did see improvements for the existing GIL. Up to this point, the article discussed the impact of GIL on multi-threaded programs that either exclusively utilize the CPU or exclusively I/O. But what about programs where some threads are CPU-bound while others are I/O-bound?

In such programs, I/O-bound threads "suffer" because they lack access to GIL from CPU-bound threads. This is due to Python's built-in mechanism, which compelled threads to release GIL after a certain interval of continuous use. In cases where no one else is using GIL, these threads could continue to work.

import sys 
# By default, the interval is set to 100 
sys.getcheckinterval()

Result: 100

However, there's one problem here. Almost always, GIL is occupied by CPU-bound threads, and the other threads don't get a chance. This fact was studied by David Beazley, and you can see the visualization of it here.

The problem was addressed in Python 3.2 in 2009 by developer Antoine Pitrou. He added a mechanism to count the threads that need GIL. And if there are other threads needing GIL, the current thread wouldn't preempt them.

How to deal with GIL?

If GIL is causing you problems, here are a few solutions you can try:

Multiprocessing vs. Multithreading. A fairly popular solution because each Python process has its own interpreter with dedicated memory, so there's no problem with GIL. Python already has the multiprocessing module, which simplifies creating processes like this:

from multiprocessing import Pool
import time

COUNT = 50000000

def countdown(n):
    while n > 0:
        n -= 1

if __name__ == '__main__':
    pool = Pool(processes=2)
    start = time.time()
    r1 = pool.apply_async(countdown, [COUNT//2])
    r2 = pool.apply_async(countdown, [COUNT//2])
    pool.close()
    pool.join()
    end = time.time()
    print('Elapsed time in seconds -', end - start)

After running this, you'll get such a result:

Elapsed time in seconds - 1.4896368980407715

You can notice a significant performance improvement compared to the multithreaded version. However, the time taken did not reduce to half. This is because managing processes itself affects performance. Multiple processes are more complex than multiple threads, so they need to be handled with care.

Alternative Python interpreters. Python has many different interpreter implementations. CPython, Jython, IronPython, and PyPy, written in C, Java, C#, and Python, respectively. GIL only exists in the original interpreter—CPython.

You can simply enjoy the benefits of single-threadedness while some of the brightest minds are currently working on removing GIL from CPython. Here's one of the attempts.

Often, GIL is regarded as something complex and incomprehensible. But keep in mind that as a Python developer, you'll only encounter GIL if you're writing C extensions or multi-threaded CPU-bound programs.

Why you need Python GIL and how it works

Zakhar Kravchenko

Junior Python Developer

What problem does GIL solve in Python?

Why was GIL chosen to solve the problem?

The impact of GIL on multi-threaded applications

领英推荐

Why is GIL still being used?

Why wasn't GIL removed in Python 3?

How to deal with GIL?

社区洞察

其他会员也浏览了

File Accessing in Python - Different Methods To Handle Files In Python

What are Loop Control Statements in Python?

Python Interview Questions Set 2

Python Interview Questions Set 5

Python Set Operations: How to Perform Union, Intersection, Difference operations, etc

??? Building a Voice-to-Hand Sign Interpreter Using Python

Thread Observer Using Python

Python Basics

Understanding Python List Comprehensions and Generator Expressions: Memory Efficiency and Performance

Python Comprehension