Delving into memory inspection with python

Delving into memory inspection with python


In this article, we set out to investigate Python's complex memory management mechanisms. Our aim is to shed light on some measurements and their potential differences rather than to attempt to unravel the inner workings of memory management. We'll provide explanations for why the underlying mechanics can make memory inspection of Python applications deceptive.

Understanding RSS

RSS (Resident Set Size) memory refers to the the actual memory footprint of a process in RAM at a specific moment in time.

Getting RSS Memory Consumption of Program

The following snippet prints out the RSS memory consumption of currently running process.

A popular linux command line tool htop will also do if we want to inspect the RSS memory of process.

htop -p <process-id>        

Delve Deeper Into RSS

Lets take this snippet as example to understand better.

Well just a simple snippet to read a file and print memory consumption along the way. You can periodically place time.sleep(10) in between these line and inspect with htop consecutively to get better comprehension.

The output of following program looks something like

The output looks a bit confusing. Its totally understandable that program memory utilization increases when file is in memory but why is the memory utilization still 25 MB ? Well some of us might think the residual memory usage is just related to the interpreter allocating resources and you might not necessarily be wrong but that's not all there is to it. Lets modify snippet a bit.

Speaking logically the following snippet does nothing different, it still reads print memory consumption before reading the file, after reading file and after its done with content of file. What do you think the output will be ? The same as above correct ? Well that's where the twist is, the answer is NO

What just happened ? The RSS memory consumption for some reason seems to be decreased by 5MB, although logically program seems to do nothing different. For most of us this makes little to no sense and even if we were to say python does something weird inside, the curiosity is still persist on WHAT DOES IT DO ? The answer lies with understanding how Heap Fragmentation works.

Understanding Heap Fragmentation

Here we understand what heap fragmentation is and why RSS inspection might not present an accurate information on memory consumption in terms of python program.


Well this picture, although was hard to find is a self explanatory. Lets assume that each fragment filled with orange color is python object and initially there were 2 of them. We allocated one large object between 2 small objects. Then we deleted or perhaps were done processing with large object and it was garbage collected however small object allocated after large object is still in use. The RSS memory consumption will still represent 10 MB, although only 1.5MB is in use by process.

In context of our above script when we read file in same function at first and then divided it into second function we noticed change in RSS consumption and this explains why. The object allocation are in different order which reduces memory. Until the last object in heap fragmentation is collected the RSS memory consumption will represent the same usage, however new objects can be allocated in the empty fragments. The objective is just to clarify that just inspecting RSS consumption is not actually what you might be after when monitoring memory consumption of python program.

Delve deeper into getsizeof

Well previously we discussed about how RSS inspection might be deceptive, under the context of same topic let's also see a very popular function most programmer use and why even this might be deceptive as well.

Let's take this simple snippet example, we have 2 list one with 4 integer and one with 4 string and then we use sys.getsizeof() to get size of each of the list. For those who are aware of how memory block works string definitely takes more memory than integer, so given that the size of variable list_2 must be definitely more. Let's inspect the output.

size of list_1 is 88
size of list_2 is 88
        

To our surprise the size of both of the list appear to be same. Does that really mean both list acquire same memory ? The answer is NO and this is exactly why function like this tends to provide deceptive information.

Understanding getsizeof

Well there's not much to understand, except for the fact that getsizeof returns the size of object and in this case the object happen to be list. So what we get is the size of actual list not the aggregate sum of objects in the list. What we need to understand is that objects stored in the list are actually only the pointers to the actual string or integer and pointer to string has same size of pointer to integer and that's why it returns the same size.

Conclusion

In conclusion, Python's memory management can be challenging and mysterious at times. We can better inspect memory-related issues in our Python programs by understanding deceptive nature of some memory inspection techniques and learning more about RSS memory and getsizeof functions.







要查看或添加评论,请登录

sakshyam ghimire的更多文章

  • Understanding Python GIL

    Understanding Python GIL

    Python developers often encounter the term "GIL," which is often considered a major drawback of the Python programming…

  • Pandas like an SQL

    Pandas like an SQL

    Everyone is pretty much aware about the fact that pandas is an amazing python library for data manipulation, but more…

  • Demystifying OpenSearch Queries

    Demystifying OpenSearch Queries

    For those who are unaware of what OpenSearch is, OpenSearch is an open-source search and analytics engine that is…

  • Mathematics and Big O notation

    Mathematics and Big O notation

    Wouldn't it be great if your program works with best efficiency and minimum complexity? But as a programmer how exactly…

  • Implementing Pub-Sub in Golang

    Implementing Pub-Sub in Golang

    Publish-Subscribe (Pub-Sub) is an messaging pattern where sender of message (Publisher) announce event to multiple…

  • Concurrency States In Golang

    Concurrency States In Golang

    It’s well-known that writing concurrent code is challenging. Things generally takes a few rounds to get it operating as…

社区洞察

其他会员也浏览了