Delving into memory inspection with python
In this article, we set out to investigate Python's complex memory management mechanisms. Our aim is to shed light on some measurements and their potential differences rather than to attempt to unravel the inner workings of memory management. We'll provide explanations for why the underlying mechanics can make memory inspection of Python applications deceptive.
Understanding RSS
RSS (Resident Set Size) memory refers to the the actual memory footprint of a process in RAM at a specific moment in time.
Getting RSS Memory Consumption of Program
The following snippet prints out the RSS memory consumption of currently running process.
A popular linux command line tool htop will also do if we want to inspect the RSS memory of process.
htop -p <process-id>
Delve Deeper Into RSS
Lets take this snippet as example to understand better.
Well just a simple snippet to read a file and print memory consumption along the way. You can periodically place time.sleep(10) in between these line and inspect with htop consecutively to get better comprehension.
The output of following program looks something like
The output looks a bit confusing. Its totally understandable that program memory utilization increases when file is in memory but why is the memory utilization still 25 MB ? Well some of us might think the residual memory usage is just related to the interpreter allocating resources and you might not necessarily be wrong but that's not all there is to it. Lets modify snippet a bit.
Speaking logically the following snippet does nothing different, it still reads print memory consumption before reading the file, after reading file and after its done with content of file. What do you think the output will be ? The same as above correct ? Well that's where the twist is, the answer is NO
What just happened ? The RSS memory consumption for some reason seems to be decreased by 5MB, although logically program seems to do nothing different. For most of us this makes little to no sense and even if we were to say python does something weird inside, the curiosity is still persist on WHAT DOES IT DO ? The answer lies with understanding how Heap Fragmentation works.
领英推荐
Understanding Heap Fragmentation
Here we understand what heap fragmentation is and why RSS inspection might not present an accurate information on memory consumption in terms of python program.
Well this picture, although was hard to find is a self explanatory. Lets assume that each fragment filled with orange color is python object and initially there were 2 of them. We allocated one large object between 2 small objects. Then we deleted or perhaps were done processing with large object and it was garbage collected however small object allocated after large object is still in use. The RSS memory consumption will still represent 10 MB, although only 1.5MB is in use by process.
In context of our above script when we read file in same function at first and then divided it into second function we noticed change in RSS consumption and this explains why. The object allocation are in different order which reduces memory. Until the last object in heap fragmentation is collected the RSS memory consumption will represent the same usage, however new objects can be allocated in the empty fragments. The objective is just to clarify that just inspecting RSS consumption is not actually what you might be after when monitoring memory consumption of python program.
Delve deeper into getsizeof
Well previously we discussed about how RSS inspection might be deceptive, under the context of same topic let's also see a very popular function most programmer use and why even this might be deceptive as well.
Let's take this simple snippet example, we have 2 list one with 4 integer and one with 4 string and then we use sys.getsizeof() to get size of each of the list. For those who are aware of how memory block works string definitely takes more memory than integer, so given that the size of variable list_2 must be definitely more. Let's inspect the output.
size of list_1 is 88
size of list_2 is 88
To our surprise the size of both of the list appear to be same. Does that really mean both list acquire same memory ? The answer is NO and this is exactly why function like this tends to provide deceptive information.
Understanding getsizeof
Well there's not much to understand, except for the fact that getsizeof returns the size of object and in this case the object happen to be list. So what we get is the size of actual list not the aggregate sum of objects in the list. What we need to understand is that objects stored in the list are actually only the pointers to the actual string or integer and pointer to string has same size of pointer to integer and that's why it returns the same size.
Conclusion
In conclusion, Python's memory management can be challenging and mysterious at times. We can better inspect memory-related issues in our Python programs by understanding deceptive nature of some memory inspection techniques and learning more about RSS memory and getsizeof functions.