Python3: Mutable, Immutable... everything is object!

Python3: Mutable, Immutable... everything is object!

As part of our "Python - Everything is object" project for my second trimester at Holberton School, I'm writing this article about what "id" and "type" are in the Python programming language and how we can use them. Also I'll be going into what mutable and immutable objects are in Python and the reason they matter and how differently does Python treat them. Last but not least we'll take a look at how arguments are passed to functions and what it implies for mutable and immutable objects.

"id"

The "id" function returns the “identity” of the object. The identity of an object is an integer, which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value. In CPython implementation, this is the address of the object in memory.

Python cache the id() value of commonly used data types, such as strings, integers, tuples etc. So you might find that multiple variables refer to the same object and have same id() value if their values are same. You can take a look at this at the example below.

# integers
a = 10
b = 10
c = 11
d = 12

print(id(a))
print(id(b))
print(id(c))
print(id(d))        

Output:

4317900064
4317900064
4317900096
4317900128        

Notice that id() value of ‘a’ and ‘b’ are same, they have the same integer value. What would happen if we use string and tuples too?

# tuples
t = ('A', 'B')
print(id(t))

t1 = ('A', 'B')
print(id(t1))

# strings
s1 = 'ABC'
s2 = 'ABC'
print(id(s1))
print(id(s2))        

Output:

4320130056
4320130056
4320080816
4320080816        

From the output, it’s clear that Python cache the strings and tuple objects and use them to save memory space.

Caching can work only with immutable objects, notice that integers, strings, and tuples are immutable. So Python implementation can use caching to save memory space and improve performance.

We know that dictionary is not immutable, let’s see if id() is different for different dictionaries even if the elements are same?

# dict
d1 = {"A": 1, "B": 2}
d2 = {"A": 1, "B": 2}
print(id(d1))
print(id(d2))        

Output:

4519884624
4519884768        

dict objects are returning different id() value and there seems no caching here.

Let’s see an example of getting id() value for a custom object.

class Emp:
    a = 0


e1 = Emp()
e2 = Emp()

print(id(e1))
print(id(e2))
4520251744
4520251856        

Output:

4520251744
4520251856        

Python id() value is guaranteed to be unique and constant for an object. We can use this to make sure two objects are referring to the same object in memory or not.

"type"

We use the type() function in Python to identify the type of a specific Python object. It’s a straightforward function and an easy one to understand for that.

Python has a lot of built-in functions. The type() function is used to get the type of an object.

Python type() function syntax is:

type(object)

type(name, bases, dict)        

When a single argument is passed to the type() function, it returns the type of the object. Its value is the same as the object.__class__ instance variable.

When three arguments are passed, it returns a new type object. It’s used to create a class dynamically on the fly.

  • “name” string becomes the class name. It’s the same as the __name__ attribute of a class.
  • “bases” tuple specifies the base classes. It’s the same as the __bases__ attribute of the class.
  • “dict” dictionary helps create the class body. It’s the same as the __dict__ attribute of the class.

Let’s look into some examples of using the type() function.

1. Finding the type of a Python object

x = 10
print(type(x))

s = 'abc'
print(type(s))

from collections import OrderedDict

od = OrderedDict()
print(type(od))

class Data:
    pass

d = Data()
print(type(d))        

Output:

<class 'int'>
<class 'str'>
<class 'collections.OrderedDict'>
<class '__main__.Data'>        

Notice that the type() function returns the type of the object with the module name. Since our Python script doesn’t have a module, it’s module becomes __main__.

2. Extracting Details from Python Classes

Let’s say we have following classes. We’ll pull metadata about the classes using the class, bases, dict, and doc properties.

class Data:
    """Data Class"""
    d_id = 10


class SubData(Data):
    """SubData Class"""
    sd_id = 20        

Let’s print some of the properties of these classes.

print(Data.__class__)
print(Data.__bases__)
print(Data.__dict__)
print(Data.__doc__)

print(SubData.__class__)
print(SubData.__bases__)
print(SubData.__dict__)
print(SubData.__doc__)        

Output:


<class 'type'>
(<class 'object'>,)
{'__module__': '__main__', '__doc__': 'Data Class', 'd_id': 10, '__dict__': <attribute '__dict__' of 'Data' objects>, '__weakref__': <attribute '__weakref__' of 'Data' objects>}
Data Class

<class 'type'>
(<class '__main__.Data'>,)
{'__module__': '__main__', '__doc__': 'SubData Class', 'sd_id': 20}
SubData Class        

We can create similar classes using the type() function.

Data1 = type('Data1', (object,), {'__doc__': 'Data1 Class', 'd_id': 10}
SubData1 = type('SubData1', (Data1,), {'__doc__': 'SubData1 Class', 'sd_id': 20})

print(Data1.__class__)
print(Data1.__bases__)
print(Data1.__dict__)
print(Data1.__doc__)

print(SubData1.__class__)
print(SubData1.__bases__)
print(SubData1.__dict__)
print(SubData1.__doc__)        

Output:

<class 'type'>
(<class 'object'>,)
{'__doc__': 'Data1 Class', 'd_id': 10, '__module__': '__main__', '__dict__': <attribute '__dict__' of 'Data1' objects>, '__weakref__': <attribute '__weakref__' of 'Data1' objects>}
Data1 Class

<class 'type'>
(<class '__main__.Data1'>,)
{'__doc__': 'SubData1 Class', 'sd_id': 20, '__module__': '__main__'}
SubData1 Class        

Note that we can’t create functions in the dynamic class using the type() function.

Real Life Usage of the type() function

Python is a dynamically-typed language. So, if we want to know the type of the arguments, we can use the type() function. If you want to make sure that your function works only on the specific types of objects, use isinstance() function.

Let’s say we want to create a function to calculate something on two integers. We can implement it in the following way.

def calculate(x, y, op='sum'):
    if not(isinstance(x, int) and isinstance(y, int)):
        print(f'Invalid Types of Arguments - x:{type(x)}, y:{type(y)}')
        raise TypeError('Incompatible types of arguments, must be integers')
    
    if op == 'difference':
        return x - y
    if op == 'multiply':
        return x * y
    # default is sum
    return x + y
        

The isinstance() function is used to validate the input argument type. The type() function is used to print the type of the parameters when validation fails.

Mutable And Immutable Objects

A mutable object is an object whose state can be modified after it is defined. The opposite of a mutable object is an immutable object, whose state cannot be altered after it is initially defined.

Examples of immutable objects in Python include integers, floats, strings, and tuples. The two types of mutable objects you’ll likely deal with most often when programming in Python are lists and dictionaries.

Let’s look at a concrete example of the difference between mutability and immutability. Say I define a list of songs. If I want to change the list by replacing one of the songs, that’s perfectly fine, because lists are mutable:

>>> my_songs = ["Sweet Child O Mine", "Live And Let Die", "Knockin On Heavens Door"
>>> my_songs[1] = "November Rain"
>>> my_songs
['Sweet Child O Mine', 'November Rain', 'Knockin On Heavens Door']]        

By contrast, suppose I define a string, which is an immutable object, but I accidentally misspell it. If I then try to modify the string, Python complains:

>>> my_song = "Sweet Child O Mine"
>>> my_song[9] = 'e'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment        

Once the string has been defined, its contents cannot be changed. I will need to define a new string if I want to fix the spelling.

If you think about the functionality of lists and dictionaries, you’ll realize you’ve been taking advantage of their mutability all along, even if you may not have realized it. Functions such as append , extend , and update all modify the list or dictionary they refer to, and can be very useful for keeping track of data or accomplishing a certain programming task.

Mutable and Immutable Data Types in Python

  • Some of the mutable data types in Python are list, dictionary, set and user-defined classes.
  • On the other hand, some of the immutable data types are int, float, decimal, bool, string, tuple, and range.

Why Does It Matter And How Differently Does Python Treat Mutable And Immutable Objects

Python works differently on mutable and immutable objects,?the Immutable objects are quicker to access and are expensive to change because it involves the creation of a copy y the mutable objects are easy to change.

If the size or content of the object needs to be changed, it is recommended to use mutable objects.

Exception: Immutable objects have an exception, we know that a tuple in Python is Immutable, but the tuple contains a sequence of names with immutable links to objects that can be mutable.

tupla1 = ("klich", [1, 2, 3])         

The tuple1 is not mutable, but it contains a list in its element [1], and the lists if it is mutable and its content can change. As a rule of thumb, Generally, Primitive-like types are probably immutable and Customized Container-like types are mostly mutable.

How objects are passed on to functions

Since we know the difference between mutable and immutable types, let's look at how these are treated when they are switched to functions. The efficiency of memory is greatly affected when the right objects are used.

We walk through a list with a call for a reference and see how the changes affect the original list.


example call by reference
>>> def call_by_refernce(l1):
...     l1 += [8]
... 
>>> list1 = [2, 4, 6]
>>> list1
[2, 4, 6]
>>> id(list1)
140494953940936
>>> call_by_refernce(list1)
>>> list1
[2, 4, 6, 8]
>>> id(list1)
140494953940936
>>>        

For example, if a mutable object is called by reference in a function, you can change the original variable itself. Therefore, to avoid this, the original variable must be copied to another variable. Immutable objects can be called by reference because their value cannot be changed anyway.

Let's look at this other example.

example pass by value

>>> def pass_by_value(number):
...     print(id(number))
...     number += 8
...     print(number)
... 
>>> x = 2
>>> id(x)
10914528
>>> pass_by_value(x)
10914528
10
>>> x
2
>>>         


We observe that the same object is passed to the function, but the value of the variable does not change, even if the object is identical (same id), this is what is called step by value.

What exactly is going on?

When the function invokes the value, only the value of the variable (2) is passed, not the object itself (x), so the variable that refers to the object is not changed, but the object itself is changed, but this occurs only within the scope of the function, so the change is not reflected.

Assignment And Referencing

Python utilizes a system, which is known as “Call by Object Reference” or “Call by assignment”. In the event that you pass arguments like whole numbers, strings or tuples to a function, the passing is like call-by-value because you can not change the value of the immutable objects being passed to the function. Whereas passing mutable objects can be considered as call by reference because when their values are changed inside the function, then it will also be reflected outside the function.

How Is Memory Stored In Python?

To store objects, we need memory with dynamic memory allocation (i.e., size of memory and objects can change). Python interpreter actively allocates and deallocates the memory on the Heap (what C/C++ programmers do manually most). Python uses a garbage collection algorithm (called Garbage Collector) that keeps the Heap memory clean and removes objects that are not needed anymore.

To optimize memory allocation. Python does a process called “interning.” For some objects, Python only stores one object on Heap memory and ask different variables to point to this memory address if they use those objects. The objects that Python does interning on them are integer numbers [-5, 256], boolean, and some strings. Interning does not apply to other types of objects such as large integers, most strings, floats, lists, dictionaries, tuples.

Memory Schema

A named collection of types a.k.a schema. A schema defines the column names and types in a record batch or table data structure. They also contain metadata about the columns. For example, schemas converted from Pandas contain metadata about their original Pandas types so they can be converted back to the same types.(Do not call this class’s constructor directly. Instead use pyarrow.schema() factory function which makes a new Arrow Schema object.)

Examples:

Create a new Arrow Schema object:

import pyarrow as pa
pa.schema([
??? ('some_int', pa.int32()),
??? ('some_string', pa.string())
some_int: int3
some_string: string2        

Create Arrow Schema with metadata:

pa.schema([
??? pa.field('n_legs', pa.int64()),
??? pa.field('animals', pa.string())],
??? metadata={"n_legs": "Number of legs per animal"})
n_legs: int64
animals: string
-- schema metadata --
n_legs: 'Number of legs per animal'        


Integer Pre-Allocation

The implementors of CPython have decided that that's a good range to have preallocated for perfomance reasons, as it covers the most commonly used integer values. There's nothing magical about the range [-5,256]. The few negative numbers are probably included in the range for common error codes and negative indexing of lists, and the upper limit was just set to a nice, round power of two.

Comment from the CPython source code:

/* Small integers are preallocated in this array so that they
   can be shared.
   The integers that are preallocated are those in the range
   -NSMALLNEGINTS (inclusive) to NSMALLPOSINTS (not inclusive).
*/        


ALIAS

In python programming, the second name given to a piece of data is known as an alias. Aliasing happens when the value of one variable is assigned to another variable because variables are just names that store references to actual value.

Consider a following example:


first_variable = "PYTHON"
print("Value of first:", first_variable)
print("Reference of first:", id(first_variable))

print("--------------")

second_variable = first_variable # making an alias
print("Value of second:", second_variable)
print("Reference of second:", id(second_variable))
        

In the example above, first_variable is created first and then string ‘PYTHON’ is assigned to it. Statement first_variable = second_variable creates an alias of first_variable because first_variable = second_variable copies reference of first_variable to second_variable.

To verify, let’s have look at the output of the above program:

Value of first_variable: PYTHON
Reference of first_variable: 2904215383152
--------------
Value of second_variable: PYTHON
Reference of second_variable: 2904215383152        

From the output of the above program, it is clear that first_variable and second_variable have the same reference id in memory.

So, both variables point to the same string object ‘PYTHON’.

And in Python programming, when the value of one variable is assigned to another variable, aliasing occurs in which reference is copied rather than copying the actual value.


NSMALLPOSINTS, NSMALLNEGINTS

It is an array of 262 integers (most commonly used). And this structure is basically used to access these integers fast. They get allocated right when you initialize your NSMALLPOSINTS and NSMALLNEGINTS.

define NSMALLPOSINTS 257

define NSMALLNEGINTS 5        

Turns out Python keeps an array of integer objects for “all integers between -5 and 256”. When we create an int in that range, we’re actually just getting a reference to the existing object in memory.If we set x = 42, we are actually performing a search in the integer block for the value in the range -5 to +257. Once x falls out of the scope of this range, it will be garbage collected (destroyed) and be an entirely different object. The process of creating a new integer object and then destroying it immediately creates a lot of useless calculation cycles, so Python preallocated a range of commonly used integers.

Frozensets

Tuples are immutable lists, frozensets are immutable sets. Tuples are indeed an ordered collection of objects, but they can contain duplicates and unhashable objects, and have slice functionality.

Frozensets aren't indexed, but you have the functionality of sets - O(1) element lookups, and functionality such as unions and intersections. They also can't contain duplicates, like their mutable counterparts.

The frozenset() function returns an immutable frozenset object initialized with elements from the given iterable.

Frozen set is just an immutable version of a Python set object. While elements of a set can be modified at any time, elements of the frozen set remain the same after creation.

Due to this, frozen sets can be used as keys in Dictionary or as elements of another set. But like sets, it is not ordered (the elements can be set at any index).

Example:

# tuple of vowels
vowels = ('a', 'e', 'i', 'o', 'u')

fSet = frozenset(vowels)
print('The frozen set is:', fSet)
print('The empty frozen set is:', frozenset())

# frozensets are immutable
fSet.add('v')        

Output:

The frozen set is: frozenset({'a', 'o', 'u', 'i', 'e'})
The empty frozen set is: frozenset()
Traceback (most recent call last):
  File "<string>, line 8, in <module>
    fSet.add('v')
AttributeError: 'frozenset' object has no attribute 'add'        

Finally thats it for the article luckily we covered a bunch of information about objects in python, I hope you enjoy the read and learn something from this. Any questions? Feel free to hit me up through linkedin. Happy Coding!

要查看或添加评论,请登录

社区洞察

其他会员也浏览了