Everything is an Object!
Mateo Garcia
Data Engineer | Python | Databricks | Spark | Azure Synapse Analytics | AWS
In this article, a simple explanation of Python's Objects will be given!
It's been said that in Python everything is an object. Well, what does that mean? This blog will explain how it is evident that indeed everything is an object. It's noted that in Python, when assigning a variable's value, there is no previous need of declaring it. There is no need to tell the program it is an integer, a pointer, etc.. This is because, in Python, everything is an object!
Id and Type
"Id" is a built-in python function that allows knowing the memory address of an object. This means that it will show where is this object in memory, and it will be useful to know later on if Python creates a new object or not!
"Type" is another built-in function, this will tell what type of object is what is being passed into it. Some of the built-in types in python are Boolean, integer, float, string, tuple, list, dictionary. These are all objects, but they are all different types and this will change what operations are able to be done with them.
Immutable Objects
Mutable can be a strange word, for a new reader to the subject. To clear this up, what is referred is if the object's value or values can not be changed while still referring to the same object.
Integer Example
>>> a = 7
>>> id(a)
140690973473264
change object's "a" value to integer 10
>>> a = 10
>>> id(a)
1406909734733600
So it is visible that "a" refers to a different object when changing its value.
This is because the integer itself is immutable, the number 7 is the number 7 and cannot change, cannot mutate into another object. What can change is "a" to stop referencing where 7 is, and start referencing where another number is.
Integer Aliasing!
Now, we are going to make an alias of a
>>> a = 7
>>> b = a
>>> id(a)
140690973473264
>>> id(b)
140690973473264
As expected, they both point to the same memory address. But what happens when a is changed. Remember, these are immutable so the numbers perse does not change, only the reference to it changes.
>>> a = 9
>>> b
7
>>> id(a)
140690973473328
>>> id(b)
140690973473264
variable "a" and variable "b" now point to a different place. This would be relevant later on when analyzing how Python manages the values passed on into functions.
String Example
Now another example of immutable objects are strings. Strings in python are strangely similar to integers since they are an array of characters, which (if knowing a little of C) are integers under the hood.
a = "hello"
This type of object is iterable, has a length and the different characters can be accessed. However, one of its characters cannot be changed by another, deleted, or added while still referencing the same object.
>>> a = "hello
>>> a[1]
'e'
>>> a[1] = 'e'
Traceback (most recent call last):
? File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment
"
How does something like concatenating would look for immutable objects?
>>> a = 7
>>> b = 10
>>> id(a)
140690973473264
>>> id(b)
140690973473360
>>> a = a + b
>>> a
17
>>> id(a)
140690973473584
before assignment of a = a + b it would look like this
One would be tempted to think that a would point to the same id 140693973473264 where integer 7 is, and change its value to the sum of 7 + 10. However, when looking at the id of "a" after the assignment it has changed. So it would look like this:
>>> a = "hello "
>>> b = "world"
>>> id(a)
140690972080560
>>> a += b
>>> a
'hello world'
>>> id(a)
140690962481072
>>>?
As seen, "a" refers to another memory address after the assignment.
Mutable Objects
Mutable Objects are different from mutable, since, as one can start to infer, they can mutate, or change. Take for example a list of cities, we will create an object called "cities" which will have a list of strings with the names of some cities, and print out the id and type.
>>> cities = ["New York", "Miami", "San Francisco"]
>>> print(id(cities))
140427874094536
>>> type(cities)
<class 'list'>
Can we access them as a string, however, we can infact change a city.
领英推荐
>>> cities = ["New York", "Miami", "San Francisco"]
>>> cities
['New York', 'Miami', 'San Francisco']
>>> id(cities)
140690953332800
>>> cities[0] = "Boston"
>>> cities
['Boston', 'Miami', 'San Francisco']
>>> id(cities)
140690953332800
The variable "cities" even though changed its element at index 0, still refers to the same memory address!
This is important because it can have some unexpected behavior if users are not aware of this. Making an alias to the same object.
>>> alias_cities = cities
>>> id(alias_cities)
140690953332800
Wait! They have the same memory address! This assignment does not create a copy of the list "cities", instead it is an alias, it is another variable that references the same object, references the same memory address.
So what happens if we do something like appending a new city to cities and then printing the alias:
>>> cities
['Boston', 'Miami', 'San Francisco']
>>> alias_cities
['Boston', 'Miami', 'San Francisco']
>>> id(cities)
140690953332800
>>> id(alias_cities)
140690953332800
>>> cities.append("Orlando")
>>> cities
['Boston', 'Miami', 'San Francisco', 'Orlando']
>>> alias_cities
['Boston', 'Miami', 'San Francisco', 'Orlando']
Even though the append was done to variable "cities", the variable "alias_cities" also reflects the change. This is because they both point to the same object, so when mutating from any of the aliases, the change is done to the object itself, reflecting the change to any of the aliases the object may have.
This is important because if a real copy must be made, it has to through another method, which is using slicing.
>>> copy_cities = cities[:]
>>> copy_cities
['Boston', 'Miami', 'San Francisco', 'Orlando']
>>> id(cities)
140690953332800
>>> id(copy_cities)
140690961761920
Now using some other functions like the equal comparison and the "is" comparison
>>> cities == copy
copy_cities? copyright(? ?
>>> cities == copy_cities
True
>>> cities is copy_cities
False
They are in fact both equal in content, but as seen through memory address, they are NOT pointing to the same object.
Why is this important?
This is important because users need to know when they are aliasing to the same object, and the change will affect other alias referencing and when they are making a real copy. This is especially critical when passing into another function. Which is passed? The reference? or the value?
It would be a good start by understanding that when passed into a function, what's actually going on, is assignments just as the onces we have been doing.
Arguments passed to functions
Integer example:
It is important to remember how aliasing works for immutable. If immutable objects remain the same always, when operations are done, these generate new objects which are then referenced to the corresponding variable.
Using python tutor, we can make a graphical illustration of how a simple power function would work. Power will multiply the argument by itself assigned it to itself and then return that value. First, 5 is assigned to a. Then a is assigned to b, and the function power is called with a as its argument.
Notice how at the end, a and b remain the same. Why?
Because what was passed into the variable x, inside the function power, was passed by value because integers are immutable. a, b, and x all point to 5, and 5 cannot change. What can change is to point to another object. So when power(a) is executed, x is assigned to the result, but the result, when returned to the main part, is not assigned into anything, therefore losing reference to the object where the number 25 is. Also, notice how x changed its reference from 5 to 25, but only the variable x and not to a nor b.
List Example
Now an example with a mutable object.
With the list cities again with the alias_cities as its alias, it will be passed into a function that will insert another city into the list.
This is how it would look like just before the moment where the new city is appended.
Once the append is done, it would look like this:
It is to be noted that cities, alias_cities, and new_cities all point to the same object that now has been changed. Even though the append was done to a variable called_current list, it is visible how the other two aliases in fact still point to the same object that has just mutated. When the function terminates, the variable current_list is gone, but the effects of the function remain to the aliases cities and citie_aliases.
This is different from the previous example where when the power called upon variable a, did not affect after terminating the function power, its value nor its aliases.
This is because integers are passed by value. The immutable value 5 is passed to the function power, received by the variable x, x was changed
On the other hand, to the function add_list, the variable cities reference to the object list was passed to the variable current_city. As the previous visualization shows, they all POINT to the same object list, because the value passed was a reference to the memory address of the object. The object itself was mutated, therefore impacting all the variables that are a reference to this same object, to this same memory address.
Conclusions:
Variables in Python are objects which have different types. Depending on the type of the variable it has different properties such as being mutable or immutable. When an object is immutable and it is aliased or it is passed into another function, a reference by value is being made. Since it is immutable, any change made implies a new object and if an assignment is made, that variable is changing where it is pointing to, therefore pointing into a new object.
On the contrary, when treating a mutable type object, and aliasing or passing into a functions argument, it is passed by reference. This means that the argument variable does point to the same memory address and if an operation where it is mutated is called upon this alias, it will not create a new object, but it will change or mutate that same object, impacting all the aliases that point to that same object.