Python3: Mutable, Immutable... everything is object!
What is Python?
Python is a programming language created by Guido Van Rossum, with a very clean syntax, designed to teach people how to program well. It is an interpreted or scripting language.
To run a Python program you need an interpreter. A Python interpreter is a program that reads a Python program and then executes the statements found in it. Once the program is written and you are ready to test it, you have to tell the Python interpreter to run your Python program so you can see what it does. For this process to work, you must have the Python program installed on your computer.
Python is free and can be downloaded from the Internet. In recent years there have been some changes in the Python programming language between Python 2 and Python 3.
When learning to program and even as an experienced professional, it can be advantageous to run the program using a tool called a debugger. A debugger allows you to run the program, stop at any point, and inspect the state of the program to help you better understand what is happening while the program is running.
There are IDE's that can be used, some examples of IDE's for Python development are Netbeans pycharm and Eclipse.
The Python Interpreter
id() function in Python:
id() is an inbuilt function in Python.
Syntax:
id(object)
As we can see the function accepts a single parameter and is used to return the identity of an object. This identity has to be unique and constant for this object during the lifetime. Two objects with non-overlapping lifetimes may have the same id() value. If we relate this to C, then they are actually the memory address, here in Python it is the unique id. This function is generally used internally in Python.
Examples:
The output is the identity of the object passed. This is random but when running in the same program, it generates unique and same identity. Input : id(1025) Output : 140365829447504 Output varies with different runs Input : id("geek") Output : 139793848214784
Mutable and Immutable Objects:
So as we discussed earlier, a mutable object can change its state or contents and immutable objects cannot.
- Mutable objects:
- list, dict, set, byte array
- Immutable objects:
- int, float, complex, string, tuple, frozen set [note: immutable version of set], bytes
A practical example to find out the mutability of object types
x = 20 x = y
We are creating an object of type int. identifiers x and y points to the same object.
id(x) == id(y) id(y) == id(20)
if we do a simple operation.
x = x + 1
Now
id(x) != id(y) id(x) != id(20)
The object in which x was tagged is changed. object 10 was never modified. Immutable objects doesn’t allow modification after creation
In the case of mutable objects
m = list([1, 2, 3]) n = m
We are creating an object of type list. identifiers m and m tagged to the same list object, which is a collection of 3 immutable int objects.
id(m) == id(n)
Now poping an item from list object does change the object,
m.pop()
object id will not be changed
id(m) == id(n)
m and n will be pointing to the same list object after the modification. The list object will now contain [1, 2].
So what have we seen so far from the above examples?
- Python handles mutable and immutable objects differently.
- Immutable are quicker to access than mutable objects.
- Mutable objects are great to use when you need to change the size of the object, example list, dict etc.. Immutables are used when you need to ensure that the object you made will always stay the same.
- Immutable objects are fundamentally expensive to “change”, because doing so involves creating a copy. Changing mutable objects is cheap.
How are arguments passed to functions and what does that imply for mutable and immutable objects?
The way that the Python compiler handles function arguments has to do with whether the objects in the function arguments are mutable or not immutable. If a mutable object is called by reference in a function, the original variable may be changed. If you want to avoid changing the original variable, you need to copy it to another variable.
When immutable objects are called by reference in a function, its value cannot be changed. Let’s look at this Python script and guess what it will print:
def increment(n): n += 1 b = 9 increment(b) print(b)
Think about it and then continue reading for the answer. The variable b refers to the object with value 9. When we pass b as a function argument to increment(n) function, the local variable n refers to the same object. However, integers are immutable so we need to create a new object with the value 10 and assign it to the variable n. The variable n is pointing to a different object from what b is pointing. Now, n refers to an object with value 10, but b still refers to an object with value 9. When we print(b), we get the answer 9.
The answer: 9
Let’s look at another Python script and guess what it will print:
def increment(n): n.append(4) my_list = [1, 2, 3] increment(my_list) print(my_list)
Think about it, perhaps draw a visualization, and then continue reading for the answer.
The variable my_list refers to a list object that contains references to three integers. Lists are mutable but integers are immutable. When we pass my_list as a function argument to increment(n) function, the function has the local variable n refer to the same object that my_list refers.
Since lists are mutable, the .append() method is able to modify the list in place. No new object is created and when we print my_list, we get the answer [1, 2, 3, 4].
The answer: [1, 2, 3, 4]
Let’s look at another Python script to understand more about function parameters and why mutability and immutability matter.
def assign_value(n, v): n = v list1 = [1, 2, 3] list2 = [4, 5, 6] assign_value(list1, list2) print(list1)
Think about it and then continue reading for the answer.
We pass both lists as function parameters to the assign_value(n, v) function. The function has the local variable n refer to the same object that list1 refers, and the local variable v refers to the same object that list2 refers.
The function body reassigns n to what v is referring. Now n and v are referring to the same object.
The variables n, v, and list2 all point to the list object [4, 5, 6], while list1 still points to the list object [1 2, 3]. This is why when we print list1, we get the answer: [1, 2, 3]
The answer: [1, 2, 3]
What is a Memory-Mapped File in Python:
A memory-mapped file object behaves like both strings and like file objects. Unlike normal string objects, however, these are mutable.
Basically, a memory-mapped (using Python's mmap module) file object maps a normal file object into memory. This allows you to modify a file object's content directly in memory. Since a memory-mapped file object also behaves like a mutable string object, you can modify the content of a file object like you modify the content of a list of characters:
- obj[1] = 'a' - Assigns a character 'a' to the the second character of the file object's content.
- obj[1:4] = 'abc' - Assigns a character list 'abc' to a range of three characters starting at the second character of the file object's content.
In a nutshell, memory-mapping a file with Python's mmap module us use the operating system's virtual memory to access the data on the filesystem directly. Instead of making system calls such as open, read and lseek to manipulate a file, memory-mapping puts the data of the file into memory which allows you to directly manipulate files in memory. This greatly improves I/O performance.
Comparison of Memory-Mapped Files vs Normal Files with Python
Assume we have a binary file test.out that is larger than 10 MB and there's a certain kind of algorithm that requires us to process the data of the file in such a manner that needs us to repeat the process of:
- From the current position, seek 64 bytes and process the data at the beginning of current position.
- From the current position, seek -32 bytes and process the data at the beginning of the current position.
The actual process of the data is replaced by a pass statement since it does not affect the relative performance comparison between mmap and normal file access.
Parameters and Arguments
A function or procedure usually needs some information about the environment, in which it has been called. The interface between the environment, from which the function has been called, and the function, i.e. the function body, consists of special variables, which are called parameters. By using these parameters, it's possible to use all kind of objects from "outside" inside of a function. The syntax for how parameters are declared and the semantics for how the arguments are passed to the parameters of the function or procedure depends on the programming language.
Very often the terms parameter and argument are used synonymously, but there is a clear difference. Parameters are inside functions or procedures, while arguments are used in procedure calls, i.e. the values passed to the function at run-time.
"call by value" and "call by name"
The evaluation strategy for arguments, i.e. how the arguments from a function call are passed to the parameters of the function, differs between programming languages. The most common evaluation strategies are "call by value" and "call by reference":
- Call by Value The most common strategy is the call-by-value evaluation, sometimes also called pass-by-value. This strategy is used in C and C++, for example. In call-by-value, the argument expression is evaluated, and the result of this evaluation is bound to the corresponding variable in the function. So, if the expression is a variable, its value will be assigned (copied) to the corresponding parameter. This ensures that the variable in the caller's scope will stay unchanged when the function returns.
- Call by Reference In call-by-reference evaluation, which is also known as pass-by-reference, a function gets an implicit reference to the argument, rather than a copy of its value. As a consequence, the function can modify the argument, i.e. the value of the variable in the caller's scope can be changed. By using Call by Reference we save both computation time and memory space, because arguments do not need to be copied. On the other hand this harbours the disadvantage that variables can be "accidentally" changed in a function call. So, special care has to be taken to "protect" the values, which shouldn't be changed. Many programming languages support call-by-reference, like C or C++, but Perl uses it as default.
In ALGOL 60 and COBOL there has been a different concept called call-by-name, which isn't used anymore in modern languages.
and what about Python?
There are some books which call the strategy of Python call-by-value, and some call it call-by-reference. You may ask yourself, what is right.
Humpty Dumpty supplies the explanation:
--- "When I use a word," Humpty Dumpty said, in a rather a scornful tone, "it means just what I choose it to mean - neither more nor less."
--- "The question is," said Alice, "whether you can make words mean so many different things."
--- "The question is," said Humpty Dumpty, "which is to be master - that's all."
Lewis Carroll, Through the Looking-Glass
To come back to our initial question what evaluation strategy is used in Python: The authors who call the mechanism call-by-value and those who call it call-by-reference are stretching the definitions until they fit.
Correctly speaking, Python uses a mechanism, which is known as "Call-by-Object", sometimes also called "Call by Object Reference" or "Call by Sharing".
If you pass immutable arguments like integers, strings or tuples to a function, the passing acts like call-by-value. The object reference is passed to the function parameters. They can't be changed within the function, because they can't be changed at all, i.e. they are immutable. It's different, if we pass mutable arguments. They are also passed by object reference, but they can be changed in place within the function. If we pass a list to a function, we have to consider two cases: Elements of a list can be changed in place, i.e. the list will be changed even in the caller's scope. If a new list is assigned to the name, the old list will not be affected, i.e. the list in the caller's scope will remain untouched.
Defining a Function
You can define functions to provide the required functionality. Here are simple rules to define a function in Python.
- Function blocks begin with the keyword def followed by the function name and parentheses ( ( ) ).
- Any input parameters or arguments should be placed within these parentheses. You can also define parameters inside these parentheses.
- The first statement of a function can be an optional statement - the documentation string of the function or docstring.
- The code block within every function starts with a colon (:) and is indented.
- The statement return [expression] exits a function, optionally passing back an expression to the caller. A return statement with no arguments is the same as return None.
Syntax
def functionname( parameters ): "function_docstring" function_suite return [expression]
By default, parameters have a positional behavior and you need to inform them in the same order that they were defined.
Examples:
The following function takes a string as input parameter and prints it on standard screen.
def printme( str ): "This prints a passed string into this function" print str return
Calling a Function
Defining a function only gives it a name, specifies the parameters that are to be included in the function and structures the blocks of code.
Once the basic structure of a function is finalized, you can execute it by calling it from another function or directly from the Python prompt. Following is the example to call printme() function ?
#!/usr/bin/python # Function definition is here def printme( str ): "This prints a passed string into this function" print str return; # Now you can call printme function printme("I'm first call to user defined function!") printme("Again second call to the same function")
When the above code is executed, it produces the following result ?
I'm first call to user defined function! Again second call to the same function
Pass by reference vs value
All parameters (arguments) in the Python language are passed by reference. It means if you change what a parameter refers to within a function, the change also reflects back in the calling function. For example ?
#!/usr/bin/python # Function definition is here def changeme( mylist ): "This changes a passed list into this function" mylist.append([1,2,3,4]); print "Values inside the function: ", mylist return # Now you can call changeme function mylist = [10,20,30]; changeme( mylist ); print "Values outside the function: ", mylist
Here, we are maintaining reference of the passed object and appending values in the same object. So, this would produce the following result ?
Values inside the function: [10, 20, 30, [1, 2, 3, 4]] Values outside the function: [10, 20, 30, [1, 2, 3, 4]]
There is one more example where argument is being passed by reference and the reference is being overwritten inside the called function.
#!/usr/bin/python # Function definition is here def changeme( mylist ): "This changes a passed list into this function" mylist = [1,2,3,4]; # This would assig new reference in mylist print "Values inside the function: ", mylist return # Now you can call changeme function mylist = [10,20,30]; changeme( mylist ); print "Values outside the function: ", mylist
The parameter mylist is local to the function changeme. Changing mylist within the function does not affect mylist. The function accomplishes nothing and finally this would produce the following result ?
Values inside the function: [1, 2, 3, 4] Values outside the function: [10, 20, 30]
Function Arguments
You can call a function by using the following types of formal arguments ?
- Required arguments
- Keyword arguments
- Default arguments
- Variable-length arguments
Required arguments
Required arguments are the arguments passed to a function in correct positional order. Here, the number of arguments in the function call should match exactly with the function definition.