Python object weirdness
Please note, this post is a required task for a Holberton school project. I do not assert myself as a subject matter expert, and don’t expect anyone to gain value from my ramblings.
After spending the past few months learning a statically typed, procedural, lower level language like C, it has been frustrating at times to gain my balance as I learn a new(to me), object-oriented, interpreted, higher level language like Python. Under the hood, Python does some weird things to streamline object creation and passing of values between instances. It can be quite a bit disorienting at first to wrap my head around, especially the heavy reliance on aliasing, and perhaps you feel the same. I’ll discuss some of the major points that have stood out to me so far as I’ve worked through the Python projects as part of Holberton's curriculum. There can be some weird magic happening under the hood, especially regarding the mutability, or sometimes lack of, for python variables and objects. Understanding it may help explain unexpected behavior, and how to avoid it in the future!
For all variables, Python does automatic typecasting and does not require us to specify what type a variable is when we initialize it (and on that note, we also don't need to first declare variables before using them!). Python will generally determine what type a variable should be by its formatting, ie strings wrapped in quotes, integers are digits without quotes, floats are numbers with decimal precision, lists wrapped in square brackets, tuples wrapped in rounded brackets, and dictionaries wrapped in curly braces. We do have the ability to force a type, or to convert a type, by wrapping a value in a type function, ie int(“3”) converts the string or character digit 3 into an integer.
If we need to check what type a variable is, before acting on it for example, we can use the type() function (isinstance() works too). When given an object as a single input, type will return the object’s type, simple as that. If we combine type with the is keyword, we can determine of the returned object type matches what we need, ie. if type(string_var) is str: ... Note, the is keyword and the double equal == sign are not synonymous in Python; is checks the identity of the objects(their memory addresses), whereas == checks if the values are the same. Be careful using those interchangeably! i
I mentioned above a few python data types. Now may be a good time to mention the mutability of these objects. Python's mutable objects include lists, dictionaries, and sets. Integers, floats, and strings are all immutable, or unchangeable in Python. This is largely due to the weird referencing magic happening under the hood. Basically, Python creates objects for the values we attempt to assign to a variable, and then uses that variable as a reference, or alias, to that object. This can be seen using a few simple tests, which I’ll do after the following explanation on the id function.
Using the id() function, we can retrieve the identity value of an object, or it’s address in memory. This has been useful in wrapping my head around Python’s automatic creation and use of objects, and the weird reference magic happening under the hood. In C, if I were to assign a value to a variable, then reassign a new value to it, the memory address of the variable allocated when declared would be the same. Expecting the same behavior in Python? Nope. Python is so weird, it automatically creates objects based on the values we attempt to assign to a variable. For example, if I were to assign the value 5 to a variable a, then reassign 6 to a, the memory address of the variable reference would change. Equally weird, if I assign 5 to a, and 5 to b, they would both reference the same object in memory, and return the same memory address if I called them in id(). The same thing happens with strings, as can be seen in the following example:
Notice that a and b both reference addresses based on the object value they hold. When a is 5, the address is 10105216. When b is 5, the address is 10105216. When a is 6, it's memory address changes, to represent the address of the object value 6, and the same occurs with b. When both a and b hold the same value, 'a is b' is True, because they are both aliases of the same object in memory. Weird, right? The same thing happens with strings. Even after deleting a and b, both holding "HBTN", despite no variables holding "HBTN" at that point, then assigning that same value to c, the object remained stored under Python's magic hood, and c printed with the same memory address that a and b were previously referencing.
With the mutable data types, like lists for example, two different objects with the same values will have their own identity, with unique memory addresses. However, if you assign a list a = b, then b will basically become a pointer to a, and any changes in a will be reflected in b. Now if you reassign b, like b = c, then a won't equal c. But while b still = a, if you update index values from b, they will reflect in a. Note, however, that depending on how you add to one of your lists, it will become an entirely new list, and changes made now in either won't affect each other. Note the following example:
Using a slightly different method, l1 += [4], we would end up with a completely different result. The memory address would stay the same, and the changes would be made in both l1 and l2. Gosh, Python is weird.
One last thing to note is how Python passes arguments into functions. Python will implement a pass-by-object method, passing the values object rather than the variable it holds. Once the variable is inside the function, any changes made won't affect the variable from where it was called, so any changes made locally inside the function don't become global unless it's returned.
https://ianculp.tech
https://github.com/icculp
https://twitter.com/IanCSU