Common operations with NumPy

Common operations with NumPy

Here we’ll do some operations with arrays through the NumPy library and take the opportunity to compare NumPy with List Comprehensions and Lambda Functions and even see the difference in performance between the different kinds of operations.

Study notebook

Go to Jupyter Notebook to see the concepts that will be covered about Operations with Arrays in NumPy. Note: Important functions, outputs, and terms are bold to facilitate understanding — at least mine.

View versions

Let’s start by viewing the Python version and the NumPy package version. This is for almost any package — always check the version we work with.

import sys
import numpy as np
print(sys.version) 
np.__version__    
3.8.1    # Python version
'1.19.5' # NumPy version

Create an array in NumPy

Now we create an Array with the arange function of Numpy and not range built-in. The detail is subtle.

array1 = np.arange(15); array1

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

Help?

When we have any questions about any object in Python, can we call help using ? And the name of the object.

?array1

A complete help of that object is opened — the full description of what the object is specifically, how to use it, parameters that we can use, attributes and methods, other packages capable of generating that same type of object, etc.

Mathematical Methods directly in arrays

Once we have the collection created with NumPy, we can call the mathematical methods or any methods with + TAB.

Average

array1.mean()
7.0

Sum

array1.sum()
105

Minimum value

array1.min()
0

Maximum value

array1.max()
14

Standard-deviation

array1.std()
4.320493798938574

List Comprehension in NumPy Arrays

Let’s use the NumPy array1 in the list comprehension to use any other type of array.

Translating: for (for) each value of x within (in) array1, multiply the value of x by itself:

[x * x for x in array1]

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196]

Translating: for (for) each value of x within (in) array1, if (if) the value of x divisible by two results in zero, return the value of x multiplied by itself — that is, the operation only with even numbers.

[x * x for x in array1 if x % 2 == 0]

[0, 4, 16, 36, 64, 100, 144, 196]

Therefore, when working with NumPy, remember that you can also work with List Comprehension.

Lambda Function in NumPy Arrays

Here we call Lambda, where it will return the value of x only when the division of x by 2 is == 0, for each element of array1 NumPy — then apply the filter to print the numerical results in the list:

list(filter(lambda x: x % 2 == 0, array1))

[0, 2, 4, 6, 8, 10, 12, 14]

This that we did above can be further simplified:

array1 % 2 == 0

array([ True, False, True, False, True, False, True, False, True, False, True, False, True, False, True])

We’re using Array1 (NumPy), asking for the rest of the division for 2 == 0 — we return exactly True| False; that is, with this simple notation, we were able to do the same with the map function, generating True| False for each value of array1 that met that condition.

? Slicing notation — Powerful! ?

What if we put the above condition as a slicing notation within array1?

array1[array1 % 2 == 0]

array([0, 2, 4, 6, 8, 10, 12, 14])

NumPy opens up a sea of possibilities. Everything that has been done so far has been able to replace with this simple notation of slicing. This will work when we’re working with NumPy — very important for Data Science when working with Python.

Evaluating Performance

We will use the %timeit operator. It belongs, specifically, to the Jupyter Notebook. It allows you to measure the execution time of a command.

Translating: For (for) each value of x in array1, return if (if) x divisible by 2 results in 0, that is, x for even. And, of course, calculate the time of the operation.

%timeit [x for x in array1 if x % 2 == 0]

6.03 μs ± 135 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

It took 6.03 microseconds with the list comprehension in Pure Python, and already with the notation of slicing, the time gets to be 3x faster with Numpy Slicing.

%timeit array1[array1 % 2 == 0]

2.16 μs ± 65.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Logical Operators

We can also use logical operators for operations with Numpy Arrays.

Ex.1

Array1 is greater than 8? A check is made for each element of the array:

array1 > 8

array([False, False, False, False, False, False, False, False, False, True, True, True, True, True, True])

Ex.2

This notation can be placed within the index, returns the values that meet the condition — not True| False:

array1[array1 > 8]

array([9, 10, 11, 12, 13, 14])

Ex.3

We can use a logical operator and. We concatenate two logical operations that must be true:

(array1 > 9) & (array1 < 12)

array([False, False, False, False, False, False, False, False, False, False, True, True, False, False, False])

Ex.4

We can use a logical operator or. We concatenate two logical operations that at least one must be true:

(array1 > 13) | (array1 < 12)

array([ True, True, True, True, True, True, True, True, True, True, True, True, False, False, True])

Ex.5

We can also place the logical operator or within the Slicing Notation:

array1[(array1 > 13) | (array1 < 12)]

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 14])

Ex.6

We can create a NumPy Array with List Comprehension:

array2 = np.array([x ** 3 for x in range(15)]; array2

array([0, 1, 8, 27, 64, 125, 216, 343, 512, 729, 1000,
     1331, 1728, 2197, 2744])

Ex.7

Array NumPy with List Comprehension, returning true for each value of x in the range:

array2 = np.array([True for x in range(15)]); array2

array([ True, True, True, True, True, True, True, True, True, True,  True, True, True, True, True])

Ex.8

Use this same array2 above as the index for array1.

array1[array2]

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])

Let’s continue from here with another Jupyter Notebook to see the concepts that will be covered about concatenation, join, and split arrays in NumPy.

We’re talking about the split-apply-combine technique. We can split an array, apply a method, an analysis, or some calculation to parts of this array, and then combine the results — this is very useful in various data manipulation and organization situations.

Concatenating Arrays

First, we will create array1 with the Ones function, which creates an array filled with values 1:

array1 = np.ones(4)

We now create array2, with another function, arange — elements up to 15 positions:

array2 = np.arange(15)

Next, we call the concatenate function, which belongs to NumPy, to concatenate between array1 and array2:

array_conc = np.concatenate((array1, array2)); array_conc

array([1., 1., 1., 1., 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13., 14.])

Above, the first part of the array output can quickly identify the set of 1 generated by array1 in np.ones and the other part of array2, from 0 to 14.

Joining Arrays

Let’s create two arrays, a and b:

a = np.ones((3,3))

b = np.zeros((3,3))
print(a)
[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]
print(b)
[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]

vstack & hstack

Let’s apply the vstack and hstack functions. If we don’t know anything about this function or anything else, we call help!

# stack arrays in vertical sequence
?np.vstack

# stack of arrays in horizontal sequence
?np.hstack

vstack function

Fill snares towards the lines:

np.vstack((a,b))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

hstack function

Fills in the direction of the columns:

np.hstack((a,b))

array([[1., 1., 1., 0., 0., 0.],
       [1., 1., 1., 0., 0., 0.],
       [1., 1., 1., 0., 0., 0.]])

Creating more arrays

We’ll create more arrays to run other stack examples:

a = np.array([0, 1, 2])
b = np.array([3, 4, 5])
c = np.array([6, 7, 8])

column_stack function

This function stacks — stacks one-dimensional arrays in the direction of columns — vertically forming a two-dimensional array:

np.column_stack((a, b, c))

array([[0, 3, 6],
       [1, 4, 7],
       [2, 5, 8]])

split Arrays

Below we use the arange function to create an array of 16 elements and apply the reshape to make the one-dimensional array two-dimensional:

array3 = np.arange(16).reshape((4,4)); array3

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

hsplit function

This function splits horizontally, from array3, passing parameter 2 — we divide array3 into two parts [p1 and p2].

[array3_p1, array3_p2] = np.hsplit(array3, 2)

array3_p1
array([[ 0,  1],
       [ 4,  5],
       [ 8,  9],
       [12, 13]])
array3_p2
array([[ 2,  3],
       [ 6,  7],
       [10, 11],
       [14, 15]])

vsplit function

This function does a vertical-level split from array3, passing parameter 2 — we divide array3 into two parts [p1 and p2].

[array3_p1, array3_p2] = np.vsplit(array3, 2)

array3_p1
array([[0, 1, 2, 3],
       [4, 5, 6, 7]])
array3_p2
array([[8, 9, 10, 11],
       [12, 13, 14, 15]])

Record and Load data with NumPy

We create the data object from array3:

data = array3

save

use the save function — save, it is helpful for at the end of the process, to keep the array in the operating system. You don’t have to redo the process next time:

np.save('data_saved_v1', data)

load

use the load function — load:

loaded_data = np.load('data_saved_v1.npy'); loaded_data

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

This material is instrumental and comfortably serves whenever it is necessary to manipulate the NumPy library.

And there we have it. I hope you have found this helpful. Thank you for reading. ??


要查看或添加评论,请登录

Leonardo A.的更多文章

社区洞察

其他会员也浏览了