Numpy For Data Science
Jagdish Chavan
Software Development & Data Science Educator: Android, Python, Java | Bridging Academia & Industry
Welcome to the article on NumPy for Data Science, where you will learn both the concept behind the NumPy library as well as its implementation of it in Python.
My name is Jagdish Chavan as a working Data Scientist and Industry Trainer, my Courses have been enrolled by over 1000 + students holding B.Sc in Computer Science with Mathematics and Statistics and I have experience in Building Projects in Data Science, ML, DL application and the consulting industry.
While doing my Job, I realized That many Data analysts and beginners in the field of data science are overwhelmed by the power of NumPy. This article and coming articles focus will be on students who want to expand their skill set in the Field of Data Science.
A practising data scientist needs to know from concept to code without getting too much mathematical about it. if you regularly put one hour a day within a week, you will be able to understand the true implementation of NumPy.
The prerequisite for this article will be core and advanced python concepts it will be really good if you are familiar with Arrays.
this article will be ideal for an existing Business analyst or data analyst who wants to expand on current skills or a student who wants to have a career in data science.
Q) What is Numpy?
Q) Advantages of Numpy?
NumPy array has many advantages over regular Python lists. Some of them are listed below:
Install NumPy
You can install the NumPy package in your Python installation using the command prompt via the following pip command.
pip install numpy
Q) What is Numpy Array?
NumPy Data Types
Q)?Explain Different Data types in the NumPy library?
You can check the data type in a NumPy array using the dtype() property.
Write a numpy program to create a numpy array and check the data type of the given array
import numpy as np
array1 = np.array([1,2,3,4,5,6,7,8,9])
?
print(array1)
print(array1.dtype)
print(array1.dtype.itemsize)
?
OUTPUT
[1 2 3 4 5 6 7 8 9]
?int32
?4
The Python NumPy library supports the following data types including the default Python types.
?i – integer
b – boolean
?u – unsigned integer
f – float
领英推荐
c – complex float
m – timedelta
M – datetime
?o – object
S – string
U – Unicode string
V – a fixed chunk of memory for other types ( void )
write a program that creates a NumPy array with three text items and displays the data type and size of each item.
import numpy as np
array2 = np.array(["Red","Green","Orange"])
?
print(array2)
print(array2.dtype)
print(array2.dtype.itemsize)
?
OUTPUT
['Red' 'Green' 'Orange']
<U6
?24
The output shows that NumPy stores text in the form of Unicode string data type denoted by U. Here, digit 6 represents the item with the most number of characters.
Though the NumPy array is intelligent to guess the data type of items stored in it, this is not always the case. For instance, in the following script, you store some dates in a numpy array. Since the dates are stored in the form of the text(enclosed in double quotations) by default the numpy array treats the dates as text. Hence if you print the data type of the items stored, you will see that it will be a Unicode string(U10)
import numpy as np
array3 = np.array(["1990-10-04","1989-05-06","1990-11-04"])
?
print(array3)
print(array3.dtype)
print(array3.dtype.itemsize)
?
OUTPUT
['1990-10-04' '1989-05-06' '1990-11-04']
<U10
?40
array4 = array3.astype("M")
print(array4.dtype)
print(array4.dtype.itemsize)
?
OUTPUT
datetime64[D]
8
import numpy as np
array5 = np.array(["1990-10-04","1989-05-06","1990-11-04"],dtype="M")
print(array5)
print(array5.dtype)
print(array5.dtype.itemsize)
OUTPUT
['1990-10-04' '1989-05-06' '1990-11-04']
datetime64[D]
8
Creating NumPy Arrays
Depending on the type of data you need inside your NumPy array, different methods can be used to create a NumPy array.
Using Array Method
To create a NumPy array, you can pass a list, tuple or any array-like object to the array() method of the NumPy module, as shown below.
import numpy as np
num_list = [1,2,3,4,5,6]
arr1 =np.array(num_list)
print(arr1)
print(type(arr1))
OUTPUT
[1 2 3 4 5 6]
<class 'numpy.ndarray'>
- - - - - - - - -- - - - - - - - - - - - - - - -
Example 2
num_tup = (8,9,10,11,12)
?
arr2 = np.array(num_tup)
print(arr2)
print(type(arr2))
OUTPUT
[ 8 9 10 11 12]
<class 'numpy.ndarray'>
0-D Arrays:-0-D arrays or scalars are the elements in an array. Each value in an array is a 0-D array.
import numpy as np
arr = np.array(42)
print(arr)
OUTPUT
42
Keeping in mind the length of the given article we will stop here and further code and notes will be provided on my GitHub account
If you have any doubts or queries through the series you can post them in the comments, as soon as get free time I will be personally solving all your doubts, notebooks, code and data sets required for practice will be available on the dedicated GitHub repository.
The next article is dedicated to the most important library for data science that is Pandas.
Finally, one request if you really like what you read and acquire knowledge about feel free to share the article with your Friends, Colleagues, students or anyone you might think will benefit from this. sharing is caring.