Numpy For Data Science

Numpy For Data Science

Welcome to the article on NumPy for Data Science, where you will learn both the concept behind the NumPy library as well as its implementation of it in Python.

My name is Jagdish Chavan as a working Data Scientist and Industry Trainer, my Courses have been enrolled by over 1000 + students holding B.Sc in Computer Science with Mathematics and Statistics and I have experience in Building Projects in Data Science, ML, DL application and the consulting industry.

While doing my Job, I realized That many Data analysts and beginners in the field of data science are overwhelmed by the power of NumPy. This article and coming articles focus will be on students who want to expand their skill set in the Field of Data Science.

A practising data scientist needs to know from concept to code without getting too much mathematical about it. if you regularly put one hour a day within a week, you will be able to understand the true implementation of NumPy.

The prerequisite for this article will be core and advanced python concepts it will be really good if you are familiar with Arrays.

this article will be ideal for an existing Business analyst or data analyst who wants to expand on current skills or a student who wants to have a career in data science.

Q) What is Numpy?

  • Numpy is one of the most commonly used libraries for numeric and scientific computing.
  • The word NumPy is a portmanteau of two words: Numerical and Python.
  • NumPy is extremely fast and contains support for multiple mathematical domains such as linear algebra, geometry, etc. Therefore, it is extremely important to learn NumPy in case you plan to make a career in data science and data preparation.
  • NumPy library stores data in the form of NumPy arrays, which provide extremely fast and memory-efficient data storage.

Q) Advantages of Numpy?

NumPy array has many advantages over regular Python lists. Some of them are listed below:

  • NumPy arrays are much faster for insertion, deletion, updating, and reading of data.
  • NumPy arrays contain advanced broadcasting functionalities compared with regular Python arrays.
  • NumPy arrays come with a lot of methods that support advanced arithmetic and linear algebra options.
  • NumPy provides advanced multi-dimensional array slicing capabilities.

Install NumPy

You can install the NumPy package in your Python installation using the command prompt via the following pip command.

pip install numpy

Q) What is Numpy Array?

  • The main data structure in the NumPy library is the NumPy array, which is an extremely fast and memory-efficient data structure.
  • The NumPy array is much faster than the common Python list and provides vectorized matrix operations.

NumPy Data Types

Q)?Explain Different Data types in the NumPy library?

  • The NumPy library supports all the default Python data types in addition to some of its intrinsic data types.
  • This means that the default Python data types, e.g., strings, integers, floats, Booleans, and complex data types, can be stored in NumPy arrays.

No alt text provided for this image

You can check the data type in a NumPy array using the dtype() property.

Write a numpy program to create a numpy array and check the data type of the given array

import numpy as np
array1 = np.array([1,2,3,4,5,6,7,8,9])

?

print(array1)

print(array1.dtype)

print(array1.dtype.itemsize)

?

OUTPUT

[1 2 3 4 5 6 7 8 9]

?int32

?4        

  • The script above defines a NumPy array with nine integers.
  • Next, the array type is displayed via the dtype attribute.
  • Finally, the size of each item in the array (in bytes) is displayed via the itemsize attribute.
  • The output prints the array and the type of the items in the array, i.e., int32 (integer type), followed by the size of each item in the array, which is 4 bytes (32 bits).

The Python NumPy library supports the following data types including the default Python types.

?i – integer

b – boolean

?u – unsigned integer

f – float

c – complex float

m – timedelta

M – datetime

?o – object

S – string

U – Unicode string

V – a fixed chunk of memory for other types ( void )

write a program that creates a NumPy array with three text items and displays the data type and size of each item.

import numpy as np
array2 = np.array(["Red","Green","Orange"])

?

print(array2)

print(array2.dtype)

print(array2.dtype.itemsize)

?

OUTPUT

['Red' 'Green' 'Orange']

<U6

?24        

The output shows that NumPy stores text in the form of Unicode string data type denoted by U. Here, digit 6 represents the item with the most number of characters.

Though the NumPy array is intelligent to guess the data type of items stored in it, this is not always the case. For instance, in the following script, you store some dates in a numpy array. Since the dates are stored in the form of the text(enclosed in double quotations) by default the numpy array treats the dates as text. Hence if you print the data type of the items stored, you will see that it will be a Unicode string(U10)

import numpy as np
array3 = np.array(["1990-10-04","1989-05-06","1990-11-04"])

?

print(array3)

print(array3.dtype)

print(array3.dtype.itemsize)

?

OUTPUT

['1990-10-04' '1989-05-06' '1990-11-04']

<U10

?40        

  • You can convert data types in the NumPy array to other data types via the astype() method.
  • But first, you need to specify the target data type in the astype() method.
  • For instance, the following script converts the array you created in the previous script to the datetime data type.
  • You can see that “M” is passed as a parameter value to the astype() function. “M” stands for the datetime data type as aforementioned.

array4 = array3.astype("M")

print(array4.dtype)

print(array4.dtype.itemsize)

?

OUTPUT
datetime64[D]
8        

  • In addition to converting arrays from one type to another, you can also specify the data type for a NumPy array at the time of definition via the dtype parameter.
  • For instance, in the following script, you specify “M” as the value for the dtype parameter, which tells the Python interpreter that the items must be stored as datatime values.

import numpy as np

array5 = np.array(["1990-10-04","1989-05-06","1990-11-04"],dtype="M")

print(array5)
print(array5.dtype)
print(array5.dtype.itemsize)

OUTPUT

['1990-10-04' '1989-05-06' '1990-11-04']
datetime64[D]
8        

Creating NumPy Arrays

Depending on the type of data you need inside your NumPy array, different methods can be used to create a NumPy array.

Using Array Method

To create a NumPy array, you can pass a list, tuple or any array-like object to the array() method of the NumPy module, as shown below.

import numpy as np

num_list = [1,2,3,4,5,6]
arr1 =np.array(num_list)

print(arr1)
print(type(arr1))

OUTPUT
[1 2 3 4 5 6]
<class 'numpy.ndarray'>

- - - - - - - - -- - - - - - - - - - - - - - - - 
Example 2
num_tup = (8,9,10,11,12)

?

arr2 = np.array(num_tup)
print(arr2)
print(type(arr2))

OUTPUT
[ 8 9 10 11 12]
<class 'numpy.ndarray'>        

  • You can also create a multi-dimensional NumPy array.
  • A dimension in an array is one level of array depth(nested arrays).
  • To do so, you need to create a list of lists where each internal list corresponds to the row in a two-dimensional array.
  • Here is an example of how to create a two-dimensional array using the array() method.

0-D Arrays:-0-D arrays or scalars are the elements in an array. Each value in an array is a 0-D array.

import numpy as np

arr = np.array(42)
print(arr)

OUTPUT
42        

Keeping in mind the length of the given article we will stop here and further code and notes will be provided on my GitHub account

If you have any doubts or queries through the series you can post them in the comments, as soon as get free time I will be personally solving all your doubts, notebooks, code and data sets required for practice will be available on the dedicated GitHub repository.

The next article is dedicated to the most important library for data science that is Pandas.

Finally, one request if you really like what you read and acquire knowledge about feel free to share the article with your Friends, Colleagues, students or anyone you might think will benefit from this. sharing is caring.


要查看或添加评论,请登录

Jagdish Chavan的更多文章

  • Unleashing Hidden Patterns: An Introduction to the Power of Clustering Analysis

    Unleashing Hidden Patterns: An Introduction to the Power of Clustering Analysis

    Welcome to the Series on Clustering with Python. In this article, we will explore one of the most significant…

  • Time Series Forecasting Steps

    Time Series Forecasting Steps

    Welcome to the Series of Articles on Time Series Analysis and Forecasting with Python.where you will learn both the…

    3 条评论
  • Time Series Analysis and Forecasting

    Time Series Analysis and Forecasting

    Welcome to the Series of Articles on Time Series Analysis and Forecasting with Python. Where you will learn both the…

    1 条评论
  • Introduction to Data Analytics

    Introduction to Data Analytics

    Welcome to the article on Introduction to Data Analytics, My name is Jagdish Chavan working as a Training Coordinator…

    4 条评论
  • 3V's of Data Science-Volume, Variety, Velocity

    3V's of Data Science-Volume, Variety, Velocity

    Welcome to the article on 3 Vs of Data Volume Variety and Velocity, where you will learn both the concept behind 3V's…

    1 条评论
  • Feature Engineering For NLP

    Feature Engineering For NLP

    Welcome to the article on Feature Engineering for NLP, where you will learn both the concept behind NLP as well as its…

  • Popular Libraries used for NLP

    Popular Libraries used for NLP

    Welcome to the article on Popular libraries used for NLP, where you will learn both the concept behind NLP as well as…

  • Introduction to Text processing

    Introduction to Text processing

    Welcome to the article on Introduction to text processing, where you will learn both the concept behind NLP as well as…

  • Support Vector Machine Introduction

    Support Vector Machine Introduction

    Welcome to the Support Vector Machine article, where you will learn both the concept behind an SVM model as well as its…

  • Machine Learning Introduction

    Machine Learning Introduction

    Welcome to the Machine learning Introduction, where you will learn both the concept behind the Machine Learning(ML)…

社区洞察

其他会员也浏览了