Data Analysis In Python
Data has been an integral part of business in their day-to-day operations to keep track and derive insights from it. Can you imagine your company without data? — Impossible.
But where does this start? I mean, how do you start handling such complex data , interpret it and derive insights from it? — Data Analysis is the answer!
Python is the top most useful programming language to perform data analysis. Powerful libraries like Pandas and NumPy makes it easier to perform efficient data analysis.
In this article, we will export its features and see its usage using Python code.
Now let’s get started!
Prerequisites
An Introduction to NumPy
NumPy — which stands for Numerical Python is a an open source library for performing numerical computing and data manipulations including linear algebra, statistics and fourier transforms.
Key Features of NumPy
Implementing NumPy
Importing NumPy In Python Code
To import this library, simply use this import in Python and the name of the library you wish to import.
import numpy as np
Create A NumPy Array
In Python, NumPy array is created using array() of numpy library.
The array() takes an array of integers.
data = np.array([1, 2, 3, 4, 5])
Perform operations
It’s time to perform some operations! The numpy library offers methods like mean() to calculate the mean, std to calculate standard deviation etc. Let’s see this in action. Here is the full code.
We are simply passing the data` variable which has the array defined.
Output
See how simply it is, isn’t it? ??
Let’s move forward to see what are Pandas.
Introduction to Pandas
Many a times, we are in a situation where you want to analyse your data and perform manipulations based on business requirements — Pandas is the solution.
It provides two primary data structures:
Key Features of Pandas
Implementing Pandas
领英推荐
Importing Pandas in Your Python Code
Pretty much similar how we imported NumPy.
import pandas as pd
Creating A DataFrame
In order to create a DataFrame, we would first create a dictionary where keys represent column names with its corresponding data as values.
Next, pandas provide a method called DataFrame() that converts the dictionary into a structured table.
data = {'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'Salary': [50000, 60000, 70000]}
df = pd.DataFrame(data)
So in short, DataFrame means a structured table
Performing Operations
We would specifically access the Salary column and calculating the mean of the data provided, we would analyse our data using describe().
Output
This is how the mean would look like, followed by describe that describes all the parameters required.
Don’t get confused with the term mean. It literally means average. And we all know how to calculate average of numbers.
Combining NumPy and Pandas for Data Analysis
Having learnt the basics about NumPy and Pandas, it’s now time to see an example of combination of both.
NumPy is excellent for numerical computation. Pandas can be used for data manipulation and visualization.
Output
Common Use Cases Of These Libraries
These open source libraries in Python are widely used for various tasks which I have listed below:
This table summarizes some of the common use cases in day-to-day business operations.
Conclusion
Data Analysis is very vast topic in itself. I have provided you the basic explanation of how to implement it. Your minds would explode if I put everything into one — which I really don’t want.
I would further write more articles that demonstrates a real-world example, but I felt it is necessary to understand the basics first.
Pandas and Numpy are great tools to tackle a wide range of data-related tasks. Whether you are cleaning data, performing statistical analysis, or preparing data for visualization, these libraries will make your work more efficient and enjoyable.
Enjoyed This Article? ??
If you enjoyed this article and if you feel I was able to teach you some basics, please show your appreciation with a like.
Feel free to leave a comment below with your thoughts, experiences, or any additional tips on setting boundaries.