Data analysis using pandas in python
CodersArts
Upskill yourself with Codersarts' Programming and Coursework Assistance, POCs, MVPs, and Live Project Training.
Data analysis using pandas in python?
In this article we are going to discuss data analysis using pandas in python. So let's see first what pandas is and why it is used in python.?
What are pandas and why is it used? ?
Pandas is the open source python library which can be used for data analysis and data manipulation and we can perform many other operations related to data analysis.
Pandas library built on top of the numpy and it has cleaning, analysing and manipulating function which helps in extracting valuable insights from the dataset.?
Two-Dimensional is the main data structure in pandas which is also called a dataframe. We can create a dataframe by importing data in these formats: csv, xlsx, json, sql, parquet etc. we can perform many operations with some line of code like delete, update row/column, statistics of data. Identify the outlier and missing values and handle it.?
Why Pandas??
Pandas Download?
We can easily download the pandas in our system. It does not take much time. When you install the anaconda in your system you don’t need to install python and pandas explicitly in your system. If you want to download pandas latest version just run a simple command on command prompt or jupyter notebook.??
pip install pandas?
After installing the pandas, you need to first import pandas and use it in the jupyter notebook.
Data Analysis of survey_result_public data with Pandas?
Let's? explore the data and perform practical data analysis on the survey_result_public datasets with pandas. This is the open source dataset, you can download it from this link.?
We will read data by Loading the data in pandas dataframe from the .csv file and perform the basic operations on survey_result_public data.?
Read Data?
Load data from CSV file
View Data?
Lets see the first five and last five records of the data by using head() and tail() methods.?
head() : This method returns the first five records from the dataset. We can pass the number of row to view as you want to see the number of? rows, like for the first 20 records? “head(20)”.
tail() : This method returns the last five records from the dataset.
data.shape : It returns a tuple of array dimensions representing the shape of a given dataframe. 0 index of tuple represents the number of records in the dataset and 1 index represents the number of columns.?
data.info() : This method returns the information about the dataframe including index dtype and columns, non nulls values and memory usage.
pd.set_options : In pandas have options that we can customize some aspect of behaviour and display related options. Following code will display all the columns and rows mentioned in the parameter. ?
Create Dataframe Using Dictionary
Dictionaries store data values in key:value pairs. A dictionary is a collection which is ordered*, changeable and do not allow duplicates.? We can create dataframe using the following code.
领英推荐
Index?
In python using the? index() method finds the index position of an element or an item in a string of characters or a list of items.
We can access single column in just like we are accessing the key of a dictionary
We can access the single columns using df.email
When we check the type of the df[‘email’] of columns it is pandas core series. Series is the list of data
When you want to access the multiple columns then we can use bracket notation and pass the list of the columns that we want to access.
We can see all the columns name from the given dataset using the follwoing code
We can access the columns using i.loc and pass the index number of columns . Here df.iloc[ : ,1]? extract second column.
Filtering
Using the filter method we can access the specified columns and rows of dataframe according to labels in the specified index.
To filter data apply two condition on data set
Update
We can update the data in pandas
Remove
We can remove the rows from the pandas dataframe
Sorting
We can sort data in pandas dataframe
Grouping
Grouping the data according to the categories and apply a function to the categories.
In this article we have discusssed about the Data analysis using pandas dataframe
If you are looking help in Data analysis please contact us [email protected]
Thank you