登录查看更多内容

Pandas/Python Library

Kishan Kumar

Senior Consultant CRD(Corporate function) at Huquo

发布日期: 2022年7月6日

Pandas

Pandas?is a Python library for data analysis. Started by?Wes McKinney?in 2008 out of a need for a powerful and flexible quantitative analysis tool, pandas has grown into one of the most popular Python libraries. It has an extremely active?community of contributors.

Pandas is built on top of two core Python libraries—matplotlib?for data visualization and?NumPy?for mathematical operations. Pandas acts as a wrapper over these libraries, allowing you to access many of matplotlib's and NumPy's methods with less code. For instance, pandas'?.plot()?combines multiple matplotlib methods into a single method, enabling you to plot a chart in a few lines.

Before pandas, most analysts used Python for data munging and preparation, and then switched to a more domain specific language like R for the rest of their workflow. Pandas introduced two new types of?objects for storing data?that make analytical tasks easier and eliminate the need to switch tools:?Series, which have a list-like structure, and?DataFrames, which have a tabular structure.

Pandas tutorials

Here are some analysis-focused pandas tutorials that aren't riddled with technical jargon.

Pandas cookbook?(Julia Evans) - This tutorial uses real-world data and presents a problem to solve or question to answer in every example. Great for putting pandas' capabilities in context of the actual analytical workflow.
Practical Data Analysis with Python?(Anita Raichand) - Provides code examples for four specific analytical tasks: data munging, aggregation, visualization, and time series analysis.
[VIDEO SERIES] Easier data analysis in Python with pandas?(Data School) - A series of video tutorials for pandas newbies who know some Python. Each video answers a student-posed question using real-world data.
An Introduction to Pandas?(Michael Hansen) - This tutorial covers the basics of pandas with a complete analysis of weather data—from reading in data to creating charts.
Modern Pandas?(Tom Augspurger) - An intermediate tutorial for experienced Python users looking to stay sharp on pandas.

Pandas data structures

Series

You can think of a series as a single column of data. Each value in the series has a label, and these labels are collectively referred to as an index. This is demonstrated in the output below. 0-4 is the index and the column of numbers to the right contain the values.

DataFrames

While series are useful, most analysts work with the majority of their data in DataFrames. DataFrames store data in the familiar table format of rows and columns, much like a spreadsheet or database. DataFrames makes a lot of analytical tasks easier, such as?finding the averages per column?in a dataset.

You can also think of DataFrames as a collection of series—just as multiple columns combined make up a table, multiple series make up a DataFrame.

领英推荐

What is Pandas in Python?

Muhammad Nazam 10 个月前

Working with Data in Python: A Quick Guide for Data…

Bushra Sully 4 个月前

Mastering Data Analysis with Pandas Series: A…

Rany ElHousieny, PhD??? 1 年前

Note: In Mode, the results of your SQL queries are automatically converted into DataFrames and made available in the list variable "datasets." To describe or transform the results of Query 1, use?datasets[0], for the results of Query 2, use?datasets[1]?and so on.

For more on manipulating pandas data structures, check out?Greg Reda's three-part tutorial, which approaches the topic from a?SQL perspective.

Pandas features

Time series analysis

Time Series / Date functionality?(Official Pandas Documentation)
Times series analysis with pandas?(EarthPy)
Timeseries with pandas?(Jupyter)
Complete guide to create a Time Series Forecast (with Codes in Python)?(Analytics Vidhya)

split-apply-combine

Split-apply-combine is a common strategy used during analysis to summarize data—you split data into logical subgroups, apply some function to each subgroup, and stick the results back together again. In pandas, this is accomplished using the?groupby()?function and whatever functions you want to apply to the subgroups.

Group By: split-apply-combine?(Official Pandas Documentation)
Summarizing Data in Python with Pandas?(Brian Connelly)
Using Pandas: Split-Apply-Combine?(Duke University)

Pandas/Python Library

Kishan Kumar

Senior Consultant CRD(Corporate function) at Huquo

Pandas

Pandas tutorials

Pandas data structures

Series

You can think of a series as a single column of data. Each value in the series has a label, and these labels are collectively referred to as an index. This is demonstrated in the output below. 0-4 is the index and the column of numbers to the right contain the values.

DataFrames

领英推荐

Pandas features

Time series analysis

split-apply-combine

Data visualization

Pivot tables

Working with missing data

更多精彩文章

社区洞察

其他会员也浏览了

Mastering Data Analysis with Python: Essential Tips and Tricks

Panda | Python library

Mastering Data Manipulation with Python's Pandas Library

Data cleaning with help of python with pandas.

Leveraging Python's Power for Advanced Data Analysis: Unleash Your Analytical Superpowers!

Essential Python Libraries for Data Science and Visualization

Python for Data Science: A Comprehensive Guide

HOW TO USE PYTHON FOR DATA SCIENCE?

9 Python Libraries for Data Science and Artificial Intelligence

Data analysis using pandas in python

Pandas

Pandas tutorials

Pandas data structures

Series

You can think of a series as a single column of data. Each value in the series has a label, and these labels are collectively referred to as an index. This is demonstrated in the output below. 0-4 is the index and the column of numbers to the right contain the values.

DataFrames

领英推荐

Pandas features

Time series analysis

split-apply-combine

Data visualization

Pivot tables

Working with missing data

Sales Manager

2024年4月5日

Data Modelers

2024年4月4日

Deepfake Technology

2024年4月3日

Analytics

2024年4月2日

What is Apache Airflow?

2024年4月1日

LSTM Networks

2024年3月30日

Free Space Laser Communication

2024年3月29日

Neo4j

2024年3月28日

Customer Communications Management

2024年3月27日

Bid Rigging

2024年3月26日

社区洞察

其他会员也浏览了

Mastering Data Analysis with Python: Essential Tips and Tricks

Panda | Python library

Mastering Data Manipulation with Python's Pandas Library

Data cleaning with help of python with pandas.

Leveraging Python's Power for Advanced Data Analysis: Unleash Your Analytical Superpowers!

Essential Python Libraries for Data Science and Visualization

Python for Data Science: A Comprehensive Guide

HOW TO USE PYTHON FOR DATA SCIENCE?

9 Python Libraries for Data Science and Artificial Intelligence

Data analysis using pandas in python