Polars vs pandas

Polars vs pandas

Polars is a fast DataFrame library, similar to Pandas, but optimized for performance, especially when working with large datasets. Polars supports multi-threading and execution across multiple cores, making it an attractive choice for handling large datasets efficiently.

While both Polars, Pandas, and DuckDB are used interchangeably when working with large datasets, DuckDB is primarily designed for SQL-like operations.

Polars outperforms Pandas due to its Rust-based architecture. Unlike Java or other languages that rely on garbage collection, Rust does not use garbage collection, which helps avoid the performance pitfalls seen in other languages over the past decade.

Rust, developed by Mozilla in 2010, is designed for three core purposes in programming:

Performance

Safety

Memory management

Rust is used to develop advanced applications such as gaming engines, operating systems, and browsers, all of which require scalability. Rust shares similarities with C++, but it provides memory safety without relying on garbage collection. The language aims to deliver higher performance and better safety than C++.

import polars as pl

import duckdb

import pandas as pd

For the comparison, I used airline data stored in a CSV file (1996.csv) with 5,351,983 rows and a disk size of 540 MB. The comparison involved measuring execution time, CPU usage, and RAM consumption. The results clearly show a significant performance improvement with Polars, as seen in the accompanying image.

Airlines data- https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/HG7NV7

System configuration- 16GB RAM,5core ,Ubantu24.04,SSD HD

Feel free to contact me at cnsnoida@gmail.com.

Thanks for reading!

Aniket Gole

ETL Tech Lead at Synechron

5 天前

Definitely worth reading

赞
回复

要查看或添加评论,请登录

Chandrashekhar Kumar的更多文章

  • AI: A Blessing or Bhasmasura(????????)Curse?

    AI: A Blessing or Bhasmasura(????????)Curse?

    In the last five years, the IT industry has increasingly focused on AI and similar tools like ChatGPT, Gemini, etc. In…

    2 条评论
  • Building a Budget-Friendly DEV Env on LAN

    Building a Budget-Friendly DEV Env on LAN

    Thanks for reading my previous post: How Docker Development Setup Can Significantly Reduce Costs https://www.linkedin.

  • How Docker Development setup Can Significantly Reduce Costs for Businesses

    How Docker Development setup Can Significantly Reduce Costs for Businesses

    In today's world, physical installations of software on machines are becoming a thing of the past. With the rise of…

    1 条评论
  • Trained Resource Crunch and Indian Universities/Engineering colleges

    Trained Resource Crunch and Indian Universities/Engineering colleges

    This article is dedicated to our Indian universities. I often contemplate the balance between intelligence and…

    1 条评论
  • Back ground verification in Indian IT industry

    Back ground verification in Indian IT industry

    First of all, I believe background verification is a mandatory part of the hiring process and should be conducted to…

  • Magic of Snowflake (part1)

    Magic of Snowflake (part1)

    Now a days snowflake (data warehouse tool) is in trend for DWH analytics. Snowflake is an SaaS(Software-as-a-Service)…

    1 条评论
  • Interview Process in IT Industry

    Interview Process in IT Industry

    The way interviews are taken in the IT industry, personally, not relevant any more. This is totally my personal…

  • Impact of chat GPT in the market

    Impact of chat GPT in the market

    K8- kubernetes NoSQL- No SQL DS- Data science AI- Artificial Intelligence First and foremost, congratulations to the…

  • Spark with Kubernetes

    Spark with Kubernetes

    K8=kubernetes One of the finest technologies(Spark & K8) is trending now in the enterprise . The main motive behind…

    1 条评论
  • Pure Language (Scala)

    Pure Language (Scala)

    When I started Scala 5 years back , i thought it was an alien language and the creator must be from another planet…

    3 条评论

社区洞察

其他会员也浏览了