登录查看更多内容

Dask: From Scratch to Scalable Analytics in Python! :)

Josue Luzardo Gebrim

Platform and Data Engineer

发布日期: 2022年2月13日

+ 关注

A Set of Practical, Powerful, and Sexy Libraries for Working with Machine and Deep Learning!

Dask provides advanced parallelism for analytics, enabling performance at scale for the tools you love

Dask is a set of flexible libraries for parallel computing in Python, consisting of two parts:

Dynamic Task Scheduling: It’s like Airflow, Luigi, Celery, or Make but optimized for interactive computing workloads.
Custom types for “Big Data”: such as parallel arrays, dataframes, and lists that extend standard interfaces like NumPy, Pandas, or Python iterators for distributed environments, or larger than memory. These parallel collections run on top of dynamic task schedulers.

In addition to this part, there is still a strong integration with frameworks and other libraries for data science, customized interfaces to facilitate its use, in addition to being an open-source project with a large maintainer community and having a vast ecosystem of integrations and other “daughter” libraries.

Find out more about Dask at:

#Thanks for your reading; share this post! :)

Data for Everyone!

6,799 位关注者

Thiago Barbosa

Analista de PLD/FT & Data Analytics

3 年

Parabéns pela iniciativa mestre!

1 次回应

要查看或添加评论，请登录

Josue Luzardo Gebrim的更多文章

Nim: An Efficient, Expressive, Elegant, and Constantly Evolving Computational Language!

2024年1月22日

Nim: An Efficient, Expressive, Elegant, and Constantly Evolving Computational Language!

NIM is a programming language of statically typed compiled systems. It combines concepts of success of mature languages…
Discovering Data Product Possibilities and Opportunities!

2023年10月12日

Discovering Data Product Possibilities and Opportunities!

In today's data-driven world, the importance of data products stands out as a crucial element for business success…
Effectively enhancing and optimizing AWS Glue!

2023年10月10日

Effectively enhancing and optimizing AWS Glue!

Discover the untapped secrets to success: how to boost your efficiency and drastically cut your operational expenses…
Discovering Zero?ETL!

2023年10月5日

Discovering Zero?ETL!

Bringing Agility and Speed to Your Data Analysis! "Zero-ETL is a set of integrations that eliminates or minimizes the…
DuckDB: Discovering High-Performance Embedded Database!

2023年10月4日

DuckDB: Discovering High-Performance Embedded Database!

A Powerful Embedded Analytics Database! There are many database management systems (DBMS) out there. But there is no…
GO(Golang) and Big Data Solutions!

2023年6月25日

GO(Golang) and Big Data Solutions!

Hi Everyone, Today I bring the possibility of using different data solutions for the most diverse contexts with the…
Apache Pinot: The Missing Wine in the Data Scenario!

2023年6月16日

Apache Pinot: The Missing Wine in the Data Scenario!

Hello Everyone, This time I come to bring a summary of an open-source, real-time distributed OLAP datastore solution…
Strimzi: Apache Kafka on Kubernetes in Minutes!

2023年6月15日

Strimzi: Apache Kafka on Kubernetes in Minutes!

Hi everyone, Today I came to share with you a summary about Strimzi, a super interesting option to configure and use a…

3 条评论
Dremio and Apache Iceberg: New Hope or Data Hype!?

2023年6月14日

Dremio and Apache Iceberg: New Hope or Data Hype!?

Hello Everyone, I started to look at the market and saw a growing use of Apache Iceberg, an amazing data format totally…
Feature Store: Accelerating Data Science Initiatives!

2023年6月13日

Feature Store: Accelerating Data Science Initiatives!

Hi Everyone! This time, bring a summary with some feature store solutions(Databricks, Google, Amazon Web Services…

2 条评论

See all articles

Dask: From Scratch to Scalable Analytics in Python! :)

Josue Luzardo Gebrim

Platform and Data Engineer

A Set of Practical, Powerful, and Sexy Libraries for Working with Machine and Deep Learning!

Data for Everyone!

6,799 位关注者

Josue Luzardo Gebrim的更多文章

社区洞察

其他会员也浏览了

7 Python libraries for parallel processing

WHAT IS NUMPY

GenAI-Evaluation: New Open Source Python Library Now Available

Demystifying the GIL: How Threading Impacts I/O in Python

NumPy Magic:

Understanding How Python Code Runs in Memory

Numpy Arrays in Python

Numpy

Download, Combine, and Visualize MODIS in python – a tutorial + code

Dynamical Systems with Python: Lorenz System

A Set of Practical, Powerful, and Sexy Libraries for Working with Machine and Deep Learning!

Data for Everyone!

6,799 位关注者

Josue Luzardo Gebrim的更多文章

Nim: An Efficient, Expressive, Elegant, and Constantly Evolving Computational Language!

Discovering Data Product Possibilities and Opportunities!

Effectively enhancing and optimizing AWS Glue!

Discovering Zero?ETL!

DuckDB: Discovering High-Performance Embedded Database!

GO(Golang) and Big Data Solutions!

Apache Pinot: The Missing Wine in the Data Scenario!

Strimzi: Apache Kafka on Kubernetes in Minutes!

Dremio and Apache Iceberg: New Hope or Data Hype!?

Feature Store: Accelerating Data Science Initiatives!

社区洞察

其他会员也浏览了

7 Python libraries for parallel processing

WHAT IS NUMPY

GenAI-Evaluation: New Open Source Python Library Now Available

Demystifying the GIL: How Threading Impacts I/O in Python

NumPy Magic:

Understanding How Python Code Runs in Memory

Numpy Arrays in Python

Numpy

Download, Combine, and Visualize MODIS in python – a tutorial + code

Dynamical Systems with Python: Lorenz System