DATA Pill #075 - 5 Best Data Observability Platforms, to dbt or not to dbt

DATA Pill #075 - 5 Best Data Observability Platforms, to dbt or not to dbt


Hi,

Recover after the weekend with a DATA Pill.

Inhale, exhale… your newest dose is here.


ARTICLES

5 Best Data Observability Platforms | 7 min | Data Observability | Shawn Fergus | Shippyyard Blog

The pros and cons of the 5 best Data Observability Platforms. In this article, Shawn compares Datadog, Splunk, and what else? Check it out.


Accelerating Innovation at JetBlue Using Databricks | 8 min | DataOps | Sai Ravuru and Yared Gudeta | Databricks Blog

How JetBlue utilizes Azure, Databricks, and generative AI for customer experience. Shifting to Databricks Lakehouse architecture improves scalability and cost efficiency, while their BlueSky AI system boosts operational efficiency and satisfaction.

Data engineering at Meta: High-Level Overview of the internal tech stack | 12 min | Data Engineering | Alex M. | Meta Engineering Blog

This article provides an overview of the internal tech stack that we use on a daily basis as data engineers at Meta. The idea is to shed some light on the work we do, and how the tools and frameworks contribute to making our day-to-day data engineering work more efficiently, and to share some of the design decisions and technical tradeoffs that we made along the way.


To dbt or not to dbt | 10 min | Data Science | Pragun Bhutani | intercom-rad Blog

Delve into data management challenges, including SQL complexity, and present Intercom's use of dbt for data transformation. It highlights advantages like organized code and staging models, but acknowledges challenges like a learning curve and boilerplate code.



TUTORIALS

Deploying efficient Kedro pipelines on GCP Composer / Airflow with node grouping & MLflow | 7 min | Data Engineering | Artur Dobrogowski | GetInData | Part of Xebia Blog

Have you ever wondered if it’s possible to use Kedro and Airflow together? Dive into the step-by step tutorial on how to deploy Kedro pipelines on GCP Composer and Airflow.


Building a Real-Time Data Architecture with Apache Kafka, Flink, and Druid | 10 min | Real Time Analytics | David Wang | Data Engineering Things

This article explores how, when combined, Apache Kafka, Flink and Druid create a real-time data architecture that eliminates these waiting states and enables various real-time data applications, including alerting, monitoring, dashboards, analytics and personalized recommendations. These tools provide a purpose-built pipeline for real-time data applications and have been used by major companies like Lyft, Pinterest, Reddit and Paytm to achieve the data freshness, scale and reliability required for real-time use cases.


In MORE LINKS you will find setting up the new dbt Semantic Layer and testing with DBeaver

{ MORE LINKS }



NEWS

Manage your big data needs with HDInsight on AKS | 4 min | Balaji Sankaran | Cloud | Microsoft Blog

Microsoft is launching a public preview of HDInsight on the Azure Kubernetes Service (AKS), a rearchitected cloud-native big data service with Apache Spark, Apache Flink and Trino workloads. It offers seamless integration with Azure analytics services like Power BI and Azure Data Factory with easy use and robust security. It simplifies library management and resource handling while promoting cost-effective analytics setups.

In MORE LINKS you will find Docker with Neo4j, LangChain, and Ollama launches new GenAI stack for developers

{ MORE LINKS }



PODCASTS

Versioning and MLOps for Generative AI | 38 min | AI | Ben Lorica and Yucheng Low | The Data Exchange Podcast

In this talk, Yucheng Low addresses the challenges of managing large-scale machine learning assets and the need for version control, emphasizing their platform's collaborative versioning system for diverse data types and open-source integration. The discussion also touches on data deletion challenges and the value of flexibility and openness in data formats.


In MORE LINKS you will listen to about Do LLMs Make Ethical Choices

{ MORE LINKS }



CONFS, EVENTS, AND MEETUPS

Introduction to GenAI: How to get the most for your business from the latest AI revolution | Webinar | 9th November 2023

Join our upcoming webinar, where we dive into Language Model Models (LLMs) and GenAI. Discover the new way of interacting with LLM models, gain insights into the current LLM landscape, explore their novel possibilities, learn about GenAI’s branches and modalities, and address potential challenges. We’ll also showcase practical use cases, making this a must-attend event for AI enthusiasts and professionals.

________________________

Have any interesting content to share in the DATA Pill newsletter?

? Join us on GitHub

? Dig previous editions of DataPill?


Adam from the GetInData | Part of Xebia

要查看或添加评论,请登录

社区洞察

其他会员也浏览了