DATA Pill #089 - Looker, dbt, real-time streaming, Java and Kubernetes relationship

DATA Pill #089 - Looker, dbt, real-time streaming, Java and Kubernetes relationship


Hi,

How much do you know about Looker?

We’re pretty sure still not enough, so we have some dose of it this week.

These two and a way more are waiting for you.


ARTICLES

A developer’s second brain: Reducing complexity through partnership with AI | 7 min | AI | Eirini Kalliamvakou | Github Engineering Blog

The article talks about how AI is changing the way software developers work. It's based on interviews with 25 developers by GitHub Next, aiming to get a real-world perspective on AI's role in their daily tasks and how it shapes their jobs. This feedback helps in figuring out the future of AI in software development.

github-datapill-ai-partnership

Data Modelling in Looker: PDT vs DBT | 15 min | Data Analytics | Anna Wnuczko | GetInData | Part of Xebia Blog

The accurate data model is one of the essential aspects that help companies to become more data-driven organizations. When we think about data modelling in Looker, we have two approaches: we can use Looker PDT or model data in dbt first. Which approach? is better and when? Read about two ways of modeling data on the same use case.

looker-pdt-dbt-datapill-getindata

The Scary Thing About Automating Deploys | 14 min | DevOps | Sean McIlroy | Slack Engineering Blog

The article explains Slack's deployment strategy, focusing on quick and frequent updates for user-responsive iteration and error reduction and efficient management of high-frequency updates despite large-scale inputs. It also highlights the transition to automated processes with ReleaseBot, addressing the technical aspects of deployment management, including anomaly detection, monitoring, and the benefits and challenges of automation.

In MORE LINKS you will read about: Warm up the relationship between Java and Kubernetes, Real-time data processing using Change Data Capture and event-driven architecture

{ MORE LINKS }



TUTORIALS

Unit testing with dbt | 7 min | Data Engineering | Matthieu Bonneviot | Teads Engineering

The article discusses Teads' shift from a Spark and Parquet-based BI system to a cutting-edge dbt and BigQuery framework. It highlights the author's journey in migrating a pipeline from the former system to the latter, emphasizing the critical role and methodology of unit testing within dbt.

dbt-testing-datapill

Building real-time data views with Streamhouse | 7 min | Data streaming | Alexey Novakov | Ververica Blog

This blog post explores building a real-time data view with Apache Paimon on Streamhouse, focusing on efficient data analytics pipelines and low-latency solutions for data engineers. It shows the use of Apache Flink for real-time processing and Apache Paimon for cost-effective storage, demonstrating their combined power in modern data management.

ververica-streamhouse-datapill

In MORE LINKS you will read about: Towards AGI: Making LLMs better at Reasoning and Design a data mesh on AWS that reflects the envisioned organization

{ MORE LINKS }



TOOLS

FOCUS? | FinOps

The FinOps Cost and Usage Specification (FOCUS?) standardizes cloud cost data, making it easier for companies to understand and manage their cloud expenses. It converts complex cloud billing data into a straightforward, standardized format. This simplification aids consistent reporting across multiple cloud vendors and reduces the complexity of financial operations like allocation, chargeback, budgeting, and forecasting.?


SQL Assistant: Text-to-SQL Application in Streamlit | 7 min | Data Science | Romy Mendez | Personal Blog

This article explores the application of Vanna.ai, a Python library specifically designed for training a model capable of processing natural language questions and generating SQL queries as responses. The implementation will be integrated into a Streamlit application, creating a chatbot that facilitates posing questions and explains the returned queries.

text-sql-datapill



PODCAST

AI Roundtable | 51 min | AI | Kyle Polich, Pramit Choudhary, Frank Bell | Data Skeptic Podcast

Listen to a talk where Kyle, Pramit, and Frank discuss the impacts LLMs and machine learning have had on the industry in the past year and where things may go in the current year.



CONFS EVENTS AND MEETUPS

Real-Time Data to Drive Business Growth and Innovation in 2024 | Data Streaming | Webinar | 31st January

During this webinar, you will explore practical examples and success stories that highlight the benefits realized by top companies through their implementation of data streaming strategies.


Big Data Technology Warsaw 2024 | On-site and Online event | 10-11th April

The Big Data Technology Warsaw Summit returns on April 10-11, 2024. This event is a prime gathering for data enthusiasts, experts, and innovators from across the globe. Take advantage of this opportunity to broaden your knowledge, connect with industry leaders, and shape your data strategy for success. Remember, the special promotional price is available for a limited time only!

________________________

Have any interesting content to share in the DATA Pill newsletter?

? Join us on GitHub

? Dig previous editions of DataPill?


Adam from the GetInData | Part of Xebia

Piotr Malicki

NSV Mastermind | Enthusiast AI & ML | Architect Solutions AI & ML | AIOps / MLOps / DataOps | Innovator MLOps & DataOps for Web2 & Web3 Startup | NLP Aficionado | Unlocking the Power of AI for a Brighter Future??

1 年

Thanks for sharing this awesome data roundup! Can't wait to dive into it! ??

回复

要查看或添加评论,请登录

Adam Kawa的更多文章

社区洞察

其他会员也浏览了