DATA Pill #089 - Looker, dbt, real-time streaming, Java and Kubernetes relationship
Hi,
How much do you know about Looker?
We’re pretty sure still not enough, so we have some dose of it this week.
These two and a way more are waiting for you.
ARTICLES
A developer’s second brain: Reducing complexity through partnership with AI | 7 min | AI | Eirini Kalliamvakou | Github Engineering Blog
The article talks about how AI is changing the way software developers work. It's based on interviews with 25 developers by GitHub Next, aiming to get a real-world perspective on AI's role in their daily tasks and how it shapes their jobs. This feedback helps in figuring out the future of AI in software development.
Data Modelling in Looker: PDT vs DBT | 15 min | Data Analytics | Anna Wnuczko | GetInData | Part of Xebia Blog
The accurate data model is one of the essential aspects that help companies to become more data-driven organizations. When we think about data modelling in Looker, we have two approaches: we can use Looker PDT or model data in dbt first. Which approach? is better and when? Read about two ways of modeling data on the same use case.
The Scary Thing About Automating Deploys | 14 min | DevOps | Sean McIlroy | Slack Engineering Blog
The article explains Slack's deployment strategy, focusing on quick and frequent updates for user-responsive iteration and error reduction and efficient management of high-frequency updates despite large-scale inputs. It also highlights the transition to automated processes with ReleaseBot, addressing the technical aspects of deployment management, including anomaly detection, monitoring, and the benefits and challenges of automation.
In MORE LINKS you will read about: Warm up the relationship between Java and Kubernetes, Real-time data processing using Change Data Capture and event-driven architecture
TUTORIALS
Unit testing with dbt | 7 min | Data Engineering | Matthieu Bonneviot | Teads Engineering
The article discusses Teads' shift from a Spark and Parquet-based BI system to a cutting-edge dbt and BigQuery framework. It highlights the author's journey in migrating a pipeline from the former system to the latter, emphasizing the critical role and methodology of unit testing within dbt.
Building real-time data views with Streamhouse | 7 min | Data streaming | Alexey Novakov | Ververica Blog
This blog post explores building a real-time data view with Apache Paimon on Streamhouse, focusing on efficient data analytics pipelines and low-latency solutions for data engineers. It shows the use of Apache Flink for real-time processing and Apache Paimon for cost-effective storage, demonstrating their combined power in modern data management.
In MORE LINKS you will read about: Towards AGI: Making LLMs better at Reasoning and Design a data mesh on AWS that reflects the envisioned organization
领英推荐
TOOLS
FOCUS? | FinOps
The FinOps Cost and Usage Specification (FOCUS?) standardizes cloud cost data, making it easier for companies to understand and manage their cloud expenses. It converts complex cloud billing data into a straightforward, standardized format. This simplification aids consistent reporting across multiple cloud vendors and reduces the complexity of financial operations like allocation, chargeback, budgeting, and forecasting.?
SQL Assistant: Text-to-SQL Application in Streamlit | 7 min | Data Science | Romy Mendez | Personal Blog
This article explores the application of Vanna.ai, a Python library specifically designed for training a model capable of processing natural language questions and generating SQL queries as responses. The implementation will be integrated into a Streamlit application, creating a chatbot that facilitates posing questions and explains the returned queries.
PODCAST
AI Roundtable | 51 min | AI | Kyle Polich, Pramit Choudhary, Frank Bell | Data Skeptic Podcast
Listen to a talk where Kyle, Pramit, and Frank discuss the impacts LLMs and machine learning have had on the industry in the past year and where things may go in the current year.
CONFS EVENTS AND MEETUPS
Real-Time Data to Drive Business Growth and Innovation in 2024 | Data Streaming | Webinar | 31st January
During this webinar, you will explore practical examples and success stories that highlight the benefits realized by top companies through their implementation of data streaming strategies.
Big Data Technology Warsaw 2024 | On-site and Online event | 10-11th April
The Big Data Technology Warsaw Summit returns on April 10-11, 2024. This event is a prime gathering for data enthusiasts, experts, and innovators from across the globe. Take advantage of this opportunity to broaden your knowledge, connect with industry leaders, and shape your data strategy for success. Remember, the special promotional price is available for a limited time only!
________________________
Have any interesting content to share in the DATA Pill newsletter?
? Join us on GitHub
? Dig previous editions of DataPill?
Adam from the GetInData | Part of Xebia
NSV Mastermind | Enthusiast AI & ML | Architect Solutions AI & ML | AIOps / MLOps / DataOps | Innovator MLOps & DataOps for Web2 & Web3 Startup | NLP Aficionado | Unlocking the Power of AI for a Brighter Future??
1 年Thanks for sharing this awesome data roundup! Can't wait to dive into it! ??