DATA Pill #003: Apache Airflow at Scale, One-stop MLOps portal and more
Hi everyone ??
?let’s start the third leg of our DATA marathon.
?
ARTICLES?
Lessons Learned From Running Apache Airflow at Scale | 10 min read | Apache Airflow | Megan Parker | Shopify Blog?
Challenges in running Airflow at scale + concrete solutions
One-stop MLOps portal at LinkedIn | 10 min read | MLOps| LinkedIn Blog
To visualize the entire ML lifecycle, an infrastructure is needed to automatically track every step of the machine learning process. We created a data schema to capture the complete, structured, and well-documented information detailing how machine learning models are produced.
Monitoring Large-Scale Apache Flink Applications, Part 1: Concepts & Continuous Monitoring | 12 min read | Apache Flink | Nico Kruber | Ververica Blog?
This post introducees various useful metrics which can be set up with proper alerts to inform you about imminent failures and allow you to monitor cluster and application health and checkpointing progress. Different ways to track latency and observe your application’s throughput for performance monitoring
Real-time ingestion to Iceberg with Kafka Connect - Apache Iceberg Sink | 11 min read | Apache Iceberg Sink | ?? Grzegorz Liter | GetInData Blog?
GetInData created an Apache Iceberg sink that can be deployed on a Kafka Connect instance. Data format that is consumed by Apache Iceberg has to represent table-like data and its schema, therefore we used a format created by Debezium for change data capture.
{ MORE LINKS }
领英推荐
____________________
PODCAST
Dataflow Automation | 47 min | The Data Exchange
Jeremiah Lowin CEO of Prefect on designing tools to allow teams to build, run, and monitor data pipelines at scale. Data engineering challenges facing data and ML teams today, and implications of looming trends in machine learning and AI are discussed.?
{ MORE LINKS }
____________________
DATAtube?
Things I Wish I Knew When I Started As A Data Engineer ?| 15 min | Seattle Data Guy
Lessons and advice after 10 years in data. Don't try to learn all technologies all at once - it’s gonna get you nowhere
{ MORE LINKS }
If You have any feedback, please leave a comment below.?
I want this newsletter to reach out to our tech community and its needs.
See You tomorrow!
Adam Kawa from GetInData