DATA Pill #053 - model segmentation by Meta, MLFlow models on BigQuery, Databricks Notebooks with Variable Explorer, and more…

DATA Pill #053 - model segmentation by Meta, MLFlow models on BigQuery, Databricks Notebooks with Variable Explorer, and more…

Hi,


Get ready to dive headfirst into some of the hottest topics in the data world.

CI/CD in DBT Cloud, Kedro-snowflake plugin, and the image model segmentation by Meta.

Whether you're a certified data geek or just starting to dip your toes into the data pool, this edition has something for everyone.

So sit back, relax, and take your dose of DATA Pill no. 53.



ARTICLES

Blog CI/CD in dbt Cloud with GitHub Actions: Automating multiple environments deployment | 10 min | dbt| Lucas Ortiz | Xebia Blog

Do you remember Lucas’s text from DataPill#49? Unfortunately, he left the deployment pipeline setup for a later moment. Good news for those who were waiting. The guide through setting up an automated deployment pipeline that continuously runs integration tests and delivers changes (CI/CD), including multiple environments and CI/CD build as soon as pull requests are opened in the code repository - it's available right now. Check out how to set up your environment to release changes as many times as you want, making your job easier.


How Zoom implemented streaming log ingestion and efficient GDPR deletes using Apache Hudi on Amazon EMR | 6 min | Streaming | Sekar Srinivasan, Amit Kumar Agrawal, Chandra Dhandapani, Viral Shah | AWS Blog

This article shares what Zoom and the AWS Data Lab team have accomplished together to solve critical data pipeline challenges, and Zoom has extended the solution further to optimize extract, transform, and load (ETL) jobs and resource efficiency.

No alt text provided for this image

Telecom’s Big Opportunity in the Data Economy | 5 min | Cloud | Jennifer Belisseni | Snowflake Blog

The article touches the various opportunities that telecom companies can leverage in this data-driven economy. From insights regarding customer behavior to identifying new revenue streams, data is an essential tool that telecom companies can use to their advantage. Jennifer in the text also examines the challenges in utilizing this data, given the complexity and volume of information.

No alt text provided for this image


Writing Flink jobs using the Spring dependency injection framework | 13 min | Streaming | GetInData | Part of Xebia Blog

How can the popular Spring Dependency Injection Framework be leveraged to write #Flink jobs in a more structured and organized way??

By creating reusable components and defining them within a Spring context, you can easily manage and modify dependencies, leading to more efficient development processes.

In his blog, Krzysztof explains how Spring can help create high-quality Flink applications and provide invaluable insights to businesses.

No alt text provided for this image


Build Your Data Skills with the Data Literacy Trail on Trailhead | 5 min | Data Literacy | Sue Kraemer | Tableau Blog

Have you ever wanted to start building your data literacy? The Data Literacy Trail covers topics such as data preparation, data analysis and data visualization, enabling learners to understand how to work with data efficiently. It features guided learning pathways, interactive quizzes and real-world scenarios that help practice skills. Trailhead will help you develop your data skills.


Since we are talking about analytics, there is an interesting job offer available in that area.

In MORE LINKS you will read about:

  • introducing Segment Anything: Working toward the first foundation model for image segmentation
  • building and deploying MySQL Raft at Meta

{ MORE LINKS }



TUTORIAL

Deploy MLFlow models on BigQuery | 11 min | ML | Marcin Zab?ocki | Personal Blog

It can be difficult to deploy machine learning models and ensure their accessibility for other teams, particularly data and business analysts, but reading this one can be helpful. Marcin created? a great tutorial on how to deploy MLflow models to the GCP Cloud Run service in a way that they could be consumed from BigQuery using SQL.



NEWS

Announcing new Jupyter contributions by AWS to democratize generative AI and scale ML workloads | 3 min | AI | Brian Granger | AWS Blog

New tools for Jupyter users to improve their experience and boost development productivity are available now. These extensions enable you to perform a wide range of development tasks using generative AI models in JupyterLab and Jupyter notebooks. All of these are open-source and can be used anywhere you are running Jupyter.


In MORE LINKS you will read about:

  • releasing Debezium 2.3.0.Alpha1
  • new debugging features for Databricks Notebooks with Variable Explorer

{ MORE LINKS }



DATA TUBE

Run machine learning pipelines on Snowflake using Kedro. MLOPS TUTORIAL | 20 min | MLOps | Marcin Zab?ocki | GetInData | Part of Xebia

The next part of MLOps tutorials series in which we prove you can run Kedro pipelines… everywhere. Kedro-snowflake is the newest GetInData | Part of Xebia’s plugin that allows you to run full Kedro pipelines in Snowflake.Thanks to this, you can build your ML pipelines in Kedro and execute them in a scalable Snowflake environment in three simple steps.



CONFS EVENTS AND MEETUPS

MLOps at Dutch Unicorn Fintech Mollie | 23rd May | Online

In this webinar, Mollie will share their MLOps journey, from initial idea to current use, showcasing their custom platform built around Google's Vertex AI and other tools.

The presentation aims to deepen attendees' understanding of MLOps and provide actionable strategies for improving the model development process and achieving reliable, scalable, and maintainable deployments. It will equip participants with the knowledge to implement MLOps in their own ML projects and organizations.


Cloud Leadership Day | 21th June | Zurich

Join Xebia, Swisscom, Microsoft, Lufthansa Group and others at Cloud Leadership Day, where the brightest minds and industry leaders will converge to explore the latest trends, innovations, and strategies shaping the future of cloud computing.? Gain invaluable insights and connect with like-minded professionals who are passionate about leveraging the power of the cloud.

No alt text provided for this image

You will dive into topics like:

  • How To Align AI Technology with Your Business Strategy?
  • A Journey through FinOps and Multi-Cloud Strategy?
  • Breaking the Data-Business Divide: Successful Data Strategy Execution?
  • Generative AI in Finance??

and many more.

________________________


Have any interesting content to share in the DATA Pill newsletter?

? Join us on GitHub

? Dig previous editions of DataPill?


Adam from the GetInData | Part of Xebia

要查看或添加评论,请登录

Adam Kawa的更多文章

社区洞察

其他会员也浏览了