DATA Pill #021 - serverless Lock-in, real-time AI and a lot from the open-source giants

DATA Pill #021 - serverless Lock-in, real-time AI and a lot from the open-source giants

Hi!

Do you really need MLOps?

Thankfully a new, freshly extracted DATA Pill is already waiting for you to answer this and much more questions.


ARTICLES?

Concerned about Serverless Lock-in? Consider Patterns! | 12 min | Cloud | Gregor Hohpe | The Architect Elevator Blog

A Lock-in to the cloud is maybe unavoidable, but the risk is strongly reduced if you introduce good architecture patterns as an abstraction layer. The author nicely describes this concept here.


Enabling real-time AI with Streaming Ingestion in Vertex AI | 7 min read | AI & ML | Erwin Huizenga & Kaz Sato | Google Cloud Blog

It’s difficult to set up the infrastructure needed to support high-throughput updates and low-latency retrieval of data.?

Starting this month, the Vertex AI Matching Engine and Feature Store will support real-time Streaming Ingestion as Preview features. With Streaming Ingestion for Matching Engine, a fully managed vector database for a vector similarity search and items in an index are updated continuously and reflected in the similarity search results immediately.

This blog post covers how these new features can improve predictions and enable near real-time use cases, such as recommendations, content personalization and cybersecurity monitoring.

BTW, last week I recommended the ebook about Building a Feature Store (with an introduction to Vertex AI). That was a coincidence, but… right on time ;)?


Upgrading Data Warehouse Infrastructure at Airbnb | 10 min read | Cloud | Ronnie Zhu, Edgar Rodriguez, Jason Xu, Gustavo Torres, Kerim Oktay & Xu Zhang | Airbnb Tech Blog

Airbnb’s experience with upgrading their Data Warehouse infrastructure to Spark and Iceberg.

In our data ingestion framework, we found that we could take advantage of Iceberg’s flexibility to define multiple partition specs to consolidate ingested data over time. Ingested tables write new data with an hourly granularity (ds/hr), and a daily automated process compresses the files on a daily partition (ds), without losing the hourly granularity, which later can be applied to queries as a residual filter.


No, you don’t need MLOps | 5 min read | MLOps | Lak Lakshmanan | Personal Blog

A bit of a provocative title, but the content features a concrete proposition of Keep it Simple alternatives to complex MLOps solutions.

It also provides some sort of a rule of thumb when complexity is actually necessary, so it's not all hype.


Evolution of Streaming Pipelines in Lyft’s Marketplace | 6 min read | Streaming | Rakesh Kumar | Lyft Engineering Blog

Lyft’s journey of evolving our streaming platform and pipeline to better scale and support new use cases. Each iteration provided a better scale, but also exposed shortcomings.

No alt text provided for this image

{ MORE LINKS }


TUTORIALS

LoadBalancer Services using Kubernetes in Docker (kind) | 11 min read | Kubernetes | Owain Williams | Groupon Blog

Tutorial to multi-node kind cluster with extraPortMappings to forward requests from your host to an NGINX ingress controller, which uses the path to send your request to the appropriate service, rewriting the target so it can recognise the request.


NEWS

OpenTest: McDonald’s debut into open-source software | 4 min | Adrian Theodorescu | McDonald’s Technical Blog

A short insight into why McDonald's open sourced OpenTest.

The open sourcing for us led to another significant benefit by reducing the unnecessary friction involved in getting the software onto people’s machines. No more approvals required and no more dependencies on other teams for the actual binaries and updates.

?

PODCAST

What Data Visualization Means for Data Literacy | 41 min | AI | Host: Ben Lorica, Guest: Yashar Behzadi | The Data Exchange

  • how data visualization increases organizational data literacy
  • the best practices for visual storytelling


Synthetic data technologies can enable more capable and ethical AI | 40 min | Data Visualization | Andy Cotgreave | DataFramed

Yashar Behzadi is the CEO & Founder of Synthesis AI, a startup that uses synthetic data technologies to enable teams to build AI applications, as well as gaming and metaverse applications.


CONFS AND MEETUPS

Data Driven Innovation | 12 0ctober | Online

The third edition of the Big Data, AI, ML and Data Science conference organized by Computerworld Magazine.


Building Machine Learning pipelines with Kedro and Vertex AI on GCP? | 25 October | MLOps | Micha? Bry? | Free Webinar?

Micha? Bry? - Senior ML Engineer and Technical Product Owner will cover:

  • Why we need a pipeline for machine learning models
  • Kedro, an open-source Python framework for creating reproducible, maintainable and modular data science code
  • Q&A session

{ MORE LINKS }


___________________________


See you next week ??

Adam Kawa from GetInData

要查看或添加评论,请登录

Adam Kawa的更多文章

社区洞察

其他会员也浏览了