登录查看更多内容

Feature Store for ML - enabling AI adoption at scale

Sasirekha Cota

AI Strategist, Generative AI, Enterprise Architect, Transformation Consultant, Content Writer

发布日期: 2021年8月19日

We all know that the AI models are ONLY as good as the data they are trained on. Data and Application Silos is one of the main challenges of implementing and scaling AI solutions in Enterprise. Feature stores – the concept and associated products/capabilities coming into the marker – is aimed at addressing this issue and enable AI adoption at scale.

A machine learning model maps a set of data inputs, known as features, to a predictor or target variable.?A feature?is an individual measurable property or characteristic of a phenomenon. In a relational dataset, features appear as columns and typically referred to as attributes or variables.

In Machine learning, the performance and quality of the model is dependent on the choice of the algorithm, quantity of data available as well as the quality of the dataset – accuracy, reliability, completeness, consistency, granularity etc. While it sounds counter-intuitive, it is a well-proven fact that using all the features (especially in its raw form – even if it meets all the basic quality requirements) will not result in the best prediction models. Feature selection and feature engineering can be used as levers for improving and/or optimizing model performance.

Feature selection involves limiting the data inputs used for training (say by eliminating redundant, irrelevant, contradicting attributes in a available dataset) with the aim of increasing accuracy, reducing cost as well as producing an interpretable model (this becoming more and more important with the focus on Explainable AI).

Feature engineering is about transforming existing features to create new features by using various techniques imputation, extracting date/time, handing outliers, grouping, feature split etc. (One good reference is https://towardsdatascience.com/feature-engineering-for-machine-learning-3a5e293a5114).

Arriving at a single new feature that is effective is a long process (that involves trial and error) and resource intensive. What if these new features and the transformation pipeline that are painstakingly created are not limited to a specific project or model, but made available to the entire data scientist population of the Enterprise. And “Feature Store” is aimed at exactly doing that – by publishing a catalog of available features.?Feature Store is the data management layer of ML - which is considered as one of the missing pieces of the puzzle and a superior alternative to the issues arising out of the micro-services architecture in place.

The feature store is a data warehouse of features for machine learning (ML) with the data scientists as the end-user. It is typically implemented as a dual-database:

领英推荐

Challenges in adopting AI and ML

VintageMori Yehudit Mori 2 年前

Including ModelOps in your AI strategy

Giuliano Liguori 3 年前

AI in traditional organizations - PoV on today's…

Anirban Mukherjee 2 年前

1.??????Online Feature store - Row-oriented database (returning a single row of features called “feature vector”) to be used as input for an online model for prediction. Mostly implemented as key-value stores to be able to provide millisecond latency.

2.??????Offline Feature store - Provides large batches of features used to create training/test datasets

In other words, a feature store is a Machine Learning specific data system that stores and manages features, runs data pipelines that transforms raw data into feature values and serves features for both training and production model.

AI-powered products that are limited to the data available within its application are like jellyfish: its autonomic system makes it functional, but it lacks a brain. However, you can evolve your models with data enriched "brains" through the help of a feature store. -https://www.kdnuggets.com/2021/06/ai-with-feature-store.html

Uber Michelangelo aimed at democratizing machine learning and making scaling AI easy seems to be the starting point of this feature store concept. Today Palette is Michelangelo’s Feature store – that is centralized (providing single source of truth for features), catalogued (with features grouped into perspectives) and having reduced training/serving skew (as the features used for training and serving are the same). In effect, Feature stores is the Uberization of AI implementation in Enterprises.

Tecton Feature Store, Kaskada, Feast (open source feature store), Hopsworks, Amazon Feature Store capability of SageMaker (Dec 2020), Splice Machine Feature Store (Jan 2021), Databricks feature store co-designed with MLOps (May 2021), Google Vertex Feature Store (May 2021) are some of the options companies can explore right now.

Clearly the concept of Feature Store has picked up and expected to grow with better products and more “features”. As Enterprises are moving from the experimentation to exploitation of AI, the feature stores concept brings a host of advantages including increased model accuracy, faster development, smoother deployment, better collaboration and improved compliance.

Feature Store for ML - enabling AI adoption at scale

Sasirekha Cota

AI Strategist, Generative AI, Enterprise Architect, Transformation Consultant, Content Writer

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

AI in traditional organizations - PoV on today's realities..

Data and artificial intelligence (4th part)

Leveraging Generative AI for Predictions: A Promising Frontier

Databloom Blossom - The Federated AI for Data Lakehouse Analytics

Harnessing Generative AI (Gen AI) for Enhanced Data Observability: Use Cases and Future Prospects

Transforming Data Analytics: The Power of Generative AI

Generative AI in Data Analytics: Unleashing New Possibilities

Understanding the Business Value of AI

Why the road to AI starts with data.

7 common risks you face when you do a machine learning project

领英推荐

Understanding Low-code/No-code Trends

2021年2月22日

Enterprise Search – Tech Stack History

2020年6月15日

Conversational Technologies to the rescue of HR in the era of Lockdown

2020年5月15日

What is Conversational Experiences in Enterprises?

2019年5月17日

社区洞察

其他会员也浏览了

AI in traditional organizations - PoV on today's realities..

Data and artificial intelligence (4th part)

Leveraging Generative AI for Predictions: A Promising Frontier

Databloom Blossom - The Federated AI for Data Lakehouse Analytics

Harnessing Generative AI (Gen AI) for Enhanced Data Observability: Use Cases and Future Prospects

Transforming Data Analytics: The Power of Generative AI

Generative AI in Data Analytics: Unleashing New Possibilities

Understanding the Business Value of AI

Why the road to AI starts with data.

7 common risks you face when you do a machine learning project