Scaling and Accelerating your AI Journey - An Operational Example
Operational Model built on a Lakehouse platform capabilities such as Databricks

Scaling and Accelerating your AI Journey - An Operational Example

Following up on my previous blog (Choosing the right Data & AI Technology for your Next Gen Capabilities - Five Guiding Principles), I want to share an example on how these principles can accelerate your AI journey specifically focusing on:?

1.Standardization of data layer & transformations (data foundation) to scale AI through democratizing AI persona-specific tools.?

2.Governing centrally produced models from different tools to approve or reject while maintaining data lineage & related.

3.Productionalizing (expediting the last mile journey) models through continuous & efficient operations to accelerate embedded intelligence everywhere.


Important Perspective:

People are a powerful asset. An organization can truly scale innovatively and extensively with its workforce. Business and Technical priorities can change. But inspiring people to rally around common company and individual goals with proper empowerment is priceless investment in the company’s future and its intellectual, business and human capital.


1.Standardize (to scale) the foundational data layer for broader empowerment of DS & CDS tools:

In a recent MIT report findings, 40% of the respondents cited the number-one difficulty they face with their Data & AI platforms as a top concern is training and upskilling staff to use them.

Trying to upskill all of the people in all of the business units to use only one AI tool is an impossible job. The domain knowledge and the tools are still new and complex requiring specialized expertise. An alternative is to divide and conquer through democratizing selected (CDS) tools on the platform. We have seen this before (ex. what Tableau did for democratizing building dashboards to the masses). These needed tools can range from the UI friendly & autoML kind to the heavy coded notebooks experiences.?

But how to avoid the catch-22 effect of the plethora of tools and models? Well, you opt for a standardized data foundational layer to balance that. By applying standardization to the lowest common denominator layer - the foundational data layer and transformation (at minimum), organizations can avoid repeating the negative effects that marked previous similar data experiences, such as the proliferation of multi-truth versions and non-consiladatable data analysts or end-users BI tools data sources and pipelines.

To standardize here is to be concerned with facilitating a common Data & AI culture while having a “Data Centric AI” approach. It relates to ingestion, prep, transformations, workflows, jobs, lineage, auditing, monitoring and more. The result is portable (data) tech assets (not tech debt), with interoperable "utility scaffold" tools and skill sets that are transferable across teams and resource rotation. This allows for democratization of data science tools without sacrificing standardization on centric data access and approach. Ideally, you want to combine this with standardizing on a model registry & tracking system (ex. Databricks MLflow) within the deployed MLOps architecture of choice.

The result? A scaled-up effect of the Data & AI journey with a growing number of models and data science projects. In particular, the low hanging fruit projects that need “good enough” models with lower model metrics scrutiny. Note that many of these models and projects may not pass the governance SOPs (the next principle) but this will ignite Data & AI literacy, and jumpstart the Data & AI culture.


2.Govern (to systemize) produced models from different tools while maintaining data lineage & related:

The same technology of choice for data layer standardization has to allow comprehensive governance for systemising and regulating in an efficient (non-tyrannical) way. This includes discoverability of the Data & AI artifacts, access, end-to-end lineage and auditing. Governing associated AI serving compute resources including associated cost & performance is a plus. This should as well cover full Life Cycle Management (LCM) of produced & consumed products including tracking the impacted value KPIs.?

To remove historical bottlenecks with centralization (which can be pseudo-centralized in mesh architecture as well), the technology needs to facilitate a Data Scientists (DS) governance body or office that coaches, approves and rejects models produced by the scaled up effect from first principle. This office of selected DS gives the stamp of approval - according to established standard operating procedures (SOPs) - of models and artifacts produced, while leveraging the data standardization for fast reproducibility and efficacy plus auditing. More importantly, it allows these hard to hire, highly skilled, expensive resources (Data Scientists) to focus on important or prioritized projects & critical innovations while having a secondary multiplier effect on the CDS teams across the organization.

An example? The leading primary example to date is Databricks Unity Catalog that addresses this principle from day one across both Data & AI assets.?


3.Productionalize (to accelerate) expediting last mile journey of the models through continuous & efficient operations resulting in embedded intelligence everywhere:

Better known as the famous “last mile“, productionalization here means automated production of 100s to 1000s of models across different intelligent experiences & their maintenance (monitoring, updating and retraining) as well as tracking the cost, performance and impact while serving these models. But at scale and in a streamlined and repetitive manner. Take Databricks again as an example: works well with CI/CD, MLOps Stacks for a blueprint productionalization approach, lakehouse monitoring and auditing capabilities, and automation of model life cycle including changes to data streams end-to-end.?

Having strong productionalization capabilities within the same data platform technology of choice ensures the efficacy and continuity of the scaled-up (through resources) and accelerated (left to right in the diagram above) AI outcomes.? For comparison, think historically how we went from 1-10 dashboards per organization to 100s and 1000s of dashboards created and used by many users. This is what predictions and intelligent insights will be like.


To wrap up this blog, standardizing, governing and productionalizing on the same technology platform enables an organization to scale up (the number of products produced)? and accelerate (the availability of these products for consumption) while establishing a pervasive Data & AI culture. Use these three principles to fit your culture and accelerate your AI transformative journey.?


要查看或添加评论,请登录

Ziad A Fayad的更多文章

社区洞察

其他会员也浏览了