登录查看更多内容

Data and artificial intelligence (4th part)

Philippe NIEUWBOURG

Data Strategist | AI Governance | Data Governance | Data Architect | Concepteur pédagogique | Formation et accompagnement

发布日期: 2024年7月23日

Data is at the heart of the proper functioning of machine learning models, deep learning, LLMs, RAGs and so on. No model, none, can understand our world, without having gone through a training phase.

Some models are capable of learning on their own, depending of course on the data they are provided with; others need the human being to label the data beforehand. But it's invariable: an AI model is nothing without the data it needs to learn.

So, it's easy to imagine that the quality of what the model learns depends on the quality of the data it learns from. In one sentence, we've said it all! And then the problem appears.

If I supply my model with poor quality data, it will predict or generate poor quality results! It's not that difficult to understand.

So how do you go about it? In fact, everything is already in place, nothing new, we just need to apply the best practices linked to data governance. Yes, indeed, deploying AI tools in production without data governance is as dangerous as driving without having passed the code!

Data governance has three facets: knowledge (i.e. the data catalog), the quality of the data used, and finally its compliance.

So, AI or dashboard, the stakes are the same.

领英推荐

Why the road to AI starts with data.

Kevin O'Callaghan 6 年前

When It Comes To AI—Synthetic Data Has A Dirty Little…

John Anthony Radosta 1 年前

Understanding the Business Value of AI

Jana Koehler 6 年前

First and foremost, knowledge. If you don't know what data is feeding your AI models, you've got it all wrong. Or to be more precise, you run the risk of using unsuitable data. So, the first step is to reference and catalog the data used by your models. We often use graph modeling to connect the data, the algorithms that use them, and the people in charge. So, the first step is to map and catalog the data.

The second step is quality. Second indeed, because how can you measure the quality of data without first referencing it? So, measure, evaluate, quantify non-quality. Just because we're used to hearing at the coffee machine that this data is false, doesn't mean it really is. If so, in what proportion? Is it still usable? You can't improve what you haven't measured. Once measured, we look for the root causes of this non-quality. There's no point in correcting the data stock if you haven't plugged the leak first! At this stage, we assess whether the data can be used to feed algorithms and inform users of the actual state of its quality.

And thirdly, compliance. Doesn't it shock you to feed an algorithm with data you have no right to use? For RGPD compliance reasons, for ethical reasons, for AI Act compliance reasons, etc. So, the data used by AI must be compliant, no loopholes.

In short, prior to any production launch, the data used by artificial intelligence must be catalogued, its quality measured and its conformity validated.

I did specify, before going live. That some tests are carried out by data scientists in sandbox mode on anonymized data, just "to see". This is acceptable. But be warned: before going into production, data governance and AI must be rigorously scrutinized.

From the point of view of corporate responsibilities, there are many intersections between the person in charge of data governance and the person in charge of Artificial Intelligence governance. And it's logical that, in some organizations, the same person should take on both responsibilities.

#data #ai #aigovernance #dataquality #datagovernance

要查看或添加评论，请登录

Gouverner l’intelligence artificielle : cadres réglementaires et normatifs (3ème partie)

2024年10月15日
Le Chief Data Officer (CDO) est en CDD, et c’est une excellente nouvelle !

2024年10月8日
Combiner l’architecture Data Vault et une gouvernance des données orientée métier

2024年9月29日
Gouverner l’intelligence artificielle : cartographier les risques (2ème partie)

2024年9月22日
Non, la gouvernance des données n'est pas un projet !

2024年9月15日
MDM et catalogue de données : pourquoi deux outils complémentaires ?

2024年9月9日
Le MIT a recensé 777 risques potentiels liés à l’IA dans une base de données partagée gratuitement

2024年9月1日
Gouverner l’intelligence artificielle : un passage obligé afin d’en sécuriser les bénéfices pour l’entreprise (1ère partie)

2024年8月13日
Gouvernance des données : quelques prérequis organisationnels

2024年8月4日
Governing Artificial Intelligence (3rd part): Regulatory and standards frameworks

2024年7月11日

查看全部

Data and artificial intelligence (4th part)

Philippe NIEUWBOURG

Data Strategist | AI Governance | Data Governance | Data Architect | Concepteur pédagogique | Formation et accompagnement

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

What may go wrong with your Artificial intelligence Projects? Top 6 reasons for failures in AI!

Why Small Data is Essential for Advancing AI

Make Way, "Big Data" !

The Path to AI Success Begins with Quality Data

Why Clean Data is the Key to Successful AI Applications

Data, Data, Data: Readiness, Quality, and AI Strategy

Debunking the Myth of Data Exhaustion: Why Future Models Will Not Run Out of Training Data

AI & ML are great, but there is no such thing as a free lunch!

Feature Store for ML - enabling AI adoption at scale

Building a High-Quality Dataset: Best Practices and Challenges

领英推荐

Gouverner l’intelligence artificielle : cadres réglementaires et normatifs (3ème partie)

2024年10月15日

Le Chief Data Officer (CDO) est en CDD, et c’est une excellente nouvelle !

2024年10月8日

Combiner l’architecture Data Vault et une gouvernance des données orientée métier

2024年9月29日

Gouverner l’intelligence artificielle : cartographier les risques (2ème partie)

2024年9月22日

Non, la gouvernance des données n'est pas un projet !

2024年9月15日

MDM et catalogue de données : pourquoi deux outils complémentaires ?

2024年9月9日

Le MIT a recensé 777 risques potentiels liés à l’IA dans une base de données partagée gratuitement

2024年9月1日

Gouverner l’intelligence artificielle : un passage obligé afin d’en sécuriser les bénéfices pour l’entreprise (1ère partie)

2024年8月13日

Gouvernance des données : quelques prérequis organisationnels

2024年8月4日

Governing Artificial Intelligence (3rd part): Regulatory and standards frameworks

2024年7月11日

社区洞察

其他会员也浏览了

What may go wrong with your Artificial intelligence Projects? Top 6 reasons for failures in AI!

Why Small Data is Essential for Advancing AI

Make Way, "Big Data" !

The Path to AI Success Begins with Quality Data

Why Clean Data is the Key to Successful AI Applications

Data, Data, Data: Readiness, Quality, and AI Strategy

Debunking the Myth of Data Exhaustion: Why Future Models Will Not Run Out of Training Data

AI & ML are great, but there is no such thing as a free lunch!

Feature Store for ML - enabling AI adoption at scale

Building a High-Quality Dataset: Best Practices and Challenges