登录查看更多内容

ODDS: Getting Ready for AI.

Patrick Bangert

Chief of AI | Data Science | Artificial Intelligence (AI) | Machine Learning (ML) | Data Analytics | Product Development | Software Engineering | CTO

发布日期: 2020年10月5日

At a recent discussion session at The Data Standard (https://datastandard.io/), we asked the question “Are you ready for AI?” Josh Odmark and I hosted the session and introduced the topic. Here are the highlights of how to get ready for AI and improve your ODDS for the project giving you a great result.

First, the organizational framework must be set up to create a supporting structure. This starts with support and buy-in from top-level management who are clear on the objectives. A lack of management support will usually mean that the project will either end when the going gets tough or not get deployed in the end. Furthermore, a project needs to be set up with all the usual bells and whistles of a project: A timeline, to do items, stakeholders, budget, and most importantly project management with enough spare time to see it through.

Second, the project should start with due diligence. In this context, that refers to making sure that the current situation is known, the problem is clearly defined, and the form of the solution is clear. When using the word “clear” twice in the prior sentence, I’m speaking as a mathematician who wants a numerical definition that can be objectified. Especially the solution needs to be defined as precisely as possible including objective numerical success criteria that can be assessed at any stage in the project and allow the project to stay on track. An essential part is the desired accuracy of the model. Projects are often either terminated too early (“this will do”) or too late (“let’s see if we can do even better”), wasting resources in either case.

Third, the data must be available. Data science, artificial intelligence, machine learning or what you wish to call it relies on having as much data as possible. Some data may be in your possession. Other data could be generated, acquired, or bought. Some data cannot be obtained at all. What data is even needed, or desired, must be determined at the very start of the project – and this must be done by domain experts and not the data scientists! Sometimes you have less data than you really want and the question is whether this is good enough or not. Unfortunately this cannot be answered until you try to make the model but all stakeholders should be aware that the project is taking a risk.

Fourth, the data must be scrubbed. Most of the time, the term is “clean” but I have used “scrub” here for two reasons. Of course, I want the acronym of this checklist to work nicely but also because this process actually involves more than just cleaning. Before we clean the data, we must ask whether the data is relevant to the question we are asking and whether it is representative of the underlying problem. Often, these two difficult items are not done, which leads to models that do model the situation but do not solve the problem that was posed – note that being relevant and representative are two technical words from statistics that can be made quite precise. Following this, the data must be cleaned, which means outliers and bad data removed, missing values filled in, spikes possibly smoothed over and so on.

Now you are ready to do AI.

Having done AI, you face the toughest task yet: Convincing people to use your model in real life. Change management is the process of getting all the users to change their behavior from whatever they were doing to the new workflow that includes your model and its solution to the problem. This change may involve many features that have nothing to do with AI, or even nothing to do with the problem. Take, for example, industrial predictive maintenance. The problem is that equipment fails catastrophically sometimes, which is why they want an accurate forecast. The change is that maintenance personnel, having acted in a fire-fighting capacity all along, must now learn to plan their work and spare parts orders. This is a significant change in the way the company operates and may require additional software tools to realize – all of which has nothing to do with making an accurate failure forecast.

Want some help with improving your ODDS? Feel free to reach out. Once you are ready to do AI, please consider giving the Brightics AI Accelerator a spin: https://trial.xcelerator.ai/

Sushma Bhan

SPE Technical Director - Data Science and Engineering Analytics | Former Chief Data Officer (CDO) at Shell | Global Data, Digitalization and AI4Energy Leader

4 年

Patrick - I think you have touched the key points. Agree that data can be “Achilles heel” for any company’s digitalization aspirations, especially oil and gas majors. Accessing and ensuring data reliability are key blockers, besides the data accountability, data-centric culture, global data standards for scaling solutions and lack of compliance culture. It has to be all worked together with innovative technology for automation and algorithms to succeed. Keen to hear comments from others. Regards, Sushma

7 次回应

查看更多评论

要查看或添加评论，请登录

Patrick Bangert的更多文章

Operating Models for Data and Analytics

2023年12月2日

Operating Models for Data and Analytics

Data and its related analytics, charts and reports are increasingly important for all enterprises. They are accessed by…

7 条评论
Framework for AI Ethics: A practical guide for technology organizations

2022年7月28日

Framework for AI Ethics: A practical guide for technology organizations

Abstract Ethical considerations have recently become prominent in artificial intelligence (AI) and represent a major…

32 条评论
Semiconductor Manufacturing QA/QC using Visual AI

2021年7月27日

Semiconductor Manufacturing QA/QC using Visual AI

Demand for semiconductors is rising due to increased need for processing power. As new models are being designed faster…
Data efficiency: solving a problem of visual analytics

2021年7月9日

Data efficiency: solving a problem of visual analytics

A consistent pattern in AI research is the need for a great volume of training data. This creates an issue for visual…

12 条评论
To teach a computer: computer vision training and COVID scans (Part 2 of 2)

2021年5月12日

To teach a computer: computer vision training and COVID scans (Part 2 of 2)

In the previous article, we looked at the way that AI can scan x-ray images and identify potential COVID patients. The…
Artificial intelligence and COVID-19: screening for COVID using computer vision

2021年4月19日

Artificial intelligence and COVID-19: screening for COVID using computer vision

There are many ways for hospitals to screen patients for SARS-COV-2. One option that seems very powerful is utilizing…

3 条评论
DistilBERT Benchmark: Distributed Training trains Model over 13 Times Faster by using 8 Times the Resources

2020年9月11日

DistilBERT Benchmark: Distributed Training trains Model over 13 Times Faster by using 8 Times the Resources

Abstract The introduction of masked language models like BERT and the ability to fine tune them for downstream tasks…

2 条评论
AI in the Service of Humanity: Guidelines for Ethical AI

2020年8月25日

AI in the Service of Humanity: Guidelines for Ethical AI

As models made by artificial intelligence interact with human beings in their daily life, we must ask whether those…

1 条评论
The Status and Future of AI

2020年6月29日

The Status and Future of AI

This article will present some thoughts on the status and the next years of AI evolution. Center stage are two…

15 条评论
The Case for Collaboration: Data Science Is Done Best When an Operator Works With a Data Scientist

2019年5月23日

The Case for Collaboration: Data Science Is Done Best When an Operator Works With a Data Scientist

In the past year, the number of presentations and papers submitted to SPE conferences and similar events in the oil and…

21 条评论

See all articles

ODDS: Getting Ready for AI.

Patrick Bangert

Chief of AI | Data Science | Artificial Intelligence (AI) | Machine Learning (ML) | Data Analytics | Product Development | Software Engineering | CTO

Patrick Bangert的更多文章

社区洞察

其他会员也浏览了

From Theory to Practice: 4 AI and Data Science Applications You Can Implement Today

#Artificial Intelligence #25 - My challenges with the definition of data centric vs model centric

AI and ML Functionalities in Power BI

BigID's Data Leaders Series: Week 1 - AI and Leadership: Insights from the Top and Navigating Implementation

Is Your Business AI-Ready? Laying the Data Foundation for AI Implementation

Observability Best Practices for MLOps and GenAI Systems

How AI Retrieves Data Faster Than Traditional Databases

Overcoming Data Scarcity: Doing More With Less Using Data Centric AI

How Vector Databases and Embeddings Power?AI

Patrick Bangert的更多文章

Operating Models for Data and Analytics

Framework for AI Ethics: A practical guide for technology organizations

Semiconductor Manufacturing QA/QC using Visual AI

Data efficiency: solving a problem of visual analytics

To teach a computer: computer vision training and COVID scans (Part 2 of 2)

Artificial intelligence and COVID-19: screening for COVID using computer vision

DistilBERT Benchmark: Distributed Training trains Model over 13 Times Faster by using 8 Times the Resources

AI in the Service of Humanity: Guidelines for Ethical AI

The Status and Future of AI

The Case for Collaboration: Data Science Is Done Best When an Operator Works With a Data Scientist

社区洞察

其他会员也浏览了

From Theory to Practice: 4 AI and Data Science Applications You Can Implement Today

#Artificial Intelligence #25 - My challenges with the definition of data centric vs model centric

AI and ML Functionalities in Power BI

BigID's Data Leaders Series: Week 1 - AI and Leadership: Insights from the Top and Navigating Implementation

Is Your Business AI-Ready? Laying the Data Foundation for AI Implementation

Observability Best Practices for MLOps and GenAI Systems

How AI Retrieves Data Faster Than Traditional Databases

Overcoming Data Scarcity: Doing More With Less Using Data Centric AI

How Vector Databases and Embeddings Power?AI