登录查看更多内容

Predictive Maintenance - A look into The Machine Learning Side Of Things

Keyanoush Razavidinani

Trying to be the dumbest person in the room. #ai #digitaltransformation #machinelearning #strategy #datascience

发布日期: 2021年11月5日

Managing a machine learning project can be daunting, time-consuming, and even career-ending. Managers and engineers alike have to answer many complicated questions along the way. In this article, I'll provide managers that have been tasked with a machine learning project with a basic understanding of what an ML project entails.

In our last article - How to Develop a Predictive Maintenance Model - The Data Side of things - we looked at data science, how to clean the data, find features and indicators, and use different models to process the data along the way.

This time we look at machine learning models when machine learning is a viable option, what kind of problems they're best suitable for, and what type of data they need to work correctly.

Introduction - How Machine Learning Works

Machine learning maps input data to output data. Intuitively, we describe this 'mapping' with a formula:

Y = f(X)

Where Y is the output, X is the input data, and the function f(X) is the function that maps the input data to the output data. Machine learning is excellent for solving problems that are repeatable and have defined inputs and outputs.

The machine learning model creates the function that maps the input data to the output data. Some models provide explainable functions, like a decision tree or logistic regression, between the input and the output. Other models are black boxes where we don't know how the model maps input to output like naive Bayes or linear SVM. The machine learning model learns with test data how to map the input data to the output. If we could describe the relationship between input and output data with if/else statements, we wouldn't need a machine learning model.

Machine learning is imperfect, and creating this function always comes with an error and a tradeoff between accuracy, speed, and computational time.

Y = f(X) + e

We always get an irreducible error with machine learning models.

Machine learning models work differently well for different kinds of data sets. There needs to be harmony between the machine learning model and the data. That's why the machine learning engineers are often involved in the whole project, from data acquisition to data processing and testing, integrating, and deploying the machine learning model.

Project managers and executives should know that the hardware, software, data acquisition, data science, and machine learning models are deeply tangled in machine learning projects. Splitting these tasks into separate business units makes the information exchange unnecessarily difficult and could inhibit the project's success.

The basic questions

The project lead and engineer need to ask a few basic questions about the project. It makes sense to bring in a consultant specializing in machine learning to support these questions.

What's the goal of collecting the data?
What's the size, quality, and nature of the data?
How much computational time do we have?

There aren't many standard problems; thus, no standard solutions. Companies have different machines, different ways of collecting data, and different goals on what to do with the data.

Using a cheat sheet can help classify the goal of the machine learning project and provide a framework of what kind of data the company needs to collect.

Source: https://blogs.sas.com/content/subconsciousmusings/2020/12/09/machine-learning-algorithm-use/

Like any cheat sheet, it's a simplification. This article aims to provide the reader with an overview of what models exist, what data they need, and their best use.

It helps managers follow the engineers' thought-process and have some references to understand how the ML engineer works.

What's the goal of collecting the data?

There are two basic ways that machine learning helps companies - increase revenue or decrease costs.

Companies collect data to answer questions during the regular business, evaluate customers' behavior and trends, and make predictions of future outcomes.

领英推荐

Machine Learning Algorithms Every Data Scientist…

Quantum Analytics NG 9 个月前

How to Build a Robust Data Collection Pipeline for…

Objectways 5 个月前

Empowering Intelligence: Automated Machine Learning…

Pratibha Kumari J. 1 年前

Source: Author

A few examples could be that the goal of collecting data is to improve the maintenance of machines, sensor manufacturing patterns, predict customer retention rates, improve delivery routes, or automate recurrent tasks.

If the goal is unclear, the data that the company collects could lack significance and make the whole data collection process useless.

What's the size, quality, and nature of the data?

The data could be in the form of video or picture, text, alpha-numerical data, or time-series data.

Machine learning generally requires a large quantity of data. For a PoC, a few gigabytes of data could be sufficient, but for a production-ready state, we need hundreds of gigabytes, terabytes, or in some cases even petabytes of data.

The manager's goal with machine learning should always be to create a production-ready model. A PoC is a first step and a simplified version of the final project. The PoC - proof of concept requires limited data and processing work to generate first results, but applying the PoC to the production and the full scope of the data requires much more data and work. The basic framework to apply machine learning should be in place before starting with a PoC.

You could have petabytes of data, but if the quality of the data is bad, it could make the data useless for machine learning applications.

Data that have good quality show patterns (features) that help distinguish the properties of a problem.

The phrase is quite abstract, so let's help with an example.

For a machine learning model to learn the difference between an apple and an orange, using the texture of the surface is a good feature. We know that most apples have a smooth surface, while most oranges have a bumpy surface. The feature weight is a bad feature. The weight of an apple or orange can vary strongly depending on the type. Another good feature could be the color. Most oranges are orange, and most apples are not orange.

Good quality data means that the data needs good features that describe what we're trying to solve. In our little example, we provide our model input data like color and texture and expect the output "orange" or "apple."

We discussed feature-extraction in our previous article - How to Develop a Predictive Maintenance Model - The Data Side of things. If the problem has no descriptive features describing its behavior over time, machine learning can't map the input to the output data.

Often, it's more complicated than in our little apple and orange example. I'll discuss feature engineering in an upcoming article.

How much computational time do we have?

Two basic properties of machine learning are speed and accuracy.

For real-time decision-making, machine learning models should be used that trade accuracy for speed. A good example would be in autonomous driving, where decisions have to be made rapidly against predictive maintenance, where the deterioration of the machine is not a real-time issue.

Ask yourself this, how mission-critical is the task at hand?

If you need quick on-site decision-making, you should opt for faster and less accurate machine learning models like Naive Bayer or Linear SVM. On the other hand, if real-time decision-making is not an issue like predictive maintenance, you can opt for a more accurate but more process-intensive model like Random Forest or Neural Network.

Summary

We discussed the fundamental questions that any project manager and engineer involved in a machine learning project should ask before investing large sums.

Machine learning is not a one-for-all solution as many other companies and the industry depict it. For example, machine learning is not so great for problems that require a creative solution approach or where the same output creates different results and human interpretation.

The basic idea of machine learning is to map inputs to outputs. For this mapping to be successful, a company needs to do the groundwork. Especially regarding the data that a company collects, a machine learning consultant or engineer should determine if the meaningfulness of the data is sufficient and if it entails enough features for machine learning models.

Before starting with a machine learning project, the people involved should ask themselves a few basic questions discussed in this article. The first proof of concept should determine if the basic framework of the company is good enough to deploy machine learning throughout the whole business.

Keyanoush Razavidinani

Trying to be the dumbest person in the room. #ai #digitaltransformation #machinelearning #strategy #datascience

3 年

Steve Nouri These are my first tries on educating others about machine learning. Would love to have your feedback and suggestions for future content. :)

1 次回应

要查看或添加评论，请登录

Keyanoush Razavidinani的更多文章

History shows that AI won't just replace us - but enable us.

2024年3月17日

History shows that AI won't just replace us - but enable us.

Human history is interwoven with innovation, the destruction of industries, the emergence of new ones, and resources…

2 条评论
Using Scenario-Thinking to realize the value of AI

2023年11月16日

Using Scenario-Thinking to realize the value of AI

Companies increasingly rely on artificial intelligence (AI) and machine learning (ML) to gain a competitive advantage…
An Integrative Approach to IoT

2023年11月13日

An Integrative Approach to IoT

Companies that seek to integrate IoT often get stuck in the Proof of Concept (PoC) phase. They try to break the problem…

2 条评论
The Internet of Things - The last Article you need to read about IoT!

2023年10月19日

The Internet of Things - The last Article you need to read about IoT!

This article aims to provide the reader with a condensed overview of resources regarding the Internet of Things. I want…

1 条评论
Intellect vs. Intelligence - The War Between Humans and AI

2023年8月5日

Intellect vs. Intelligence - The War Between Humans and AI

In an age where machines converse like humans and algorithms wield power, the rise of Large Language Models (LLMs)…

1 条评论
Optimizing Energy Consumption for a Future-Proof Factory

2022年12月7日

Optimizing Energy Consumption for a Future-Proof Factory

In this study, an energy monitoring system and data logging system was installed in an automobile factory. Energy…

2 条评论
Lego’s Digital Transformation - From the Brink of Bankruptcy to Digital Operating Model

2022年10月6日

Lego’s Digital Transformation - From the Brink of Bankruptcy to Digital Operating Model

What does digital transformation mean? How does a company that builds hardware like Lego transform its operations to…
Architectural Innovation - Digital-First to go Beyond the Innovator's Dilemma

2022年7月19日

Architectural Innovation - Digital-First to go Beyond the Innovator's Dilemma

In this article, we'll discuss the different types of innovation and why a technological similar company can disrupt an…
The Importance of Working on the Edge

2022年2月4日

The Importance of Working on the Edge

Data generation on the edge would outstrip the storage capacity of all data centers worldwide in a few short years. So…

2 条评论
The Flywheel Model - Customer Retention & Satisfaction

2021年9月30日

The Flywheel Model - Customer Retention & Satisfaction

The Flywheel model is a customer engagement model that looks at the customer as an engager instead of an end-goal…

5 条评论

See all articles

Predictive Maintenance - A look into The Machine Learning Side Of Things

Keyanoush Razavidinani

Trying to be the dumbest person in the room. #ai #digitaltransformation #machinelearning #strategy #datascience

Introduction - How Machine Learning Works

The basic questions

What's the goal of collecting the data?

领英推荐

What's the size, quality, and nature of the data?

How much computational time do we have?

Summary

Keyanoush Razavidinani的更多文章

社区洞察

其他会员也浏览了

Enhancing IT Support with Predictive Analytics and Machine Learning

Refine your data to get the most out of machine learning

What is Feature Engineering? —Tools and Techniques for Machine Learning

Training Data is Crucial for Building an Accurate Model

MLOps: Managing Machine Learning Pipelines from Development to Production

Learn Automated Machine Learning in Power BI

Feature Engineering: Unveiling the Art of Data Transformation for Machine Learning

How to approach a Machine Learning Project ?

Data Cleaning and Transformation for Machine Learning

Machine Learning Pipeline

Introduction - How Machine Learning Works

The basic questions

What's the goal of collecting the data?

领英推荐

What's the size, quality, and nature of the data?

How much computational time do we have?

Summary

Keyanoush Razavidinani的更多文章

History shows that AI won't just replace us - but enable us.

Using Scenario-Thinking to realize the value of AI

An Integrative Approach to IoT

The Internet of Things - The last Article you need to read about IoT!

Intellect vs. Intelligence - The War Between Humans and AI

Optimizing Energy Consumption for a Future-Proof Factory

Lego’s Digital Transformation - From the Brink of Bankruptcy to Digital Operating Model

Architectural Innovation - Digital-First to go Beyond the Innovator's Dilemma

The Importance of Working on the Edge

The Flywheel Model - Customer Retention & Satisfaction

社区洞察

其他会员也浏览了

Enhancing IT Support with Predictive Analytics and Machine Learning

Refine your data to get the most out of machine learning

What is Feature Engineering? —Tools and Techniques for Machine Learning

Training Data is Crucial for Building an Accurate Model

MLOps: Managing Machine Learning Pipelines from Development to Production

Learn Automated Machine Learning in Power BI

Feature Engineering: Unveiling the Art of Data Transformation for Machine Learning

How to approach a Machine Learning Project ?

Data Cleaning and Transformation for Machine Learning

Machine Learning Pipeline