登录查看更多内容

Applying Machine Learning in your Company: 10 criteria for the best AI solutions

Dr. Thomas Vanck

Machine Learning Expert , Consultant

发布日期: 2019年2月21日

+ 关注

By reading this third part of the blog series, you will get an understanding of what Machine Learning models are.

Reading this article will enable you to communicate with your data scientist at eye level competently.

Here is what you will learn:

· What kind of prediction methods do exist and what benefit do they have.

· Understanding the criteria to select the right method.

· Understanding common terms in the context of machine learning.

This is part 3 of 5. The full article is split into the following blog post is split into two parts: Part 3-A and Part 3-B.

Part 1: Machine Learning Projects: 5 Steps to Success!
Part 2: Data are the new oil: 10 essential Machine Learning tricks to maximize exploitation
Part 3 - First Part: 10 criteria for the best Machine Learning models (this part)
Part 3 - Second Part: 10 criteria for the best Machine Learning models
Part 4 & 5: Coming soon! Follow me on Linkedin so you won’t miss it.

Read to the end of both articles to fully understand all aspects. The whole article is split into two parts A & B.

PART 3A

Machine Learning Models

In the previous part 2, you gained the basic knowledge about the data itself and how to deal with it.

Applying an appropriate data preparation method yields much better results or makes the application of machine learning methods possible in the first place.

With this, you are ready for the next step, finding the best machine learning algorithm, and applying it in your production environment to predict new data.

Thus, the machine learning model reflects - to some extent – the actual core of your AI project.

Its predictions will become the foundation for every other process that builds on top of it.

Algorithm and Model – Concrete and abstract

Machine Learning methods are usually described with the help of mathematical terms and formulations.

These represent an abstract description.

The advantage of this approach is a compact and clear representation of a machine learning method that can be further analyzed and improved by applying mathematical tools.

This is precisely what scientists in the research area of machine learning are doing.

This kind of mathematical formulation is often referred to as a model

Hence, a model is the abstract description of a machine learning method.

The real world, however, requires something more concrete. The concrete realization of the model is achieved by an actual (computer) algorithm or computer program.

The implementation of the model being a particular algorithm will encounter some uncomfortable truths of reality (like for instance finiteness of CPU calculation etc.).

Thus, the concrete algorithm will always only be an approximation of the abstract model.

Strictly speaking, one has to differentiate between a model and an algorithm.

However, in practice, this difference is irrelevant.

Therefore, for reasons of simplicity, in the context of this article, you can consider model and algorithm as synonyms.

The Human Factor

You know now the difference between model and algorithm.

However, besides all the theoretical aspects, you must not forget the essential aspect: In order to generate the desired business value, it is indispensable that any persons involved keep the overall business goal in mind.

As a consequence, you should always assess your chosen model concerning this goal.

This sounds easier than it is.

Step 3 of finding the right model usually provides different options that often yield equally good results.

Due to that, project members might be tempted to choose a less good model.

The neglect of the business goal commonly happens quickly due to a lack of awareness and can lead to the choice of a less suitable model.

Therefore, ask yourself regularly if the chosen model best serves your business goals.

Besides these human aspects, the technical aspects are of course also of relevance.

Most of the time your target variable (i.e., the quantity one wishes to predict) is in direct relationship to the business goal.

Elementary criteria, therefore, is the question of which machine learning model best accounts for the target variable.

Classification – the standard

Target variables can be very different.

Your target variable might be a decision, or you would only like to distinguish between two values like cheap, and expensive (price estimation), harmless or dubious (fraud), urgent or insignificant (client request), etc.

In that case, we are talking about classes. Is your target variable a decision? Then, you have to apply a classification model.

In the most straightforward case, classification is about the prediction of two classes.

If there are multiple classes like, e.g., red, green, blue, or yellow, it is a little more complicated but primarily remains the same approach.

When doing classification one is usually interested in finding an absolute statement about an event such that a definitive decision can be made.

However, please bear in mind that classification algorithms are not a crystal ball and you have to accept a specified error tolerance, keyword false positive and false negatives.

Side note:

What are False Positives and False Negatives?

As the name suggests false positive are data points that are mistakenly predicted as positive although, in reality, they are negative.

Similarly falsely negatively predicted data points (false negative) are positive.

False negative and false positive both denote an error regarding the prediction. On the other hand, true positives and true negatives are correctly predicted as positive and negative respectively.

Consider the following example:

You want to predict the winner of the next Champions League cup.

Let’s say you are a fan of Bayern Munich and the positive event for you is Bayern Munich wins.

As a consequence, any other team winning the cup is a negative event.

You now will become yourself the machine learning algorithm. There are two types of prediction you can make. The first one “Bayern Munich wins” (positive outcome) and the second one “Bayern Munich loses” (negative outcome).

Now, there are four possible combinations of outcomes:

1. Your prediction is “Bayern Munich wins,” and Bayern Munich does win the cup. So you correctly predicted the positive outcome, and you have a True Positive.

2. If you are pessimistic about your favorite club, your prediction might be “Bayern Munich loses” and Bayern Munich does lose the final. So you correctly predicted the negative outcome, and you have a True Negative.

So, 1 and 2 describe the situation when your prediction was correct in regard to the negative and positive outcome. On the other hand, there are the following other two possible outcomes:

3. Your prediction is “Bayern Munich wins” but Bayern Munich loses the final. So you incorrectly predicted the actual result. Since “Bayern Munich wins” is considered the positive outcome, you mistakenly predicted the positive outcome which makes this prediction a False Positive.

4. Similarly, if you say “Bayern Munich loses” but Bayern Munich does win the cup you mistakenly predicted the false outcome which makes this a False Negative.

The following diagram visualizes all possible outcomes:

Usually, you will predict more than just one data point.

Therefore, for each prediction, you will be at risk of making a False Positive or False Negative error.

Fundamentally speaking: You will reach your goals the best if you try to do predict as less False Negatives and False Positives as possible.

So, when you only predict True Positives and True Negatives, your prediction will be perfect.

However, since reality is a reality, you should consider such a perfect result just as a theoretical example. In reality, such a result is impossible. Alternatively, if you encounter it, you can be sure that you made a mistake somewhere in your implementation.

Besides having an entire class membership, one is also interested in membership probabilities.

For instance, predictions like this image is with 83% probability an image of a car.

Probabilities lie in the range of 0% to 100%, or between 0 and 1. Other values like 2 or 3, 48 are impossible or better said: it does not make any sense.

If probabilities are of importance, you have to apply a probabilistic classification algorithm. This is an algorithm that also generates probabilities instead of only labels. Classic examples are Na?ve Bayes or Logistic Regression.

Probabilistic classification algorithms are a subclass of all classification algorithms. For instance, a classification algorithm that is not probabilistic is the Support Vector Machine.

Probabilities offer a nice additional benefit: They can be ranked.

Consider an online shop. You want to know how likely a customer will respond to a special offer.

Based on user behavior you can use a probabilistic model to calculate the probability of how a customer will respond to this offer.

The higher the relevance or probabilities, the more likely the customer will buy or convert.

This ranking can be used to focus your marketing efforts only on the most relevant or most promising customers.

Thus, you will waste significant less money and time.

This kind of ranking is also often referred to as Recommendation (a famous example would be the Amazon product recommendation algorithm).

This was part A of the article. The article is split into two parts. The article is split into two parts. To continue reading part B, please click here.

要查看或添加评论，请登录

Dr. Thomas Vanck的更多文章

3 scenarios that help you decide when AI makes sense for you

2020年6月10日

3 scenarios that help you decide when AI makes sense for you

Machine Learning or AI (Artificial Intelligence) is an excellent tool to solve many practical problems in an elegant…
Machine Learning & KI: 3 Szenarien durch die Sie das Einsatzpotenzial erkennen

2020年5月25日

Machine Learning & KI: 3 Szenarien durch die Sie das Einsatzpotenzial erkennen

Machine Learning oder KI (Künstliche Intelligenz) ist ein ausgezeichnetes Werkzeug, um viele praktische Probleme…
How Machine Learning helps you better understand your company

2020年5月13日

How Machine Learning helps you better understand your company

Machine learning is a popular and a very hyped media topic these days. Most people probably already know that machine…
Dank Machine Learning: Verstehen Sie warum etwas passiert!

2020年4月28日

Dank Machine Learning: Verstehen Sie warum etwas passiert!

Dieser Tage ist Machine Learning ein beliebtes und popul?res Medienthema. Sie wissen wahrscheinlich schon, dass man…
Don't get Fooled! How to apply AI for detecting anomalies

2019年3月21日

Don't get Fooled! How to apply AI for detecting anomalies

The world rotates each day once around its own axis, and each day - whether normal or not – different things happen…
Lassen Sie sich nicht austricksen! So schützt Sie moderne KI vor Anomalien

2019年3月14日

Lassen Sie sich nicht austricksen! So schützt Sie moderne KI vor Anomalien

Jeden Tag dreht sich die Welt einmal um die eigene Achse und jeden Tag passieren – ob normal oder unnormal - die…
Debunk cheating data scientists – before they even start cheating

2019年3月7日

Debunk cheating data scientists – before they even start cheating

This is part 4 and 5 of my 5-part blog series about the application of AI and Machine Learning in company projects…
Applying Machine Learning in your Company: 10 criteria for the best AI solutions

2019年2月27日

Applying Machine Learning in your Company: 10 criteria for the best AI solutions

By reading this third part of the blog series, you will get an understanding of what Machine Learning models are…
So entlarven Sie schummelnde Data Scientists

2019年2月14日

So entlarven Sie schummelnde Data Scientists

Dies ist Teil 4 und 5 meiner 5-teiligen Blogserie über den Einsatz von KI und Machine Learning in Unternehmen. Mit dem…
Data are the new oil: 10 essential Machine Learning tricks to maximize exploitation

2019年1月16日

Data are the new oil: 10 essential Machine Learning tricks to maximize exploitation

The following blog series will give you all the necessary knowledge to plan and execute a Machine Learning based…

See all articles

Applying Machine Learning in your Company: 10 criteria for the best AI solutions

Dr. Thomas Vanck

Machine Learning Expert , Consultant

PART 3A

Machine Learning Models

Algorithm and Model – Concrete and abstract

The Human Factor

Classification – the standard

What are False Positives and False Negatives?

Dr. Thomas Vanck的更多文章

社区洞察

其他会员也浏览了

Exploring Model Inference in Machine Learning: Essential Techniques and Learning Materials

Artificial Intelligence vs. Machine Learning

ML Explained

Top 14 No-Code Machine Learning Platforms To Use in 202

Data Quality Is Essential for AI and Machine Learning Success

Importance of Datasets in Machine Learning and AI Research

How to apply Machine Learning in case of limited data set?

What is the difference between symbolic systems and machine learning?

Unveiling the Veil: Data Science and Explainable AI in Machine Learning

What is machine learning and how can my organisation benefit?

PART 3A

Machine Learning Models

Algorithm and Model – Concrete and abstract

The Human Factor

Classification – the standard

What are False Positives and False Negatives?

Dr. Thomas Vanck的更多文章

3 scenarios that help you decide when AI makes sense for you

Machine Learning & KI: 3 Szenarien durch die Sie das Einsatzpotenzial erkennen

How Machine Learning helps you better understand your company

Dank Machine Learning: Verstehen Sie warum etwas passiert!

Don't get Fooled! How to apply AI for detecting anomalies

Lassen Sie sich nicht austricksen! So schützt Sie moderne KI vor Anomalien

Debunk cheating data scientists – before they even start cheating