Applying Machine Learning in your Company: 10 criteria for the best AI solutions

Applying Machine Learning in your Company: 10 criteria for the best AI solutions

By reading this third part of the blog series, you will get an understanding of what Machine Learning models are.

Reading this article will enable you to communicate with your data scientist at eye level competently.

Here is what you will learn:

·     What kind of prediction methods do exist and what benefit do they have.

·     Understanding the criteria to select the right method.

·     Understanding common terms in the context of machine learning. 

This is part 3 of 5. The full article is split into the following blog post is split into two parts: Part 3-A and Part 3-B.

Read to the end of both articles to fully understand all aspects. The whole article is split into two parts A & B.

PART 3A

Machine Learning Models

In the previous part 2, you gained the basic knowledge about the data itself and how to deal with it.

Applying an appropriate data preparation method yields much better results or makes the application of machine learning methods possible in the first place. 

With this, you are ready for the next step, finding the best machine learning algorithm, and applying it in your production environment to predict new data.

Thus, the machine learning model reflects - to some extent – the actual core of your AI project. 

Its predictions will become the foundation for every other process that builds on top of it.

Algorithm and Model – Concrete and abstract

Machine Learning methods are usually described with the help of mathematical terms and formulations.

These represent an abstract description.

The advantage of this approach is a compact and clear representation of a machine learning method that can be further analyzed and improved by applying mathematical tools.

This is precisely what scientists in the research area of machine learning are doing.

This kind of mathematical formulation is often referred to as a model 

Hence, a model is the abstract description of a machine learning method.

The real world, however, requires something more concrete. The concrete realization of the model is achieved by an actual (computer) algorithm or computer program.

The implementation of the model being a particular algorithm will encounter some uncomfortable truths of reality (like for instance finiteness of CPU calculation etc.).

Thus, the concrete algorithm will always only be an approximation of the abstract model.

Strictly speaking, one has to differentiate between a model and an algorithm.

However, in practice, this difference is irrelevant.

Therefore, for reasons of simplicity, in the context of this article, you can consider model and algorithm as synonyms.

The Human Factor 

You know now the difference between model and algorithm.

However, besides all the theoretical aspects, you must not forget the essential aspect: In order to generate the desired business value, it is indispensable that any persons involved keep the overall business goal in mind.

As a consequence, you should always assess your chosen model concerning this goal. 

This sounds easier than it is.

Step 3 of finding the right model usually provides different options that often yield equally good results.

Due to that, project members might be tempted to choose a less good model.

The neglect of the business goal commonly happens quickly due to a lack of awareness and can lead to the choice of a less suitable model.

Therefore, ask yourself regularly if the chosen model best serves your business goals. 

Besides these human aspects, the technical aspects are of course also of relevance.

Most of the time your target variable (i.e., the quantity one wishes to predict) is in direct relationship to the business goal.

Elementary criteria, therefore, is the question of which machine learning model best accounts for the target variable.

Classification – the standard

Target variables can be very different.

Your target variable might be a decision, or you would only like to distinguish between two values like cheap, and expensive (price estimation), harmless or dubious (fraud), urgent or insignificant (client request), etc. 

In that case, we are talking about classes. Is your target variable a decision? Then, you have to apply a classification model.

In the most straightforward case, classification is about the prediction of two classes.

If there are multiple classes like, e.g., red, green, blue, or yellow, it is a little more complicated but primarily remains the same approach. 

When doing classification one is usually interested in finding an absolute statement about an event such that a definitive decision can be made. 

However, please bear in mind that classification algorithms are not a crystal ball and you have to accept a specified error tolerance, keyword false positive and false negatives.

Side note:

What are False Positives and False Negatives?

As the name suggests false positive are data points that are mistakenly predicted as positive although, in reality, they are negative.

Similarly falsely negatively predicted data points (false negative) are positive.

False negative and false positive both denote an error regarding the prediction. On the other hand, true positives and true negatives are correctly predicted as positive and negative respectively. 

Consider the following example:

You want to predict the winner of the next Champions League cup.

Let’s say you are a fan of Bayern Munich and the positive event for you is Bayern Munich wins. 

As a consequence, any other team winning the cup is a negative event. 

You now will become yourself the machine learning algorithm. There are two types of prediction you can make. The first one “Bayern Munich wins” (positive outcome) and the second one “Bayern Munich loses” (negative outcome).

Now, there are four possible combinations of outcomes:

1.    Your prediction is “Bayern Munich wins,” and Bayern Munich does win the cup. So you correctly predicted the positive outcome, and you have a True Positive.

2.    If you are pessimistic about your favorite club, your prediction might be “Bayern Munich loses” and Bayern Munich does lose the final. So you correctly predicted the negative outcome, and you have a True Negative. 

So, 1 and 2 describe the situation when your prediction was correct in regard to the negative and positive outcome. On the other hand, there are the following other two possible outcomes:

3.    Your prediction is “Bayern Munich wins” but Bayern Munich loses the final. So you incorrectly predicted the actual result. Since “Bayern Munich wins” is considered the positive outcome, you mistakenly predicted the positive outcome which makes this prediction a False Positive.

4.    Similarly, if you say “Bayern Munich loses” but Bayern Munich does win the cup you mistakenly predicted the false outcome which makes this a False Negative. 

The following diagram visualizes all possible outcomes:

Usually, you will predict more than just one data point.

Therefore, for each prediction, you will be at risk of making a False Positive or False Negative error. 

Fundamentally speaking: You will reach your goals the best if you try to do predict as less False Negatives and False Positives as possible.

So, when you only predict True Positives and True Negatives, your prediction will be perfect. 

However, since reality is a reality, you should consider such a perfect result just as a theoretical example. In reality, such a result is impossible. Alternatively, if you encounter it, you can be sure that you made a mistake somewhere in your implementation.

Besides having an entire class membership, one is also interested in membership probabilities. 

For instance, predictions like this image is with 83% probability an image of a car.

Probabilities lie in the range of 0% to 100%, or between 0 and 1. Other values like 2 or 3, 48 are impossible or better said: it does not make any sense.

If probabilities are of importance, you have to apply a probabilistic classification algorithm. This is an algorithm that also generates probabilities instead of only labels. Classic examples are Na?ve Bayes or Logistic Regression.

Probabilistic classification algorithms are a subclass of all classification algorithms. For instance, a classification algorithm that is not probabilistic is the Support Vector Machine.

Probabilities offer a nice additional benefit: They can be ranked.

Consider an online shop. You want to know how likely a customer will respond to a special offer.

Based on user behavior you can use a probabilistic model to calculate the probability of how a customer will respond to this offer. 

The higher the relevance or probabilities, the more likely the customer will buy or convert. 

This ranking can be used to focus your marketing efforts only on the most relevant or most promising customers.

Thus, you will waste significant less money and time.

This kind of ranking is also often referred to as Recommendation (a famous example would be the Amazon product recommendation algorithm). 

This was part A of the article. The article is split into two parts. The article is split into two parts. To continue reading part B, please click here.

要查看或添加评论,请登录

Dr. Thomas Vanck的更多文章

社区洞察

其他会员也浏览了