Machine Learning for Humans: Overview
Neural Network Flow Diagram

Machine Learning for Humans: Overview

Introduction:

Machine Learning has been increasingly in the news lately. This series of papers will describe some basic concepts and how those concepts may be applied to business. Machine intelligence was referenced by Rene Descartes in his "Discourse on Method" in 1637. Another significant paper was published in 1950 by Alan Turing ("Computing Machinery and Intelligence") where he describes an "imitation game," w a machine interacting with a human (and from where the movie "The Imitation Game" got its name). In short, Machine Learning refers to : the science of getting computers to learn without being explicitly programmed (Arthur Samuel, 1959).  

 Work continued on this path ever since, and with the success of widely published events such as the IBM-Kasparov chess match (1996) and Jeopardy (2011) among others, it has gone into the mainstream. In addition, other technologies such as high speed networks and specialized hardware have made it available to the masses.

These papers will describe some of the more common categories under Artificial Intelligence, such as Machine Learning, Neural Networks and Deep Learning, citing basic examples and how they may (or may not) be suitable for a particular project.

A word about words.

The term 'Artificial Intelligence' has many different definitions. I will be using it as the broad definition (above), with others such as Machine Learning, Neural Networks and Deep Learning as further subsets. Each of these terms, such as Neural Networks, will have multiple variants underneath them (RNN, CNN, GAN, etc.). The goal of this series is not to discuss each variant in detail, but to discuss the categories, their general approach and how they apply to business issues. 

Don't forget about the Math.

Keep in mind, however all of these techniques are rooted in sound mathematical principles and make extensive use of linear algebra, probability, statistics, and calculus. To blindly grab a technique and start coding away is ill advised. A high level understanding of these approaches are recommended in order to build a suitable model. In addition sound business understanding is needed to effectively use these models. As you will see, it is easy to create a model, but it is much more difficult to create a proper model, as well as to determine if it makes business sense to use it. 

The Data Foundation:

Data is extremely important as it is what the data scientist uses as raw input to create and evaluate a model. There have been estimates that up to 80% of a data scientists time is working through data issues such as obtaining, cleaning, filtering, organizing and understanding the data. Though data is available everywhere it is most likely not in the format or definition that you need for your project. It will also have inherent errors that need to be acknowledged and minimized as you work through the modeling. 

Approach to Modeling:

As with many other techniques in life, there are always compromises. Such is the case with modeling. A data scientist has many models at their disposal, with new techniques being introduced continually. Some models assume a linear relationship between the inputs and outputs. These can work well, if your data is, well, linear. But in reality the vast majority of data is not linear. However, in certain situations we can approximate reality by using linear models. One advantage to a linear model is they tend to be easier to explain. "If I increase advertising revenue by 10%, I can expect a 7% increase in Sales". Others are defined by creating groupings in the data, so the model is defined more by the inherent distribution of the data, than by assuming a linear fit.  

Not all models are appropriate for all situations. It is the responsibility of the data scientist to understand the pros and cons when analyzing and proposing a modelling approach. 

From the business perspective, careful thought needs to be applied to determine if the model is used and what type of error is minimized. Let's say a company wants to predict if a customer will default on their credit card based on their balance and student status. Based on historical data of 10,000 customers, 333 defaulted (for a rate of 3.33%). A model built by the company predicts 2.75% will default. Another measurement of the model is that it predicted correctly 99.8% of the non-defaulters. This looks like a pretty solid model!

With a second look, we see some of the cracks. For instance, the error rate without a model is 3.33%. Also, if we looked at the people who did default, we were only able to predict them correctly 75% of the time, most likely not acceptable by the company. Instead of focusing on the business' most important error rate (the prediction of the defaulters) the model instead focused on the overall error rate (prediction of the defaulters plus the prediction of the non-defaulters). A business person may be more inclined to have a larger error rate on misclassifying the non-defaulters and instead being more accurate on the defaulters (as this group is more costly to the company). 

One prediction is Clear.

The uses of Machine Learning is accelerating. Examples of this are everywhere, from identifying faces on Facebook, to serving up search results in Google, to using predictive typing on keyboards, and FaceID in Apple. The list goes on and on. The next paper will focus on Linear Regression before we tackle more involved topics such as Neural Networks and Deep Learning.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了