Machine Learning
ROBERTO PALACIOS
DevOps Engineer | Cloud Computing | AWS Cloud | 2x AWS Certified | Certiprof Cibersecurity Certified |
Due to increased capacity and lower cost of information technologies and sensors, we can produce, store and send more data than ever before in history. In fact, it is estimated that 90% of the data currently available on the planet has been created in the last two years, currently producing around 2.5 quintillions (2.5 billion) bytes per day, following a strongly increasing trend. These data feed the Machine Learning models and are the main impetus for the boom that this science has experienced in recent years.
Machine learning offers an efficient way to capture knowledge through the information contained in the data, to gradually improve the performance of predictive models and to make decisions based on such data. It has become a technology with a wide presence, and is currently present in: anti-spam filters for email, automatic driving of vehicles or voice recognition and images.
Basic Terminology and Notations
In Machine Learning we usually use vector matrices and notations to refer to the data, as follows:
- Each row of the matrix is a sample, observation or point data.
- Each column is a feature (or attribute) of the observation mentioned in the previous point ("feature" in the image below).
- In the more general case there will be a column, which we will call objective, label or response, and which will be the value that is intended to predict. ("label" in the image below.
There are specific algorithms whose purpose is to "train" Machine Learning models. These algorithms provide training data that allow models to learn from them.
With respect to Machine Learning algorithms, they usually have certain "internal" parameters. For example in decision trees, there are parameters such as maximum tree depth, number of nodes, number of leaves,… these parameters are called "hyperparameters".
We call "generalization" the ability of the model to make predictions using new data.
Types of Machine Learning
The types of Machine Learning that will be addressed in this series are:
- Supervised learning
- Non supervised learning
- Deep Learning
We will explore and study these three types, focusing more particularly on a class of deep learning techniques called "enhanced learning".
Supervised learning
It refers to a type of Machine Learning models that are trained with a set of examples in which the output results are known. The models learn from these known results and make adjustments to their internal parameters to suit the input data. Once the model is properly trained, and the internal parameters are consistent with the input data and the results of the training data battery, the model will be able to make adequate predictions in the face of new data not previously processed.
Non supervised learning
In unsupervised learning, we will deal with unmarked data whose structure is unknown. The objective shall be the extraction of significant information, without reference to known output variables, and by exploring the structure of such unlabelled data.
There are two main categories: grouping and dimensional reduction.
1. Grouping or Clustering:
Grouping is an exploratory technique of data analysis, which is used to organize information into meaningful groups without prior knowledge of its structure. Each group is a set of similar objects that differs from the objects of other groups. The goal is to obtain a number of groups of similar characteristics.
An example of application of this type of algorithms can be to establish types of consumers according to their purchasing habits, to be able to perform effective and "personalized" marketing techniques.
2. Dimensional reduction:
t is common to work with data in which each observation is presented with high number of characteristics, in other words, that have high dimensionality. This fact is a challenge for the processing capacity and computational performance of Machine Learning algorithms. Dimensional reduction is one of the techniques used to mitigate this effect.
Dimensional reduction works by finding correlations between characteristics, implying that there is redundant information, as some characteristics can be partially explained with others (for example, there may be linear dependence). These techniques remove "noise" from the data (which can also worsen the behavior of the model), and compress the data into a smaller sub-space.
3. Deep Learning
Deep learning, or Deep Learning, is a subfield of Machine Learning, which uses a hierarchical structure of artificial neural networks, which are constructed in a way similar to the neuronal structure of the human brain, with neural nodes connected like a spider web. This architecture allows data analysis to be approached in a non-linear way.
The first layer of the neural network takes raw data as input, processes it, extracts information and transfers it to the next layer as output. This process is repeated in the following layers, each layer processes the information provided by the previous layer, and so on until the data reaches the final layer, which is where the prediction is obtained.
This prediction is compared with the known result, and thus by inverse analysis the model is able to learn the factors that lead to adequate outputs.
Deep learning
Enhanced learning is one of the most important branches of deep learning. The goal is to build a model with an agent that improves its performance, based on the reward obtained from the environment with each interaction that is performed. Reward is a measure of how right an action has been to achieve a certain goal. The agent uses this reward to adjust his future behavior, with the goal of obtaining the maximum reward.
A common example is a chess machine, where the agent decides between a series of possible actions, depending on the layout of the board (which is the state of the environment) and the reward is received according to the result of the game.
Preprocessing
This is one of the most important steps in any Machine Learning application. Data are usually presented in formats that are not optimal (or even inadequate) to be processed by the model. In these cases, data pre-processing is a mandatory task.
Many algorithms require the characteristics to be on the same scale (for example, in the range [0,1]) to optimize their performance, which is often done by applying data standardization or standardization techniques.
We can also find in some cases that the selected characteristics are correlated, and are therefore redundant to extract information with correct meaning from them. In this case we will have to use dimensional reduction techniques to compress the characteristics in subspace with smaller dimensions.
Finally, we will randomly fragment our original data set into subsets of system training and testing.
Summary
In this article we have shown a few strokes of what Machine Learning means, a general picture of its nature, purpose and applications.
We have also learned some basic notations and terminology and different kinds of Machine Learning algorithms:
- Supervised learning, with grading and regression techniques.
- Unsupervised learning, with clustering and dimensional reduction.
- Reinforced learning, in which the agent learns from the environment.
- Deep learning and its artificial neural networks.
Finally, we introduced the typical methodology to build Machine Learning models and explained their main tasks:
- Preprocessing.
- Training and testing.
- Selection of the model.
- Evaluation.
As discussed at the beginning of the article, this is the first in a series and is intended to serve as a general introduction. The aim is that the series will be a stimulating journey as you are shown how to apply different and powerful techniques.
By the technical nature of the series, aspects of calculation, linear algebra, statistics and Python concepts will be shown, as it will be necessary to understand the main concepts and how the algorithms work. But do not worry if you do not have specific training in this regard, as we will make a light approximation to all these concepts.