What is machine learning?
William Dyrland-Marquis
Data Engineer II at GrowthLoop | Software Solutions | Cloud Computing (GCP), Python, Linux, SQL, Cryptography, System Troubleshooting, Generalist, T-shaped
What is machine learning?
One of the first searches I ever made for machine learning turned up this quote, which I feel sums up the whole thing pretty well:
"Machine learning is a method of data analysis that automates analytical model building." (quoted from https://www.sas.com/en_us/insights/analytics/machine-learning.html)
An alternative description of machine learning that I found is this:
“Field of study that gives computers the capability to learn without being explicitly programmed” (quoted from https://www.geeksforgeeks.org/ml-machine-learning/)
A Brief History of machine learning:
(paraphrased from https://www.doc.ic.ac.uk/~jce317/history-machine-learning.html)
Interest in machine learning and neural networks has been around since the 1940's, but only recently have they begun to be buzzwords in tech once again. In 1943, a paper was written regarding neurons in the brain, and the first neural network was constructed using electronic circuitry. The world famous Turing test was created in 1950. The first self-learning program was a game of checkers made in 1952, and in 1958 and 1959, two neural-network-based programs used for different types of pattern recognition were pioneered.
Probably due to the Von Neumann architecture's popularity, there was a slight pause in interest until the 1970's, when interest picked back up again.
After that, one year in particular is notable. In 1997, a chess computer named Deep Blue beat the world chess champion of the time, which could serve as the first notable instance of a computer beating a human expert in an area of general problem solving.
But the real explosion in interest regarding machine learning has come about mostly in recent years. This is due to a combination of factors. Large amounts of data are available recently, which is needed for machine learning. In addition, parallelization is constantly becoming better designed, which neural networks can naturally take advantage of. Also, the ease of storage for large amounts of data today is vastly superior to that which existed only a few decades ago.
How does machine learning work?
In terms of how the models are "automatically" built, there are actually a few different ways. In any of these methods, however, making the predictive model is roughly the same steps every time:
First, ask a question that you are trying to get the model to answer. Then, collect data related to the question. After that, use an already-chosen algorithm to "train" your model. Next, you are ready to test the model with new data. The results of the model are then analyzed. And finally, the training algorithm is updated based on the performance of the model on the new data.
This cycle repeats indefinitely until a model is created that approximates reality closely enough to be useful.
Additionally, there are four general categories to separate machine learning into.
Supervised Learning:
In algorithms that use supervised learning, data is tagged with "labels" that contain relevant information about the data being examined. Then when the feedback is analyzed at the end, the result of the model is compared to the labels that already exist on the data. This makes it fairly easy to see how well the model is performing, but it has the drawback that all the data that is used to build and test the model needs to be labeled, which can be a fairly labor intensive task.
Unsupervised Learning:
In algorithms that use unsupervised learning, data is grouped into broader categories, and the model is allowed to train until it can categorize testing data fairly well. Because the categories which the model is predicting are more broad, this can sometimes result in discovering hidden connections between similar data points in the same category. Caution must be used, however; if the data set is not large or diverse enough, coincidental relationships or unintentional bias can be introduced into the model. The big advantage of unsupervised learning is not needing to label every piece of data being used, along with the possibility of discovering previously unknown relationships.
Semisupervised Learning:
Semisupervised learning is a nice balance between supervised and unsupervised learning. The data in semisupervised learning is partially labelled and partially unlabeled. A disadvantage of this would be a possibly less accurate model, but the advantage is a much lower cost of labeling data.
Reinforced Learning:
Reinforced learning involves providing the data as an actual simulation and having the model "play" the simulation and learn from any mistakes it makes. The big advantage here is that you can usually see the progression of your model more clearly, but one disadvantage might be the time and effort required to transform data into simulation form. There is, however, a large advantage for things that are easily translatable to simulations (such as driving a car), because it then becomes possible to generate valid data sets from simple knowledge and logic of the real world.
What is deep learning?
Deep learning is actually a separate type of machine learning that typically used multi-layered neural networks. This can result in finding non-linear or complex relationships between the input and output of the model that is created. Neural networks are actually modeled on how neurons in the human brain actually work, so neural networks are particularly good at human-like tasks, such as natural language processing or image recognition. When most people hear the term "Machine Learning", they usually visualize deep learning and neural networks.
What is the advantage of using neural networks?
The following quote explains it better than I can without going deeply into detail:
"Neural networks are also ideally suited to help people solve complex problems in real-life situations. They can learn and model the relationships between inputs and outputs that are nonlinear and complex; make generalizations and inferences; reveal hidden relationships, patterns and predictions; and model highly volatile data (such as financial time series data) and variances needed to predict rare events (such as fraud detection)." (quoted from https://www.sas.com/en_us/insights/analytics/neural-networks.html)
A few algorithms:
k Nearest Neighbors algorithm:
This is an algorithm that looks at the labels of k number of nearest neighbors and makes a labeling decision for the data point based on the labels of the k nearest neighbors. This algorithm can actually be fine tuned if weighting is used based on how near each neighbor is.
Random Decision Forests:
This is an algorithm related to k Nearest Neighbors which constructs a multitude of decision trees (a forest), with each decision tree being somewhat randomly more sensitive to certain aspects of the input data. It then takes the mode of the forest (for classification), or the mean of the forest (for regression) as the final decision tree to use as the model. This algorithm is mostly concerned with not overfitting the data, and randomly affects the sensitivity to different input dimensions in order to minimize variability of the result but keep reasonably low bias.
k-means Clustering:
This algorithm seeks to cluster information into k number of clusters based on the nearest "mean" for a given cluster of data for every data point.
Locally Weighted Learning:
Like learning from past experience, this algorithm works by picking out the most similar data points to the one being considered from past experience. Then a (possibly weighted) combination of the similar points is constructed and used as the probable future data point for the current situation. This is how people naturally tend to make predictions about unknown situations they encounter, so it turns out to be very intuitive to use.
Naive Bayes:
Naive Bayes is an algorithm that applies Bayes Theorem to a data set while assuming a high-level of independence among the data points. This is often not a good assumption to have.
The future and machine learning:
What will come next for machine learning?
Self driving cars are already being developed and tested. They are on the cusp of being a part of everyday life. Image recognition and AR are becoming so developed that soon reality may be hard to distinguish. The internet of things becoming prominent at the same time as machine learning suggests a way to gather very granular data sets, already prefiltered, and maybe anonymized to some degree. There is an abundance of data and parallelism today, and it only seems to be increasing as time goes on, so a safe bet would be that the field of machine learning will expand as well. This leads to a lot of philosophical questions.
Just as social media websites have changed social interaction for a whole generation, surely machine learning will have an impact for the next generation, but in what areas of life? Will education be necessary? Will we still retain the ability to draw our own analysis with data? What about everyday life? Will we be targeted by ads that have already calculated a 90% chance probability that we will buy a product? If so, will this improve our quality of life? In this new territory, will we be more akin to Pavlov, or to his dogs? Up to this point in time, technology has been a tool for us to use, but who will be ringing the bell 20 years from now?