Artificial intelligence: all comes from data
Data, artificial intelligence, machine e deep learning: the (in)visible link between these buzzwords of 21st century?
Data economy is one of the factor behind the emergence of artificial intelligence. Data economy refers to how much data has grown during the last few years and how much it can grow in the coming year. Despite there is not a “universal” definition of data economy, it can defined as an ecosystem of organizations for whom data is the main source (or goal) of their business (Jani Koskinen – researcher at the University of Turku – Finland).
On the other hand, data is growing at a meteoric rate. In fact, the total amount of data generated by 2025 is set to accelerate exponentially to 175 zettabytes. And over the next two years, enterprise data is expected to increase at a 42% annual growth rate (source: MIT Technology Review Insight).?Embedded within these vast volumes of data there are insights into consumer behavior, emerging market trends, even predictors of the future.
The explosion of data has given rise to a new economy and there is a constant battle for data ownership between companies to gain advantages from it.
From data to big data
The increase in data volume is given rise to big data. Simply speaking, big data is a combination of structured, semi-structured and unstructured data collect by organizations that can be mined for information and used in advanced analitycs applications.
Big data is often characterized by 3 Vs:
The numbers on the other hand, are very interesting. The big data market is poised to grow by $ 247, 30 bn during 2021-2025, progressing at a CAGR of almost 18% during the forecast period (Source Reporterlink- July 2021).
Data science is the discipline that helps to analyze this data. In other words, we can define it as a field of deep study of data that includes extracting useful insights from the data, and processing that information using different tools, statistical models, and machine learning algorithms.?It is a concept that is used to handle big data that includes data cleaning, data preparation, data analysis, and data visualization.
A data scientist collects the raw data from various sources, prepares and pre-processes the data, and applies machine learning algorithms and predictive analysis to extract useful insights from the collected data.
So the science associated with data has create a new paradigm where it's possible to teach machine to learn from data and derive a great number of useful insights, giving rise to artificial intelligence.
What is artificial intelligence?
Artificial intelligence is a constellation of many different technologies working together to enable machines to sense, comprehend, act, and learn with human-like levels of intelligence. Artificial intelligence?is a science like mathematics or biology. It studies ways to build intelligent programs (software) and machines (hardware) that can creatively solve problems, which has always been considered a human prerogative.
It involves autonomous entities called intelligent agents. But what is an agent? An agent can be anything that perceive its environment through sensors and act upon that environment through actuators. An agent runs following this cycle: perceiving,?thinking, and?acting.
Here is some examples of agent:
So the world around us is full of agents such as thermostat, smartphone, camera, and even we are also agents!
What is machine learning?
Machine learning (the term was first introduced by?Arthur Samuel?in?1959) is a subset of AI that provides systems the ability to automatically learn and improve performance from experiences, and make predictions without being explicitly programmed.
With the help of sample historical data, which is known as?training data, machine learning algorithms build a?mathematical model?that helps in making predictions or decisions. Suppose you have a complex problem, where you need to perform some predictions, so instead of writing a code for it, you just need to feed the data to generic algorithms, and with the help of these algorithms, machine builds the logic as per the data and predict the output. In short, machine learning is a set of methods, tools, and computer algorithms used to train machines to analyze, understand, and find hidden patterns in data and make predictions. The definitive goal of machine learning is to utilize data for self-learning, eliminating the need to program machines in an explicit manner. Once trained on datasets, machines can apply memorized patterns on new data and as such make better predictions.
Example of machine learning
Image recognition is a well-known and widespread example of machine learning in the real world. It can identify an object as a digital image, based on the intensity of the pixels in black and white images or color images. Assigning a name to a photographed face (aka “tagging” on social media) is a very popular use case. Another example is speech recognition: machine learning can translate speech into text. Certain software applications can convert live voice and recorded speech into a text file. The speech can be segmented by intensities on time-frequency bands as well. Some of the most common uses of speech recognition software are devices like?Google Home?or?Amazon Alexa.
Another classic applications is spam email filtering. If you open the spam folder in your email account, you may find all kinds of messy and annoying messages. Spam detection systems help with filtering out irrelevant messages from those important to users. The systems analyze the content of emails and classify data by using machine learning algorithms. The task of such ML-based models is to determine whether an incoming message is “spam” or “not spam” (or “ham”). As spam detection is a subject for supervised machine learning, the model is first trained with labeled datasets — examples of spam and ham emails carefully defined by a human expert.
Among the other application we can mention medical diagnosis, statistical arbitrage, predictive analitycs and extraction of structured information from unstructured data.
What is deep learning?
Simply speaking, deep learning is a subset of?machine learning, (we can consider it as a special kind of machine learning) that work technically in the same way as machine learning does, but with different capabilities and approaches?. It is essentially a neural network with three or more layers. These neural networks attempt to simulate the behavior of the human brain allowing it to learn from large amounts of data. While a neural network with a single layer can still make approximate predictions, additional hidden layers can help to optimize and refine for accuracy. If deep learning is a subset of machine learning, how do they differ? Deep learning distinguishes itself from classical machine learning by the type of data that it works with and the methods with which it learns.
Machine learning algorithms leverage structured, labeled data to make predictions—meaning that specific features are defined from the input data for the model and organized into tables. This doesn’t necessarily mean that it doesn’t use unstructured data; it just means that if it does, it generally goes through some pre-processing to organize it into a structured format.
Deep learning eliminates some of data pre-processing that is typically involved with machine learning. These algorithms can ingest and process unstructured data, like text and images,?and automate feature extraction, removing some of the dependency on human experts. For example, let’s say that we had a set of photos of different pets, and we wanted to categorize by “cat”, “dog”, etc. Deep learning algorithms can determine which features (e.g. ears) are most important to distinguish each animal from another. In machine learning, this hierarchy of features is established manually by a human expert.
Relationship between artificial intelligent, machine learning and data science
Let's go ahead, trying to understand the difference between artificial intelligence, machine learning and data science. Even if the terms artificial intelligence, machine learning and data science fall the same domain and are connected each other, they have specific applications an meaning.
Caption - Neither ML nor AI is a subset of Data Science, and Data Science is a subset of neither of these: there are ML techniques used in Data Science for performing particular tasks and solving specific problems and there are AI concepts - that are non ML techniques, employed in the field of Data Science
As mentioned before, artificial intelligence systems (try to) mimic or replicate human intelligence.
Machine learning provides systems the ability to automatically learn and improve from experience without has been explicitly programmed.
Data science is an umbrella term that encompass data analytics, data mining, machine learning, artificial intelligence and several other related disciplines.
So, what is the relationship between these three disciplines? To understand this correlation, we can consider the development process or life cycle of Data Science.
In brief, the first step is data gathering and data transformation. This step basically belong to data science domain. Data transformation is the process of changing or converting the format, structure or values from one source to the format, structure or values of a destination. Data transformation is important to activities like data management and data integration.
After data gathering you would want to use the data to make predictions and derive insights. In order to get predictions out of the data set we use machine learning techniques such as supervised learning and unsupervised learning. In brief supervised and unsupervised learning are the machine learning techniques used to extract predictions from a data set. Now you can ask where deep learning comes into the picture. Deep learning uses, as mentioned before, artificial neural networks which are modeled on the structure and performance of the neurons of the human brain. Deep learning is an advanced form of machine learning generally used for solving more complex tasks in presence of large amount of unstructured data.
The next step is to get insight from predictions being made. In order to do so you need to use data analysis that actually is a process under data science domain. At this point, you must want your data to perform some actions. This is where AI comes into the picture. Artificial intelligence combines predictions and insights to perform actions based on human decisions and automated decisions.
In conclusion, the fields of?artificial intelligence, machine learning?and?data science?have a great deal of overlap, but they are not interchangeable. There are some nuances between them. To summarize: data science produces insights from the data, machine learning produces predictions and artificial intelligence produces actions.