AI - How, What, Why
"Artificial Intelligence" typically describes machines mimicking cognitive functions such as learning and problem solving. This translation makes AI a moving target giving rise to what’s popularly called the “AI effect” wherein new milestones are so quickly adopted that they seem commonplace, like we aren’t progressing.
Before we get into how robots learn, let's look at how we learn. As babies, a caregiver breaks down a task into simple steps and then we repeat, repeat and repeat some more. Eat with a spoon, stack blocks, read, write – same set of steps, repeated for mastery. Now weave into it, experience. A spoon works well for soup, not so much for steak. Taller the stack, more precarious it becomes. Now how about intuition – the spoon worked well for the soup, so if it’s liquid(ish) ask for a spoon. Blocks start tipping from 5 so keep stacks at 4 or less. These deductions may not always be accurate for example you could use spoons for rice or straws for liquids. Similarly you could use wider bases to stack taller or crazy glue but that is learning that comes from the experience of failed attempts.
In AI terms, the steps are your base algorithm. The addition of experience is machine learning. Instead of writing routines for every possible outcome, you train the machine using massive amounts of data and tweak your routine so it “learns” from it. The intuition (and its refinement) comes from adaptive learning. To enable adaptive learning, we use Neural Networks similar to the biological neural network in our brains. Artificial Neural Networks (ANN) model data using graphs of Artificial Neurons which are a mathematical model that simulates how a biological neuron would work.
The main categories of networks are acyclic or feedforward neural networks (where the signal passes in only one direction) and recurrent neural networks (which allow feedback and short-term memories of previous input events). Neural networks can be applied for learning, using techniques like Hebbian learning, GMDH or competitive learning.
Computer programs contain commands that are largely executed sequentially. Deep learning is a fundamentally new software model where billions of software-neurons and trillions of connections are trained, in parallel. Using Deep Neural Network (DNN) algorithms and tuning, the computer is essentially writing its own software.
For neural network-based deep learning models, as the name implies, the number of layers are greater than in shallow learning algorithms. Shallow algorithms tend to be less complex and require more up-front knowledge. In contrast, deep learning algorithms rely more on optimal model selection and optimization through model tuning. They are more well suited to solve problems where prior knowledge of features is less desired or necessary, and where labeled data is unavailable or not required for the primary use case. For example, image recognition is enabled through deep learning. If an AI needed to be trained to recognize a dog, there is no finite set of rules we could prescribe to the task. There are different types of dogs, different sizes, different settings in terms of location, weather, props – the possibilities are endless. So how would the machine make that call? First of all we need a copious amount of data, millions of pictures. Now run these pictures through the first layer of the neural network. Here individual neurons process and pass the data to a second layer and so on, until the final output is produced. Each neuron assigns a weighting to its input — how correct or incorrect it is relative to the task being performed. The tasks in this case are examining a specific attribute perhaps the shape of the face, the number of ears or legs, the type of tail. It comes up with a “probability vector,” based on the weighting. So the system might be 92% confident the image is a dog, 6% confident it’s a bear, and 1% it’s a cat. The network architecture then tells the neural network whether it is right or not. This is the model tuning or training phase. By the n millionth picture the weightings of the neuron inputs are tuned so accurately that the network has taught itself what a dog looks like. Just as neural connections and pathways are built in our brains based on our perception and outside stimulus, given enough data the neurons in the ANN are able to fine tune their perception.
The associated algorithms within Deep Learning are used for supervised, unsupervised, and semi-supervised learning problems. Supervised learning uses labeled datasets. You have an input vector with a defined output. For unsupervised, you use unlabeled data and let the network draw inferences completely by itself. For semi supervised, you use mostly unlabeled data with some labeled data. Unsupervised learning is completely unbiased since human input is provided only for the output and not during the decision making.
An exciting space I see is Mind Simulation which is a combination of epistemology and self-defining heuristics. Heuristic-based methods are those that are not guaranteed to find the optimal solution for a problem, but will do a satisfactory job a majority of the time. Self –defining is the DNN version of it. Think of it as an efficient jack of all trades – may not be the SME but instinctively gets the job done well. Epistemology talks about “how” we learn. To design search engines and many other narrow AI tasks, knowing how a human brain works is not a significant factor. Situational problems that require a great deal of intuition are considered Artificial General Intelligence (AGI). To get to a true AGI, epistemology could be the lynch pin. How do you build a copy (enhanced or not) if you don’t know how it works?
Over the past few years AI has exploded. This surge has been greatly attributed to a combination of the wide availability of GPUs that make parallel processing ever faster, cheaper, and more powerful. As well as availability of infinite storage and a deluge of data. Deep learning in artificial neural networks with many layers has transformed many important subfields of artificial intelligence, including computer vision, speech recognition, natural language processing and others. The success stories are incredible – Deep Speech2 learned both English and Mandarin with a single algorithm, DNNs have achieved post-grad level IQ, image recognition to identify indicators for cancer in blood and tumors in MRI scans. Fear of losing jobs, sentient AGIs taking over is part of the technical revolution we’re in. Resistance to change is normal. Do these fears need to be addressed? Yes. But let’s not let them stand in the way of evolution.
Technology Enthusiast | Legacy Modernization | Applications, Data and AI | Cloud | Alliance Building
6 年Very well written... I really like the analogy with how humans learn and how it is related to computers Learning themselves.