Machines That Teach Themselves: The Next Evolution in Artificial Intelligence and Industrial Automation?
As I study Artificial Intelligence (AI) I become more and more amazed at the innovations that Data Scientists, Industry and Academia have come up with in terms of making machines act more like humans. One of the ones I am particularly fascinated with is the concept Reinforcement Learning and it's application in Industrial Automation and Industrial IoT. Essentially this the art of creating machines that learn from their environment themselves better than their human masters can teach them.
Let me explain the idea a bit further and illustrate the concept through practical application in industry demonstrating it's significant benefits. But first a bit of background for those new to AI...
Narrow AI vs. General AI - Reality vs. Science Fiction
For those of you whom have read my previous articles you would have heard me talk about the concept of General AI vs. Narrow AI. Narrow AI is the concept of Artificial Intelligence being applied to a specific process or system. Whereas General AI is a intelligence that can handle a multitude of situations. Examples in science fiction of General AI come in the form of C3PO from Star Wars or the android Data from Star Trek. The reality of it is General AI is still probably 30 years off whereas Narrow AI is here and now. Narrow AI has found a multitude of applications already in the last few years. Examples include everything from Online Search, Chat-bots, to Google Home to Driverless cars. In the industry AI has found application of Narrow AI in many manufacturing, mining and utilities processes.
The primary reason why General AI is still many years off as compared to Narrow AI is because the algorithms that data scientists use for example to make machines see like humans (i.e. Machine Vision) are different from those that are used to recognise the written word (i.e. Natural Language Processing) which are different yet again from those that are used to manage manufacturing processes (i.e. Deep Learning). Much like in the study of physics where scientists have been looking to achieve one unifying equation that describes the universe for many years but haven't gotten there, data scientists have yet to find one algorithm capable of modelling all aspects of human behaviour.
So in my view all the discussions around people being replaced by machines is over hyped and reflects a lack of understanding of the technology. But there is no doubt huge opportunities to have AI make humans more efficient in what they do now, today! The benefits of Narrow AI are significant.
Types of Narrow AI: Unsupervised vs. Supervised vs. Reinforcement Machine Learning
Broadly speaking there are three types of AI that Data Scientists will talk about. The first is Unsupervised Learning. This is the scenario where the machine is allowed to of its own accord to identify patterns or commonalities in existing data. Examples of this include when Spotify or Apple Music create music playlists of certain types of music or algorithms that identify anomalies in consumer purchasing patterns and thereby flag credit card fraud.
The second and probably most common form of AI is Supervised Learning. This is where the machine is given huge amounts of inputs and the associated outputs and looks for correlations in the data. Much like the experienced manufacturing plant operator whom knows that by making certain adjustments to machine settings certain things will happen in the process down the line but doesn't know exactly why, Supervised Learning looks for correlations between inputs and outputs in the data without necessarily identifying or understanding the causality as to why certain inputs lead to certain outputs. But of course the machine is much better at seeing and taking action on these correlations than a human can. Artificial Intelligence based on Supervised Learning is already in wide application in industrial settings. Practical applications include predictive maintenance and process optimisation in manufacturing and general industrial settings.
The challenge with Supervised Learning however is that the AI never understands it's environment better than it's human designers simply because it is the data scientists and engineers that "Supervise the Learning of the AI" and provide it with the data that it learns from. Whilst the AI of course being a machine is capable of handling more data, more quickly and more efficiently than it's human masters and therefore is better/ faster at decision making it is nonetheless limited by its human masters understanding of the manufacturing processes and their associated biases in terms of the data they collect and feed it.
The next and probably most interesting form of Artificial Intelligence and the focus of this article is what is referred to as Reinforcement Learning. This is where the machine teaches itself. The term Reinforcement Learning has it's origins in the theories of Ivan Pavlov and the famous Pavlov's Dog experiments where the researcher was able to reinforce certain behaviours in a dog to occur when desired. The dog learned through reinforcement.
To the chagrin of many of my colleagues in academia and technology I often explain complex AI concepts using the analogy of children. A concept that most of us can related to. In this case I use the children's analogy of a toddler crawling around at home. When a toddler crawls around its environment it often picks up things, sticks them in their mouths (because taste is an important human sense) and either continues to chew on the object or spits them out when they taste bad. Over time a toddler will develop his or her favourite toys to chew on but they will always occasionally try out new things in their environment with a view to always finding the best tasting option. AIs based on Reinforcement Learning follow this same concept, taking action to maximise benefit whilst occasionally doing something to test and learn from their environment, in the hope of finding a more optimal solution than their current solution. This is what is known in the industry as the Exploit vs. Explore decision of Reinforcement Learning. Contrary to the toddler whom for the most part randomly chooses when to Explore its environment vs. Exploit know options that it knows will work, the algorithms in Reinforcement Learning that make this Exploit vs. Explore decision are much more efficient at making this choice and deciding when it would be most beneficial to experiment and gain knowledge potentially leading to a better solution. So not only does the machine learn faster than a human but it can also surpass it's human masters in terms of it's knowledge of it's environment and therefore the solutions it picks as it is not limited to the data provided to it by it's human masters.
The advantages of Reinforcement Learning were recently demonstrated when a consultancy called Alpha-Go Zero did an experiment on the Chinese game of Go. Hundreds of thousands of human played games were fed into an AI in a Supervised Learning fashion to teach it how to play the game. The AI was then used to play human players and of course efficiently beat them every time. Then the engineers at Alpha-Go built an AI in a Reinforcement Learning Fashion. The AI was allowed to play games and learned from the games it played experimenting with different strategies (or in AI speak Policies) rather than being "taught" by its human masters. After several days of learning the Reinforcement Learning based AI became better than the standard Supervised Learning AI at playing the game. Not only better but 1000 times better. The new AI was able to identify successful game play strategies that hadn't been identified in the data provided to it's Supervised Learning based cousin.
Current practical examples of Reinforcement Learning based AI in the consumer world include when Spotify, Netflix and Apple Music make song and movie suggestions. Most times they suggest things you like, but occasionally the AI intentionally makes suggestions you don't like experimenting to better understand your preferences. This is the AI learning in a Reinforcement Learning fashion.
The Next Step In Industrial Automation: Reinforcement Learning and Machines that teach themselves?
So you might be saying that it is a big stretch to go from playing a board game to managing a manufacturing process. But the fact is that there are several examples of Reinforcement Learning being used in quite large manufacturing processes today. Reinforcement Learning based AI has been successfully applied globally and here in Australia to quite large manufacturing processes. Results have been significant with significant cost savings. Just like the AI used in the game of Go, Reinforcement Learning based AI was not only better at running the machine than it's human masters but significantly better than it's Supervised Learning AI cousins. Illustrating the importance of Reinforcement Learning in AI, Microsoft has recently made several important acquisitions in this space it it's bid to be come the predominant force in AI technology globally.
Conclusions
The conclusion is that whilst not a new concept Reinforcement Learning based AI or essentially machines that teach themselves, is a technology that is maturing and most industrial organisations should be considering using it to their benefit. It is not science fiction but a technology that is saving many organisations millions of dollars here/ today.
David Goad has 20 years plus of consulting and business strategy experience and is currently a Postgraduate Fellow at the University of Sydney studying IoT and AI. He advises enterprises on their AI and IoT Strategies. If you would like more information on AI or IoT Strategy feel free to contact David at [email protected] .