The Journey of Back-Propagation: The Revolution in Neural Networks
Parth Sharma
Data Analyst, PwC Australia | Transforming Data into Strategic Insights | CSPO | Exploring AI with a Curious Mind
In the mid-20th century, artificial intelligence (AI) reached a crossroads. Researchers Marvin Minsky and Seymour Papert, in their influential book Perceptrons, doubted the possibility of successfully training multilayer neural networks. They argued that such an algorithm was unlikely, leading to a drop in funding and interest in neural network research. Despite this, determined researchers persisted, and what followed changed AI’s course forever: the creation of the back-propagation algorithm.
The Birth of Back-Propagation
The back-propagation algorithm defied Minsky and Papert’s predictions. But what is back-propagation? Picture a neural network misclassifying an image. Back-propagation comes into play by analysing the error at the output and tracing it back through the network to find its source. It identifies how much each weight contributed to the error and adjusts these weights step by step. Over time, the network refines itself, reducing errors and improving accuracy. This iterative learning process is the foundation of modern neural networks and has driven many of today’s AI advancements.
A Simple Example of Back-Propagation
Imagine teaching a child to throw a basketball into a hoop. On the first attempt, the child misses the hoop. You then give feedback: "Throw a bit harder." The child adjusts their throw slightly and tries again, gradually improving until they consistently score.
Let’s see how this works in a neural network with a real example:
Here’s a visual representation:
Over thousands of iterations, the network "learns" to identify cats by minimising the error.
Connectionism: The Rise of Subsymbolic Thinking
In the 1980s, the term "connectionist networks" gained popularity to describe neural networks. Connectionism emphasises the idea that knowledge is stored in the weighted connections between network units, unlike symbolic AI, which relies on explicit rules and logic.
Psychologists David Rumelhart and James McClelland were key figures in this movement. Their 1986 book, Parallel Distributed Processing, argued that the brain’s structure—made up of interconnected neurons—is better suited for tasks like perception, language understanding, and memory retrieval than symbolic AI systems. This subsymbolic approach marked a shift in AI thinking.
领英推荐
The Fall of Symbolic AI and the AI Winter
Symbolic AI initially showed promise. Systems like MYCIN, which used 600 rules to diagnose blood diseases, demonstrated the power of rule-based reasoning. MYCIN could even explain its decision-making process, showcasing symbolic AI’s transparency.
However, by the mid-1980s, symbolic AI faced significant challenges. These systems were brittle, prone to errors, and struggled to adapt to new situations. The main issue was their reliance on explicit human knowledge, which often draws on subconscious insights and common sense—things that are difficult to program into rigid rules.
This led to a decline in symbolic AI and the continuation of AI Winter, a period of reduced funding and interest in AI research. Yet, the debate between symbolic and subsymbolic approaches continued, shaping AI’s evolution.
Symbolic vs. Subsymbolic: A Tale of Two Paradigms
Symbolic systems excel in tasks requiring clear logic and reasoning. For example, MYCIN’s rule-based design allowed it to diagnose diseases while explaining its reasoning. However, these systems falter when the rules are unclear or hard to define.
On the other hand, subsymbolic systems, like neural networks, shine in areas where humans struggle to define explicit rules. Tasks like recognising handwriting, identifying voices, or catching a ball come naturally to these systems. Philosopher Andy Clark summed it up perfectly: subsymbolic systems are "bad at logic, good at Frisbee."
So, Why Not Use Both?
This raises an important question: Why not use symbolic systems for logic-based tasks and subsymbolic systems for perceptual tasks?
This idea has led to hybrid systems, where the strengths of both paradigms are combined. These systems integrate symbolic reasoning with the perceptual power of neural networks, enabling them to reason and perceive simultaneously. For instance, neuro-symbolic AI combines symbolic logic with neural architectures, allowing systems to learn and reason in complementary ways.
Hybrid models are already progressing in fields like autonomous driving and medical diagnosis. While there’s still work to do before AI becomes fully reliable and transparent, these approaches bring us closer to systems that excel at both logic and perception, bridging the gap between human and machine intelligence.
Acknowledgements
The views expressed in this article are taken from the book Artificial Intelligence: A Guide for Thinking Humans by Melanie Mitchell. All credit for this content goes to the author—I am simply sharing my summarized notes and key takeaways as I work through different sections of the book. Also, a big thanks to ChatGPT and NotebookLM for turning my scribbled notes into something that actually makes sense! Since English isn’t my first language and I’m still learning how to write like a pro, ChatGPT is my trusty sidekick in making these insights clear and readable.
Parth Sharma, Wrapping up 2024 with a thoughtful reflection on neural networks and connectionism is inspiring! Your article on back-propagation and the revival of neural networks captures the essence of AI's transformative journey. Excited to follow your exploration into AI Spring in 2025, here’s to a year filled with innovation and discovery.