The ABR Story: From Curiosity About Brains to Edge-AI 'Brain' Building Chips and Software.
We have had a growing interest in ABR and neuromorphics of late so it seems like a good time to tell our story a bit more for those new to this space. Many thanks to CIO Bulletin and their affiliated CIO's for the wonderful recognition of Applied Brain Research Inc. (ABR - www.AppliedBrainResearch.com ) as one of their Top 30 Innovative Companies Award recipients (https://www.ciobulletin.com/magazine/applied-brain-research-integrated-ai-systems-for-real-world-applications) on the heels of ABR's appearance at VentureBeat's Transform 2019 AI Showcase (https://venturebeat.com/2019/07/10/first-transform-2019-ai-showcase-highlights-practicality-and-privacy/). The whole team at ABR is very grateful for the recognition of all our work. The co-founders of ABR: Dr. Travis DeWolf, Dr. Trevor Bekolay, Dr. Xuan Choo, Dr. Dan Rasmussen and Dr Terry Stewart, Dr. Chris Eliasmith and I are especially honored.
So for those of you who might be reading about us for the first time and wondering what the heck all this is about, here is a bit of our story, and my journey in it. It is at an introductory level, so don't expect the rigor of a research paper (for those please see ABR's site https://www.nengo.ai/publications/ and Chris' lab https://compneuro.uwaterloo.ca/publications.html). Stories do help us sort out complex things and give us a map across time and the related concepts and events that they are composed of, so I hope this one helps you understand us a bit better.
The field we work in is called neuromorphics. Neuromorphics is so named because neuromorphic researchers study how the properties of single neurons, neural circuits and brains as a whole, represent, learn, control and transform information to perform adaptive computations that give rise to cognitive phenomena and ultimately behaviors.
Neuromorphics as a model for computing AI applications is pretty much just getting started commercially but, as with many teams with 'new' developments, it has been a long road for us at ABR to this point. My co-CEO and the scientific and technical leader of our company, Dr. Chris Eliasmith (https://appliedbrainresearch.com/about-us/eliasmith/ ) was a Waterloo Engineer who became fascinated with the brain and how might it work. His focus became how the brain represents information and thereby meaning. Chris enrolled in a PhD program in psychology and neuroscience and while doing his doctorate met up with a brilliant physicist, Dr. Charles (Charlie) Andersen. Chris and Charlie released a book together entitled "Neural Engineering: Computation, Representation, and Dynamics in Neurobiological Systems" in 2003 (https://mitpress.mit.edu/books/neural-engineering).
The book details the Neural Engineering Framework (NEF), a set of mathematical equations that could be used to emulate three major principles of neural computation; (1) how populations of spiking neurons and their connections represent information from the world and from each other, (2) how those representations are transformed during neural processing including learning, and (3) how the dynamics resulting from the spiking of the neurons through each other in networks could be computed and controlled. The Nengo (Neural ENGineering Objects) software package, that Chris and his team built to simulate brain circuits, can be used to implement NEF networks as well as many other forms of neural networks, both non-spiking and spiking (spike rate, timing and phase coded), allowing the rapid visual design, development, execution and debugging of multi-network, dynamic, real-time neural systems. Nengo embodies Chris' and his team's belief in "proof by construction" (more on this below).
In 2011, the gravitas for me of reading Chris' published papers, showing strong statistical correlations between actual biological circuits and spiking software models, drove me to contact Chris. My prior reading across many sub-fields of neuroscience, before coming across Chris' work, had filled me with the question as to why no one was taking all these models of neural circuits and trying to fit them together. Maybe if someone did that you would learn so much by trying, sort of like fitting jigsaw puzzle pieces together to constrain the number of possible ways you can build a picture, rather than simply trying to observe the pieces and imagine the picture. Little did I imagine back then that the puzzle of how brains work was so large, had so many as yet unknown dimensions. But it was a start. To compound matters, I was in search of a business success, and to understand the brain. Entrepreneurs have a painful defect of seeing possibilities where more rational folks might see lifetimes of exploration. Excited, I contacted Chris, he foolishly took that fateful meeting in his office at the University of Waterloo in the fall of 2011 and a we spoke about brains, AI, creating companies and neuroscience research. I asked him why no one was trying to fit the puzzle pieces of known neural circuits together, and he then me told me about Spaun. "We already did a first version of that and we are about to publish it" he uttered that or some such remarkable statement to me. I was stunned. I felt like I had just won the lottery of being in the right place at the right time to get a front row seat to a sea-change event. As important as it was, it was drowned out though in 2012 with the global deluge of attention for the 'discovery' of deep learning and its more immediate commercial viabilities.
Many have been inspired by the famed Dr. Richard Feynman's edict "What I cannot create, I do not understand" including Chris and his research colleagues. For many years researchers used Nengo to emulate brain circuits and compare the spiking dynamics from the Nengo model to the actual spiking patterns recorded from the organisms the neural circuits were observed in. Amazingly, many times researchers were able to reproduce the spiking patterns found in nature in Nengo. Over the years, Chris' lab has published many simulations of brain circuits using the NEF and other spiking network types. Extensions to the NEF since then have included dendritic processing, delay networks and semantic representations using the Semantic Pointer Architecture (SPA). The SPA is very interesting, it is a key component of the large brain model called Spaun (https://www.popsci.com/science/article/2012-11/meet-spaun-first-computer-model-complex-brain-behavior/) that Chris and ABR's founders built and published in Science in 2012 (https://science.sciencemag.org/content/338/6111/1202).
Spaun is what an old 4th generation programmer like me might call a 'thin-vertical slice' of the full system, a really thin slice of the mammalian brain's architecture in this case. It is not a brain like yours or mine but it does show how what many networks can do when built in parallel composed of spiking neurons, each network performing a function, integrated with the other networks, similar in some ways to how brains are thought to work. Spaun's step forward was in actually making all those smaller networks spike away and not produce garbage (random output) but actually be able to replicate higher level behavioral tasks. Limited as it is, try imagining coordinating 6.5 million asynchronous spiking neurons to do anything useful, let alone replicate how these group of networks might shed light on how brains solve small pattern-matching IQ tests. It was controversial because everyone wants to say it is not a brain. It isn't. But it is a model of how some dynamic phenomena of brains might work. An advance for spiking networks and constructionism, it certainly was. To see the latest version, Spaun 2, see Dr. Xuan Choo's (lead on Spaun) PhD thesis at https://pdfs.semanticscholar.org/90e0/2c6febad1fe6b1768a34294938e221b8e379.pdf?_ga=2.31027322.415586952.1564410234-782906329.1564410234.
The SPA methods used in Spaun are a set of mathematics that shows how spiking neurons could store semantic concepts in their activities. It is a compact representation of the sparse, distributed coding embodied in the web of connections between neurons thought to be used in the brain, especially in the cerebral cortex. Spaun uses a large collection of dynamically connected NEF & SPA networks to show how the brain might take in visual sensory information, store the sensory information in the spiking activities of its neurons, then bind those sensory activities to semantic representations of the concepts of what the visual system is seeing, and then perform reasoning over those semantic representations, finally computing a result (such as seeing a missing digit to complete a pattern across the symbols Spaun was shown previously) and writing it out using a neural emulation of a muscle cortex.
Spaun executes a loop from sensing in the environment, to representing, to conceptualizing, to pattern processing, the deciding, to taking a motor action back in the same environment where the sensory inputs originated. Spaun only has 6 million or so neurons, human brains have 100 billion, but it is significant as it shows how high level behaviors like completing a trivial pattern of digits might be performed by brains. Commercially, Spaun shows how millions of neurons, organized into many nuclei which each perform unique tasks can contribute in-parallel the results of their processing toward the successful construction of high level behaviors we associate with thinking. The brain is so much more complex, but Spaun was a start and a manifestation of ideas about how neurons might work together to integrate all these parts dynamically. If you want to read about exactly how Spaun works, Chris published a very readable book called "How to Build a Brain" detailing the methods used to develop Spaun and how they relate to his and his team's theories on how brains work (https://www.oxfordscholarship.com/view/10.1093/acprof:oso/9780199794546.001.0001/acprof-9780199794546).
For my own part, I ambitiously and very over-enthusiastically, attempted to replicate an important set of circuits theorized to be used in the brain, the cortico-thalamic-cortico loops thought to be used by the visual system to represent concepts at various levels from visual inputs. Taking theories from Rodriguez, Whitson and Granger (https://www.brainengineering.org/publications/2018/4/3/the-differential-geometry-of-perceptual-similarity-5n8aw-4a5kf-bpfw4-mxnbp-89anl-948e9-h8zkm-twhth-kkxla-398ym-zjga4-7pc9k-xb7er) of Dr. Granger's Brain Engineering Lab at Dartmouth ( https://www.brainengineering.org/ ) and combining Chris' methods for building Nengo models of brain circuits, I built a model of information flows from the visual stream into layer 4 of the 6-layer neocortex, and then through other cortical layers (and across cortical areas) into the thalamus' circuits and back to the cortex ( https://compneuro.uwaterloo.ca/publications/suma2018.html ) . I also used my time at Chris' lab to survey the most prominent models of how the brain works at a circuit level. My own feeling is that we know far more about the brain than the vast majority of us believe. It is a fascinating machine composed of dynamically connected, spiking neural circuits which implement layered, spatio-temporal maps of the world, the dimensions of the world and our internal representations of it. Intersecting maps of maps, built from dynamically formed, integrated circuits of spiking neurons, with lots of other methods of computation thrown in at many different levels from genes on up. I personally suspect we will uncover some of the foundational computing principles of the brain within the not too, too distant future.
I also believe we can build more efficient computing, as well as very functional robots and information systems, using just a subset of the principles we are discovering in the brain. If real-time AI is to advance, surely one way is to work to emulate brains which are dynamic computing systems intimately tied to their environments. It is the only working model we have of intelligence after all. Think about that for a minute. Really think about it. It is the only working model of the kind of intelligence we are seeking. So why not look there to advance AI? There are many intelligent organisms on this amazing planet but the performance we are trying to achieve from our autonomous cars, robots and information systems is measured by the yardstick of our own performance. Of course we may yet develop theories of what intelligence is in theoretical information processing terms and then be able to generalize it and ultimately find that many systematic embodiments of it are possible. But at the least, it is fair to observe that more than a hundred years of trying to do so has not resulted in the answer, yet. So it seems very reasonable to pursue the construction of brain circuit models slowly eking out the computational principles of neural circuits and dynamic systems of them. This seems especially worthwhile when combined with all the incredible progress being made in statistical learning and the broader field of AI. The two fields seem to be very much coming together more and more if reading many of the recent statements and papers from the leading labs and organizations in deep learning is any guide.
Some of the brain's lower-level computing principles are now being seen, skeptically still of course, in a new light. We know the mammalian brain uses spiking and non-spiking neurons to compute. After over a hundred years, we are only now coming to appreciate that spiking is useful not just for sending a signal along a long neural axon without loss of signal fidelity, we know now it is energy efficient at a system's level (see Intel's Mike Davies' series of posts entitled "Why Spikes?" for a good summary https://www.intel.ai/exploring-neuromorphic-computing-for-ai-why-spikes-part-one/#gs.spfxn2 ). The idea that if we want to compute all these AI networks, whether deep learning or spiking networks, one of the most efficient ways surely must lie in creating a field of hardware neurons directly on a chip to minimize the amount of power used. Biological neurons compute independently at the same time, but embedded in a circuit of inhibitory and excitatory influences. Each neuron processes the signals it receives on a continuous basis. As our AI networks become more temporal, processing data continuously from real-time sensors or temporal data such as videos, the compute loads rise. Biological neurons often focus on processing just the difference in what is now being sensed compared to what was last seen. This "temporal sparsity" combined with the threshold behavior of spiking neurons (simply put, neurons spike if they reach a threshold of evidence that they should) saves time and energy, making AI systems more responsive and economical.
Nengo spiking neural networks running on neuromorphic computing chips, like Intel's Loihi and the SpiNNaker chip. ABR has been working for and with Intel on the Loihi project since 2015. ABR built the very first working deep learning, reinforcement learning and adaptive networks on Loihi back and presented them with Intel at the NICE Conference in 2018 (https://appliedbrainresearch.com/press/2018-03-01-nice-loihi/). With regard to SpiNNaker2, ABR is commercializing the next generation of the chip in partnership with its creators led by Dr Christian Mayer at the Dresden Technical University in Germany, the revised chip whose working name is SpiNNakerNeo is set for release in 2021. A good article discussing Chris, ABR, Spaun, Loihi, SpiNNaker and neuromorphics in general was published in May 2019 in The Scientist and is very much worth a read (https://www.the-scientist.com/features/building-a-silicon-brain-65738) to contextualize the space and our role in it.
Neuromorphic chips exploit temporal sparsity, in addition to weight-sparsity (basically keeping only the connections that make the most difference to the network's performance) commonly used in AI networks. Combined with Nengo, spiking chips yield fast, dynamic, low-power AI networks for processing video, speech, control and semantic models to enable cars, phones, drones, robots and controllers to sense, react and control on continuous, real-time basis with less power, all of which means possibly better control, longer battery life and lower TCO. It is, however, still early days in commercial spiking neural computing, the neuron counts are growing fast though, bringing more and more problems within the range of being processed on a single chip or a few chips. ABR FPGA bitstream (https://appliedbrainresearch.com/products/brainboard/) has between 16,000 and 32,000 neurons and costs $99 USD (https://store.appliedbrainresearch.com/collections/nengo-fpga) plus the cost of an FPGA board from either Intel Altera or Xilinx (https://www.nengo.ai/nengo-fpga/getting_started.html#things-you-need), Intel's Loihi has 128,000 neurons, SpiNNaker2 can process ~300,000 neurons on a chip, Intel's Poholki Beach board has over 8,000,000 neurons and bigger boards are rumored (https://www.allaboutcircuits.com/news/intel-introduces-pohoiki-beach-its-64-chip-neuromorphic-system/), SpiNNaker1's largest system can theoretically be made to simulate hundreds of millions of neurons on its million ARM cores ( https://www.manchester.ac.uk/discover/news/human-brain-supercomputer-with-1million-processors-switched-on-for-first-time/).
Neuromorphic systems also can learn using methods inspired by how biological neural circuits implement learning. In one neural learning method the neurons in the circuit modulate their connectivity to other neurons as a function of which neurons spike or not in relation to their own spiking behavior. This effect was discovered and subsequently published by a Canadian, Dr. Donald Hebb in 1949. For commercial AI, this means that a neuromorphic network could be deployed on a neuromorphic processor and continue to learn after being deployed. Normally, edge-AI networks (networks on devices which are at the 'edge' of the network, cars, drones, phones, cameras, controllers, ioT devices…etc…) are trained using large computers employing many GPU's and/or CPU's and then once trained and sparsified they are deployed to infer (be run) on whatever processor CPU, GPU or matrix-multiplier (new chips made to improve AI inference performance) is in the edge-device the network is running on. For many problems that works fine. AI networks though need to have a lot of data to be trained on to make them better and better at whatever classification and/or categorization task they were created for. Supervised learning in deep learning networks (the most common type of AI network and learning method used commercially today) is a process of running over all the training cases (an example of the input paired with the answer the network is to produce on seeing that example) and adjusting the weights of the connections in the network by working backward from the network's answer layer nodes (nodes the analog to the neurons in a deep learning network) backward down the layers in the network toward the input layers and adjusting all the weights such that the right answer is a little more likely next time. This training process is called back-propagation. The results it can achieve are downright amazing. Watching AI networks translate speech, classify images and generate faces for the first time is as fascinating as it can be scary. This process of back-propagation through many layers was a core part of the amazing discovery made in Canada at Dr Hinton's lab at U of Toronto and published in 2012 which revolutionized AI forever. Our country and the technology industry the world over will never be the same.
There is more to be done though. Once we deploy a network for inference, if we encounter an unknown case, an image we did not train on, anything for which the patterns learned from the training data do not represent adequately in the learned weights in the network, we will get less accurate answers from the network. So just adjust the weights on the fly, one might say. But we cannot do it that way. For any such case we could turn on the back-propagation process and adjust the weights right then and there, assuming a human or other system told us that the answer our network produced was wrong and what the right answer should be. We could do that but the effect of the weight changes would a very small move in the direction of this new case due to the fact that we must change the weights only a little toward each new training case, or the network will fixate on a subset of cases and more so on the last cases learned. If we turn up the learning rate a lot (the measure of much to move the weights for each training case) we would end up swinging the network toward that new case too much and then we would jeopardize the networks already learned balance of weights over all the training cases. So we would have to re-run the whole of the training set of data and then test the network and then sparsify it and then republish it to adjust for any of these new cases seen after the network was deployed into inference mode. So, as a result of this batch nature of the process needed to produce a properly trained network, AI networks are most often batch trained and then re-published with the any new training cases captured at the edge. The new cases are therefore not used to train the network until the next batch training is run. Learning new cases on the fly at the edge would be helpful to make our networks more accurate over time and there is much more to this massively simplistic characterization, but the general point that most commercial networks are batch trained in practice is accurate. Many researchers are working on methods of online learning as well as learning generalizations, transferring learning from one case to another, rapid re-training, federating learning across many edge-AI's and many other improvements to AI performance, but neuromorphics has something to offer here as well.
So can neuromorphics help networks learn continually? Neurons learn in many ways and while it is not clear all the ways, and it is also not clear if they use methods akin to back-propagation (there are experts on both sides of that debate, most say no, but others say there are processes effectively similar to it in biology). It is clear that neurons do use Hebbian learning in places. In neuromorphic systems, we at ABR have found real-time changes to the learned weights in neuromorphic AI networks are very helpful for networks to smoothly learn mappings that change in time. One example we have built, is an adaptive robotic arm controller. The controller includes a neuromorphic network whose weights are adjusted on the fly to learn to control a robot arm when it encounters objects that weigh more or less than it was trained to handle, or when the robot itself wears out resulting in motors that need more energy to achieve the same movements as before. One of our co-founders, Dr. Travis DeWolf, built this adaptive neuromorphic AI system by simulating the human motor cortex's process for performing real-time adaptation and now we have run it on Intel's Loihi chip. The system demonstrates real-time, post-training, dynamic robotic reaching adaptation (see videos on ABR's YouTube channel https://www.youtube.com/channel/UCYvrmvMo5hdT8ckOqz0wJ3Q/videos ). We are now working on grasping with real-time adaptation, like humans and animals do every day with every movement.
For all those existing TensorFlow deep learning networks we have also implemented deep learning on spiking networks using our patent-pending Nengo Spiking Deep Learning ( https://www.nengo.ai/nengo-dl/ ) implementation led by our Dr. Dan Rasmussen. You can build spiking deep learning networks or you can migrate your already built models to Nengo to run the networks on spiking hardware. You can wait until neuromorphic chips arrive from ABR, Intel and others or for smaller projects, you can run them right now on our ABR FPGA implementation. If you are familiar with deep nets, you will get the low-power and responsive benefits of spiking implementations with the least amount of work and a smaller learning-curve with Nengo-DL. As an added benefit, for the right networks Nengo networks running on neuromorphic chips like Loihi can possibly be the lowest power, most scalable platform for edge-AI. See the benchmarks here https://arxiv.org/abs/1812.01739, 109x better than an NVIDIA Quadro 4000 and 5x better than the best we tested for that study, Movidius1.
One thing to consider in thinking about neuromorphics is that the next phases of AI are not a neuromorphics versus deep learning debate. As an old friend and a very bright Microsoft technical expert told me repeatedly years ago when we worked on a very large distributed systems project together, "Do the right thing in the right place". One possible commercial scenario would have cloud-based deep learning AI networks running in the data-center, with spiking AI networks on edge devices (especially the always-on ones) all working together. The neuromorphic chips would do what they do best, provide real-time, always-on, dynamic AI, matching their power use to the computing needed when the sensors or devices they control or use need intelligent processing, and contributing to the system's ability to adapt to unexpected cases as they arise. Now devices that need specific, fully-optimized, fixed networks will likely implement those in special purpose ASICs, but those where general computing is needed and with it the flexibility to update your networks at any time or to change them on the fly, or have them learn online, neuromorphic solutions are becoming a very attractive option. As far as we can tell, brains use many neurons, networks, nuclei and algorithms to achieve their purposes, we should too. Use the right tool for the right job in the right place.
With neuromorphic systems we have a new addition to the AI toolkit for edge-AI. Adaptive, real-time, continuous, dynamic AI applications that are power efficient can now be built with neuromorphic chips and software tools. These applications of course all need to be programmed though. That is where we at ABR come in. Nengo is the only visual design, develop, debug platform supporting all the major spiking neural chips today with more coming. You build your neural model once, and then you can run it on whichever chip or board suits your deployment purpose. You also therefore invest in your model once and then run a small version on an ioT device, while running the same AI network also on a GPU or neural board where more performance is needed. In the coming years neuromorphics needs many, many developers to take up the challenge and start to learn to program these dynamical systems, taking advantage of the short term power efficiency wins that these new chips can offer for edge-AI uses to grow use cases and companies.
For all that enter neuromorphics, the entry investment offers a chance to work on the grand challenges beyond, to find out what can these dynamical computing platforms do. It is a whole new world, which leans heavily on the explosion of AI advances and tools of the last few years as well as offers new models to explore for building real-time, always-on, edge-AI systems ("brains"). It is the most fascinating time right now, while deep learning has grown the confidence in everyone that progress can be made after the long painful AI winters of the past decades, the critical vulnerabilities of current deep nets (lacking an ability to truly generalize, to transfer knowledge and to deal with time, …) are urging us all to consider new ways to move forward, including to looking back to the thing that inspired us in the first place, the brain.
If you wish, you can get started with www.Nengo.ai today by downloading Nengo from https://www.nengo.ai/download/. You can run your Nengo models on your desktop CPU or GPU (Nengo will use it to emulate the asynchronous, parallel spiking neurons in your networks) or if you want to start programming spiking hardware you can get started with single population networks of up to 32,000 neurons with the Nengo FPGA bitstream ( https://appliedbrainresearch.com/products/brainboard/), use the Nengo Loihi software simulator (https://www.nengo.ai/nengo-loihi/) to get started programming the Intel Loihi before it is even released, or you can start programming on Nengo for SpiNNaker1 in anticipation of ABR's release of the SpiNNakerNeo chip in 2021, based on SpiNNaker2 from the team at TU Dresden and ABR. For Nengo's documentation please see https://www.nengo.ai/documentation/.
Note: For customers in Israel, if you are interested in purchasing ABR's FPGA bitstream for your FPGA board, please contact ABR's exclusive distributor in Israel at NBEL Innovation Labs about purchasing the bitstream ( https://nbel-lab.com/neuromorphic-engineering ). NBEL is a wonderful company, filled with technical depth and enthusiasm for neuromorphic engineering. They offer classes and workshops and many other services in the field.
It has been a winding road to date. The recognition from these awards of the work it has taken to lift this new concept, whether for Chris' and team's pioneering research work, or for our collective commercial efforts, is very welcome. Awards such as these above are nice stops along this fascinating, continuous learning journey.
AI Training Data | NLP | Prompt Engineering | Multilingual Speech-to-Text Transcription | Chatbot | Conversational AI | Machine translation | Human in the loop AI integration
1 年Peter, thanks for sharing!