Vygotsky Meets Backpropagation: Alternative neural learning models and the development of higher forms of thought
A new working paper. Please let me know if you are interested in commenting.
keywords: conceptual development, machine learning, Inception-v3, Adaptive Resonance Theory, Sakharov experiment, syncretic thinking, thinking in complexes
Abstract
In this paper we revisit Vygotsky’s developmental model of concept formation, and use it to discuss alternative algorithmic approaches to learning in artificial neural networks. We study learning in artificial neural networks from a learning science point of view, asking whether it is possible to construct systems that have developmental patterns that align with empirical studies on concept formation.
The Vygotskian model of cognitive development highlights important limitations in currently popular neural network algorithms, and puts neural AI in the context of post-behavioristic science of learning. At the same time, the Vygotskian model of development of thought suggests new architectural principles for developing AI, machine learning, and systems that support human learning. Using the Vygotskian framework, we can ask what would it take for machines to learn, and what could they learn from research on learning.
Introduction
In the last years, the wide availability of big data and low-cost parallel computational architectures have led to rapidly growing interest in the capabilities of artificial neural networks. Although neural network models have been developed and studied since the 1930’s and many currently popular network models reflect ideas that have been well-established several decades ago, only in the recent years the convergence of computational capability and big data have started to create visible breakthroughs in neural AI. The remarkable successes of “deep learning” now suggest that learning theorists may learn something important from neural network research and its algorithms.
For a learning scientist, however, an attempt to review research on neural AI can be quite confusing. More than superficial understanding of the domain requires multi-disciplinary competences in theoretical physics, computer programming, probability theory, linear algebra, and neurobiology (Hagan et al. 2014; Anderson and Rosenfeld 1988). Successful application of neural AI methods often requires detailed understanding of the problem domain, computational experimentation with large numbers of model parameters, and exotic hardware and software platforms. Somewhat surprisingly, very little knowledge about human learning is needed, however. From learning sciences point of view, “deep learning” verges on being an oxymoron. Most often the implicit models of learning in neural AI resemble associationist, neo-behavioristic and reflexological models from the turn of the 19th century.
In this paper, we ask whether neural AI algorithms can learn concepts. We therefore start from learning theory, asking to what extent current neural AI algorithms can simulate different stages of development in conceptual thinking. As a starting point, we use the empirical and theoretical studies on concept formation by Vygotsky and his colleagues conducted at the turn of 1920s. Vygotsky’s main claim was that advanced forms of adult thought use culturally and historically developed conceptual systems. In the ontogenetic development of a child, different types of pre-conceptual thinking gradually evolve to a point where the child becomes able to internalize socio-culturally accumulated word meanings and use these in his or her thinking. This, according to Vygotsky, leads to qualitatively new types of thought that are not available for animals or young children.
Although Vygotsky’s stage model of conceptual development does not directly map to the developmental age of a child, using this conceptual framework we may roughly locate the capabilities of different learning algorithms among the various developmental stages. Using this approach, we may characterize the different types of conceptual thinking that current neural AI architectures use, and highlight those processes of concept formation that these architectures currently lack.
Due to space limitations, in this paper we focus on two influential types of networks: feedforward convolutional networks and adaptive resonance networks. Feedforward convolution networks are now widely-used, and they implement supervised learning. Adaptive resonance networks are conceptually more interesting from learning sciences point of view as they use unsupervised learning and expectations. As a prototypical convolutional network we use the Inception-v3 network that currently represents state-of-the-art in image recognition (Szegedy et al. 2016). A similar analysis could be extended to recurrent networks that use sequential data for learning, unsupervised self-organized feature maps, or, for example, networks that use reinforcement learning.
The paper is organized as follows. In the next section, we briefly describe the empirical basis that informed Vygotsky’s work on conceptual development in children, based on Leonid Sakharov’s (1994) laboratory experiments, and outline the resulting stages of conceptual development. The following sections then use this conceptual framework to discuss learning in neural AI systems, first, convolutional networks, using the Inception-v3 network as an example, and then also briefly the Adaptive Resonance networks, originally developed by Carpenter and Grossberg (1987). The paper ends with some concluding comments.