Collective Intelligence for Deep Learning: A Survey of Recent Developments

David Ha¹¹affiliationmark: and Yujin Tang¹¹affiliationmark: ¹¹affiliationmark: Google Brain, Tokyo, Japan.
Both authors contributed equally to this work.
Email: [email protected], [email protected]

Abstract

In the past decade, we have witnessed the rise of deep learning to dominate the field of artificial intelligence. Advances in artificial neural networks alongside corresponding advances in hardware accelerators with large memory capacity, together with the availability of large datasets enabled practitioners to train and deploy sophisticated neural network models that achieve state-of-the-art performance on tasks across several fields spanning computer vision, natural language processing, and reinforcement learning. However, as these neural networks become bigger, more complex, and more widely used, fundamental problems with current deep learning models become more apparent. State-of-the-art deep learning models are known to suffer from issues that range from poor robustness, inability to adapt to novel task settings, to requiring rigid and inflexible configuration assumptions. Collective behavior, commonly observed in nature, tends to produce systems that are robust, adaptable, and have less rigid assumptions about the environment configuration. Collective intelligence, as a field, studies the group intelligence that emerges from the interactions of many individuals. Within this field, ideas such as self-organization, emergent behavior, swarm optimization, and cellular automata were developed to model and explain complex systems. It is therefore natural to see these ideas incorporated into newer deep learning methods. In this review, we will provide a historical context of neural network research’s involvement with complex systems, and highlight several active areas in modern deep learning research that incorporate the principles of collective intelligence to advance its current capabilities. We hope this review can serve as a bridge between the complex systems and deep learning communities.

keywords:

Deep Learning, Reinforcement Learning, Cellular Automata, Self-Organization, Complex Systems

1 Introduction

Deep learning (DL) is a class of machine learning methods that uses multi-layer (“deep”) neural networks for representation learning. While artificial neural networks, trained with the backpropagation algorithm, first appeared in the 1980s Schmidhuber (2014), deep neural networks did not receive widespread attention until 2012 when a deep artificial neural network solution trained on GPUs Krizhevsky et al. (2012) won an annual image recognition competition Deng et al. (2009) by a significant margin over the non-DL runner up methods. This success demonstrated that DL, when combined with fast hardware-accelerated implementations and the availability of large datasets, is capable of achieving exceptionally better results in non-trivial tasks than conventional methods. Practitioners soon quickly incorporated DL to address the long-standing problems in several other fields. In computer vision (CV), deep learning models are used in image recognition Simonyan and Zisserman (2014); He et al. (2016); Radford et al. (2021) and image generation Wang et al. (2021); Jabbar et al. (2021). In natural language processing (NLP), deep language models can generate text Radford et al. (2018, 2019); Brown et al. (2020) and perform machine translation Stahlberg (2020). Deep learning has also been incorporated into reinforcement learning (RL) to tackle vision-based computer games such as Doom Ha and Schmidhuber (2018) and Atari Mnih et al. (2015), and play games with large search spaces such as Go Silver et al. (2016) and Starcraft Vinyals et al. (2019). Deep learning models are also deployed for mobile applications like speech recognition Alam et al. (2020) and speech synthesis Tan et al. (2021), demonstrating their wide applicability.

Refer to caption — Figure 1: Recent advances in GPU hardware enables realistic 3D simulation of thousands of robot models Heiden et al. (2021), such as the one shown in this figure from Rudin et al. Rudin et al. (2021). Such advances opens the door for large scale 3D simulation of artificial agents that can interact with each other and collectively develop intelligent behavior.

However, DL is not an elixir without side effects. While we are witnessing many successes and a growing adoption of deep neural networks, fundamental problems with DL are also revealing themselves more and more clearly as our models and training algorithms become bigger and more complex. DL models are not robust in some cases. For example, it is now known that by simply modifying several pixels on the screen of a video game (the modification is not even noticeable to humans), the agent trained with unmodified screens that originally surpassed human performance could fail Qu et al. (2020). Also, CV models trained without special treatment may fail to recognize rotated or similarly transformed examples, in other words, our current model and training methods do not lend themselves to generalization to novel task settings. Last but not least, most DL models do not adapt to changes. They make assumptions about input and expect rigid configurations and stationarity of the environment, what statisticians think of as the data generating process. For instance, they may expect a fixed number of inputs, in a determined ordered. We cannot expect agents to capably act beyond their skills learned during training, but once these rigid configurations are violated, the models do not perform well unless we retrain them or manually process the inputs to be consistent with the expectations of their initial training configurations.

Furthermore, with all these advances, the impressive feats in deep learning involve sophisticated engineering efforts. For instance, the famous AlexNet Krizhevsky et al. (2012) (See Figure 2), which put deep learning into the spotlight in the computer vision community after winning ImageNet in 2012, presented a carefully designed network architecture with a well-calibrated training procedure. Modern neural networks are often even more sophisticated, and require a pipeline that spans network architecture to delicate training schemes. Like many engineering projects, much labor and fine-tuning went into producing each result.

We believe that many of the limitations and side effects of deep learning stems from the fact that the current practice of deep learning is similar to the practice of engineering. The way we are building modern neural network systems is similar to the way we are building bridges and buildings, which are designs that are not adaptive. To quote Pickering, author of The Cybernetic Brain Pickering (2010): “Most of the examples of engineering that come to mind are not adaptive. Bridges and buildings, lathes and power presses, cars, televisions, computers, are all designed to be indifferent to their environment, to withstand fluctuations, not to adapt to them. The best bridge is one that just stands there, whatever the weather.”

In natural systems, where collective intelligence plays a big role, we see adaptive designs that emerge due to self-organization, and such designs are very sensitive and responsive to changes in the world around them. Natural systems adapt, and become part of their environment (See Figure 3 for an analogy).

As exemplified by the example of army ants collectively forming a bridge that adapts to its environment, collective behavior, commonly seen in nature, tends to produce systems that are adaptable, robust, and have less rigid assumptions about the environment configuration. Collective intelligence, as a field, studies the shared intelligence that emerges from the interactions (such as collaboration, collective efforts, and competition) of many individuals. Within this field, ideas such as self-organization, emergent behavior, swarm optimization, and cellular automata were developed to model and explain complex systems. It is therefore natural to see these ideas incorporated into newer deep learning methods.

We do not believe that deep learning models have to be built in the same vein as bridges. As we will discuss later on, it didn’t have to be this way. The reason why the deep learning field took this course could just be an accidental outcome in history. In fact, recently, several works have been addressing the limitations of deep learning by combining it with ideas from collective intelligence, from applying cellular automata to neural network-based image processing models Mordvintsev et al. (2020); Randazzo et al. (2020) to re-defining how problems in reinforcement learning can be approached using self-organizing agents Pathak et al. (2019); Huang et al. (2020); Tang and Ha (2021). As we witness the continuous technological advances in parallel-computation hardware (which is naturally suited to simulate collective behavior, see Figure 1 for an example), we can expect more works that incorporate collective intelligence into problems that have been traditionally approached with deep learning.

The goal of this review is to provide a high level survey of how ideas, tools, and insights central to the field of collective intelligence, most notably self-organization, emergence, and swarm models have impacted different areas of deep learning, ranging from image processing, reinforcement learning to meta-learning. We hope this review will provide some insights on future deep learning collective intelligence synergies, which we believe will lead to meaningful breakthroughs in both fields.

2 Background: Collective Intelligence

Collective intelligence (CI) is a term widely used in areas like sociology, business, communication and computer science. The definition of CI can be summarized as a form of distributed intelligence that is constantly enhancing and coordinating, with the goal of achieving better results than any individual of the group, through mutual recognition and enrichment of the individual Lévy (1997); Leimeister (2010). The better results from CI are attributed to three factors: diversity, independence and decentralization Surowiecki (2005); Tapscott and Williams (2008).

For our purposes, we view collective intelligence, as a field, to be the study of the group intelligence that emerges from interactions (can be collaborative or competitive) between many individuals. This group intelligence is a product of emergence, which occurs when the group is observed to have properties that the individuals that compose of the group do not have on their own, and emerge only when the individuals of the group interact in a wider whole.

Examples of such systems are abounded in nature where complex global behaviors toward mutual goals emerge from simple local interactions/collaborations between individuals Deneubourg and Goss (1989); Toner et al. (2005); Sumpter (2010); Lajad et al. (2021). In this review, we confine ourselves to be concerned with the simulation of collective intelligence, rather than the analysis of CI observed in nature and society. Decades of earlier work have also explored the simulation of collective behavior and to gather insights from such simulations. Mataric (1993) investigated the use of physical mobile robots for studying social interactions leading to group behavior. They proposed a set of basic interactions (e.g., collision avoidance, following, flocking, etc) with the hope that these primitives would enable a group of autonomous agents to accomplish a common goal or to learn from each other. Inspired by group behaviors observed in real ant colonies, Dorigo et al. (2000) posed stigmergy (a particular form of indirect communication used by social insects) as a distributed communication paradigm and showed how it inspired novel algorithms for solutions of distributed optimization and control problems. Moreover, Schweitzer and Farmer (2003) applied Brownian agent models in many different contexts. Combined with multi-agent systems and statistical approaches, the authors laid out a vision for a coherent framework for understanding complex systems.

While some of these earlier works led to the discovery of algorithms that are applicable to optimization problems (such as ant colony optimization for tackling the traveling salesman problem), many of these works aim to use these simulation models to understand the emergent phenomenon of collective intelligence. This points to a fundamental difference between the goals of collective intelligence and artificial intelligence fields. In collective intelligence, the goal is to build models of complex systems that can help us explain and understand emergent phenomena, which may have applications to understand real systems in nature and society. Artificial intelligence (in particular, the field of machine learning), on the other hand, is concerned with optimization, classification, prediction, and solving a problem.

The early works we mentioned did not fully leverage the modeling power of DL or the advancement of hardware development, but nonetheless are consistently demonstrating the incredible effects of CI. Namely, the systems are self-organizing, capable of optimization via swarm intelligence, present emergent behavior, etc. They suggest that concepts from CI are promising ideas that can be applied to DL to produce solutions that are robust, adaptable, and have less rigid assumptions about the environment configuration, which is the focus of this review.

3 Historical Background: Cellular Neural Networks

Ideas from complex systems such self-organization that were used to model and understand emergent and collective behavior have a long and interesting historical relationship with the development of artificial neural networks. While connectionism and artificial neural networks came about in the 1950s with the birth of artificial intelligence as a research field, our story begins in the 1970s, when a group of electrical engineers led by pioneer Leon Chua, started developing nonlinear circuits theory and applied it to computation. He is known for conceptualizing the Memristor in the 1970s (a device that has been implemented only recently), and devising the Chua circuit, one of the first circuits to exhibit chaotic behavior. In the 1980s, his group developed Cellular Neural Networks, which are computational systems that resemble cellular automata (CA), but use neural networks in place of the algorithmic cells typically seen in CA systems such as Conway’s Game of Life Conway et al. (1970) or elementary cellular automata rules Wolfram (2002).

Cellular Neural Networks (CeNNs) Chua and Yang (1988b, a) are artificial neural networks where each neuron, or cell, can only interact with their immediate neighbors. In the most basic setting, the state of each cell is continuously updated using a nonlinear function of the states of its neighbors and itself. Unlike modern deep learning approaches which rely on digital, discrete-time computation, CeNNs are continuous-time systems that are usually implemented with non-linear analog electronic components (See Figure 4, left), making them very fast. The dynamics of CeNNs rely on independent local processing of information and interaction between processing units, and like CAs, they also exhibit emergent behavior, and can be made to be Universal Turing Machines. However, they are vastly more general than discrete CAs and digital computers. Due to the continuous state space, CeNNs exhibit emergent behavior never seen before. GoraS et al. (1995)

From the 1990s to mid 2000s, CeNNs became an entire subfield of AI research. Due to its powerful and efficient distributed computation, it found applications in image processing, texture analysis, and its inherent analog computation applied to solving PDEs and even modeling biological systems and organs. Chua and Roska (2002) There were thousands of peer-reviewed papers, textbooks, and an IEEE conference devoted to CeNNs, with many proposals to scale them up, stack them, combining them with digital circuits, and investigating different methods of training them (just like what we are currently seeing in deep learning). At least two hardware startups were formed to produce CeNN hardware and devices.

But in the latter half of the decade in the 2000s, they suddenly disappeared from the scene! There is hardly any mention of Cellular Neural Networks in the AI community after 2006. And from the 2010s, GPUs took over as the predominant platform for neural network research, which led to the rebranding of artificial neural networks to deep learning. See Figure 4 (right) for a comparison of the trends over time.

No one can really pinpoint the exact reason for the demise of Cellular Neural Networks in AI research. Like the Memristor, perhaps CeNNs were ahead of its time. Or perhaps the eventual rise of consumer GPUs made it a compelling platform for deep learning. One can only imagine in a parallel universe where CeNN’s analog computer chips had won the Hardware Lottery Hooker (2020), the state of AI might be very different where the world and all of our devices are embedded with powerful distributed analog cellular automata.

However, one key difference between CeNNs and deep learning is accessibility, and in our opinion, this is the main reason it did not catch on. In the current deep learning paradigm, there is an entire ecosystem of tools designed to make it easy to train and deploy neural network models. It is also relatively straightforward to train the parameters of a neural network with deep learning frameworks by providing it with a dataset Chollet et al. (2015), or a simulated task environment Hill et al. (2018). Deep learning tools are designed to be used by anyone with a basic programming background. CeNNs, on the other hand, were designed for electrical engineers at a time when most EE students knew more about analog circuits than programming languages.

To illustrate this difficulty, “training” a CeNN requires solving a system of at least nine ODEs to determine the coefficients that govern the analog circuits to define the behavior of the system! In practice, many practitioners needed to rely on a cookbook Chua and Roska (2002) of known solutions to problems and then manually adjust the solutions for new problems. Eventually, genetic algorithms (and early versions of backpropagation) have been proposed to train CeNNs Kozek et al. (1993), but they require simulation software to train and test the circuits, before deploying on an actual (and highly customized) CeNN hardware.

There are likely more lessons to be learned from Cellular Neural Networks. They were an immensely powerful hybrid of analog and digital computation that truly synthesized cellular automata with neural networks. Unfortunately, we probably only witnessed the very beginning of its full potential, before its demise. Ultimately, commodity GPUs and software tools that abstracted neural networks into simple Python code enabled deep learning to take over. Although CeNNs have faded away, concepts and ideas from complex systems, like CAs, self-organization and emergent behavior have not. Despite being limited to digital hardware, we are witnessing a resurgence of Collective Intelligence concepts in many areas of deep learning, from image generation, deep reinforcement learning, to collective and distributed learning algorithms. As we will see, these concepts are advancing the state of deep learning research by providing solutions to some limitations and restrictions of traditional artificial neural networks.

4 Collective Intelligence for Deep Learning

Collective intelligence naturally arises from the interaction of multiple individuals in a network, and it is no surprise to also see self-organizing behaviors naturally emerge from artificial neural networks. This is especially true when we employ repeated computation of identical modules with identical weight parameters across the network. For example, Gilpin Gilpin (2019) observed the close connection between cellular automata and convolutional neural networks (CNNs), a type of neural network often used in image processing that applies the same weights (or filters) to all of its inputs. In fact, they show that any CA can be represented with a certain kind of CNN, and with an elegant demonstration of Conway’s Game of Life Conway et al. (1970) in a CNN, illustrating that in certain settings, CNNs can exhibit interesting self-organizing behaviors. Recently, several works such as Mordintsev et al. Mordvintsev et al. (2020) that we will discuss later have exploited the self-organizing properties of CNN, and have developed neural network-based cellular automata for applications such as image regeneration.

Other types of neural network architectures, such as Graph Neural Networks Wu et al. (2020); Sanchez-Lengeling et al. (2021); Daigavane et al. (2021) explicitly target self-organizing as a central feature, modeling the behavior of each node of a graph as identical neural network modules that pass messages to their neighbors defined by the edges of a graph. GNNs have been traditionally used to analyze graph domains such as social networks and molecular structures. Recent work Grattarola et al. (2021) has also demonstrated the ability of GNNs to learn rules for established CA systems such as Voronoi diagrams, or the flocking behavior of swarms Schoenholz and Cubuk (2020). As we will discuss later, the self-organizing properties of GNNs have recently been applied to the deep reinforcement learning domain, creating agents with far superior generalization capabilities.

We have identified four areas of deep learning that have started to incorporate ideas related to collective intelligence: (1) Image Processing, (2) Deep Reinforcement Learning, (3) Multi-agent Learning, and (4) Meta-Learning. We will discuss each area in detail and provide examples in this section.

4.1 Image Processing

Implicit relationships and recurring patterns in nature (such as texture and scenery) can benefit from employing approaches from cellular automata in learning alternative representations of natural images. Like CeNNs, the Neural Cellular Automata (neural CA) model proposed by Mordvintsev et al. Mordvintsev et al. (2020) treated each individual pixel of an image as a single neural network cell. The networks are trained to predict its color based on the states of its immediate neighbors, thereby developing a model of morphogenesis for image generation. They demonstrated that it was possible to train neural networks to reconstruct entire images this way, even when each cell lacks information about its location and rely only on local information from its neighbors. This approach enabled the generation algorithm to be resistant to noise, and moreover, allowed images to regenerate when damaged. An extension of neural CA Randazzo et al. (2020) enabled individual cell to perform image classification tasks, such as handwritten digit classification (MNIST) by only examining the contents of a single pixel, and passing a message on to the cell’s immediate neighbors (See Figure 5). Over time, a consensus will be formed as to which digit is the most likely pixel, but interestingly, disagreements may result depending on the location of the pixel, especially if the image is intentionally drawn to represent different digits.

The regeneration with neural CA has been explored beyond 2D images. In a later work, Zhang et al. Zhang et al. (2021) employed a similar approach to 3D voxel generation. This is particularly useful for high-resolution 3D scanning, where 3D shape data is often described with sparse and incomplete points. Using generative cellular automata they can recover full 3D shapes from only a partial set of points. This approach is also applicable outside of pure generative domains, and can also be applied to the construction of artificial agents in active environments such as Minecraft. Sudhakaran et al. Sudhakaran et al. (2021) trained neural CAs to grow complex entities from Minecraft such as castles, apartment blocks, and trees, some of which are composed of thousands of blocks. Aside from regeneration, their system is able to regrow parts of simple functional machines (such as a virtual creature in the game), and they demonstrate a morphogenetic creature grow into two distinct creatures when cut in half in the virtual world (See Figure 6).

Cellular Automata is also naturally applicable to provide visual interpretation for images. Qin at al. Qin et al. (2018) examined the use of a Hierarchical CA model for visual saliency, to identify items in an image that stand out. By getting a CA to operate on visual features extracted from a deep neural network, they were able to iteratively construct multi-scale saliency maps of the image, with the final image being close to the target items. Sandler et al. Sandler et al. (2020) later investigated the use of CA for the task of image segmentation, an area where deep learning enjoys tremendous success. They demonstrated the viability of performing complex segmentation tasks using CAs with relatively simple rules (with as little as 10K neural network parameters), with the advantage of the approach being able to scale up to incredibly large image sizes, a challenge for traditional deep learning models with millions or even billions of model parameters, which are bounded by GPU memory.

4.2 Deep Reinforcement Learning

The rise in deep learning gave birth to the use of deep neural networks for reinforcement learning, or deep reinforcement learning (Deep RL), equipping reinforcement learning agents with modern neural networks architectures that can address more complex problems, such as high dimensional continuous control or vision-based tasks from pixel observations. While Deep RL shares successful characteristics with deep learning, in that employing sufficient computation resources will generally lead to the solution of a target training task to be found. But like deep learning, Deep RL has its share of limitations. Agents trained to perform a particular task often fail when the task is slightly altered. Furthermore, neural network solutions generally only work for a specific morphology with well-defined input and output mappings. For instance, a locomotion policy trained for a 4-legged ant might not work for a 6-legged one, and a controller that expects to receive 10 inputs won’t work if you give it 5, or 20 inputs.

The evolutionary computation community started approaching some of these challenges earlier on, by incorporating modularity Schilling (2000); Schilling and Steensma (2001) in the evolutionary process that govern the design of artificial agents. Having agents that are composed of identical but independent modules foster self-organization via local interactions between the modules, enabling systems that are robust to changes in the agent’s morphology, an essential requirement in evolutionary systems. These ideas have been presented in the literature of work on soft-bodied robotics Cheney et al. (2014), where robots consist of a grid of voxel cells–each controlled by an independent neural network with local sensory function that can produce a localized action. Through message passing, the group of cells that make up the robot are able to self-organize and perform a range of locomotion tasks (See Figure 7). Later work Joachimczak et al. (2016) even proposes incorporating metamorphosis in the evolution of the placement of the cells to produce configurations robust to a range of environments.

Recently, soft-bodied robots have even been combined with the neural CA approach discussed earlier to enable these robots to regenerate themselves. Horibe et al. (2021) To bridge the gap between policy optimization (where the goal is to find the best parameters of the policy neural network) usually done in the Deep RL community and the type of morphology-policy co-evolution (where both the morphology and the policy neural network is optimized together) work done in the soft-bodied literature, Bhatia et al. Bhatia et al. (2021) has recently developed an OpenAI Gym-like Brockman et al. (2016) environment called Evolution Gym, a benchmark for developing and comparing algorithms for co-optimizing design and control, which provided an efficient soft-bodied robot simulator written in C++ with a Python interface.

Modular, decentralized self-organizing controllers have also started to be explored in the Deep RL community. Wang et al. Wang et al. (2018) and Huang et al. Huang et al. (2020) explored the use of modular neural networks to control each individual actuator of a simulated robot for continuous control. They expressed a global locomotion policy as a collection of modular neural networks (in the case of Huang et al. Huang et al. (2020), identical networks) that correspond to each of the agent’s actuators, and trained the system using RL. Like soft-bodied robots, every module is only responsible for controlling its corresponding actuator and receives information from only its local sensors (See Figure 8). Messages are passed between neighboring modules, propagating information between distant modules. They show that a single modular policy can generate locomotion behaviors for several distinct robot morphologies, and show that the policies generalize to variations of the morphologies not seen during training, such as creatures with extra legs. As in the case of soft-bodied robots, these results also demonstrate the emergence of centralized coordination via message passing between decentralized modules that are collectively optimizing for a shared reward.

The aforementioned work hints at the power of embodied cognition, which emphasizes the role of the agent’s body in generating behavior. Although the focus of much of the work in Deep RL is in learning neural network policies for an agent with a fixed design (e.g., a bipedal robot, humanoid, or robot arm), embodied intelligence is an area that is gathering interest in the sub-field Ha (2018); Pathak et al. (2019). Inspired by previous work on self-configuring modular robots Stoy et al. (2010); Rubenstein et al. (2014); Hamann (2018), Pathak et al. Pathak et al. (2019) investigates a collection of primitive agents that learn to self-assemble into a complex body while also learning a local policy to control the body without an explicit centralized control unit. Each primitive agent (which consists of a limb and a motor) can link up with nearby agents, allowing for complex morphologies to emerge. Their results show that these dynamic and modular agents are robust to changes in conditions and the policies can generalize to not only unseen environments, but also to unseen morphologies consisting of a greater number of modules. We note that these ideas can be used to allow general DL systems (not confined to RL) to have more flexible architectures that can even learn machine learning algorithms, and we will discuss this later on in the Meta-Learning section.

Aside from adapting to changing morphologies and environments, self-organizing systems can also adapt to changes in their sensory inputs. Sensory substitution refers to the brain’s ability to use one sensory modality (e.g., touch) to supply environmental information normally gathered by another sense (e.g., vision). However, most neural networks are not able to adapt to sensory substitutions. For instance, most RL agents require their inputs to be in an exact, pre-specified rigid format, otherwise they will fail. In a recent work, Tang and Ha Tang and Ha (2021) explored permutation invariant neural network agents that require each of their sensory neurons (receptors that receive sensory inputs from the environment) to deduce the meaning and context of its input signal, rather than explicitly assume a fixed meaning. They demonstrate that these sensory networks can be trained to integrate information received locally, and through communication between them using an attention mechanism, can collectively produce a globally coherent policy. Moreover, the system can still perform its task even if the ordering of its sensory inputs (represented as real-valued numbers) is randomly permuted several times during an episode. Their experiments show that such agents are robust to observations that contain many additional redundant or noisy information, or observations that are corrupt and incomplete.

4.3 Multi-agent Learning

Collective intelligence can be viewed at several different scales. The brain can be viewed as a network of individual neurons functioning collectively. Each organ can be viewed as a collection of cells performing a collective function. Individual animals can be viewed as a collection of organs working together. As we zoom out further, we can also look at human intelligence beyond biology and see human civilization as a collective intelligence solving (and producing) problems that are beyond the capabilities of a single person. As such, while in the previous section, we discussed several works that leverage the power of collective intelligence to essentially decompose a single RL agent into a collection of smaller RL agents working together towards a collective goal, resembling a model of collective intelligence at the biological level, we can also view multi-agent problems as a model of collective intelligence at the societal level.

A major focus of the collective intelligence field is to study the group intelligence and behaviors emerged from a large collection of individuals, whether in humans citetapscott2008wikinomics, animals Sumpter (2010) insects Dorigo et al. (2000); Seeley (2010), or artificial swarm robots Hamann (2018); Rubenstein et al. (2014). This focus has clearly been missing in the Deep RL field. While multi-agent reinforcement learning (MARL) is a well-established branch of Deep RL, most learning algorithms and environments proposed have targeted a relatively small number of agents Foerster et al. (2016); OroojlooyJadid and Hajinezhad (2019), and thus not sufficient to study the emergent properties from large populations. In the most common MARL environments Resnick et al. (2018); Baker et al. (2019); Jaderberg et al. (2019); Terry et al. (2020), “multi-agent” simply means 2 or 4 agent trained to perform a task by means of self-play Bansal et al. (2017); Liu et al. (2019); Ha (2020). Collective intelligence observed in nature or in society, however, relies on a much larger number of individuals than typically studied in MARL, involving population sizes from thousands to million. In this section, we will discuss recent works from the MARL sub-field of Deep RL that had been inspired by collective intelligence (as their authors have even noted in their publications). Unlike most MARL works, these work started to employ large population of agents (each enabled by a neural network), from thousands to millions, in order to truly study their emergent properties at the macro level (1000+ agents), rather than at the micro-level (2-4 agents).

Recent advances in Deep RL have demonstrated the capabilities of simulating thousands of agents in complex 3D simulation environments using only a single GPU Heiden et al. (2021); Rudin et al. (2021). A key challenge is in approaching the problem of multi-agent learning at a much larger scale, leveraging such advances in parallel computing hardware and distributed computation, with the goal of training millions of agents. In this section, we will example recent attempts at training a massive number of agents that interact in a collective setting.

Rather than focusing on realistic physics or environment realism, Zheng. et al. Zheng et al. (2018) developed a platform called MAgent, a simple grid-world environment that can allows millions of neural network agents. Their focus is on scalability, and they demonstrate that MAgent can host up to a million agents on a single GPU (in 2017). Their platform supports interactions among the population of agents, and facilitates not only the study of learning algorithms for policy optimization, but more critically, enables the study of social phenomena emerging from the millions of agents in an AI society, including the emergence of languages and societal hierarchy structures that may have emerged. Environments can be built using scripting, and they have provided examples such as predator-prey simulations, battlefields, adversarial pursuit, supporting different species of distinct agents that may exhibit different behaviors.

MAgent inspired many recent applications, including multi-agent driving Peng et al. (2021), which looks at emergent behavior of entire populations of driving agents to optimize the driving policies which not only affect a single car, but aim to improve the safety of the population as a whole. These directions are good examples that demonstrate the difference between problems framed for deep learning (finding a driving policy for a single car) versus problems in collective intelligence (finding a driving policy for the entire population).

Inspired by the game genre of MMORPGs (Massively Multiplayer Online Role-Playing Games, aka MMOs), Neural MMO Suarez et al. (2021) is an AI research environment that supports a large number of artificial agents that have to compete for finite resources in order to survive. As such, their environment enables large-scale simulation of multi-agent interactions that requires agents to learn combat and navigation policies alongside other agents in a large population all attempting to do the same. Unlike most MARL environments, each agent is allowed to have their own distinct set of neural network weights, which has been a technical challenge in terms of memory consumption. Preliminary experimental results in early versions of the platform Suarez et al. (2019) demonstrated agents with distinct neural network weight parameters developed skills to fill different niches in order to avoid competition within a large population of agents.

As of writing, this project is in active development in the NeurIPS machine learning community Suarez et al. (2021) to work towards studies of large agent populations, long time horizons, open-ended tasks, and modular game systems. The developers provide active support and documentation, and also develop additional training, logging, and visualization tools to enable this line of large-scale multi-agent research. This work is still in its early stages, and only time will tell if platforms that enable the study of large populations such as Neural MMO or MAgent gain further traction within the Deep RL communities.

4.4 Meta-Learning

In the previous sections, we described works that express the solution to problems in terms of a collection of independent neural network agents acting together to achieve a common goal. These parameters of these neural network models are optimized for the collective performance of the population. While these systems have been shown to be robust and adapt to changes in its environment, they are ultimately hardwired to perform a certain task, and cannot perform another task unless retrained from scratch.

Meta-learning is an active area of research within deep learning where the goal is to train the system to learn. It is a large sub-field of ML, including areas such as simple transfer learning from one training set to another. For our purposes, we follow the line of work from Schmidhuber Schmidhuber (2020), where he views meta learning as the problem of ML algorithms that can learn better ML algorithms, which he believes is required to build truly self-improving AI systems.

So unlike traditionally training a neural network to perform one task, where the weight parameters of neural networks are traditionally optimized with a gradient descent algorithm, or with evolution strategies Tang et al. (2022), the goal of meta-learning is to train a meta-learner (which can be another neural network-based system) to learn a learning algorithm. This is a particularly challenging task, with a long history see Schmidhuber Schmidhuber (2020) for a review). In this section, we will highlight recent promising works that make use of collective agents that can learn to learn, rather than learn to perform only a particular task (which we have covered in the previous section).

Concepts from self-organization can be naturally applied to train neural networks to meta-learn by extending the basic building blocks that compose artificial neural networks. As we know, artificial neural networks consist of identical neurons which are modeled as non-linear activation functions. These neurons are connected in a network by synaposes which are weight parameters which are normally trained with a learning algorithm such as gradient descent. But one can imagine extending the abstraction of neurons and synapses beyond static activation functions and floating point parameters. Indeed, recent work Ohsawa et al. (2018); Ott (2020) have explored modeling each neuron of a neural network as an individual reinforcement learning agent. Using the terminology of RL, each neuron’s observations are its current state which change as information is transmitted through the network, and each neuron’s actions enable each neuron to modify its connections with other neurons in the system, hence the problem of learning to learn is treated as a multi-agent RL problem where each agent is part of the collection of neurons in a neural network. While this approach is elegant, the aforementioned works are only capable of learning to solve toy problems and are not yet competitive with existing learning algorithms.

Recent methods have gone beyond using simple scalar weights to transmit scalar signals between neurons. Sandler et al. Sandler et al. (2021) introduce a new type of generalized artificial neural network where both neurons and synapses have multiple states. Traditional artificial neural networks can be viewed as a special case of their framework with two-states where one is used for activations, the other is used for gradients produced using the backpropagation learning rule. In the general framework, they do not require the backpropagation procedure to compute any gradients, and instead rely on a shared local learning rule for updating the states of the synapses and neurons. This Hebbian-style bi-directional local update rule would only require that each synapse and neuron only requires state information from their neighboring synapse and neurons, similar to cellular automata. The rule is parameterized as a low-dimensional genome vector, and is consistent across the system. They employed both evolution strategies, or conventional optimization techniques to meta-learn this genome vector, and their main result is that the update rules meta-learned on the training tasks generalize to unseen novel test tasks. Furthermore, the update rules perform faster than gradient-descent based learning algorithms for several standard classification tasks.

A similar direction has been taken by Kirsch et al. Kirsch and Schmidhuber (2020), where the neurons and synapses of a neural network are also generalized to higher dimension message-passing systems, but in their case each synapse is replaced by an recurrent neural network (RNN) with the same shared parameters. These RNN synapses are bi-directional and govern the flow of information across the network. Like Sandler et al. Sandler et al. (2021), the bi-directional property allows for the network to be used for both inference and learning at the same time by running the system in forward-pass mode. The weights of this system are essentially stored in the hidden states of the RNNs so by simply running the system, they can train themselves using the error signals as feedback. Since RNNs are general-purpose computers, they were able to demonstrate that the system can encode the gradient-based backpropagation algorithm by training the system to simply emulate backpropagation, rather than explicitly calculating gradients via hand-engineering. Of course, their system is much more general than backpropagation, and thus capable of learning new learning algorithms that are much more efficient than backpropagation (See Figure 13).

The previous two works mentioned in this section are only recently published at the time of writing, and we believe that these decentralized local meta-learning approaches have the potential to revolutionize the way neural networks are used in the future in a way that challenges the current paradigm that separates model training and model deployment. There is still much work to be done in demonstrating that these approaches can scale to larger datasets, due to inherently much larger memory requirements (due to much larger internal states of the system). Furthermore, while the algorithms are able to produce learning algorithms that are vastly more sample efficient compared to gradient descent, this efficiency is only apparent in the early stages of learning, and performance tends to peak very early on. Gradient descent, while less efficient, is less biased towards few-shot learning, and can continue to run for many more cycles to refine the weight parameters that will ultimately produce networks that achieve higher performance.

5 Discussion

In this survey, we first gave a brief historical background to describe the intertwined development of deep learning and collective intelligence research. The two research areas were born at roughly the same time, and we can also spot some positive correlations of the rises and falls between the two areas throughout their history. This is no coincidence, since advances and breakthroughs in one of the two areas can usually innovate new ideas or complement the solutions to the problems in the other. For example, introducing deep neural networks and related training algorithms to cellular automata allowed us to develop image generation algorithms that are resistant to noise and have “self-healing” properties. This survey explored several works in deep learning that were also inspired by concepts in collective intelligence. At a macro-level collective intelligence in multi-agent deep RL led to interesting works that can exceed human performance through collective self-play, and to decentralized self-organizing robot controllers; At a micro-level, collective intelligence is also embedded inside advanced methods of simulating each neuron, synapse or other object at a finer granularity within a system with deep models.

Despite the progress made in the works described in this survey, many challenges lie ahead. While neural CA techniques have been applied to image-processing, their application has so far been limited to relatively small and simple datasets, and their image generation quality is still far below the state-of-the-art on more sophisticated datasets such as ImageNet or Celebrity Faces Palm et al. (2022). For Deep RL, while the surveyed works have demonstrated that a global policy can be replaced by a collection of smaller individual policies, we have yet to transfer these experiments to real physical robots. Finally, we have witnessed self-organization guide meta-learning algorithms. While this line of work is extremely promising, they are currently confined to small-scale experiments due to the large computational requirements that come with replacing every single neural connection with an entire artificial neural network. We believe many challenges will be solved in due time as their trajectories are already in motion.

Looking at their respective development trajectories, DL has accomplished notable achievements in developing novel architectures and training algorithms that led to efficient learning and better performance. The research and development cycle of DL is more engineering-focused, as such the advances seen are more benchmark-based (such as classification accuracy for image recognition problems, or related quantitative metrics for language modeling and machine translation problems). DL advances are generally more incremental and predictable in nature, while CI focuses more on problem formulations and environmental mechanisms that motivate novel emergent group behavior. As we have shown in this survey, CI-based techniques enable new capabilities that were simply not possible before. For instance, it is impossible to incrementally-improve a fixed-robot to become a robot capable of self-assembly, and gain all the benefits from such modularity. Naturally, the two areas can complement each other. We are confident that the hand-in-hand style of co-development will continue.

6 Glossary of Terms and Definitions

Deep Learning Related

Term

Definition

Deep Learning (field)

The study of machine learning methods based on artificial neural

networks. Much of the field is devoted to research on the numerous

architectures, their training methods, theoretical properties, and

applications of artificial neural networks.

Supervised Learning

An approach of learning when both the data and the expected

outputs (training signal) are given.

Unsupervised

Representation

Learning

An approach of learning to represent data in a latent space (the

dimension of which is usually, but not necessarily, lower than that

of the input data) without additional training signals.

Transfer Learning

A research problem in ML that focuses on applying knowledge gained

from one problem to solve another different but related problem.

Meta-Learning

A large sub-field of ML, and in this paper (and including areas such

as simple transfer learning from one training set to another). For our

purposes, we view meta-learning as the problem of machine learning

algorithms that can learn better machine learning algorithms, which

many believe is required to build truly self-improving artificial

intelligent systems.

Reinforcement

Learning

RL is an area of ML. It consists of methods that train an agent to

improve its policy from interactions with the environment or

experiences in order to achieve goals.

Agent / Controller

An (artificial) agent or a controller is a system that takes actions

corresponding to a series of inputs in order to achieve goals.

Policy

A (control) policy is the “guide book” by which an agent makes

decisions for its actions. In deep RL, a policy usually takes the

form of an artificial neural network which accepts the inputs from

the task/environment and outputs the corresponding actions.

Self-Play

A training scheme in RL where an agent is trained by playing

against/with snapshots of itself.

Convolutional

Neural Networks

A class of artificial neural networks that are commonly applied to

imagery data. Their connectivity pattern resembles the organization

of the animal visual cortex.

Recurrent

Neural Networks

A class of artificial neural networks most commonly applied to analyze

sequential/temporal data. RNNs can use their internal states to process

inputs of variable lengths.

Graph

Neural Networks

A class of artificial neural networks for processing data best represented

by graph data structures. Such data examples include social networks,

molecule structures, robot morphologies, etc.

Graph

Processing

Unit

A GPU is a specialized electronic circuit designed to rapidly accelerate

the creation of images. Their highly parallel structure makes them

efficient for algorithms that process large blocks of data parallelly and

are therefore widely adopted in DL research.

MNIST

MNIST is a dataset of handwritten digits that is commonly used for

training image processing systems.

Collective Intelligence Related Concepts

Term

Definition

Collective

Intelligence (field)

The study of the shared, or group intelligence that emerges from the

interaction (collaboration, collective efforts, and/or competition) of

a large group of individuals.

Self-Organization

A process where some form of overall order arises from (local)

interactions between parts within a system.

Other Concepts

Term

Definition

Complex Systems

Systems whose behavior is intrinsically difficult to model due to the

dependencies and interactions between the parts within the system

and/or across time.

Cellular Automaton

A CA is a collection of cells on a grid that evolves their states over a

set of discrete values according to predefined rules based on the states

of the neighboring cells.

Embodied Cognition

It is a theory stating that cognition is shaped by aspects of the entire

body of the organism. It emphasizes the role of the body (e.g., motor,

perception) in forming cognition features (e.g., form concepts, make

judgements).

References

Alam et al. (2020) Alam M, Samad MD, Vidyaratne L, Glandon A and Iftekharuddin KM (2020) Survey on deep neural networks in speech and vision systems. Neurocomputing 417: 302–321.
Authors (2022) Authors W (2022) Trajan’s bridge at alcantara. Wikipedia .
Baker et al. (2019) Baker B, Kanitscheider I, Markov T, Wu Y, Powell G, McGrew B and Mordatch I (2019) Emergent tool use from multi-agent autocurricula. arXiv preprint arXiv:1909.07528 .
Bansal et al. (2017) Bansal T, Pachocki J, Sidor S, Sutskever I and Mordatch I (2017) Emergent complexity via multi-agent competition. arXiv preprint arXiv:1710.03748 .
Bhatia et al. (2021) Bhatia J, Jackson H, Tian Y, Xu J and Matusik W (2021) Evolution gym: A large-scale benchmark for evolving soft robots. In: Advances in Neural Information Processing Systems. Curran Associates, Inc. URL https://sites.google.com/corp/view/evolution-gym-benchmark/.
Brockman et al. (2016) Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J and Zaremba W (2016) Openai gym. arXiv preprint arXiv:1606.01540 .
Brown et al. (2020) Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A et al. (2020) Language models are few-shot learners. arXiv preprint arXiv:2005.14165 .
Cheney et al. (2014) Cheney N, MacCurdy R, Clune J and Lipson H (2014) Unshackling evolution: evolving soft robots with multiple materials and a powerful generative encoding. ACM SIGEVOlution 7(1): 11–23.
Chollet et al. (2015) Chollet F et al. (2015) keras.
Chua and Roska (2002) Chua LO and Roska T (2002) Cellular neural networks and visual computing: foundations and applications. Cambridge university press.
Chua and Yang (1988a) Chua LO and Yang L (1988a) Cellular neural networks: Applications. IEEE Transactions on circuits and systems 35(10): 1273–1290.
Chua and Yang (1988b) Chua LO and Yang L (1988b) Cellular neural networks: Theory. IEEE Transactions on circuits and systems 35(10): 1257–1272.
Conway et al. (1970) Conway J et al. (1970) The game of life. Scientific American 223(4): 4.
Daigavane et al. (2021) Daigavane A, Ravindran B and Aggarwal G (2021) Understanding convolutions on graphs. Distill 10.23915/distill.00032. Https://distill.pub/2021/understanding-gnns.
Deneubourg and Goss (1989) Deneubourg JL and Goss S (1989) Collective patterns and decision-making. Ethology Ecology & Evolution 1(4): 295–311.
Deng et al. (2009) Deng J, Dong W, Socher R, Li LJ, Li K and Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. Ieee, pp. 248–255.
Dorigo et al. (2000) Dorigo M, Bonabeau E and Theraulaz G (2000) Ant algorithms and stigmergy. Future Generation Computer Systems 16(8): 851–871.
Foerster et al. (2016) Foerster JN, Assael YM, De Freitas N and Whiteson S (2016) Learning to communicate with deep multi-agent reinforcement learning. arXiv preprint arXiv:1605.06676 .
Freeman et al. (2019) Freeman CD, Metz L and Ha D (2019) Learning to predict without looking ahead: World models without forward prediction URL https://learningtopredict.github.io.
Gilpin (2019) Gilpin W (2019) Cellular automata as convolutional neural networks. Physical Review E 100(3): 032402.
GoraS et al. (1995) GoraS L, Chua LO and Leenaerts D (1995) Turing patterns in cnns. i. once over lightly. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications 42(10): 602–611.
Grattarola et al. (2021) Grattarola D, Livi L and Alippi C (2021) Learning graph cellular automata.
Ha (2018) Ha D (2018) Reinforcement learning for improving agent design URL https://designrl.github.io.
Ha (2020) Ha D (2020) Slime volleyball gym environment. https://github.com/hardmaru/slimevolleygym.
Ha and Schmidhuber (2018) Ha D and Schmidhuber J (2018) Recurrent world models facilitate policy evolution. In: Advances in Neural Information Processing Systems 31. Curran Associates, Inc., pp. 2451–2463. URL https://papers.nips.cc/paper/7512-recurrent-world-models-facilitate-policy-evolution. https://worldmodels.github.io.
Hamann (2018) Hamann H (2018) Swarm robotics: A formal approach. Springer.
He et al. (2016) He K, Zhang X, Ren S and Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 770–778.
Heiden et al. (2021) Heiden E, Millard D, Coumans E, Sheng Y and Sukhatme GS (2021) NeuralSim: Augmenting differentiable simulators with neural networks. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). URL https://github.com/google-research/tiny-differentiable-simulator.
Hill et al. (2018) Hill A, Raffin A, Ernestus M, Gleave A, Kanervisto A, Traore R, Dhariwal P, Hesse C, Klimov O, Nichol A et al. (2018) Stable baselines.
Hooker (2020) Hooker S (2020) The hardware lottery. arXiv preprint arXiv:2009.06489 URL https://hardwarelottery.github.io/.
Horibe et al. (2021) Horibe K, Walker K and Risi S (2021) Regenerating soft robots through neural cellular automata. In: EuroGP. pp. 36–50.
Huang et al. (2020) Huang W, Mordatch I and Pathak D (2020) One policy to control them all: Shared modular policies for agent-agnostic control. In: International Conference on Machine Learning. PMLR, pp. 4455–4464.
Jabbar et al. (2021) Jabbar A, Li X and Omar B (2021) A survey on generative adversarial networks: Variants, applications, and training. ACM Computing Surveys (CSUR) 54(8): 1–49.
Jaderberg et al. (2019) Jaderberg M, Czarnecki WM, Dunning I, Marris L, Lever G, Castaneda AG, Beattie C, Rabinowitz NC, Morcos AS, Ruderman A et al. (2019) Human-level performance in 3d multiplayer games with population-based reinforcement learning. Science 364(6443): 859–865.
Jenal (2011) Jenal M (2011) What ants can teach us about the market. URL https://www.jenal.org/what-ants-can-teach-us-about-the-market/.
Joachimczak et al. (2016) Joachimczak M, Suzuki R and Arita T (2016) Artificial metamorphosis: Evolutionary design of transforming, soft-bodied robots. Artificial life 22(3): 271–298.
Kirsch and Schmidhuber (2020) Kirsch L and Schmidhuber J (2020) Meta learning backpropagation and improving it. arXiv preprint arXiv:2012.14905 .
Kozek et al. (1993) Kozek T, Roska T and Chua LO (1993) Genetic algorithm for cnn template learning. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications 40(6): 392–402.
Krizhevsky et al. (2012) Krizhevsky A, Sutskever I and Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25: 1097–1105.
Lajad et al. (2021) Lajad R, Moreno E and Arenas A (2021) Young honeybees show learned preferences after experiencing adulterated pollen. Scientific reports 11(1): 1–11.
Leimeister (2010) Leimeister JM (2010) Collective intelligence. Business & Information Systems Engineering 2(4): 245–248.
Lévy (1997) Lévy P (1997) Collective intelligence.
Liu et al. (2020) Liu JB, Raza Z and Javaid M (2020) Zagreb connection numbers for cellular neural networks. Discrete Dynamics in Nature and Society 2020.
Liu et al. (2019) Liu S, Lever G, Merel J, Tunyasuvunakool S, Heess N and Graepel T (2019) Emergent coordination through competition. arXiv preprint arXiv:1902.07151 .
Mataric (1993) Mataric MJ (1993) Designing emergent behaviors: From local interactions to collective intelligence. In: Proceedings of the Second International Conference on Simulation of Adaptive Behavior. pp. 432–441.
Mnih et al. (2015) Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al. (2015) Human-level control through deep reinforcement learning. nature 518(7540): 529–533.
Mordvintsev et al. (2020) Mordvintsev A, Randazzo E, Niklasson E and Levin M (2020) Growing neural cellular automata. Distill 10.23915/distill.00023. URL https://distill.pub/2020/growing-ca.
Ohsawa et al. (2018) Ohsawa S, Akuzawa K, Matsushima T, Bezerra G, Iwasawa Y, Kajino H, Takenaka S and Matsuo Y (2018) Neuron as an agent. URL https://openreview.net/forum?id=BkfEzz-0-.
OroojlooyJadid and Hajinezhad (2019) OroojlooyJadid A and Hajinezhad D (2019) A review of cooperative multi-agent deep reinforcement learning. arXiv preprint arXiv:1908.03963 .
Ott (2020) Ott J (2020) Giving up control: Neurons as reinforcement learning agents. arXiv preprint arXiv:2003.11642 .
Palm et al. (2022) Palm RB, Duque MG, Sudhakaran S and Risi S (2022) Variational neural cellular automata. In: International Conference on Learning Representations. URL https://openreview.net/forum?id=7fFO4cMBx_9.
Pathak et al. (2019) Pathak D, Lu C, Darrell T, Isola P and Efros AA (2019) Learning to control self-assembling morphologies: a study of generalization via modularity. arXiv preprint arXiv:1902.05546 .
Peng et al. (2021) Peng Z, Hui KM, Liu C, Zhou B et al. (2021) Learning to simulate self-driven particles system with coordinated policy optimization. Advances in Neural Information Processing Systems 34.
Pickering (2010) Pickering A (2010) The cybernetic brain. University of Chicago Press.
Qin et al. (2018) Qin Y, Feng M, Lu H and Cottrell GW (2018) Hierarchical cellular automata for visual saliency. International Journal of Computer Vision 126(7): 751–770.
Qu et al. (2020) Qu X, Sun Z, Ong YS, Gupta A and Wei P (2020) Minimalistic attacks: How little it takes to fool deep reinforcement learning policies. IEEE Transactions on Cognitive and Developmental Systems .
Radford et al. (2021) Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J et al. (2021) Learning transferable visual models from natural language supervision. arXiv preprint arXiv:2103.00020 .
Radford et al. (2018) Radford A, Narasimhan K, Salimans T and Sutskever I (2018) Improving language understanding by generative pre-training .
Radford et al. (2019) Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I et al. (2019) Language models are unsupervised multitask learners. OpenAI blog 1(8): 9.
Randazzo et al. (2020) Randazzo E, Mordvintsev A, Niklasson E, Levin M and Greydanus S (2020) Self-classifying mnist digits. Distill 10.23915/distill.00027.002. URL https://distill.pub/2020/selforg/mnist.
Resnick et al. (2018) Resnick C, Eldridge W, Ha D, Britz D, Foerster J, Togelius J, Cho K and Bruna J (2018) Pommerman: A multi-agent playground. arXiv preprint arXiv:1809.07124 .
Rubenstein et al. (2014) Rubenstein M, Cornejo A and Nagpal R (2014) Programmable self-assembly in a thousand-robot swarm. Science 345(6198): 795–799.
Rudin et al. (2021) Rudin N, Hoeller D, Reist P and Hutter M (2021) Learning to walk in minutes using massively parallel deep reinforcement learning. arXiv preprint arXiv:2109.11978 .
Sanchez-Lengeling et al. (2021) Sanchez-Lengeling B, Reif E, Pearce A and Wiltschko AB (2021) A gentle introduction to graph neural networks. Distill 6(9): e33.
Sandler et al. (2021) Sandler M, Vladymyrov M, Zhmoginov A, Miller N, Madams T, Jackson A and Arcas BAY (2021) Meta-learning bidirectional update rules. In: International Conference on Machine Learning. PMLR, pp. 9288–9300.
Sandler et al. (2020) Sandler M, Zhmoginov A, Luo L, Mordvintsev A, Randazzo E et al. (2020) Image segmentation via cellular automata. arXiv preprint arXiv:2008.04965 .
Schilling (2000) Schilling MA (2000) Toward a general modular systems theory and its application to interfirm product modularity. Academy of management review 25(2): 312–334.
Schilling and Steensma (2001) Schilling MA and Steensma HK (2001) The use of modular organizational forms: An industry-level analysis. Academy of management journal 44(6): 1149–1168.
Schmidhuber (2014) Schmidhuber J (2014) Who invented backpropagation? More[DL2] .
Schmidhuber (2020) Schmidhuber J (2020) Metalearning machines learn to learn (1987-). URL https://people.idsia.ch/~juergen/metalearning.html. https://people.idsia.ch/~juergen/metalearning.html.
Schoenholz and Cubuk (2020) Schoenholz S and Cubuk ED (2020) Jax md: a framework for differentiable physics. Advances in Neural Information Processing Systems 33.
Schweitzer and Farmer (2003) Schweitzer F and Farmer JD (2003) Brownian agents and active particles: collective dynamics in the natural and social sciences, volume 1. Springer.
Seeley (2010) Seeley TD (2010) Honeybee democracy. Princeton University Press.
Silver et al. (2016) Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M et al. (2016) Mastering the game of go with deep neural networks and tree search. nature 529(7587): 484–489.
Simonyan and Zisserman (2014) Simonyan K and Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 .
Stahlberg (2020) Stahlberg F (2020) Neural machine translation: A review. Journal of Artificial Intelligence Research 69: 343–418.
Stoy et al. (2010) Stoy K, Brandt D, Christensen DJ and Brandt D (2010) Self-reconfigurable robots: an introduction .
Suarez et al. (2019) Suarez J, Du Y, Isola P and Mordatch I (2019) Neural mmo: A massively multiagent game environment for training and evaluating intelligent agents. arXiv preprint arXiv:1903.00784 .
Suarez et al. (2021) Suarez J, Du Y, Zhu C, Mordatch I and Isola P (2021) The neural mmo platform for massively multiagent research. In: Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track. URL https://openreview.net/forum?id=J0d-I8yFtP.
Sudhakaran et al. (2021) Sudhakaran S, Grbic D, Li S, Katona A, Najarro E, Glanois C and Risi S (2021) Growing 3d artefacts and functional machines with neural cellular automata. arXiv preprint arXiv:2103.08737 .
Sumpter (2010) Sumpter DJ (2010) Collective animal behavior. Princeton University Press.
Surowiecki (2005) Surowiecki J (2005) The wisdom of crowds. Anchor.
Tan et al. (2021) Tan X, Qin T, Soong F and Liu TY (2021) A survey on neural speech synthesis. arXiv preprint arXiv:2106.15561 .
Tang and Ha (2021) Tang Y and Ha D (2021) The sensory neuron as a transformer: Permutation-invariant neural networks for reinforcement learning. In: Thirty-Fifth Conference on Neural Information Processing Systems. URL https://openreview.net/forum?id=wtLW-Amuds. https://attentionneuron.github.io.
Tang et al. (2020) Tang Y, Nguyen D and Ha D (2020) Neuroevolution of self-interpretable agents. In: Proceedings of the Genetic and Evolutionary Computation Conference. URL https://attentionagent.github.io.
Tang et al. (2022) Tang Y, Tian Y and Ha D (2022) Evojax: Hardware-accelerated neuroevolution. arXiv preprint arXiv:2202.05008 .
Tapscott and Williams (2008) Tapscott D and Williams AD (2008) Wikinomics: How mass collaboration changes everything. Penguin.
Terry et al. (2020) Terry JK, Black B, Jayakumar M, Hari A, Sullivan R, Santos L, Dieffendahl C, Williams NL, Lokesh Y, Horsch C et al. (2020) Pettingzoo: Gym for multi-agent reinforcement learning. arXiv preprint arXiv:2009.14471 .
Toner et al. (2005) Toner J, Tu Y and Ramaswamy S (2005) Hydrodynamics and phases of flocks. Annals of Physics 318(1): 170–244.
Vinyals et al. (2019) Vinyals O, Babuschkin I, Czarnecki WM, Mathieu M, Dudzik A, Chung J, Choi DH, Powell R, Ewalds T, Georgiev P et al. (2019) Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature 575(7782): 350–354.
Wang et al. (2018) Wang T, Liao R, Ba J and Fidler S (2018) Nervenet: Learning structured policy with graph neural networks. In: International Conference on Learning Representations.
Wang et al. (2021) Wang Z, She Q and Ward TE (2021) Generative adversarial networks in computer vision: A survey and taxonomy. ACM Computing Surveys (CSUR) 54(2): 1–38.
Wolfram (2002) Wolfram S (2002) A new kind of science, volume 5. Wolfram media Champaign, IL.
Wu et al. (2020) Wu Z, Pan S, Chen F, Long G, Zhang C and Philip SY (2020) A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems 32(1): 4–24.
Zhang et al. (2021) Zhang D, Choi C, Kim J and Kim YM (2021) Learning to generate 3d shapes with generative cellular automata. In: International Conference on Learning Representations. URL https://openreview.net/forum?id=rABUmU3ulQh.
Zheng et al. (2018) Zheng L, Yang J, Cai H, Zhou M, Zhang W, Wang J and Yu Y (2018) Magent: A many-agent reinforcement learning platform for artificial collective intelligence. In: Proceedings of the AAAI Conference on Artificial Intelligence, volume 32.