Autonomous and Ubiquitous In-node Learning Algorithms of Active Directed Graphs and Its Storage Behavior

Hui Wei^a∗, Weihua Miao^a, Fushun Li^a
^a Laboratory of Algorithms for Cognitive Models, School of Computer Science, Fudan University, No.2005 Songhu Road, Shanghai and 200438, China

Abstract

Memory is an important cognitive function for humans. How a brain with such a small power can complete such a complex memory function, the working mechanism behind this is undoubtedly fascinating. Engram theory views memory as the co-activation of specific neuronal clusters. From the perspective of graph theory, nodes represent neurons, and directed edges represent synapses. Then the memory engram is the connected subgraph formed between the activated nodes. In this paper, we use subgraphs as physical carriers of information and propose a parallel distributed information storage algorithm based on node scale in active-directed graphs. An active-directed graph is defined as a graph in which each node has autonomous and independent behavior and relies only on information obtained within the local field of view to make decisions. Unlike static-directed graphs used for recording facts, active-directed graphs are decentralized like biological neuron networks and do not have a super manager who has a global view and can control the behavior of each node. Distinct from traditional algorithms with a global field of view, this algorithm is characterized by nodes collaborating globally on resource usage through their limited local field of view. While this strategy may not achieve global optimality as well as algorithms with a global field of view, it offers better robustness, concurrency, decentralization, and bio-viability. Finally, it was tested in network capacity, fault tolerance, and robustness. It was found that the algorithm exhibits a larger network capacity in a more sparse network structure because the subgraph generated by a single sample is not a whole but consists of multiple weakly connected components. In this case, the network capacity can be understood as the number of permutations of several weakly connected components in the network. The algorithm maintains high recall accuracy and completeness when facing error-containing sample inputs or in the presence of node corruption within the network.

Index Terms:

Directed Graph Storage, Connected Subgraphs, Decentralization.

I Introduction

The traditional theoretical memory models based on psychology and cognitive science are the result of the summary and analysis of many experimental phenomena. It decomposes memory activities into the collaboration and interaction of several functional components, which are usually highly abstract and lack the underlying implementation details. However, most of the computational memory models based on artificial neuron networks, represented by Hopfield network [1] and BAM network [2], lack biological authenticity. Therefore, it is necessary to establish an intermediate theoretical model that satisfies biological constraints between the macroscopic models of psychology and the molecular mechanism of memory in neurobiology, so as to help us understand the working mechanism of memory.

Marr [3] proposed a hierarchical framework for analyzing information processing systems, which involves three levels: the theory of computation, representations and algorithms, and hardware implementation. Based on this framework, we analyzed the memory system of the brain. The theory of computation needs to specify the specific requirements of the task, i.e., how the memory storage and retrieval functions are performed by the brain. In the second level, it is necessary to specify the form of memory representation and design algorithms for storing and retrieving memories. The last level is the detailed architecture of connections and communication between neurons in the brain, and since neurobiological studies in this area are more detailed, this paper focuses only on the first two levels. Psychology and cognitive science define and classify memory from a macroscopic point of view, it gives a rough division of memory stages [4]. While neurobiology shows how neurons connect and communicate with each other from a microscopic perspective. It gives the details of implementing the memory system in the hardware dimension [5]. However, these perspectives alone are insufficient for a comprehensive understanding of the memory process as they lack detailed representations and algorithms. To gain a more complete understanding of memory, it is necessary to establish reasonable representations and algorithms at an intermediate level. Computer memory, hard disk, and brain memory have similar functions, and their implementation details are compared in Table I. From the comparison, it can be found that the implementation process of computer storage is very transparent, but there are still systematic gaps in the implementation of human brain memory. It is difficult to provide a logically consistent, sufficiently detailed, and complete theoretical understanding of the entire memory system.

TABLE I: Comparison of computer and brain function implementation details

Function implementation details	Computer	Brain
Minimum functional units	Data is stored in binary format, such as using the polarity of magnetic particles on a disk to represent 0 and 1.	Neuronal membrane potential has two types: resting potential and action potential. Neurons typically maintain a resting potential, but they generate an action potential when activated.
Rapid memory devices	Memory is used to temporarily store the data for CPU calculations and the data for interaction with external devices.	Memory is usually stored in a distributed form in multiple brain regions such as the hippocampus and neocortex. From the neural circuit perspective, we know little about how these functional regions collaborate with each other to achieve information localization and response.
Memory Management	Memory is usually managed in a segment-page approach. The operating system uses segments to distinguish different logical information and fixed-size pages to store the actual content.	We have very limited knowledge about how individual memory items are stored in neural circuits and how multiple memory items are allocated to different neuronal groups.

Directed graphs are a classical data structure used in computer science to model the connections between information, such as citation relationships between papers, interaction relationships between proteins, and acquaintance relationships between people in social networks. In recent years, graph databases [6] and knowledge graphs [7] have been developed to store information in graphs. These databases avoid the need for multi-table unions during traditional relational database queries, providing a more efficient way to access and manage data. However, the directed graphs mentioned above are static and serve as a record of factual information and relationships or a graphical representation of a set of first-order logic propositions. These graphs do not exhibit dynamic behavior independently but act as a data structure required to implement global view-based algorithms. For instance, consider the classical Dijkstra’s shortest path algorithm [8]. In each update, a node in the queue that satisfies the conditions is greedily selected. However, this selection operation is not a behavior of the nodes themselves but rather a task of a super manager with a global view. This manager can access information about all nodes, such as the adjacency matrix, enabling it to select the optimal node. In this case, the directed graph is just an information carrier to ensure the algorithm can run efficiently and correctly.

Looking at directed graphs from the perspective of a Multi-Agent System [9], each node in the graph can be viewed as an independent and fully autonomous agent. Its field of view is limited to the upstream and downstream nodes connected to it, and information is exchanged only between neighboring nodes via directed edges. Every decision made by a node is based on local information, and it can continuously learn and optimize. Now, the directed graph is no longer static but a dynamic system in which neighboring nodes can influence each other. The intrinsic behavior of each node and the external upstream and downstream connections are unique, leading to a rich and complex dynamic behavior of the whole directed graph. In this paper, we term such a directed graph an active-directed graph, which is no longer a manipulated data structure but a cluster of agents with parallel distributed behaviors. An example of an active-directed graph is a biological neural network, where neurons receive and integrate stimulus signals from upstream neurons via dendrites, and subsequently transmit stimulus signals to downstream neurons via axons. Neurons can self-regulate through synaptic plasticity mechanisms like spike-timing-dependent plasticity(STDP) [10]. Since there is no super-manager with a global view to control the behavior of each neuron or node, the behavior of such active-directed graphs is much more complex than static-directed graphs that only record facts.

A directed graph consists of many nodes and directed edges that can be arranged to form numerous connected subgraphs. These connected subgraphs can be seen as a resource, where a subgraph characterizes a state in which several nodes cooperate or relate to each other. Therefore, the entire directed graph can store or remember much content. In a static-directed graph, the number of connected subgraphs obtained by a path-walking algorithm based on a global view is predictable because there is no uncertainty. However, in an active-directed graph, each node has autonomous behavior and relies only on the information obtained within the local field of view for decision-making. This leads to unpredictable functional connected subgraphs’ structure and number. Moreover, since each node can perform incremental adaptive learning, the number of functional subgraphs in the active-directed graph may be much larger and more diverse. To achieve this, corresponding behavior criteria for nodes must be set, which is not required in static-directed graphs.

In a vast active-directed graph, the challenge lies in how to form functional connected subgraphs, consolidate them, efficiently use limited node and path resources, and make multiple connected subgraphs compatible or less interfering with each other. In short, we aim to study how to achieve storage in an active-directed graph. The contribution of this paper is to propose a node-scale-based parallel distributed storage algorithm in active-directed graphs. Unlike traditional algorithms with a global view, the design challenge of such algorithms is that nodes have to collaborate globally on resource usage through their very limited local field of view. This strategy may not achieve global optimality compared to algorithms with global views, but it offers better robustness, concurrency, decentralization, and bio-viability.

II Related Work

Directed graphs have various application modes in storage, and one common mode is the abstract modeling of neuronal networks. One such example is the Hopfield network proposed by John Hopfield in 1982 [1]. It is a fully connected binary recurrent neural network that characterizes the network state by an energy function. Each iteration of the network proceeds towards energy reduction until it reaches a steady state, also known as an attractor. The number of attractors represents the network capacity, which is approximately 0.14N, where N represents the number of nodes in the network. When implementing the associative memory function, the Hopfield network enables complete content retrieval by only part of the sample. However, the capacity of the Hopfield network increases linearly with its network size, making it difficult to preserve too many samples. In 2016, Krotov and Hopfield introduced the discrete modern Hopfield network [11], which allows network capacity to be extended by changing the network energy function and the update rule, but at the corresponding cost of requiring a large number of hidden layer nodes. Demircigil et al. [12] further extended the energy function by introducing exponential interaction functions, increasing the network capacity. In 2021, Ramsauer, Hubert, et al. [13] extended the energy function of modern Hopfield networks from discrete to continuous states while maintaining exponential storage capacity and fast convergence. Hopfield networks, as classical self-associative computational models, enable mapping between vectors of the same dimension. The bidirectional associative memory(BAM) model proposed by Bart Kosko in 1988 [2] can realize both self-association and hetero-association, i.e., mapping between vectors of different dimensions. The model comprises two layers of neurons connected by a weight matrix, which encodes the mapping relationships of all samples. Activating any layer of neurons and iterating through the network results in a correlated output in the neuron on the other layer. In 2021, Bart Kosko [14] introduced a bidirectional backpropagation algorithm in BAM to update matrix parameters dynamically. The original structure was also extended to have any number of hidden layers. The network capacity is increased. In this mode, directed graphs simulate the structure of biological neuron networks, and the network structure is often fixed, such as fully connected or hierarchically connected. The impact of structural parameters such as connectivity, clustering coefficient, and average path length on network performance is not considered. The implementation of the storage function relies more on the weight parameters and update rules of the network, which remains the weight-centric theory. Moreover, all storage contents need to be determined in advance, and the weights are calculated and written at once, making them hard to update incrementally or partially. The local damage to the network will affect the global network, and the scalability of the network scale is not good.

Directed graphs are also commonly used to represent facts and relationships. One popular use case is in graph databases, which employ a graph structure for information queries [15]. In graph databases, nodes represent entities such as people, accounts, or other items, while edges represent connections, such as a friendship relationship. Compared to traditional relational databases, graph databases can perform complex queries more efficiently, as they do not require table join operations but instead search the graph directly. Therefore, they usually offer better performance, especially in the era of big data, where data organization and query complexity have become increasingly important. Graph databases are widely used in various fields, such as in the biomedical domain, for modeling proteins, metabolites, and their relationships, such as digestion and catalysis [16, 17]. In knowledge graphs [18], information is recorded as RDF triples and stored as graphs, laying the groundwork for achieving goals like semantic search and knowledge inference. In these applications, the role of the directed graph is to record facts, and there is no dynamic behavior. This approach does not address capacity issues or the impact of network structure on storage performance.

In contrast to artificial neural networks that rely on fixed connection patterns, weight parameters, and update rules, or static directed graphs that primarily function for fact recording, this paper presents a method for storing information in a directed network in a distributed manner. This approach leverages the autonomous and dynamic behaviors of numerous nodes in the network, such as resource acquisition and competition. The information content is differentiated based on the distinct combinations of nodes and edges. Subgraphs serve as the information storage carriers without relying on super nodes. Additionally, there is no need for a global view. Information is stored through nodes and edges’ local, limited, and adaptive dynamic behaviors. This subgraph-based computational storage model ensures that the stored information remains stable, distinguishable, and fault-tolerant. It also enables incremental storage of information. The performance of this storage method relies on the network’s structural characteristics and the nodes’ adaptive learning algorithm.

III Subgraph-based storage implementation

In the early 20th century, Richard Semon proposed the concept of engrams, which represents the neural substrate for memory function [19]. He believed that an engram is eventually formed when a group of neurons experiences a persistent physical or chemical change in response to an external stimulus. Subsequently, when the original external stimulus comes again, these engram cells are reactivated, enabling the retrieval of memories. If we try to understand it from the perspective of a directed graph. Nodes represent neurons and directed edges represent synaptic connections between neurons. Then the engram is the subgraph formed between the activated nodes, and the difference in the structure of the subgraph represents the different information contents.

Let $G=(V,E)$ be a directed graph, where $V$ denotes the set of nodes and $E$ denotes the set of edges. If $G^{\prime}=(V^{\prime},E^{\prime})$ is a subgraph of $G$ , then it follows that $V^{\prime}\subseteq V$ and $E^{\prime}\subseteq E$ , denoted as $G^{\prime}\subseteq G$ . In a subgraph-based storage implementation, information is recorded in the form of a series of active nodes and interconnected pathways between them, i.e., a subgraph in the network. For instance, consider the message to be stored as: ”While observing a red apple on a tree, I also saw a chirping robin.” In this case, nodes representing semantic elements like red, circle, branch, and chirping are activated simultaneously and propagate stimulus along the directed edges in the network. These nodes are referred to as the initial nodes of this sample. During the stimulus propagation process, the initial nodes activate some otherwise inactive nodes, called communication nodes, which are crucial for establishing the pathways. Not all nodes are directly connected by edges in non-fully connected networks, so communication nodes serve as bridges to establish pathways between the initial nodes. These pathways represent associations between semantic elements, such as recalling a robin when seeing an apple again. This occurs when the initial node representing the apple is activated, and the stimulus is passed along the stored pathway in the network, finally activating the node representing the robin. This implementation draws inspiration from cognitive psychology studies on long-term memory [20, 21].

However, the initial idea is insufficient and requires the design of specific implementation details. For example, a node may only characterize a fundamental physical feature, necessitating multiple nodes to represent a concept like an apple. The initial node may activate some communication nodes, which may activate others. Figure 1a shows 30 randomly selected nodes as initial nodes in a directed network. Figure 1b shows a stable subgraph obtained by propagating the stimulus of these 30 initial nodes through the network with continuous iterations.

Refer to caption — Figure 1: A sample can be stored as a subgraph in a directed graph. (a) Random activation of 30 nodes in the network. (b) Activated nodes propagate stimulus in the network to form a subgraph.

Another factor that makes subgraphs suitable for information storage is the vast number of potential subgraphs present in the network. Given $m=|E|$ , there can be up to $2^{m}$ subgraphs in graph $G$ . Thus, using subgraphs as information storage carriers is a promising idea. The challenge lies in ensuring that the subgraphs do not interfere with or confuse each other, effectively utilizing the node and edge resources of the entire directed graph, enabling incremental storage, reducing unfair resource occupation due to varying sample upload orders, and sharing resources among multiple samples. These technical aspects need to be solved by the parallel distributed network adaptive learning algorithm.

IV Subgraph generation, storage, and retrieval

The storage and retrieval of samples are processes of the initial nodes propagating the stimulus in the network and eventually forming a stable subgraph. Assuming the stimulus propagation time between nodes is constant. The subgraph eventually reaches a stable state through iterations. The formal definition of a stable subgraph is as follows: let $V_{t}$ represent the set of all active nodes in the network at time $t$ . There exists a minimum time $t^{\prime}$ such that $V_{t^{\prime}}\neq V_{t^{\prime}-1}$ and $V_{t^{\prime}}=V_{t^{\prime}+k}$ for any positive integer $k$ . At this point, the network is considered to be in a stable state at time $t^{\prime}$ , and the subgraph comprising all active nodes and edges is the stable subgraph. There are two primary concerns: first, how the subgraph is recorded in the network, and second, what rules nodes use for stimulus propagation. Section 4.1 describes the recording of subgraphs, while sections 4.2 to 4.4 outline the stimulus propagation rules. Section 4.5 presents the specific procedure for sample storage and retrieval.

IV-A The node internal index table records the local upstream and downstream connectivity traces

The storage of subgraph structures involves recording the connectivity paths between active nodes, which can only be accomplished by the nodes themselves based on their local perspectives. This necessitates that active nodes individually record activation information upstream and downstream of themselves. Define the activation trace of node $u$ as a path fragment consisting of active fan-in nodes and active fan-out nodes of node $u$ . Storing the activation traces of all active nodes during this sample storage process will complete the subgraph storage.

In this paper, we store node activation traces by introducing an index table in each node, a data structure with small capacity, easy access, and simple updating. Figure 2 shows the structure of the index table, which contains two columns: the first for active fan-in nodes and the second for active fan-out nodes. Figure 2a shows that node B’s index table contains no content before storing the samples. Once a sample is stored, its corresponding activation trace is saved in its index table. As shown in Figure 2b, during sample retrieval, if node B receives the same or similar input as the recorded activation traces, it generates the corresponding output based on the historical records in the index table.

Defining an index table inside each node that records upstream and downstream active path pairing relationships may appear straightforward and crude, but it is also biologically feasible. Biological neurons have dendrites that receive inputs from multiple directions and axonal that transmit outputs in different directions, creating a many-to-many connection. Actual physical signaling between upstream and downstream neurons relies on synapses, regulated by combinations of diverse neurotransmitters and ion pumps. These mechanisms precisely control the direction and intensity of positive and negative charge flow. Additionally, differences in synapse location, such as being distal or proximal to the axonal, on dendrites or axons, or on the main pathway or terminal, can precisely control the activation and deactivation of specific action potential transmission pathways. In conclusion, this highly precise and diverse molecular-level and subcellular-level modulation and their combinations equip biological neurons with various pathway control mechanisms at the microcircuit level [22, 23]. As a result, a biological neuron can achieve diverse pathway control of signaling within its small neighborhood, relying on a complex set of electrochemical processes [5]. This has inspired the design of directed graph nodes’ internal behavior, allowing them to function like network routers capable of differentially leading fan-outs based on fan-in variations. An index table with limited storage space is a simple, functional equivalent implementation.

IV-B Intra-node stimulus propagation algorithm

The creation of subgraphs depends on the propagation of stimulus between nodes. Stimulus propagation consists of two aspects: node activation rules, i.e., how nodes are activated, and stimulus propagation rules, i.e., determining the downstream nodes to which the stimulus is propagated. In this paper, the node activation rule employed is a fixed-probability activation model, where a node will be activated with a fixed probability upon receiving input. The node becomes active and begins delivering stimulus to downstream nodes if successfully activated. Two types of stimulus propagation rules are used in this paper: the first reuses similar historical activation traces, and the second employs a weighted random selection algorithm.

The specific process of stimulus propagation among nodes is as follows: each resting state node has a fixed probability of H to be activated after receiving the stimulus. Once a node is activated, if its index table is empty, it randomly selects several downstream nodes with equal probability for stimulus delivery. If the index table is not empty, the similarity between the input and each item in the node index table is calculated first. In this paper, the F1 score [24] is used as a metric to evaluate the similarity of two node sequences. The F1 score is a statistical measure of the accuracy of a binary classification model, which is the harmonic mean of precision and recall. When comparing similarity, either one of the node sequences can be treated as the predicted value and the other as the actual value, and the corresponding F1 score is calculated. Higher scores indicate greater similarity. The corresponding historical activation traces are reused if the maximum similarity exceeds the threshold. Otherwise, a weighted random selection algorithm is used to select downstream nodes for stimulus delivery.

The weighted random selection algorithm is based on the frequency of the downstream node appearing in all active fan-out nodes in the current node index table. The higher the frequency of occurrence, the lower the chance of being selected. The introduction of this algorithm allows each active node to distribute stimulus evenly, thus maximizing network resource utilization. The algorithm pseudo-code is shown in Algorithm 1. The value of $H$ affects the subgraph size. The larger $H$ is, the more nodes participate in subgraph formation and the better the connectivity. However, the corresponding cost of network resources is also larger. In this paper, $H$ is set at 60% for testing.

Data:

Table[u]

u

’s index table.

currentIn[u]

: Current input of

u

Threshold

: Similarity threshold.

H

: Probability of being activated.

G[u]

u

’s adjacency table.

eOutNum

: The expected value of output size.

freq[v]

: The occurrence frequency of

v

in the index table of

u

Input:

u

: Current node

Output:

uOut

: The fan-out nodes of

u

1 if $randomValue(0,1)>H$ then

/* Activation failed */

2 return NULL;

4 end if

mxF1\leftarrow 0

;

uOut\leftarrow NULL

;

7 for $item$ in $Table[u]$ do

/* Traverse the index table and find the item most similar to

currentIn[u]

f1Score\leftarrow CalculateF1Score(item.in,currentIn[u])

;

9 if $f1Score>Threshold$ and $mxF1<f1Score$ then

mxF1\leftarrow f1Score

;

uOut\leftarrow item.out

;

13 end if

15 end for

16if $uOut\neq NULL$ then

17 return

uOut

;

19 end if

20for $v$ in $G[u]$ do

freq[v]\leftarrow getFrequencyOfOccurrence(v,Table[u])

;

23 end for

uOut\leftarrow randomChoose(G[u],freq,eOutNum)

;

25 return

uOut

;

Algorithm 1

NodeStimulusSpreading(u)

IV-C Node Resource Grabbing Rules

Nodes are considered limited resources in a directed graph, adhering to the first-come, first-served preemption rule. Activating a node can be seen as the occupation of a node resource. When performing stimulus propagation, nodes can acquire the occupancy of downstream nodes to avoid passing stimulus to already occupied nodes. If an active node does not successfully activate any downstream nodes, it returns to a resting state. A change in the state of some active nodes may trigger a chain reaction that causes more active nodes to become resting. This situation is called the avalanche effect. As shown in Figure 3, at $t_{0}$ , node B receives a stimulus from node A, then subsequently activated at $t_{1}$ . However, because node C has been occupied, node B cannot transmit the stimulus to node C, causing node B to revert to the resting state at the $t_{2}$ moment. At this point, node A is not activating any nodes due to the change in the state of node B. Therefore, at $t_{3}$ , node A also becomes resting due to the avalanche effect.

IV-D Several problems are caused by insufficient resources in subgraph generation

As the number of samples stored in the network increases, new subgraphs may encounter some problems caused by insufficient resources during the generation process, preventing new samples from being stored. The causes of insufficient resources in the network can be broadly classified into four categories:

1. The node index table has a capacity limit. When the number of samples stored in the network reaches a certain level, it becomes hard to store new samples.

2. The network is poorly connected, and the stimulus propagation rule carries a certain level of randomness, as well as the existence of the avalanche effect, which leads to an inability to establish a pathway between the initial nodes.

3. The samples already stored in the network interfere with the samples currently about to be stored. This is because nodes may reuse historical activation traces when they are activated. Although this is an optimization strategy to increase network capacity, it somewhat affects the storage of current samples.

4. Some active nodes take up too many node resources during the current activation, resulting in no resources available for other nodes.

For the first case, the capacity of the index table can be defined as the maximum number of output types that a node can store, as there may be many different inputs corresponding to the same output. The capacity can also be expanded by reasonably discarding and merging the contents of the index table. Specifically, when a node’s index table capacity reaches its upper limit, the node will search for the two most similar activation traces to merge. The similarity here refers to the similarity of the active fan-in nodes in the two activation traces, and merging refers to taking the intersection of the active fan-out nodes of the two activation traces. If the differences between the activation traces are both large, the one with the lowest strength is discarded. The strength here refers to the number of samples that the activation trace has been involved in storing. The more involved, the higher the strength. By reasonably merging and discarding, it is possible to increase network capacity as much as possible at the expense of certain recall accuracy and completeness. The pseudo-code of the algorithm is given by Algorithm 2.

Data:

activeNodes

: active nodes set.

Outputs[u]

: A collection of different output in

Table[u]

K

: The upper limit of the output type.

1 for $u$ in $activePoints$ do

Table[u][in[u]]=out[u]

;

3 if $out[u]$ not in $Output[u]$ then

Output[u].add(out[u])

;

6 end if

7 if $Output[u].size()>K$ then

/* Find the two most similar outputs to merge */

out1,out2\leftarrow findTwoMostSimilarOutput(Output[u])

;

9 if $out1\neq NULL$ and $out2\neq NULL$ then

out3\leftarrow merge(out1,out2)

;

Output[u].remove(out1)

;

Output[u].remove(out2)

;

Output[u].add(out3)

;

/* Change the output of all items whose output is

out1

out2

Table[u]

out3

changeItemsInTable(Table[u],out1,out3)

;

changeItemsInTable(Table[u],out2,out3)

;

17 else

/* If not found, discard the output with the lowest strength */

out1\leftarrow findLowestIntensityOutput(Table[u])

;

Output[u].remove(out1)

;

/* Delete all items whose output is

out1

changeItemsInTable(Table[u],out1,NULL)

;

22 end if

24 end if

26 end for

Algorithm 2

ActivationTracesUpdating()

The second and third cases can be solved by introducing a re-pathfinding rule. The re-pathfinding rule allows the initial nodes to re-propagate the stimulus to find a path connecting other initial nodes when it does not successfully activate any downstream node. For the fourth case, the active node can release some occupied resources according to its situation. The resource release algorithm is introduced here to solve this problem. When the number of failed re-pathfindings of an initial node reaches a certain threshold, it will enter a dormant state, indicating that it is currently unable to communicate with other nodes. The node in the dormant state suspends pathfinding until the subgraph stabilizes. Active nodes will release some nodes after the subgraph is stabilized to make resources available to dormant nodes. For example, if an active node has three active fan-out nodes, it can actively release the occupation of two of them. After releasing the redundant resources, the nodes in the dormant state will resume pathfinding until the subgraph stabilizes again. If, after releasing the resources, the dormant node is still unable to establish path connections to other active nodes, the network connectivity is considered poor, or there is a conflict between the current sample and the samples already stored in the network. In this case, the subgraph can still be formed. However, there will be some isolated nodes that cannot establish connections with other nodes, leading to a decrease in the subgraph’s anti-interference ability and fault tolerance. The pseudo-code of the resource release algorithm is given by Algorithm 3. Figure 4 shows the transition relationship between the three node states. Figure 5 presents the flowchart of the algorithm from the node’s perspective, which includes how nodes process the received stimulus and how downstream nodes are selected for stimulus delivery.

Data:

dormantCnt

: The number of nodes in dormant state.

isDormant[u]

: Whether

u

is in dormant state.

repathCnt[u]

: Indicates the number of re-pathfinding.

Output:

True

: The resource is released successfully and needs to continue to iterate.

False

: No node resources can be released, the iteration is complete.

1 if $dormantCnt>0$ then

isReleased\leftarrow False

;

/* The resources will be released if the subgraph is stable and several nodes are dormant. */

3 for $u\leftarrow 0$ to $n$ do

4 if out degree of $u$ ¿ 1 then

/* Release redundant resources */

isReleased\leftarrow releaseResources(u)

;

7 end if

8 if $isDormant[u]==True$ then

isDormant[u]\leftarrow False

;

dormantCnt\leftarrow dormantCnt-1

;

repathCnt[u]\leftarrow 0

;

13 end if

15 end for

16 return

isReleased

;

18 end if

19return

False

;

Algorithm 3

ResourcesReleasing()

IV-E Sample Storage and Retrieval

The sample storage process consists of two stages: (1) Stimulus propagation stage: the initial nodes propagate stimulus to other nodes along the directed edges until a stable subgraph is formed. (2) Subgraph consolidation stage: All nodes in the subgraph update their internal index tables, recording the activation traces. Figure 6 shows the storage algorithm from the processor’s perspective and the flowchart of the algorithm from the task scheduling perspective. The corresponding pseudo-code is given by Algorithm 4.

Stimulus propagation stage: Table II demonstrates a complete process of generating a stable subgraph through continuous iteration of the initial nodes. At the time $t_{0}$ , the initial nodes are activated, and downstream nodes are chosen for stimulus propagation according to the weighted random selection algorithm. The subsequent $t_{1}$ and $t_{2}$ moments represent the continuous stimulus propagation in the network. At the time $t_{3}$ , since downstream nodes B and C of node H have been occupied by other nodes, node H cannot perform stimulus transfer. Therefore, according to the node resource-grabbing rules, node H transitions from the active state to the resting state. At the time $t_{4}$ , downstream node H, excited by node J, reverts to a resting state. At this point, node J does not activate any nodes, and due to the avalanche effect, its state also becomes a resting state. After the end of time $t_{4}$ , node D will no longer propagate stimulus. However, as the initial node, it will follow the re-pathfinding rules, searching for a new path and attempting to participate in the subgraph formation. When re-pathfinding reaches a certain number of attempts, the node will enter a dormant state. Here, it is assumed that node D has entered a dormant state and will halt pathfinding until the subgraph is stable. It can be observed that at time $t_{4}$ , the subgraph is already stable since there will be no change in node states. At this point, it is necessary for other active nodes in the network to release redundant resources, providing node D the opportunity to re-engage in the subgraph formation. At the time $t_{5}$ , node A, which originally occupied both node E and node G resources, can choose to release the occupation of either of the two nodes. Assuming that node E is released, node E will become resting, and the stimulus from node E to node B will also vanish. However, because node B is an initial node, its state will not change. After the resource is released, node D resumes pathfinding and node J is activated at time $t_{6}$ . At the time $t_{7}$ , node H is activated by node J, and the stimulus is passed to the initial node B, forming a path. At this moment, the connected subgraph between active nodes becomes stable, no dormant nodes are present in the network, and the stimulus propagation stage concludes.

[Uncaptioned image] — TABLE II: Example of a dynamic process for directed graph stimulus propagation

Subgraph Consolidation Stage: The primary task of this stage is to store a stable subgraph structure in the network. When the subgraph achieves a stable state, each active node will have corresponding active fan-in and active fan-out nodes, which are activation traces. Storing the subgraph is completed by updating the activation trace of each node in the node’s internal index table. The pseudo-code of the index table update algorithm is provided by Algorithm 2, and the pseudo-code of subgraph generation and preservation is given by Algorithm 4.

Data:

isActive[u]

: whether

u

is activated.

isInitialNode[u]

: Whether

u

is a initial node.

changed

: Whether the network has iterated to a stable state.

Input:

initialNodes

dormantCnt\leftarrow 0

;

4 for $u$ in $initialNodes$ do

isActive[u]\leftarrow True

;

isInitialNode[u]\leftarrow True

;

8 end for

9while $True$ do

changed\leftarrow False

;

11 for $u\leftarrow 0$ to $n$ do

u

receives stimulus from upstream nodes */

receiveStimulus(u)

;

14 end for

15 for $u\leftarrow 0$ to $n$ do

16 if $isDormant[u]==True$ then

17 continue;

19 end if

u

delivers stimulus to downstream nodes */

NodeStimulusSpreading(u)

;

22 end for

/* The network is not stable, continue to iterate */

23 if $changed==True$ then

24 continue;

26 end if

/* The network is stable and releases resources */

27 if $ResourcesReleasing()==False$ then

28 break;

30 end if

32 end while

ActivationTracesUpdating()

;

Saving(initialNodes)

Algorithm 4

S ​ u ​ b ​ g ​ r ​ a ​ p ​ h ​ G ​ e ​ n ​ e ​ r ​ a ​ t ​ i ​ n ​ g ​ A ​ n ​ d

Saving(initialNodes)

The process of sample retrieval closely resembles sample storage. However, the sample retrieval process is simpler, only including the stimulus propagation stage. During the stimulus propagation stage, it will not be activated when a node receives a stimulus transfer from an upstream node and cannot find a similar entry in its internal index table. The initial nodes will not enter the dormant state, and no nodes will release excessively occupied resources. In summary, the sample retrieval algorithm will not cause any changes to the existing network. However, it will only perform stimulus propagation based on the activation traces stored in the internal index table of the node. Since the sample retrieval algorithm process is very similar to the storage algorithm, only the specified part mentioned above needs to be omitted. Therefore, no separate pseudo-code is provided here.

V Experiments

The experiment is primarily divided into four aspects:

1. Capacity testing: This aims to investigate the number of samples the network can stably store and the factors influencing network capacity.

2. Fault tolerance testing: This mainly explores the effect of sample retrieval when the input is incomplete or has noise.

3. Robustness testing: This mainly explores the effect of sample retrieval when some nodes or edges are damaged.

4. Performance testing on different classical network structures: This mainly explores the algorithm’s performance on various classic network structures.

The network used in the experiment is ER random graphs [25]. This classic random network model, proposed by Paul Erdős and Alfréd Rényi in 1959, is defined by having a probability p of connection between any two nodes in the network. Extending this definition to directed graphs, two distinct directional edges can be between any two nodes. The probabilities of these two edges existing are independent of each other, and both are equal to $p$ .

V-A Capacity testing

Let $G_{i}=(V_{i},E_{i})$ represent the subgraph generated of the $i$ th sample, where $V_{i}$ denotes the set of nodes, and $E_{i}$ denotes the set of edges. Let $G^{\prime}_{i}=(V^{\prime}_{i},E^{\prime}_{i})$ represent the subgraph generated during the retrieval of the $i$ th sample. Define the accuracy $P_{i}=\frac{|E_{i}\cap E^{\prime}_{i}|}{|E^{\prime}_{i}|}$ . Define the completeness $C_{i}=\frac{|E_{i}\cap E^{\prime}_{i}|}{|E_{i}|}$ . Define an isolated node as an initial node in the subgraph with both in-degree and out-degree equal to zero. Define the sample representation quality, $Q$ , as the percentage of non-isolated nodes relative to the initial nodes. If the number of initial nodes in the sample is $s$ , and the number of isolated nodes in the subgraph generated by the sample is $l$ , then $Q=\frac{s-l}{s}$ . In capacity testing, each stored sample should satisfy a high sample representation quality and high completeness and accuracy during sample retrieval to be considered successfully stored by the network. Based on this, we can define the reliable capacity of the network. Let the reliable capacity, $T$ , represent the maximum number of samples that the network can successfully store, and for each stored sample, it satisfies $Q_{i}>0.9$ , $i\in[1,T]$ , $\bar{P}=\frac{\sum_{i=1}^{T}P_{i}}{T}>0.9$ , $\bar{C}=\frac{\sum_{i=1}^{T}C_{i}}{T}>0.9$ .

Assume the network has $n$ nodes, each node has an internal index table with a capacity of $K$ , and each subgraph contains, on average, $s$ initial nodes and $c$ communication nodes. For every subgraph stored in the network, there will be an increase or modification in the internal index table entries of the activated nodes. Considering this as a resource, the total number of resources in the network is $nK$ , and each subgraph occupies $s+c$ resources. Without considering resource reuse or optimization measures, the network will consume $s+c$ resources for every stored sample, so the network capacity can be roughly represented as $\frac{nK}{s+c}$ . If resource reuse is allowed, the calculation of network capacity becomes more complex. In an extreme case, where the resources occupied by the current subgraph are all reused, the upper bound of the network capacity can be roughly expressed as the combination number $\tbinom{nK}{s+c}$ . The network capacity obtained from these two different calculation methods differs greatly. In actual testing, there are many other influencing factors, such as different network connectivity and conflicts between samples. Therefore, a theoretical network capacity analysis is difficult, and a specific analysis should be conducted in conjunction with actual testing situations.

Table III shows the performance of sample retrieval after storing 1,000 samples in networks. The node index table size is set to $K=20$ . The scale of a single sample refers to the size of the initial node set. As shown in Table III, the capacity of sparse graphs is typically larger than that of dense graphs with the same node size. The primary distinction between sparse and dense graphs lies in the number of directed edges within the network, which directly influences network connectivity. This can also be observed from the average number of weakly connected components in the subgraphs presented in Table III. For networks with the same node size, the more edges they have, the fewer weakly connected components their subgraphs have on average. Generally, the subgraphs generated by samples are not necessarily connected but are composed of multiple connected components. Weakly connected components (WCCs) are defined as components where undirected edges replace all directed edges, and any two points within the component are reachable from one another. The number of connected components reflects the aggregation of the subgraph. A greater number of connected components indicates a more dispersed subgraph, while a smaller number of connected components signifies a more clustered subgraph.

TABLE III: Different-scale network retrieval performance after storing 1000 samples

Number of network nodes	Number of edges	Single sample scale	Subgraph average number of nodes	Subgraph average number of edges	Subgraph average number of WCCs	Average accuracy	Average completeness
500	3101(Sparse)	15	24.024	15.651	3.372	99.6%	98.4%
500	3101(Sparse)	60	85.343	70.763	10.312	97.5%	94.8%
500	12606(Dense)	15	28.407	27.774	2.760	98.1%	67.2%
500	12606(Dense)	60	66.590	88.406	1.720	99.0%	63.2%
2000	15037(Sparse)	15	30.428	20.664	3.391	99.5%	98.7%
2000	15037(Sparse)	60	101.564	71.571	14.048	99.4%	98.4%
2000	199452(Dense)	15	38.593	41.987	1.706	100%	82.5%
2000	199452(Dense)	60	68.729	104.784	1.356	100%	57.9%

The connectivity or structure of subgraphs is undeniably a crucial factor influencing network capacity, as it determines the resource usage of each subgraph. There are two main factors that impact subgraph connectivity: the scale of a single sample and network connectivity. Table III demonstrates that a larger scale of a single sample and better network connectivity will reduce network capacity. This observation is intuitive for the former but counterintuitive for the latter. However, when the scale of a single sample node is 0 or network connectivity is extremely poor, the network capacity tends to be 0. This suggests that the relationship between network capacity and subgraph connectivity is not linear.

Figure 7 shows the changes in network capacity and subgraph structure as the number of edges in a network increases. The node size of the network is 500, and the single sample scale is 60. It can be observed that the network capacity first rises and then declines, eventually stabilizing near the theoretical capacity value in the simple case, which is $\frac{nK}{s+c}$ . During the stage of network capacity growth, both the average number of WCCs and the average number of nodes in the subgraph decrease, suggesting that the subgraph progressively transitions from ”dispersed” to ”clustered.” Subsequently, there is a sharp decline in network capacity, and the average number of WCCs in the subgraph also drops dramatically. This indicates that the network connectivity has reached a critical point, with almost all nodes in the subgraph belonging to the same WCC. As a result, the ”agglomeration effect” emerges. It means most initial nodes can connect directly without passing through other communicating nodes. When the network capacity reaches its peak, the average number of nodes in the subgraphs is close to the scale of a single sample, and the number of WCCs is slightly above 1.

Erdős and Rényi [26] demonstrated that when $p>\frac{(1+\epsilon)\ln n}{n}$ , the ER random graph $G(n,p)$ is almost always connected. To ensure that the subgraphs generated by the samples have a high probability of only 1 WCC, it needs to guarantee that $p>\frac{(1+\epsilon)\ln s}{s}$ , where $s$ represents the scale of a single sample, which is 60. Let’s take $p=\frac{\ln s}{s}\approx 0.07$ . The generated subgraph in this scenario is shown in Figure 8a. If we take $p=0.04$ , corresponding to the $p$ when the network capacity reaches its peak, the generated subgraph is shown in Figure 8b. It can be found that the essence of large capacity is actually the permutation and combination of multiple WCCs. When $p$ is slightly less than $\frac{\ln s}{s}$ , the subgraphs generated by the samples are composed of a small number of WCCs. Assuming the subgraph is evenly divided into $t$ WCCs, the size of each WCC is $\frac{s+c}{t}$ . The network capacity can be perceived as selecting $t$ WCCs from all possible ones. This is essentially a Uniform disordered grouping problem. The calculation result is shown in formula 1. Although the actual capacity is significantly smaller than this value, it still demonstrates the huge storage potential of the network.

	$\displaystyle T$	$\displaystyle=$	$\displaystyle\frac{\tbinom{n}{\frac{s+c}{t}}\tbinom{n-\frac{s+c}{t}}{\frac{s+c}{t}}...\tbinom{n-(t-1)\frac{s+c}{t}}{\frac{s+c}{t}}}{t!}$		(1)
		$\displaystyle=$	$\displaystyle\frac{n!}{(t!)^{\frac{s+c+t}{t}}(n-s-c)!}$		(1)

Figure 9 demonstrates the capacity difference between a sparse graph and a dense graph, both with the same number of nodes. It can be observed that in the sparse graph, the average completeness of sample retrieval drops below 80% when the number of stored samples exceeds 8000. In contrast, for the dense graph, the average completeness of sample retrieval declines below 80% when the number of stored samples approaches 300. This capacity difference between the two networks further confirms that the arrangement and combination of WCCs are the essences of large capacity. Although the dense graph has more connections, its displayed capacity is not directly proportional to the number of resources owned by the network. Conversely, the sparse graph has only a small number of connections, but the network capacity achieved by the arrangement and combination of multiple connected subgraphs is several times that of the dense graph. This indicates that a sparse connection is a more reasonable mode, which can effectively save resources and obtain a larger network capacity. Moreover, the biological neuron network of the human brain also follows a sparse connection mode, implying that sparse connections are efficient and maximize the use of resources.

V-B Fault tolerance testing

The fault tolerance testing primarily explores the effect of sample retrieval when the input is incomplete or has noise. The network used in the experiment is an ER random graph with 500 nodes and 3101 (sparse graph) and 12606 (dense graph) edges, respectively. The experiment first stores 1000 samples in the network, then modifies the sample inputs during the retrieval process. Finally, compare how the average accuracy and completeness change when the sample input is incomplete or has noise. There are three categories of modifications to inputs:

1. Removing a part of the original sample input to explore the impact of incomplete inputs on sample retrieval performance.

2. Adding extra noisy nodes to the original sample to investigate the impact of noise on retrieval.

3. Removing part of the original sample input and replacing it with an equal number of noisy nodes to examine the impact of sample retrieval in this mixed scenario.

Figure 10 shows that as more sample inputs are missing, the accuracy and completeness of retrieval decrease to varying degrees in both sparse and dense graphs. The change trends of the two networks are generally similar, and the decline in accuracy is relatively gentle. When the proportion of missing parts reaches 80% of the input, the rate of accuracy decline increases significantly. Compared to Figure 10b, Figure 10a has higher completeness and accuracy when the proportion of missing inputs is between 0.0 and 0.1. This is because the network capacity of the dense graph is small, making it difficult to achieve high reading accuracy and completeness after storing 1000 samples. The overall decline rate of completeness is greater than that of accuracy, indicating that the erroneous content obtained during retrieval does not increase as the proportion of missing inputs increases. This suggests that the algorithm is relatively reliable when facing missing sample inputs, although the rapid decline in completeness represents a significant amount of correct content that cannot be read. However, even when the proportion of missing inputs is as high as 80%, the retrieval accuracy can still be maintained at around 40% to 50%, which means that even if there are numerous missing inputs, almost half of the read content is correct and reliable.

Figure 11 shows the impact of increasing noise nodes. It can be observed that these additional noise nodes have relatively little effect on the accuracy and completeness of sample retrieval. The decline rate of completeness is lower than that of accuracy because adding noise nodes generally do not directly disrupt the original subgraph structure but makes the final subgraph larger. This demonstrates that the network has a relatively strong resistance to noise and is sensitive to the absence of sample inputs.

Figure 12 presents the performance of both incomplete and contained noise nodes in the sample input under sparse and dense graphs, respectively. It can be seen that the decline rate of accuracy and completeness, in this case, is the fastest, indicating that the impact of missing inputs and the effect of noise nodes can be superimposed.

V-C Robustness testing

It is known that neurons in the biological brain may experience various functional failures. How does this affect memory? This section primarily examines how the performance of sample retrieval changes when the network is damaged to different degrees. The test includes two main aspects: damage to some nodes and damage to some directed edges. The experiment first stores 1000 samples in the network, then deletes a certain proportion of nodes or directed edges and subsequently attempts to retrieve these samples while recording average accuracy and completeness changes. The network used in the experiment is an ER random graph with 500 nodes and 3101 edges (sparse graph) or 12606 edges (dense graph).

After a node or a directed edge is deleted, the activation traces recorded in the node’s internal index table will be affected. Assume that an index table contains two items: A,B,C → X,Y,Z and B,C,D → U,X,Z. If nodes A, D, and Z are deleted, does it need to delete the corresponding node in the activation trace recorded in its index table? If deleted, then these two items will become: B,C → X,Y and B,C → U,X. It can be observed that the input parts of these two items are the same, so addressing the different output parts is a challenge. Usually, during the initial period after network damage, nodes are hard to respond, and at this time, the original traces stored in the node index table will not change. As the damage duration increases, nodes may gradually make corresponding adaptive adjustments to the damaged network. Given the above two different situations, this paper proposes four restoration schemes, as shown in Table IV, and compares these four solutions.

TABLE IV: Restoration schemes for the index table after damage

Restoration schemes	The modified content of the index table
Maintain the original trace	{A,B,C} $\rightarrow$ {X,Y,Z}, {B,C,D} $\rightarrow$ {U,X,Z}
Take the union of the outputs	{B,C} $\rightarrow$ {X,Y,U}
Take the intersection of the outputs	{B,C} $\rightarrow$ {X}
Take the output with the highest occurrence frequency	{B,C} $\rightarrow$ {X,Y} or {U,X}

Figure 13 demonstrates the impact of partial node damage on sample retrieval performance. Figure 13a and Figure 13b respectively display the changes in average accuracy and completeness of sample retrieval in the sparse graph for the four restoration schemes. In terms of average accuracy, the scheme that maintains the original traces performs the best, the scheme that takes the union performs the worst, and the other two schemes exhibit similar performance. Conversely, in terms of average completeness, the results are reversed. The scheme that takes the union performs the best, while the one that maintains the original traces performs the worst. This is because the union-taking scheme increases the number of activated nodes, which includes both incorrect and correct nodes. The former leads to a decrease in accuracy, while the latter leads to an increase in completeness. Figure 13c and Figure 13d present the results on the dense graph, revealing that when the network has a large number of edges, the differences between the four restoration schemes progressively diminish. This occurs because when network connectivity is high, the number of communication nodes in the subgraph generated by the sample is small, with most initial nodes being directly connected. Consequently, the accuracy remains consistently high. The frequency at which each node is shared by different samples is also reduced, so when a node is deleted, the number of samples it affects decreases, making the differences between the four solutions less noticeable.

Figure 14 shows the impact of partial directed edge damage on sample retrieval. Since a node only has a local view, it can receive and transmit the information of neighboring nodes solely through its fan-in and fan-out edges. Node damage can be understood as the interruption of all fan-in and fan-out connections, so the impact on neighboring nodes is essentially the same, whether node damage or directed edge damage. Consequently, the same restoration schemes can be used. Figure 14a and Figure 14b respectively display the changes in average accuracy and completeness of sample retrieval for the four restoration schemes in the sparse graph. Their trends are almost consistent with Figure 13a and Figure 13b. Regarding accuracy, the notable difference between the two is that in the interval [0.8,1.0], Figure 14a maintains relatively high accuracy. Regarding completeness, the curve of Figure 14b is comparatively flat. Figure 14c and Figure 14d are the results of dense graphs. The results are also consistent with those of Figure 13c and Figure 13d. The difference is the same as that observed in the sparse graph, which indicates that the network is significantly more tolerant of directed edge damage than node damage, as nodes hold information while edges do not.

V-D Performance testing on different classical network structures

The information storage and retrieval algorithm proposed in this paper is closely related to the network structure. Firstly, the algorithm utilizes the subgraph structure as the information storage carrier, and secondly, the subgraph formation depends on stimulus propagation. Both of these characteristics emphasize the importance of the network structure for the algorithm. Therefore, different network structure characteristics are key factors affecting the algorithm’s performance.

Figure 15 showcases six classic network structures. Figure 15a is the ER graph with $p=0.1$ . Figure 15b represents a globally coupled network, also known as a fully connected network. Figure 15c shows the nearest-neighbor coupled network, characterized by $N$ nodes arranged in a ring, with each node establishing connections to its left and right $L$ neighbors, respectively. Figure 15d illustrates a star coupled network, featuring a central node to which all other nodes are connected. This characteristic causes any path between two points in the network to include the central node, creating a bottleneck for the entire network capacity. Figure 15e presents a one-dimensional Kleinberg network [27], a small-world network [28]. The network is constructed by adding a few random edges to the nearest-neighbor coupled network. Figure 15f displays the Price network [29], which is a scale-free network. The network’s generation relies on the preferential attachment mechanism, where newly added nodes are more likely to connect to nodes with higher degrees. Since each newly added directed edge point from the new node to the old node, there are no loops in the network, leading to a significant decrease in network connectivity and capacity.

Evaluation parameters for different network structures typically include average path length and clustering coefficient.

Average Path Length: Defined as the average of the shortest path lengths between any two points in the network. The default shortest path length is usually positive infinity if the two nodes are disconnected. This special case is common in directed graphs. To prevent the calculation result from being positive infinity, this paper uses the harmonic mean [30] of the distance between any two nodes in the network to represent the average path length.

$N$ represents the number of network nodes, and $d(i,j)$ represents the shortest distance between node $i$ and node $j$ . $GE$ represents network communication efficiency, with its essential idea being that the closer the node path distance in the network, the higher the communication efficiency. The average path length calculated by the formula 2 [30] solves the problem of the value being positive infinity when the network is disconnected. Therefore, it is more suitable for evaluating directed graph network structure.

L=\frac{1}{GE},GE=\frac{1}{N(N-1)}\sum_{i\geq j}\frac{1}{d(i,j)}

(2)

Clustering coefficient: This metric is used to measure whether the nodes in the network exhibit aggregation characteristics. This paper adopts the calculation method of the clustering coefficient in directed graphs proposed by Fagiolo [31].

Table V displays the test results of six network models with different structures but similar scales. The number of nodes in all test networks is 1000, and the single sample scale is 60. The ER random graph exhibits a small clustering coefficient and average path length, while the nearest-neighbor coupled network has a large clustering coefficient and average path length. The Kleinberg directed small-world network has a large clustering coefficient and a small average path length. Kleinberg’s directed small-world and nearest-neighbor coupled network demonstrate relatively excellent network capacity among these six network types. Figure 16 illustrates examples of sample storage corresponding to the two networks. A common feature observed in both networks is the presence of numerous small WCCs. As previously mentioned in the network capacity analysis, the essence of large network capacity lies in the arrangement and combination of WCCs. Since both networks have relatively high clustering coefficients, it is quite easy to form small components locally. Each subgraph can be considered a combination of several small components, resulting in a higher network capacity. The reason for the higher capacity of the Kleinberg network is that, due to its lower average path length, it is easier to form some large WCCs. These large WCCs not only exhibit higher distinguishability but also have a higher resource reuse rate for their nodes, thus positively impacting the improvement of network capacity. In contrast, these characteristics are not present in the ER random graph. Due to its low clustering coefficient and average path length, most weakly connected components formed by the ER random graph are large in scale. This is the difference in capacity caused by different network structures.

TABLE V: Comparison of classic network models

Network Type	Number of edges	Clustering coefficient	Average path length	Maximum reliable capacity	Subgraph average number of nodes	Subgraph average number of edges	Subgraph average number of WCCs
ER random graph	6070	0.006	3.797	85	128.139	116.708	12.044
Globally coupled network	999000	1.000	1.000	341	60.000	180.000	1.000
Nearest-neighbor coupled	6000	0.600	29.235	576	167.234	241.087	18.768
Star coupled network	1998	0.000	1.996	21	60.900	62.900	1.000
Price network	5978	0.170	85.116	0	0	0	0
Kleinberg network	6995	0.440	4.613	6144	98.320	98.180	24.084

V-E Conclusion

In this paper, we employ subgraphs as physical carriers for information storage and leverage nodes’ autonomous adaptive learning behavior to achieve a large-capacity and stable directed graph storage model. The individual nodes’ learning behavior does not need a global view, meaning that the tiny algorithms operating within each node do not work under strong central control and are entirely decentralized. Both the learning behavior and the supporting hardware resources are fine-grained and distributed and can, in theory, be highly parallel in physical implementation.

The storage capacity of the network depends on factors such as connectivity and network structure. The dense graph has better connectivity, the subgraphs generated by the samples are usually gathered together, and the communication nodes are rarely used. The measured capacity at this time is low, approaching the theoretical capacity limit that disallows resource reuse. Sparse graphs exhibit poor connectivity, and the sample-generated subgraphs are generally more dispersed, often consisting of several weakly connected components. In this case, the sample-generated subgraphs can be viewed as a permutation of connected components, significantly increasing the network capacity. Tests have shown that a sparse random directed graph with 500 nodes and 3101 edges can store nearly 8000 memory samples with over 80% accuracy and completeness. In contrast, a dense graph with 500 nodes and 12606 edges can only store around 300 memory samples.

Sparse graphs have fewer resources than dense graphs, but the actual number of samples they can store is tens of times more than dense graphs. It demonstrates that resource abundance is not the sole factor determining network capacity. The network’s structural properties, such as connectivity, clustering coefficient, and average path length, are also crucial. Biological neuronal networks exhibit sparse connections and show large capacity and low power consumption characteristics. To some extent, this paper also provides a possible explanation for how biological neuronal networks can achieve memory functions.

References

[1] J. J. Hopfield, “Neural networks and physical systems with emergent collective computational abilities.” Proceedings of the National Academy of Sciences, vol. 79, no. 8, pp. 2554–2558, Apr. 1982.
[2] B. Kosko, “Bidirectional associative memories,” IEEE Trans. Syst. Man Cybern., vol. 18, no. 1, pp. 49–60, 1988.
[3] D. Marr, Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. Cambridge, MA, USA: MIT Press, Jul. 2010.
[4] G. Zlotnik and A. Vansintjan, “Memory: An Extended Definition,” Frontiers in Psychology, vol. 10, p. 2523, 2019.
[5] L. Luo, “Architectures of neuronal circuits,” Science (New York, N.Y.), vol. 373, no. 6559, p. eabg7285, Sep. 2021.
[6] N. Francis, A. Green, P. Guagliardo, L. Libkin, T. Lindaaker, V. Marsault, S. Plantikow, M. Rydberg, P. Selmer, and A. Taylor, “Cypher: An Evolving Query Language for Property Graphs,” in Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, June 10-15, 2018. ACM, 2018, pp. 1433–1445.
[7] S. Ji, S. Pan, E. Cambria, P. Marttinen, and P. S. Yu, “A Survey on Knowledge Graphs: Representation, Acquisition, and Applications,” IEEE Trans. Neural Networks Learn. Syst., vol. 33, no. 2, pp. 494–514, 2022.
[8] E. W. Dijkstra, “A note on two problems in connexion with graphs,” Numerische Mathematik, vol. 1, pp. 269–271, 1959.
[9] A. Dorri, S. S. Kanhere, and R. Jurdak, “Multi-Agent Systems: A Survey,” IEEE Access, vol. 6, pp. 28 573–28 593, 2018.
[10] C. D. Meliza and Y. Dan, “Receptive-field modification in rat visual cortex induced by paired visual stimulation and single-cell spiking,” Neuron, vol. 49, no. 2, pp. 183–189, Jan. 2006.
[11] D. Krotov and J. J. Hopfield, “Dense Associative Memory for Pattern Recognition,” in Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain, 2016, pp. 1172–1180.
[12] M. Demircigil, J. Heusel, M. Löwe, S. Upgang, and F. Vermet, “On a Model of Associative Memory with Huge Storage Capacity,” Journal of Statistical Physics, vol. 168, no. 2, pp. 288–299, Jul. 2017.
[13] H. Ramsauer, B. Schäfl, J. Lehner, P. Seidl, M. Widrich, L. Gruber, M. Holzleitner, T. Adler, D. P. Kreil, M. K. Kopp, G. Klambauer, J. Brandstetter, and S. Hochreiter, “Hopfield Networks is All You Need,” in 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021.
[14] B. Kosko, “Bidirectional Associative Memories: Unsupervised Hebbian Learning to Bidirectional Backpropagation,” IEEE Trans. Syst. Man Cybern. Syst., vol. 51, no. 1, pp. 103–115, 2021.
[15] R. Angles and C. Gutierrez, “Survey of graph database models,” ACM Computing Surveys, vol. 40, no. 1, pp. 1:1–1:39, Feb. 2008.
[16] I. Thiele, N. Swainston, R. M. T. Fleming, A. Hoppe, S. Sahoo, M. K. Aurich, H. Haraldsdottir, M. L. Mo, O. Rolfsson, M. D. Stobbe, S. G. Thorleifsson, R. Agren, C. Bölling, S. Bordel, A. K. Chavali, P. Dobson, W. B. Dunn, L. Endler, D. Hala, M. Hucka, D. Hull, D. Jameson, N. Jamshidi, J. J. Jonsson, N. Juty, S. Keating, I. Nookaew, N. Le Novère, N. Malys, A. Mazein, J. A. Papin, N. D. Price, E. Selkov, M. I. Sigurdsson, E. Simeonidis, N. Sonnenschein, K. Smallbone, A. Sorokin, J. H. G. M. van Beek, D. Weichart, I. Goryanin, J. Nielsen, H. V. Westerhoff, D. B. Kell, P. Mendes, and B. Ø. Palsson, “A community-driven global reconstruction of human metabolism,” Nature Biotechnology, vol. 31, no. 5, pp. 419–425, May 2013.
[17] I. Vastrik, P. D’Eustachio, E. Schmidt, G. Joshi-Tope, G. Gopinath, D. Croft, B. de Bono, M. Gillespie, B. Jassal, S. Lewis, L. Matthews, G. Wu, E. Birney, and L. Stein, “Reactome: A knowledge base of biologic pathways and processes,” Genome Biology, vol. 8, no. 3, p. R39, Mar. 2007.
[18] HoganAidan, BlomqvistEva, CochezMichael, D’amatoClaudia, M. De, GutierrezClaudio, KirraneSabrina, G. E. Labra, NavigliRoberto, NeumaierSebastian, N.-C. Ngonga, PolleresAxel, R. M, RulaAnisa, SchmelzeisenLukas, SequedaJuan, StaabSteffen, and ZimmermannAntoine, “Knowledge Graphs,” ACM Computing Surveys (CSUR), Jul. 2021.
[19] S. A. Josselyn and S. Tonegawa, “Memory engrams: Recalling the past and imagining the future,” Science (New York, N.Y.), vol. 367, no. 6473, Jan. 2020.
[20] K. Abdou, M. Shehata, K. Choko, H. Nishizono, M. Matsuo, S.-I. Muramatsu, and K. Inokuchi, “Synapse-specific representation of the identity of overlapping memory engrams,” Science (New York, N.Y.), vol. 360, no. 6394, pp. 1227–1231, Jun. 2018.
[21] N. Ohkawa, Y. Saitoh, A. Suzuki, S. Tsujimura, E. Murayama, S. Kosugi, H. Nishizono, M. Matsuo, Y. Takahashi, M. Nagase, Y. K. Sugimura, A. M. Watabe, F. Kato, and K. Inokuchi, “Artificial association of pre-stored information to generate a qualitatively new memory,” Cell Reports, vol. 11, no. 2, pp. 261–269, Apr. 2015.
[22] N. X. Tritsch, A. J. Granger, and B. L. Sabatini, “Mechanisms and functions of GABA co-release,” Nature Reviews. Neuroscience, vol. 17, no. 3, pp. 139–145, Mar. 2016.
[23] L. Chen, K. A. Cummings, W. Mau, Y. Zaki, Z. Dong, S. Rabinowitz, R. L. Clem, T. Shuman, and D. J. Cai, “The role of intrinsic excitability in the evolution of memory: Significance in memory allocation, consolidation, and updating,” Neurobiology of Learning and Memory, vol. 173, p. 107266, Sep. 2020.
[24] C. D. Manning, Introduction to Information Retrieval. Syngress Publishing,, 2008.
[25] P. Erdös and A. Rényi, “On Random Graphs I,” Publicationes Mathematicae Debrecen, vol. 6, p. 290, 1959.
[26] ——, “On the evolution of random graphs,” in The Structure and Dynamics of Networks. Princeton University Press, 2006, pp. 38–82.
[27] D. Easley and J. Kleinberg, Networks, Crowds, and Markets: Reasoning about a Highly Connected World. Cambridge University Press, 2010.
[28] D. J. Watts and S. H. Strogatz, “Collective dynamics of ‘small-world’ networks,” Nature, vol. 393, no. 6684, pp. 440–442, Jun. 1998.
[29] D. J. d. S. Price, “Networks of Scientific Papers,” Science, vol. 149, no. 3683, pp. 510–515, Jul. 1965.
[30] X. Wang, Network Science:An Introduction. Higher Education Press, 2012.
[31] G. Fagiolo, “Clustering in complex directed networks,” Physical Review E, vol. 76, no. 2, p. 026107, 2007.

Graph
Nodes state change	$t_{0}$ : Initial nodes {A,B,C,D} are activated, which performs stimulus propagation according to a weighted random selection algorithm.	$t_{1}$ : {E,F,G,I,J} becomes active after receiving stimulus from the initial nodes.
Graph
Nodes state change	$t_{2}$ : {H,K} becomes active after receiving stimulus from active nodes.	$t_{3}$ : The downstream nodes {B,C} of H are all occupied, so stimulus cannot be propagated. H becomes resting again.
Graph
Nodes state change	$t_{4}$ : After H becomes resting state, according to the avalanche effect, J also becomes resting state.	$t_{5}$ : The subgraph iterates to a steady state. Start to release resources, and node A releases the occupancy of E.
Graph
Nodes state change	$t_{6}$ : After the resources are released, the dormant node D restarts pathfinding and successfully activates J.	$t_{7}$ : Node H is successfully activated after receiving the stimulus from J and passing the stimulus to B. The subgraph is iterated to a stable state.