Towards Agentic AI Networking in 6G: A Generative Foundation Model-as-Agent Approach

Yong Xiao, Guangming Shi, and Ping Zhang *This work is currently under revision at IEEE Communications Magazine. Copyright may be transferred without notice, after which this version may no longer be accessible. Y. Xiao is with the School of Electronic Information and Communications, the Huazhong University of Science and Technology, Wuhan, China 430074, also with the Peng Cheng Laboratory, Shenzhen, China, and also with Pazhou Laboratory (Huangpu), Guangzhou, China (e-mail: [email protected]). G. Shi is with the Peng Cheng Laboratory, Shenzhen, China 518055, also with the School of Artificial Intelligence, the Xidian University, Xi’an, Shaanxi 710071, China, and also with Pazhou Laboratory (Huangpu), Guangzhou, China (e-mail: [email protected]). P. Zhang is with the State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China 100876 (email: [email protected]).

Abstract

The promising potential of AI and network convergence in improving networking performance and enabling new service capabilities has recently attracted significant interest. Existing network AI solutions, while powerful, are mainly built based on the close-loop and passive learning framework, resulting in major limitations in autonomous solution finding and dynamic environmental adaptation. Agentic AI has recently been introduced as a promising solution to address the above limitations and pave the way for true generally intelligent and beneficial AI systems. The key idea is to create a networking ecosystem to support a diverse range of autonomous and embodied AI agents in fulfilling their goals. In this paper, we focus on the novel challenges and requirements of agentic AI networking. We propose AgentNet, a novel framework for supporting interaction, collaborative learning, and knowledge transfer among AI agents. We introduce a general architectural framework of AgentNet and then propose a generative foundation model (GFM)-based implementation in which multiple GFM-as-agents have been created as an interactive knowledge-base to bootstrap the development of embodied AI agents according to different task requirements and environmental features. We consider two application scenarios, digital-twin-based industrial automation and metaverse-based infotainment system, to describe how to apply AgentNet for supporting efficient task-driven collaboration and interaction among AI agents.

Index Terms:

Agentic AI networking, AI agent, 6G.

I Introduction

The seamless integration of artificial intelligence (AI) and communication networks is widely recognized as a pivotal trend shaping the future of networking systems, including 6G and beyond[1]. By harnessing the power of AI, high-level autonomous control and decision-making capabilities are expected to be enabled throughout the networking systems, from Operation Administration and Maintenance (OAM) in the core networks to experience enhancement at the user-end equipments (UEs)[2]. High-performance networking systems will also catalyze the success of AI, enabling reliable and secure knowledge sharing, collaborative learning, and efficient multi-agent coordination[3].

Future networking systems will be dominated by a large volume of densely deployed AI models and applications in both the physical and virtual worlds, tailored to specific task objectives and requirements across various industries. Communication between AI models focuses primarily on collaborative model construction and coordination for specific task objectives, which exhibit fundamentally different requirements, compared to existing data-focused communication networks. Major standardization development organizations (SDOs) including 3GPP and ITU-T are actively working on solutions to improve existing communication networks to accommodate these new requirements. For example, AI capabilities and functions have been incorporated into the core network architecture since the very first version of 5G, i.e., 3GPP Release 15. More specifically, the Network Data Analytics Function (NWDAF) has been introduced as a key functional component for data analysis and processing in 3GPP Release 15. Its scope and capability have been significantly extended since then. For example, Release 18 has emphasized the integration of AI and ML into NWDAF, enabling more sophisticated analytics and automation capabilities[4]. NWDAF in Release 19 focuses on providing the necessary insights and recommendations for supporting self-optimization and self-healing. The applications of AI models in the radio access network (RAN) have also attracted significant interest by academia and industry. For example, the AI-RAN Alliance was recently established to accelerate the integration of AI into RAN technologies to improve spectrum utilization and energy efficiency[5].

Despite these progresses, the current communication networking architecture was designed primarily to passively transport data packets. The inherent separation of data transportation and processing functions is still recognized as a significant impediment to supporting efficient, responsive, and secure AI networking[6]. More specifically, the increasing popularity of AI services and applications poses following novel challenges to existing communication networks:

(1)

Data traffic generation speed significantly exceeds network capacity: The proliferation of AI-powered applications, such as autonomous vehicles, virtual and augmented reality, and IoT devices, generates significant data traffic. The network infrastructure deployment speed has long been known to be lagging behind the growing speed of mobile data traffic. With the increasing popularity of AI-based applications and services, the gap between the volume of worldwide data traffics and the capacity of global network infrastructure will only grow larger.
(2)

Limited flexibility and adaptability of network AI models cannot meet the increasingly personalized and diversified needs: Smart devices and UEs are typically dispersed across geographically diverse regions, each operating in an unique and dynamic environment and engaging in a variety of tasks, tailored for different personalized needs. Unfortunately, most existing networking AI solutions are built based on close-loop and passive learning frameworks in which models are trained by historical datasets based on the assumption that the statistical features or patterns of data in the past will remain in the future deployment. This means that relying on existing AI models at the edge or cloud locations to keep track of the interaction between the UEs and the dynamic environments in the remote edge of networks is generally impossible.
(3)

Growing concerns on user data and model security: With the increasing integration of AI models and services, including e-personal assistance and e-healthcare, into human life, more and more sensitive data like medical records, financial transactions, or personal communications have been generated by smart devices and UEs. This data can be vulnerable when exposed to communication networks and edge or cloud service providers. Also, as existing AI models heavily rely on high-quality data to train, any attack or compromise of the data can have significantly negative consequences for the model’s performance and its impact on the services and users’ safety and security.

Recently, agentic AI, a novel AI paradigm focusing on developing capabilities of taking autonomous and independent actions and interact with the environment, has been introduced as a promising solution to address the above challenges and pay the way for the more flexible, generally intelligent, and secure AI services and applications in the future[7, 8]. Most of the existing works focus on implementing specific agentic AI models and learning frameworks for various application tasks and use cases, while ignoring the fact that the agentic AI system is in essence a communication networking ecosystem consisting of a large number of AI agents interacting, collaborating, and coordinating necessary information for their goals. Motivated by this observation, in this paper, we investigate the novel challenges and requirements for the communication networking systems for supporting agentic AI ecosystems. In the rest of this paper, we will first introduce the key building blocks of agentic AI, the AI agents, and then introduce the basic idea of agentic AI networking, we refer to as AgentNet. We will discuss the key challenges and requirements to build the AgentNet. A general architectural framework of AgentNet including the key components and main KPIs will be introduced. Finally, we propose a generative foundation model (GFM)-based implementations in which multiple GFM-as-agents have been created as an interactive knowledge-base to bootstrap the development of embodied AI agents according to different task requirements and environmental features. We discuss two application scenarios, digital-twin-based industrial automation and metaverse-based infotainment system, to verify the promising potential of AgentNet for supporting efficient task-driven communication, collaboration, and interaction among AI agents.

Note that the main scope of this paper is not to provide a detailed survey of agentic AI, but to investigate the requirements and potential implementation of communication networks for supporting collaboration, communication, and interaction among AI agents.

II Agentic AI Networking

As mentioned above, the agentic AI system constitutes an interconnected network of various AI agents[8]. As the building blocks of agentic AI systems, AI agents are autonomous or semiautonomous intelligent beings supported by necessary knowledge and resources designed to accomplish certain goals in targeted environments. They can be either logical or physical entities, each involves a single AI model or a set of collaborative AI models to perceive, make decisions, and interact with virtual or physical surroundings, and respond accordingly. More specifically, they can be functional modules or smart apps, e.g., Chatbot software or recommendation models, deployed at the smart device. It can also be physical entities such as autonomous vehicles or robots or a cluster of collaborative robots that can perceive the local environment and make autonomous decisions.

Agentic AI represents a more advanced form of AI that aims to create systems capable of independent action and interaction with the real world. More specifically, compared to the traditional AI model-based solution, AI agents have the following unique features:

(1)

Proactive interaction with environments: Compared to the existing AI models that are passively waiting for the input, AI agents can actively seek information, explore possibilities, and initiate actions to interact with their environments that involve other agents and dynamic elements.
(2)

Goal-oriented self-learning and self-correction: Compared to the existing AI models that are passively learning from the given training datasets, AI agents can leverage the real-world knowledge and prior experience to self-learn, correct, and refine their decision-making processes and provide fair and balanced responses to achieve different goals with minimized influence from the bias hidden behind the training data.
(3)

Life-long learning: Compared to the existing AI models that are close-loop and remain fixed after being trained, AI agents can continuously adapt and improve their performance over time by acquiring new knowledge and skills throughout their operational lifespan.

There are many novel approaches to implement agents according to different goals and objectives. In the rest of this paper, we mainly focus on the following three types of AI agents:

(1)

Foundation model-as-agent (F-agent): This corresponds to the agent that relies on pre-trained models or strategies built based on broad data at scale and are adaptable for a variety of tasks. A common use scenario for creating an F-agent is to build a special kind of interactive knowledge-base to assist the construction or deployment of other agents. More specifically, F-agents developed based on large foundation models such as the large language model (LLM) or the large vision model (LVM) have the ability not only to store and access a broad range of knowledge information, but also to represent various types of skillsets such as reasoning and inferring implicit rules and rationality, and even creating novel contents based on the inferred rules[9].
(2)

Embodied model-as-agent (E-agent): This corresponds to the agent that can directly interact with the environment and autonomously learn novel solutions to solve challenging tasks. E-agents can collaborate with F-agents when interacting with unknown environments. In this case, F-agents can use their stored knowledge and rules to infer potential consequences when interacting under different environments to bootstrap the development of E-agents.
(3)

Hybrid model-as-agents (H-agent): As mentioned earlier, AI agents can involve a set of different AI models that are collaboratively working towards specific goals. H-agent therefore corresponds to the agent that is composed of collaborative foundation models and embodied models according to some given tasks. For example, an H-agent can directly leverage on the pre-trained capabilities of foundation models about the fundamental knowledge of real-world scenarios and can quickly adapt, understand, and respond to the dynamics of the environments based on the embodied models.

In this paper, we focus on the communication requirements and architecture for supporting the networking of AI agents.

More specifically, we define AgentNet as a specialized networking system designed to facilitate efficient information exchange, action coordination, and knowledge transfer to achieve specific goals among heterogeneous agents including F-, E-, and H-agents. Compared to the existing data-oriented networks, AgentNet poses the following three paradigm shifts in design goal, network management, and performance optimization and evaluation solutions:

(1)

From data-focused to goal-oriented: Compared to the traditional data-oriented communication networks focusing primarily on transmission and accurate delivery of data packets, AgentNet prioritize the needs of collaborative model construction and joint decision making for achieving specific goals. This results in fundamentally different requirements on the information process, transportation, and evaluation of delivery results. More specifically, since modern AI agents are typically equipped with multiple sensors, the rate at which they generate local data often surpasses the data transmission rate supported by their assigned spectrum and capabilities of installed radio frequency unites. For instance, intelligent robots equipped with multiple sensors, such as cameras, LiDAR, and radar, can generate vast amounts of data in real-time. However, the capacity of their communication channels may limit the rate at which this data can be transmitted to a central processing unit or cloud server, potentially hindering real-time decision-making and autonomous operations. In other words, AgentNet must shift the focus from raw data transportation to the intelligent processing and extraction of the most valuable insights from the local sensing data based on specific goal-oriented objectives[10].
(2)

From central control to decentralized autonomy: Since AgentNet interconnects a large number of autonomous agents, the existing centralized management-based networking architecture, reliant on uploading local sensing data to edge or cloud servers for processing, suffer from inefficiencies, reliability issues, and security vulnerabilities. AgentNet should therefore be decentralized in nature, prioritizing edge coordination and knowledge transfer among relevant agents[11].
(3)

From single-resource-focused to multi-resource-related performance optimization: While the performance of the existing data-focused communication systems primarily relies on spectrum efficiency, the performance of AgentNet is directly or indirectly linked to a diverse set of resources, including accuracy of the local sensors, computational power, storage capacity, and algorithmic design. This means that the performance evaluation metrics of AgentNet should also include non-data-oriented metrics such as environment-adaptability, autonomous levels, etc., as will be discussed in the next section.

As the number of agents continues to increase exponentially, there is a critical need to establish novel architectural frameworks to ensure the sustainable development and scalability of AgentNet. In the next section, we will introduce a general AgentNet architecture. We will consider the GFM-based AgentNet implementation and present the numerical results based on two specific application scenarios in Section IV.

Refer to caption — Figure 1: A general architectural framework of AgentNet.

III Architectural Framework and KPIs of AgentNet

III-A Architectural Framework

A general AgentNet architecture, as presented in Fig. 1, consists of the following key components:

•

Infrastructure: includes the hardware infrastructure, such as cloud and edge computing and storage systems and communication networks that interconnect different agents and enable their access to the resources stored at the cloud and edge, as well as the so-called agent infrastructure including the software platforms of agents and sources of high-quality expert datasets, accumulated skillsets, and knowledge of the world models that are accessible for the agents.
•

Environment: includes both physical and virtual environments where agents can interact. The agent can develop various skillsets and learn how to behave to achieve their goals by adopting the standard trail-and-correct approach under different environments. Generally speaking, interaction in physical environments can be expensive and sometimes infeasible. Therefore, a more commonly adopted approach is to develop skillsets and behavioral policies in virtual worlds before deploying in physical environments.
•

Application Tasks: correspond to the objectives the agents want to achieve by collaborating with other agents and interacting with environments. Generally speaking, different tasks have different action spaces and requirements for agents to interact in different environments.
•

Agents: include various types of agents. Multiple agents often collaborate for one or more application tasks. We will discuss this in more detail later.
•

Agent Controller: coordinates the collaboration and task cognition actions among agents. It also coordinates resource orchestration and knowledge sharing among agents. More specifically, it consists of the following subfunctional modules: (1) Environment and task cognition: detects the implementation environment and types and requirements of tasks; (2) Agent adaptation: coordinates the deployment of agents to fit the needs of specific tasks; and (3) Resource orchestration: coordinates the allocations of different resources, including communication and computational resources for implementing agents in different environments according to various task requirements.

The development pipeline is described as follows. The agent controller will first identify requested application tasks and the implementation environments. It will then search the library of existing models or agents to see if there exist any models or agents that can be directly deployed to meet the task requirements. The selected models or agents can be further updated or fine-tuned based on the knowledge or data generated by the GF-agents. Finally, the updated agents will be deployed. The performance of the deployed agents will be monitored or inferred based on the dynamics of the task requirement and environment. Once the performance of agents cannot meet the new requirements and/or dynamics of the environment, the above procedures will be repeated. The above pipeline is aligned with the 3GPP’s AI/ML development pipeline[4].

The proposed architecture and development pipeline are general and can be applied in different application scenarios as illustrated in Section IV.

TABLE I: Comparison of data-focused communication, existing network AI solutions, and AgentNet.

	Existing Comm. Network	Existing Network AI Solutions	AgentNet
Basic Idea	Transporting data packets from one point to another	Pre-selected models trained based on given datasets to extract stationary patterns and insights, or make predictions.	An AI networking paradigm focusing on intra- and inter-agent, and agent-environment interacting, collaborating, and knowledge transfer to achieve various task objectives.
Key Functional Components	Source and channel encoders and decoders, etc.	Cloud/edge servers, set of model structures, training datasets	Agents, infrastructure, agent controller, environment, tasks, etc.
Limitations	Task and environment- agnostic, low efficiency.	Data transportation and processing separation, low data processing efficiency, high-decision delay, limited flexibility.	Still in the early stage of development.
KPIs	Data rate, bit/symbol-error-rate end-to-end commun. latency, etc.	Communication and computational resource, efficiency, model accuracy, etc.	Model generality, data heterogeneity, QoE, knowledge/domain generality, security, etc.
Use Scenarios	eMBB, URLLC, mMTC	Pattern recognition, planning, prediction in stationary network environments.	Interactive immersive communication, autonomous network management, autonomous smart factory, etc.

III-B Key Performance Metrics

Since the main focus of AgentNet is no longer maximizing the volume of transported data packets throughout the network, traditional metrics such as data rate and bit/symbol-error-rate will be insufficient for evaluating its performance. In particular, the performance of AgentNet needs to be evaluated based on new performance metrics, some of which are listed as follows:

III-B1 Data-related metrics

In AgentNet, the raw data samples are located at the users or their closest edge servers. For local model development in each agent, the modality, feature sets, volumes, biasness, as well as the statistics of data will directly affect the capabilities and performance of the agent. Also, for collaborative learning involving multiple agents, the inter-agent communication, interaction, and knowledge transfer schemes will affect the individual agent’s performance. For example, when multiple agents collaborate in training shared AI models, the data-related metrics for AgentNet may include non-i.i.d. levels, feature diversity, and the unbalanceness of datasets across different agents, as well as other metrics that measure the influence of the data quality on the agent’s performance to fulfill its downstream tasks.

III-B2 Model-related metrics

The selection of models is critical for delivering the ideal performance for each agent when performing specific tasks. In AgentNet, each agent can choose a set of foundation models or F-agents for local adaptation to some specific downstream tasks. Some models may be relatively easier to adapt to a wider range of tasks with relatively diverse requirements than others. Here, we borrow terms from meta-learning and refer to these models as “models with good generalization ability”[12]. There may also exist some other models that are relatively easier to adapt to very specialized tasks, which we refer to as the specialized models. There are also other metrics for evaluating the resource consumption, e.g., computational and memory cost, and even the carbon emission for deploying models.

III-B3 Knowledge-related metrics

The service tasks and data samples as well as models can be associated with specific domains of knowledge. For instance, domain-specific services are associated with relatively narrow domain knowledge, e.g., services within a particular field such as industry automation of a very specific manufacturing process. These services are relatively easier to find simple and mature models to build agents for data selection, curation, and model training and adaptation, especially compared to the domain-general services, which involve a much wider range of knowledge data samples across different domains.

III-B4 Resource-related metrics

The time duration for training a satisfactory model (with a given model accuracy) is closely related to the resource availability, including computational and storage capacity of the users or edge servers. For collaborative model construction and transfer, the model performance is also related to the communication performance between users. In AgentNet, these different resources need to be jointly evaluated, based on their tradeoff under different requirements.

III-B5 Quality-of-experience (QoE)-related metrics

QoE can be closely related to users’ subjective performance when experiencing a specific service under a given situation. It is not only affected by service-related hardware and software capacity but can also be influenced by users’ age, gender, personal preference, as well as the time, location, and physical environment of the services. It is therefore of critical importance to develop a simple model formulation to accurately quantify and evaluate the subjective QoE for agents when providing personalized services for human users.

III-B6 Security-related metrics

Recent studies suggest that model-related information such as parameters and intermediate model training results can also be exploited to estimate the personal-related information of users. Therefore, novel metrics that measure the security levels of different AgentNet configurations and solutions need to be developed and carefully evaluated. For example, some F-agents can be developed to simulate various attacks on the model and data as well as the corresponding consequences. These F-agents can then be utilized when developing E-agents to evaluate and choose appropriate security measures when being deployed in different environments.

IV A Generative Foundation Model-based Implementation Framework and Application Scenarios

IV-A GMF-based Implementation Framework

In this section, we introduce a generative foundation model (GMF)-based implementation of AgentNet. GMF-as-agent (GF-agent) is a special F-agent that is pre-trained for generating synthesized data samples that follow the prior knowledge and inference rules of the physical and virtual world environments. These synthesized data samples can be used to simulate the complex interactions and responses of agents when interacting with different environments under various seen and unseen scenarios, it can alleviate the potential bias of human-labeled data samples and the resulting models and also enable rapid adaptation of agents in new scenarios with limited or zero training dataset[13].

The detailed implementation of our proposed framework is presented in Fig. 3. In each local environment, data samples are generated at each UE when performing a set of different tasks under various environments. If the tasks and environments are unknown by the UE, it will apply a classifier to identify the number of tasks in the local data sources and then guide the construction of a set of generative models at the edge servers using its local discrimination model. In this way, each generative model at the edge server will eventually be able to produce synthesized data samples that match the feature of a specific task under a local environment. If there is already a generative model at edge servers or cloud that fits the feature of the relevant tasks, it will then initiate the construction of a GF-agent to generate synthesized data samples to simulate potential interaction results under various scenarios. The simulated results will then be used to train an embodied model in an E-agent to be deployed into specific tasks and environments. The UE’s discriminator model can also serve as the performance evaluator and monitor by continuously measuring and comparing the difference between the simulated data and the real data samples generated by the data sources. Our previous work has already shown that the discriminator and classifier models of each UE can share the same parameters with the only difference in the last layer.

IV-B Application Scenario 1: Digital Twins-based Industrial Automation

Digital twin is a key use scenario in 6G that focuses on establishing a virtual world model replicating a specific environment of the physical world. It has been considered one of the powerful tools for accelerating the development, testing, and deployment of novel ideas and solutions within 6G-enabled industrial automation.

GF-agent-based AgentNet can be applied to enable dynamic task planning and adaptation to unforeseen events and conditions[14]. To verify the performance of AgentNet, we have implemented a hardware prototype based on Franka Emika Panda FR5 robotic arm as illustrated in Fig. 3. In this implementation, the physical environment involving a robotic arm deployed at a smart factory has been captured by a camera, which is used to develop multiple GF-agents for creating the digital twin, i.e., a virtual reality replica of the physical environment, and simulating the motion planning of the robotic arm. A set of E-agents are built based on the simulated environment created by the GF-agents including an E-agent built on the Robot Operating System (ROS) and MoveIt’s planning pipeline for task-level motion planning and also an E-agent built based on Unity’s 3D rendering engine to generate the lighting and shading in the digital-twin environment. The constructed digital twin scene is streamed to a PICO 4 Pro virtual reality (VR) headset. Human users, i.e., operators, can use the handheld controller connected to PICO 4 Pro to interact with the robotic arm in the digital twin environment. In this way, human operators can test various motion planning results in the digital twin environment and only deploy the successfully tested results to the physical world robotic arm.

Since digital twins-based industrial automation requires accurate decision outputs with less subjective QoE, they should adopt highly specialized models trained with more homogeneous datasets to make sure the resulting model output is unbiased with minimized variance.

IV-C Application Scenario 2: Metaverse-Based Infotainment System

These types of applications focus on providing highly personalized services to meet the users’ QoE requirements[15]. Since different users may have different backgrounds and personal preferences, the datasets for training the generative foundation models must have high diversity. Also, these types of applications may require models with good generalization ability in order to fit the various requirements and service needs of different users. They also have high requirements on the users’ QoE, especially the subjective feeling of the users when experiencing different services.

We evaluate the performance of AgentNet in this application scenario based on a 360-degree video interaction VR prototype. In this prototype, generative AI models as well as the corresponding GF-agents have been developed to convert the 2D video captured by a normal computer camera to a 360-degree multi-view video, a Metaverse-based virtual environment. The Metaverse is rendered based on the Unity engine which is displayed on a Pico 4 Pro VR headset. E-agents are built based on the Metaverse created by the GF-agents and the interaction between Human users and Metaverse is conducted via controllers of the VR headset.

V Conclusion

This article has proposed AgentNet, a novel framework for supporting the communication and networking of AI agents. A general architectural framework of AgentNet has been presented and a GFM-based implementation has been discussed. In this implementation, multiple GFM-as-agents have been created as an interactive knowledge-base to bootstrap the development of various embodied AI agents according to different task requirements and environmental features. Two application scenarios, digital-twin-based industrial automation, and metaverse-based infotainment systems have been discussed and experimental results have been presented to verify the performance of AgentNet in supporting task-driven collaboration and interaction among AI agents.

References

[1] W. Saad, O. Hashash, C. K. Thomas, C. Chaccour, M. Debbah, N. Mandayam, and Z. Han, “Artificial general intelligence (AGI)-native wireless systems: A journey beyond 6G,” arXiv preprint, 2024. [Online]. Available: https://arxiv.org/abs/2405.02336
[2] Y. Yang et al., “6G network AI architecture for everyone-centric customized services,” IEEE Network, vol. 37, no. 5, pp. 71–80, Sep. 2023.
[3] Y. Xiao, G. Shi, Y. Li, W. Saad, and H. V. Poor, “Toward Self-Learning Edge Intelligence in 6G,” IEEE Communications Magazine, vol. 58, no. 12, pp. 34–40, Dec. 2020.
[4] X. Lin, “An overview of AI in 3GPP’s RAN Release 18: Enhancing next-generation connectivity?” IEEE ComSoc Technology News, vol. March-2024, Mar. 2024.
[5] M. A. Habibi, B. Han, M. Saimler, I. L. Pavon, and H. D. Schotten, “Towards an AI/ML-driven SMO framework in O-RAN: Scenarios, solutions, and challenges,” arXiv preprint arXiv:2409.05092, Sep. 2024.
[6] W. Tong and G. Y. Li, “Nine challenges in artificial intelligence and wireless communications for 6g,” IEEE Wireless Communications, vol. 29, no. 4, pp. 140–145, 2022.
[7] M. R. Morris, J. Sohl-Dickstein, N. Fiedel, T. Warkentin, A. Dafoe, A. Faust, C. Farabet, and S. Legg, “Position: Levels of AGI for operationalizing progress on the path to agi,” in ICML, 2024.
[8] Z. Durante, Q. Huang, N. Wake, R. Gong, J. S. Park, B. Sarkar, R. Taori, Y. Noda, D. Terzopoulos, Y. Choi et al., “Agent AI: Surveying the horizons of multimodal interaction,” arXiv preprint arXiv:2401.03568, 2024.
[9] Z. Durante, B. Sarkar, R. Gong, R. Taori, Y. Noda, P. Tang, E. Adeli, S. K. Lakshmikanth, K. Schulman, A. Milstein et al., “An interactive agent foundation model,” arXiv preprint arXiv:2402.05929, 2024.
[10] Y. Xiao, Z. Sun, G. Shi, and D. Niyato, “Imitation learning-based implicit semantic-aware communication networks: Multi-layer representation and collaborative reasoning,” IEEE Journal on Selected Areas in Communications, vol. 41, no. 3, pp. 639–658, Mar. 2023.
[11] L. He, G. Sun, D. Niyato, H. Du, F. Mei, J. Kang, M. Debbah, and Z. Han, “Generative AI for game theory-based mobile networking,” 2024. [Online]. Available: https://arxiv.org/abs/2404.09699
[12] A. Fallah, A. Mokhtari, and A. Ozdaglar, “Generalization of model-agnostic meta-learning algorithms: Recurring and unseen tasks,” Advances in Neural Information Processing Systems, vol. 34, pp. 5469–5480, Virtual conference, Dec. 2021.
[13] Y. Xiao, R. Xia, Y. Li, G. Shi, D. N. Nguyen, D. T. Hoang, D. Niyato, and M. Krunz, “Distributed traffic synthesis and classification in edge networks: A federated self-supervised learning approach,” IEEE Transactions on Mobile Computing, vol. 23, no. 2, pp. 1815–1829, Feb. 2024.
[14] H. Zhu, Y. Xiao, Y. Li, G. Shi, and M. Krunz, “SANSee: A physical-layer semantic-aware networking framework for distributed wireless sensing,” IEEE Transactions on Mobile Computing, vol. 24, no. 3, pp. 1636–1653, Mar. 2025.
[15] O. Hashash, C. Chaccour, W. Saad, T. Yu, K. Sakaguchi, and M. Debbah, “The seven worlds and experiences of the wireless metaverse: Challenges and opportunities,” IEEE Communications Magazine, pp. 1–8, 2024.

Yong Xiao (Senior Member, IEEE) is a professor in the School of Electronic Information and Communications at the Huazhong University of Science and Technology (HUST), Wuhan, China. His research interests include agentic AI networking, semantic-aware communication, green AI networking systems.

Guangming Shi (Fellow, IEEE) is the deputy director of Peng Cheng Laboratory, Shenzhen, China. He is also a professor with the School of Artificial Intelligence, Xidian university. His research interest includes artificial intelligence, semantic communication, brain-inspired computing, and signal processing.

Ping Zhang (Fellow, IEEE) is currently a Professor with the School of Information and Communication Engineering, BUPT, and also the Director of the State Key Laboratory of Networking and Switching Technology. He is an Academician with the Chinese Academy of Engineering (CAE). His research is in the board area of wireless communications.