Topology-aware Microservice Architecture in Edge Networks: Deployment Optimization and Implementation

Yuang Chen, , Chang Wu, Fangyu Zhang, Chengdi Lu, Yongsheng Huang, and Hancheng Lu Yuang Chen, Chang Wu, Fangyu Zhang, Chengdi Lu, Yongsheng Huang, and Hancheng Lu are with the University of Science and Technology of China, Hefei, 230027, China. (email:

\{

yuangchen21, changwu, fv215b, lcd1999, ysh6

\}

@mail.ustc.edu.cn; [email protected]). Hancheng Lu is also with the Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, 230088, China.

Abstract

As a ubiquitous deployment paradigm, integrating microservice architecture (MSA) into edge networks promises to enhance the flexibility and scalability of services. However, it also presents significant challenges stemming from dispersed node locations and intricate network topologies. In this paper, we have proposed a topology-aware MSA characterized by a three-tier network traffic model encompassing the service, microservices, and edge node layers. This model meticulously characterizes the complex dependencies between edge network topologies and microservices, mapping microservice deployment onto link traffic to accurately estimate communication delay. Building upon this model, we have formulated a weighted sum communication delay optimization problem considering different types of services. Then, a novel topology-aware and individual-adaptive microservices deployment (TAIA-MD) scheme is proposed to solve the problem efficiently, which accurately senses the network topology and incorporates an individual-adaptive mechanism in a genetic algorithm to accelerate the convergence and avoid local optima. Extensive simulations show that, compared to the existing deployment schemes, TAIA-MD improves the communication delay performance by approximately 30% to 60% and effectively enhances the overall network performance. Furthermore, we implement the TAIA-MD scheme on a practical microservice physical platform. The experimental results demonstrate that TAIA-MD achieves superior robustness in withstanding link failures and network fluctuations.

Index Terms:

microservice, network topology, load balancing, complex dependency, communication delay, link failure, network fluctuation.

I Introduction

With the proliferation of smartphones and internet-of-things (IoT) devices [1], along with the widespread adoption of cloud and edge computing and the explosive growth of digital content and services, the scale of users and the resource demands for network services have surged dramatically [2]. For instance, ChatGPT reached over one billion monthly active users within less than two months of its launch [3]. Currently, mobile applications are increasingly focusing on being location-aware, compute-intensive, and latency-sensitive [1, 4, 5]. Combined with continuously growing user bases and increasing quality-of-service (QoS) requirements, the limitations of cloud computing, which centralizes service operations in the cloud, are gradually becoming apparent [6, 7]. Edge computing has emerged as an advanced computing paradigm that enables the deployment of computing and storage resources closer to users, significantly improving real-time performance and reducing data traffic costs [8, 9, 10, 11]. To fully utilize computing resources, edge computing must reallocate hardware and software resources via virtualization technologies to serve multiple users on the same hardware [12]. Nevertheless, compared to centralized cloud networks, edge networks face tougher challenges in terms of user mobility, device heterogeneity, limited resources, and geographically dispersed edge nodes [13, 14, 15]. Moreover, commonly used virtualization technologies based on virtual machines (VMs) are impractical for edge environments due to their high resource overhead, slow startup times, and complex deployment and migration processes [16, 17].

To alleviate the above issues, a more distributed and service-oriented application architecture [18, 19], microservice architecture (MSA), has recently emerged as a potential enabler for flexible service deployment in distributed scenarios [20, 21, 22, 14, 23, 24]. In MSA, complex monolithic applications are decomposed into multiple logically distinct, mutually complementary, and single-function microservices, which cooperate with each other to provide combined functionality [14, 25, 19]. This approach offers significant flexibility, robustness, and scalability for the deployment of distributed services. Each microservice in MSA can be independently developed, upgraded, and deployed using different programming languages, thus effectively addressing the scalability, maintenance, and team collaboration issues inherent in traditional monolithic applications [26]. Currently, many internet giants such as Amazon, Netflix, and eBay have extensively exploited MSA, and their research and development progress reports claim the effectiveness of MSA [27]. For instance, Amazon emphasized in [28] that MSA can achieve more controllable and faster production, easily expand computing resources, reduce unit complexity, and create more scalable development teams. In addition, microservices are typically deployed using lightweight virtualization technologies like containers [29], which leads to faster deployment and startup, lower resource consumption, and greater compatibility, and can be flexibly deployed on resource-constrained, platform-heterogeneous edge or fog devices [30]. Consequently, developing MSA-based service deployment approaches for edge networks is expected to enhance the flexibility and scalability of services, becoming one of the mainstream technologies for service deployment.

I-A Motivations and Challenges

Refer to caption — Figure 1: Complex dependencies among microservices. (a) Microservice invocations for services. (b) Invocations among microservice instances.

Although MSA is anticipated to provide finer-grained deployment schemes for edge networks, enhancing flexibility and scalability, and speeding up application development cycles, the development of microservices in edge networks still encounters numerous stringent challenges.

I-A1 Complex microservice dependencies in edge networks with geographically dispersed nodes

The geographical dispersion of edge nodes necessitates complex topology forwarding for inter-device communications, which is exacerbated by the limited and unstable link bandwidth, resulting in significant differences in communication delay and bandwidth occupancy across different nodes [13, 14]. In addition, microservices in MSA typically have complex dependencies [31], as illustrated in Fig. 1, where $M_{i}$ represents a type of microservice and $M_{i,j}$ denotes the $j$ -th instance of $M_{i}$ . Fig. 1 (a) shows that fulfilling a service requirement usually necessitates the collaboration of multiple microservices, with various services potentially sharing the same type of microservice. As shown in Fig. 1 (b), each type of microservice can have multiple instances with complex invocation relationships among them. Consequently, compared to traditional monolithic applications, MSA demands greater communication requirements and exhibits more intricate inter-instance relationships. This complicates traffic analysis between edge nodes, and existing work lacks effective analysis models [32, 22, 20]. In this regard, an effective topology-aware microservice deployment model is necessitated to map microservice deployments onto edge nodes’ link traffic, thereby accurately characterizing the complex edge network topology as well as inter-microservice dependencies.

I-A2 Difficulties of implementing optimal microservice deployment in edge networks

The essence of microservice deployment involves shared resource allocation and communication overhead, which directly impacts the QoS of microservice-based applications [20]. As illustrated in Fig. 2, the impacts of microservice deployment schemes on the QoS of services are presented. In Fig. 2 (a), the microservice invocations required for a specific service are depicted, where the node labels represent the microservice types and the arrow labels indicate the order of execution for the request or response. Fig. 2 (b) and Fig. 2 (c) show two different deployment schemes for the microservice invocations described in Fig. 2 (a), where node labels represent the deployed microservice types and the connecting line labels denote the inter-node communication delay (in ms). Compared to deployment scheme 1, deployment scheme 2 significantly reduces communication delay by collocating frequently interacting microservice instances on the same or nearby computing nodes. In particular, based on the microservice invocations depicted in Fig. 2 (a), from the moment a user request is received at the user-access node $\mathbf{A}$ to the moment a user response returns to node $\mathbf{A}$ , the total communication delays generated by deployment schemes 1 and 2 are $46$ ms and $22$ ms, respectively. Moreover, deployment scheme 2 deploys instances of microservice $\mathbf{2}$ on both computing nodes $\mathbf{D}$ and $\mathbf{F}$ , which allows microservice $\mathbf{1}$ on computing node $\mathbf{D}$ to invoke microservice $\mathbf{2}$ locally, thereby avoiding cross-node communication and reducing communication delay and overhead [33].

The aforementioned analysis highlights that a well-designed microservice deployment scheme in distributed edge networks can remarkably conserve network resources and reduce communication delay. Due to the limited computing capacity of edge nodes, multiple microservice instances are usually deployed across distributed nodes to meet the QoS requirements of applications and avoid single points of failure [34, 33], which can effectively combat cross-node communication and further reduce communication delay and overhead [33]. What’s more, deploying multiple instances on the same edge node may lead to resource contention, and allocating more resources to one microservice may slow down the operation speed of other microservices [20]. As a result, microservice deployment optimization must be dynamically adjusted according to service load since microservices compete for resources with each other, and the QoS of the service is affected by almost all microservices.

I-A3 The prerequisite of optimal microservice deployment necessitates establishing an effective traffic analysis model

In edge networks with massive services and users, some links carry multiple service traffic concurrently, leading to competition for network resources [20]. Fig. 3 illustrates the bandwidth collisions of service traffic in edge networks, where two types of services are realized through mutual invocation among various microservices. Both service 1 and service 2 need to invoke microservice 4 and microservice 5, resulting in bandwidth collisions as their traffic inevitably crosses the link marked with the red star in Fig. 3. Therefore, optimizing microservice deployment necessitates an effective traffic analysis model to accurately capture complex microservice dependencies and network topologies. However, this is extremely difficult. Existing studies usually simplify edge networks by considering the bandwidth and delay between nodes as fixed values, which fails to fully incorporate network topology and ignores the impact of deployment schemes on link bandwidth, resulting in lower-than-expected communication capacity between instances and even congestion [31, 20, 35]. Furthermore, most studies focus only on simulation validation and lack actual physical platform verification, which ignores the feasibility of microservice deployment schemes deployed in real systems and leads to inaccurate estimation of link bandwidth. Therefore, implementing and validating the performance of deployment schemes in actual physical validation platforms is essential for optimal microservice deployment schemes in edge networks with intricate topologies [36, 13, 23, 24].

I-B Main Contributions

To effectively overcome the aforementioned challenges, we have proposed an innovative topology-aware MSA, which incorporates a three-tier network traffic analysis model. Then, we have formulated a weighted sum communication delay optimization problem that aims to improve delay performance by optimizing microservices deployment. To effectively tackle this problem, we have developed a novel topology-aware and individual-adaptive microservices deployment (TAIA-MD) scheme. Extensive simulations and physical platform validations demonstrate the superior performance of our proposed TAIA-MD scheme. The primary contributions of this paper are summarized as follows:

•

For the first time, we have proposed an innovative topology-aware MSA, which features a three-tier network traffic analysis model comprising the service, microservice, and edge node layers. This model meticulously captures the complex inter-microservice dependencies and edge network topologies by mapping microservice deployments onto edge nodes’ link traffic to accurately estimate the communication delay.
•

Building upon this model, we have formulated a weighted sum communication delay optimization problem that considers the load conditions of services, aiming to improve delay performance through optimizing microservices deployment. To address this intractable problem, we have developed a novel microservice deployment scheme named TAIA-MD, which customizes the deployment scheme by sensing network topologies and incorporates an individual-adaptive mechanism in the genetic algorithm (GA) to accelerate the convergence and avoid local optima.
•

Extensive simulations demonstrate that TAIA-MD can reduce communication delay by approximately 30% to 60% compared to existing deployment schemes in bandwidth-constrained and topology-complex edge networks. Moreover, physical platform experiments further show the superior robustness of the TAIA-MD in effectively combating link failures and network fluctuations.

The remainder of this paper is organized as follows. In Section II, we provide an in-depth literature review of relevant studies. The topology-aware MSA is introduced in Section III. Section IV formulates and solves the weighted sum communication delay optimization problem. The results and analysis of simulations and physical verifications are provided in Section V. Finally, Section VI concludes this paper and discusses future directions.

II RELATED WORKS

In this section, we first review earlier literature primarily focused on microservice deployments centered around cloud environments. Then, we comprehensively summarize and examine the latest microservice deployment schemes in edge networks. Finally, we identify limitations related to the aforementioned challenges from these literature reviews.

II-A Deployment Schemes Centered around Cloud Environments

Due to the highly decoupled nature of MSA, implementing a certain service usually requires close collaboration among numerous microservices [37, 38]. However, deploying hundreds or thousands of microservice instances on resource-constrained servers presents a formidable challenge [39, 38, 13, 23, 24]. Research on microservice deployment schemes is always of paramount importance. Early studies on microservice deployment primarily centered around cloud environments [40, 41, 42]. In [40], the authors have considered resource contention and deployment time among microservices, and proposed a parallel deployment scheme to aggregate microservices that compete for as diverse resources as possible, thereby minimizing interference in resource-constrained cloud systems. To alleviate the enormous pressure brought by the scale of microservices on cluster management in cloud computing, the authors in [41] have developed a topology-aware deployment framework that leverages a heuristic graph mapping algorithm to capture the topological structure of microservices and clusters, thereby minimizing network overhead for applications. In order to mitigate communication delays caused by intricately interdependent microservices, the authors in [42] have explored a novel deployment scheme that leverages the fine-grained and comprehensive modeling of microservice dependencies.

II-B Latest Deployment Schemes for Edge Networks

Microservice deployment in edge networks is receiving increasing attention due to the flourishing development of latency-sensitive services, as well as advancements in edge or fog computing techniques. In [20] and [43], the authors have proposed a novel MSA called Nautilus to deploy microservice instances in the cloud-edge continuum effectively. This MSA adjusts instance placement and achieves load balancing by migrating containers from busy nodes, thereby ensuring the required QoS under external shared resource contention. To mitigate the impact of network load and routing on service reliability, in [44], we have proposed a network-aware service reliability model that effectively captures the correlation between network state changes and reliability. In addition, we have designed a service reliability-aware deployment algorithm, which leverages shared path reliability calculation to improve service failure tolerance and bandwidth consumption. In [45], the authors have studied an optimal microservice deployment scheme to balance layer sharing and chain sharing in resource-constrained edge servers. This scheme effectively addresses the challenges of microservice image pull delay and communication overhead minimization.

II-C Limitations of Literature Reviews

Although current research reveals valuable insights into microservice deployment in edge networks, it seldom accounts for the tangible impact of edge environments and resource contention [37, 38, 39, 13, 23, 24, 40, 41, 42, 20, 43, 44, 45]. The widespread distribution of edge devices and nodes, multi-hop forwarding characteristics of inter-node communications, and the complex intrinsic dependencies among microservices make accurate traffic analysis modeling a mystery [31, 20, 35]. Concurrently, the coexistence of multiple flows on numerous links sparks resource competition, leading to significant variations in link performance across deployment strategies. Moreover, improper deployment might overwhelm links, drastically degrading data transfer rates and increasing service delays. Nevertheless, prevailing studies typically simplify traffic analysis by disregarding the dynamic nature of network topology, node bandwidth, and delay [35].

III Topology-aware Microservice Architecture for Edge Networks

As illustrated in Fig. 4, we propose an innovative topology-aware microservice architecture characterized by a three-tier network traffic analysis model, including service layer invocation model, microservice layer data model, and edge-node layer traffic model. This model maps microservice deployments onto edge nodes’ link traffic to meticulously capture the complex inter-microservice dependencies and edge network topology. Let $W_{k}$ and $N_{p}$ indicate the $k$ -th type of service and the $p$ -th node, respectively, where $k\!\in\!\mathcal{K}\!\triangleq\!\left\{1,\cdots,K\right\}$ and $p\!\in\!\mathcal{P}\!\triangleq\!\left\{1,\cdots,P\right\}$ . The $i$ -th type of microservice is denoted as $M_{i}$ , where $i\!\in\!\mathcal{I}\!\triangleq\!\left\{1,\cdots,I\right\}$ . Let $M_{i,a}$ denote the $a$ -th instance of $M_{i}$ , where $a\!\in\!\mathcal{A}_{i}\ \!\triangleq\!\left\{1,\cdots,A_{i}\right\}$ and $\mathcal{A}_{i}$ is the set of instances for $M_{i}$ .

For example, consider microservice instances deployed across five computing nodes $N_{1}\!\!\sim\!\!N_{5}$ . Notably, our proposed microservice architecture is not limited to this example. Instead, it is designed with broad applicability. As shown in Fig. 4, when $W_{1}$ is requested, microservices will be invoked in the order $M_{1}\!\rightarrow\!M_{2}\!\rightarrow\!M_{4}\!\rightarrow\!M_{5}$ . One possible sequence of instance invocations at the microservice layer can be denoted as $M_{1,1}\!\!\rightarrow\!\!M_{2,1}\!\!\rightarrow\!\!M_{4,1}\!\!\rightarrow\!\!M_{5,2}$ , which are deployed on nodes $N_{1}$ , $N_{3}$ , and $N_{4}$ . In this case, the core of minimizing communication delay in service execution at the edge-node layer rests with how to optimize the deployment locations of microservices in the microservice layer.

III-A Service Layer Invocation Model

First of all, we introduce the service layer invocation model. To facilitate clarity, we illustrate it with a specific example. As shown in Fig. 5, we examine a particular service invocation executing the service $W_{k},k\!\in\!\mathcal{K}$ . The labels inside each node indicate the type of microservice, while the labels on each arrow represent the execution order of requests or responses. In edge networks, users access the network through their nearby user-access nodes. In particular, user’s requests first arrive at the user-access nodes and are then forwarded to the appropriate computing nodes based on the deployment scheme. To describe the process of user requests transitioning from the user-access node to the first microservice, we introduce a special microservice type $M_{0}$ called the virtual head microservice that is occupancy-free of any computational resources.

Let $F_{i,j}$ and $R_{i,j},i,j\in\mathcal{I},i\neq j$ indicate a request and response from $M_{i}$ to $M_{j}$ during the execution of service $W_{k}$ , respectively. In this case, the microservice invocation of service $W_{k}$ presented in Fig. 5 can be formulated using multiset as follows:

$\displaystyle W_{k}\Longrightarrow$	$\displaystyle\bigg{\{}F_{0,1},F_{1,2},R_{2,1},F_{1,3},F_{3,4},F_{4,2},R_{2,4},F_{4,5},F_{5,7},$	(1)
	$\displaystyle R_{7,5},F_{5,6},R_{6,5},R_{5,4},F_{4,2},R_{2,4},R_{4,3},R_{3,1},R_{1,0}\bigg{\}}$
$\displaystyle\Longrightarrow$	$\displaystyle\bigg{\{}F_{0,1},F_{1,2},F_{1,3},F_{3,4},2F_{4,2},F_{4,5},F_{5,6},F_{5,7},$
	$\displaystyle R_{2,1},2R_{2,4},R_{3,1},R_{4,3},R_{5,4},R_{6,5},R_{7,5},R_{1,0}\bigg{\}}.$

We define $\mathcal{F}_{k}$ as the request matrix for $W_{k}$ , where $F_{i,j}^{k}\in\mathcal{F}_{k}$ represent the number of requests from $M_{i}$ to $M_{j}$ during the execution of service $W_{k}$ . Thus, $\mathcal{F}_{k}$ can be given by

\mathcal{F}_{k}=\begin{bmatrix}0&1&0&0&0&0&0&0\\ 0&0&1&1&0&0&0&0\\ 0&0&0&0&0&0&0&0\\ 0&0&0&0&1&0&0&0\\ 0&0&2&0&0&1&0&0\\ 0&0&0&0&0&0&1&1\\ 0&0&0&0&0&0&0&0\\ 0&0&0&0&0&0&0&0\\ \end{bmatrix}.

(2)

Similarly, the response matrix of service $W_{k}$ can be defined by $\mathcal{R}^{k}=\left(\mathcal{F}_{k}\right)^{\mathrm{T}}$ .

III-B Microservice Layer Data Model

Each type of microservice is usually equipped with multiple instances, and this paper considers scheduling them using the Round Robin method. Thus, the likelihood of each instance $M_{i,a}$ for microservice $M_{i}$ being invoked is the same and can be given as follows:

P_{M_{i,a}}=\frac{1}{\left|\mathcal{A}_{i}\right|},\forall i\in\mathcal{I}\backslash\left\{0\right\},a\in\mathcal{A}_{i}.

(3)

For the case of $i=0$ , it depends on the service arrival rate at each user-access node, which will be explained in detail later.

Let $P_{M_{i,a}}^{k}$ denote the probability of selecting instance $M_{i,a}$ when invoking microservice $M_{i}$ during the execution of service $W_{k}$ , where $i\in\mathcal{I},k\in\mathcal{K},a\in\mathcal{A}_{i}$ . Let $F_{M_{i,a},M_{j,b}}^{k}$ represent the average frequencies instance $M_{i,a}$ requests instance $M_{j,b}$ during the execution of service $W_{k}$ , then we have

F_{M_{i,a},M_{j,b}}^{k}=F_{i,j}^{k}P_{M_{i,a}}^{k}P_{M_{j,b}}^{k},a\in\mathcal{A}_{i},b\in\mathcal{A}_{j},i\neq j.

(4)

Let $D_{i,k}^{req}$ indicate the average requested data size when $M_{i}$ invokes $M_{j}$ , and $\widetilde{F}_{M_{i,a},M_{j,b}}^{k}$ denote the average data size requested by instance $M_{i,a}$ from instance $M_{j,b}$ during the execution of service $W_{k}$ . Then we can obtain that

\widetilde{F}_{M_{i,a},M_{j,b}}^{k}=D_{i,k}^{req}F_{M_{i,a},M_{j,b}}^{k}.

(5)

Similarly, let $R_{M_{i,a},M_{j,b}}^{k}$ indicate the average frequencies that the instance $M_{i,a}$ responds to instance $M_{j,b}$ when $W_{k}$ is invoked, where $i,j\in\mathcal{I},a\in\mathcal{A}_{i},b\in\mathcal{A}_{j}$ . The average data size per invocation from $M_{i}$ to $M_{j}$ and from $M_{i,a}$ to $M_{j,b}$ can be expressed as $D_{i,j}^{res}$ and $\widetilde{R}_{M_{i,a},M_{j,b}}^{k}$ , respectively. Since the request and response processes correspond to each other, we can obtain that


	$\displaystyle{R}_{M_{i,a},M_{j,b}}^{k}={F}_{M_{j,b},M_{i,a}}^{k},$		(6a)
	$\displaystyle\widetilde{{R}}_{M_{i,a},M_{j,b}}^{k}=D^{res}_{i,j}{R}_{M_{i,a},M_{j,b}}^{k}.$		(6b)

III-C Edge-Node Layer Traffic Model

The edge-node layer comprises three types of nodes: user-access nodes for processing user requests, routing nodes for data forwarding, and computing nodes for deploying and executing microservices. All nodes can be represented as set $\mathcal{N}$ , with user-acess nodes as set $\mathcal{N}_{acc}$ , routing nodes as set $\mathcal{N}_{rou}$ , and computing nodes as set $\mathcal{N}_{cmp}$ . In this case, we have $\mathcal{N}\triangleq\mathcal{N}_{acc}\cup\mathcal{N}_{rou}\cup\mathcal{N}_{cmp}$ . We consider service requests arriving at each user-access node follow a Poisson distribution. If the Poisson parameter for $W_{k}$ at user-access node $N_{p}\in\mathcal{N}_{acc}$ is $\lambda_{p}^{k}$ , then the Poisson parameter for the request of service $W_{k}$ in the edge-node layer can be expressed as

\lambda^{k}=\sum\limits_{N_{p}\in\mathcal{N}_{acc}}\lambda_{p}^{k}.

(7)

The computing node for deploying instance $M_{i,a}$ can be represented as $L_{i,a}\in\mathcal{N}_{cmp}$ . For instances of the virtual head microservice $M_{0}$ , the invocation probability of invoking instance $M_{0,c}$ during the execution of service $W_{k}$ can be given by

P_{M_{0,c}}^{k}=\frac{\lambda_{L_{0,c}}^{k}}{\lambda^{k}},c\in\mathcal{A}_{0}.

(8)

In this case, once microservice $M_{i}$ is invoked during the execution of $W_{k}$ , the invocation probability of the instance $M_{i,a}$ can be denoted as follows:

P_{M_{i,a}}^{k}=\begin{cases}\frac{\lambda^{k}_{L_{0,a}}}{\lambda^{k}},&\text{ if }i=0,a\in\mathcal{A}_{0},\\ \frac{1}{\left|\mathcal{A}_{i}\right|},&\text{ if }i\neq 0,a\in\mathcal{A}_{i}.\end{cases}

(9)

Given any two nodes $N_{p}$ , $N_{q}\in\mathcal{N}$ , we define the traffic that directly flows from $N_{p}$ to $N_{q}$ without any forwarding as the direct traffic $S_{p,q}$ . If nodes $N_{x}$ and $N_{y}$ are two adjacent nodes, then the traffic of all types, including forwarding traffic flowing this link, can be defined as the total traffic, represented by $\widetilde{S}_{x,y}$ . Let $\mathcal{Q}_{p}$ denote the set of instances deployed on node $N_{p},p\in\mathcal{N}$ , then the direct traffic between any two nodes $N_{p}$ and $N_{q}$ during a single execution of $W_{k}$ can be represented as

S_{{p},{q}}^{k}=\!\!\sum_{M_{i,a}\in\mathcal{Q}_{p}}\!\sum_{M_{j,b}\in\mathcal{Q}_{q}}\!\!\!\left(\widetilde{{F}}_{M_{i,a},M_{j,b}}^{k}\!+\!\widetilde{{R}}_{M_{i,a},M_{j,b}}^{k}\!\right),\forall k\in\mathcal{K}.

(10)

Then, the direct traffic from $N_{p}$ to $N_{q}$ in unit time can be represented by

S_{p,q}=\sum\limits_{k\in\mathcal{K}}\lambda^{k}S_{{p},{q}}^{k}.

(11)

We use $U_{p,q}$ to represent the routing path from $N_{p}$ to $N_{q}$ , where $p,q\in\mathcal{N}$ . Given any two adjacent nodes $N_{x}$ and $N_{y}$ , if the directly connected link between them is one hop of the routing path $U_{p,q}$ , then we can denote it as $\langle N_{x},N_{y}\rangle\in U_{p,q}$ . The total traffic from $N_{x}$ to its adjacent nodes $N_{y}$ can be expressed as

\widetilde{S}_{{x,y}}=\sum_{p=1}^{\left|\mathcal{N}\right|}\sum_{q=1}^{\left|\mathcal{N}\right|}\bigg{\{}S_{p,q},\ \text{if}\ \left<N_{x},N_{y}\right>\in{U}_{p,q}\ \text{else}\ 0\bigg{\}},

(12)

where $(\ref{eq12})$ reveals that when traversing the routing path $U_{p,q}$ , if the direct traffic from $N_{p}$ to $N_{q}$ is $S_{p,q}$ , and $N_{x}$ to $N_{y}$ is one hop of $U_{p,q}$ , then the total traffic between $N_{x}$ and $N_{y}$ is incremented by $S_{p,q}$ .

III-D Communications Delay Analysis Model

To address the issue of multiple service traffic flows sharing link bandwidth, this paper leverages queuing theory to analyze available link bandwidth. We exploit the M/M/1 queuing model to model the data packet transmission process between any two adjacent nodes $N_{x}$ and $N_{y}$ . Let the packet size be $s$ kB, the bandwidth from $N_{x}$ to $N_{y}$ be $Z_{x,y}$ Mbps, the packet arrival rate from $N_{x}$ to $N_{y}$ be $\lambda_{x,y}$ Mbps, and the service rate of the link be $\mu_{x,y}$ Mbps. The average sending time for each packet between $N_{x}$ and $N_{y}$ can be expressed as

T^{pkg}_{{x,y}}=\frac{1}{\mu_{{x,y}}-\lambda_{{x,y}}}.

(13)

The average available bandwidth of the directly connected link from $N_{x}$ to its adjacent node $N_{y}$ can be represented by

		$\displaystyle Z^{\prime}_{x,y}=\frac{s}{T^{pkg}_{{x,y}}}=s\times\bigg{(}{\mu_{{x,y}}-\lambda_{{x,y}}}\bigg{)}$		(14)
		$\displaystyle=s\times\bigg{(}\frac{Z_{x,y}}{s}-\frac{\widetilde{S}_{{x,y}}}{s}\bigg{)}=Z_{x,y}-\widetilde{S}_{{x,y}}.$		(14)

Given any two adjacent nodes $N_{x}$ and $N_{y}$ , the propagation delay between them is represented as $V_{x,y}$ . Assume full-duplex communication method among nodes, the total propagation delay between any two nodes $N_{p}$ and $N_{q}$ can be expressed as follows:

\widetilde{V}_{p,q}=\sum_{\left\langle N_{x},N_{y}\right\rangle\in{U}_{p,q}}V_{x,y}.

(15)

Let $Z_{min}^{p,q}$ denote the minimum available bandwidth among all paths from $N_{p}$ to $N_{q}$ . Ignoring the forwarding time at intermediate nodes, the average communication delay $T_{M_{i,a},M_{j,b}}$ for instance $M_{i,a}$ to send requests to instance $M_{j,b}$ can be calculated as

\!\!T_{M_{i,a},M_{j,b}}=\!\!\begin{cases}\widetilde{V}_{{L_{i,a},L_{j,b}}}\!+\!\frac{D^{req}_{i,j}}{Z^{min}_{{L_{i,a},L_{j,b}}}}\!\!&,\text{ if }L_{{i,a}}\neq L_{{j,b}},\\ 0\!\!&,\text{ if }L_{{i,a}}=L_{{j,b}},\end{cases}

(16)

where $L_{i,a},a\in\mathcal{A}_{i}$ and $L_{j,b},b\in\mathcal{A}_{j}$ represent the computing nodes where $M_{i,a}$ and $M_{j,b}$ are deployed, respectively. Similarly, the average communication delay for $M_{i,a}$ to send responses to $M_{j,b}$ can be given as follows:

\!\!T^{\prime}_{M_{i,a},M_{j,b}}\!=\!\begin{cases}\widetilde{V}_{{L_{i,a},L_{j,b}}}\!+\!\frac{D^{res}_{i,j}}{Z^{min}_{{L_{i,a},L_{j,b}}}}\!\!&,\text{ if }L_{M_{i,a}}\neq L_{M_{j,b}},\\ 0\!\!&,\text{ if }L_{M_{i,a}}=L_{M_{j,b}}.\end{cases}

(17)

Therefore, the average communication delay for each execution of service $W_{k}$ can be expressed as

	$\displaystyle T_{k}=$	$\displaystyle\sum_{i=0}^{\left\|\mathcal{I}\right\|}\sum_{a=1}^{\left\|\mathcal{A}_{i}\right\|}\sum_{j=0}^{\left\|\mathcal{I}\right\|}\sum_{b=1}^{\left\|\mathcal{A}_{j}\right\|}\!\bigg{(}{F}_{M_{i,a},M_{j,b}}^{k}T_{M_{i,a},M_{j,b}}$		(18)
		$\displaystyle+{R}_{M_{i,a},M_{j,b}}^{k}T^{\prime}_{M_{i,a},M_{j,b}}\!\bigg{)},\forall i\neq j,k\in\mathcal{K}.$		(18)

Due to the various delay requirements of different service types, we employ the weighted sum of the average communication delays for each type of service to indicate system’s overall average communication delay, which can be expressed as follows:

T=\sum_{k\in\mathcal{K}}\theta_{k}T_{k},\vspace{-0.8em}

(19)

where $\theta_{k}\in\Theta\triangleq\left\{\theta_{1},\cdots,\theta_{K}\right\}$ denotes the weight of service $W_{k},k\in\mathcal{K}$ .

IV Problem Formulation and Solutions

In this section, we aim to minimize the weighted-sum of average communication delay by optimizing the microservice deployment scheme $\mathcal{L}\triangleq\left\{L_{i,a}\right\}_{i\in\mathcal{I},a\in\mathcal{A}_{i}}$ of microservice instances, where $L_{i,a}$ indicates the computing node where microservice instance $M_{i,a}$ is deployed. Based on the discussion in Sec. II A $-$ D, we can formulate the minimization problem as follows:


$\displaystyle\mathcal{P}1:$	$\displaystyle\min_{\left\{L_{i,a}\right\}_{i\in\mathcal{I},a\in\mathcal{A}_{i}}}\ \ T=\sum_{k\in\mathcal{K}}\theta_{k}T_{k},$	(20a)
s.t.	$\displaystyle\sum_{M_{i,a}\in\mathcal{M}_{p}}C_{i}\leqslant C_{p},\ \forall N_{p}\in\mathcal{N},$	(20b)
	$\displaystyle\sum_{M_{i,a}\in\mathcal{M}_{p}}B_{i}\leqslant B_{p},\ \forall N_{p}\in\mathcal{N},$	(20c)
	$\displaystyle\widetilde{S}_{{p},{q}}\leqslant Z_{p,q},\ \forall q\in\mathcal{N}_{p},\ \forall p\in\mathcal{N},$	(20d)

where $\mathcal{M}_{p}$ and $\mathcal{N}_{p}$ represents the sets of all instances of microservices deployed and the nodes adjacent to node $N_{p},p\in\mathcal{N}$ , respectively. $C_{i}$ and $B_{i}$ indicate the computational and memory resources required to deploy the instance of microservice $M_{i}$ on node $N_{p}$ , respectively. Constraints (20b) and (20c) guarantee that the total computational and memory resources required by the microservice instances on each node do not exceed the node’s CPU count and memory capacity, respectively. Constraint (20d) ensures that the traffic on each link does not exceed its bandwidth limit.

IV-A Proposed Solution

Due to the numerous microservices and nodes, it is extremely challenging to efficiently tackle the optimal solution of the original problem $\mathcal{P}1$ leveraging traditional convex optimization methods. To this end, we first introduce $0$ - $1$ variables to transform $\mathcal{P}1$ into a nonlinear $0$ - $1$ programming problem.

IV-A1 Problem Transformation

Generally, to guarantee the availability of services in edge networks, instances of each microservice should be deployed across multiple nodes. In this case, we introduce binary variables $G_{p,i}\in\mathcal{G}$ to characterize whether the instance of microservice $M_{i},i\in\mathcal{I}$ is deployed on node $N_{p}$ . Moreover, we use $M_{p,i}$ to denote the instance of microservice $M_{i}$ located on node $N_{p}$ , thereby uniquely identifying whether the instance of $M_{i}$ is on $N_{p}\in\mathcal{N}$ or not. Then, (10) can be reformulated as follows:

S_{{p},{q}}^{k}=\sum_{M_{i}\in\mathcal{M}}\sum_{M_{j}\in\mathcal{M}}G_{p,i}G_{q,j}(\widetilde{{F}}_{M_{p,i},M_{q,j}}^{k}+\widetilde{{R}}_{M_{p,i},M_{q,j}}^{k}).

(21)

For the topology relationships, we introduce binary variable $I_{p,q}^{x,y}\in I_{p,q}$ to indicate whether the path from node $N_{p}$ to node $N_{q}$ includes the hop between adjacent nodes $N_{x}$ and $N_{y}$ . Then, (12) can be reformulated as follows:

\widetilde{S}_{{x,y}}=\sum_{p=1}^{\left|\mathcal{N}\right|}\sum_{q=1}^{\left|\mathcal{N}\right|}I^{p,q}_{x,y}S_{{p}{q}}.

(22)

Similarly, (15) can also be rewritten as follows:

\widetilde{V}_{p,q}=\sum_{I_{p,q}^{x,y}\in I_{p,q}}I_{p,q}^{x,y}V_{x,y}.

(23)

Furthermore, (16) and (17) can be reformulated as follows:

T_{M_{p,i},M_{q,j}}=\begin{cases}\widetilde{V}_{p,q}+\frac{D^{req}_{i,j}}{Z^{min}_{p,q}}&,\text{if}\ p\neq q\wedge i\neq j,\\ 0&,\text{else},\end{cases}

(24)

T^{\prime}_{M_{p,i},M_{q,j}}=\begin{cases}\widetilde{V}_{p,q}+\frac{D^{res}_{i,j}}{Z^{min}_{p,q}}&,\text{if}\ p\neq q\wedge i\neq j,\\ 0&,\text{ else }.\end{cases}

(25)

Finally, (18) can be can be transformed as follows:

	$\displaystyle T_{k}=$	$\displaystyle\sum_{i=0}^{\left\|\mathcal{I}\right\|}\sum_{p=1}^{\left\|\mathcal{N}\right\|}\sum_{j=0}^{\left\|\mathcal{I}\right\|}\sum_{q=1}^{\left\|\mathcal{N}\right\|}G_{p,i}G_{q,j}\bigg{(}{F}_{M_{p,i},M_{q,j}}^{k}T_{M_{p,i},M_{q,j}}$		(26)
		$\displaystyle+{R}_{M_{p,i},M_{q,j}}^{k}T^{\prime}_{M_{p,i},M_{q,j}}\bigg{)},\ i\neq j.$		(26)

IV-A2 Problem Reformulation

As a result, the original problem $\mathcal{P}1$ can be equivalently transformed into the following form:


$\displaystyle\mathcal{P}2:$	$\displaystyle\min_{\mathcal{G}:\left\{G_{p,i}\right\}_{i\in\mathcal{I},p\in\mathcal{N}}}\ \ T=\sum_{W_{k}\in\mathcal{W}}\theta_{k}T_{k},$	(27a)
s.t.	$\displaystyle\sum_{p=1}^{\left\|\mathcal{N}\right\|}G_{p,i}\leq\left\|\mathcal{A}_{i}\right\|,\ \forall i\in\mathcal{I},$	(27b)
	$\displaystyle\sum_{i=1}^{\left\|\mathcal{I}\right\|}G_{p,i}\leq 1,\ \forall N_{p}\in\mathcal{N},$	(27c)
	$\displaystyle\sum_{i=1}^{\left\|\mathcal{I}\right\|}G_{p,i}C_{i}\leqslant C_{p},\ \forall p\in\mathcal{N},$	(27d)
	$\displaystyle\sum_{i=1}^{\left\|\mathcal{I}\right\|}G_{p,i}B_{i}\leqslant B_{p},\ \forall p\in\mathcal{N},$	(27e)
	$\displaystyle\widetilde{\Gamma}_{{p},{q}}\leqslant Z_{p,q},\ \forall q\in\mathcal{N}_{p},\ \forall p\in\mathcal{N},$	(27f)

where $(\ref{eq27b})$ and $(\ref{eq27c})$ specify the constraints on the number of microservice instances. In particular, $(\ref{eq27b})$ ensures that the total instances of microservice $M_{i}$ across do not exceed $\left|\mathcal{A}_{i}\right|$ . $(\ref{eq27c})$ prevents multiple instances of $M_{i}$ from being deployed on the same node. Constraints $(\ref{eq27d})$ and $(\ref{eq27e})$ ensure that the computational and memory resources required by the deployed microservices $M_{i}$ cannot exceed the total computational and memory resources available on the node. Constraint $(\ref{eq27f})$ enforces the bandwidth constraint between node $N_{p}$ and its adjacent nodes $N_{q}$ .

IV-B Topology-aware and Individual-adaptive Microservice Deployment Scheme

Given a large number of microservices and nodes, effectively tackling the nonlinear $0$ - $1$ programming problem $\mathcal{P}2$ optimally within a constrained timeframe is challenging. While the genetic algorithm (GA) offers a robust heuristic approach for combinatorial problems, its inherent limitations of slow convergence and suboptimal performance in specific scenarios necessitate refinement. To address these shortcomings, we propose a novel topology-aware and individual-adaptive microservice deployment (TAIA-MD) scheme. This scheme leverages the perceived network topology to design the microservice deployment scheme. Moreover, it incorporates an individual-adaptive mechanism that enhances individual adaptability by strategically initializing a select group of “super individuals” within the GA population, thereby accelerating convergence and avoiding local optima. As described in Algorithm 1, the first part of our proposed TAIA-MD scheme encompasses four aspects, as follows:

•

Chromosome Encoding: We employ binary encoding, where each chromosome consists of $\left|\mathcal{I}\right|$ genes, representing the deployment of $\left|\mathcal{I}\right|$ microservice types. The set of computing nodes available for deployment is $\left|\mathcal{N}_{cmp}\right|$ . Each gene is a binary string of length $\left|\mathcal{N}_{cmp}\right|$ , with a value of ‘ $0$ ’ or ‘ $1$ ’ indicating whether the microservice is deployed on the corresponding computing node or not. In addition, the number of ‘ $1$ ’s on each gene is consistent with the number of instances of the corresponding microservice type.
•

Fitness Function: The fitness function measures the quality of an individual in solving $\mathcal{P}2$ . To minimize the average communication delay $T$ , the fitness function is defined as a sufficiently large number minus $T$ . The calculation of $T$ is detailed in Algorithm 2.
•

Genetic Operators: Genetic operators generate new individuals through selection, crossover, and mutation. For the selection operation, we implement the tournament method as our selection scheme. Specifically, two groups of individuals are randomly sampled from the population, with each containing $Y_{m}$ individuals. The fittest individual from each group is then selected as the parent. Following the selection operation, crossover operations are performed on these selected parents. Each gene from the parent has a probability of $Y_{p}$ to exchange two genes if the chromosomes of both individuals are feasible after exchanging. Subsequently, the offspring undergo mutation with a probability of $Y_{q}$ , where bits are randomly flipped to maintain chromosome feasibility. The process of selection, crossover, and mutation repeats until the offspring population reaches the desired population size $Y_{n}$ .
•

Termination Conditions: The algorithm has a maximum iteration count of $Y_{k}$ to ensure timely problem-solving. Termination transpires upon reaching this threshold or when individual fitness for $Y_{i}$ consecutive iterations has no significant change, conserving computational resources.

Input: Service set

\mathcal{W}

; Microservice set

\mathcal{M}

; Node set

\mathcal{N}

; Population size

Y_{n}

; Number of individuals per group

Y_{m}

; Gene crossover probability

Y_{p}

; genetic mutation probability

Y_{q}

; Maximum number of iterations

Y_{k}

; Current number of iterations

Y_{i}

Output: Optimal deployment scheme

\mathcal{G}^{best}

1 Randomly initialize each individual

I

in the population

\mathbb{P}

;

2 for $\text{i}\leftarrow\text{1}~{}\textbf{to}~{}Y_{k}$ do

3 foreach Individual $I$ in $\mathbb{P}$ do

4 For the deployment scheme

\mathcal{G}

of individual

I

, compute its time delay

T

;

5 Updating the fitness of individual

I

;

7 end foreach

8 Update the optimal scheme

\mathcal{G}^{best}

;

9 if The un-updated round of $\mathcal{G}^{best}$ $>$ $Y_{i}$ then

10 End the loop;

12 end if

13 Create a collection of child individuals

\mathbb{I}^{\prime}

;

15 while $\left|\mathbb{I}^{\prime}\right|<Y_{n}$ do

16 Randomly select two groups of individuals, each with the number of

Y_{m}

;

17 Select the top two individuals with the highest fitness from each group,

I_{A}

and

I_{B}

, as parents;

I_{A}

and

I_{B}

undergo uniform crossover with probability

Y_{p}

, producing offspring

I^{\prime}_{A}

and

I^{\prime}_{B}

;

19 Each gene on

I^{\prime}_{A}

and

I^{\prime}_{B}

undergoes mutation with probability

Y_{q}

;

20 Add

I^{\prime}_{A}

and

I^{\prime}_{B}

into

\mathbb{I}^{\prime}

;

22 end while

23 Replace the population of individuals with

\mathbb{I}^{\prime}

;

25 end for

Algorithm 1 Topology-aware genetic algorithm

Input: Microservice deployment scheme

\mathcal{G}

; Service set

\mathcal{W}

; Microservice set

\mathcal{M}

; Node set

\mathcal{N}

; Service rating weight set

\Theta

Output: System average communication delay

T

1 Compute the propagation delay

\widetilde{V}_{p,q}

between any nodes;

2 foreach Service $W_{k}$ in $\mathcal{W}$ do

3 Compute

F_{M_{i,a},M_{j,b}}^{k}

and

R_{M_{i,a},M_{j,b}}^{k}

between any two instances during one execution of

W_{k}

;

5 end foreach

6Compute the direct traffic

S_{p,q}

between any two nodes;

7 Compute the total traffic

\widetilde{S}_{x,y}

between any adjacent nodes;

8 Compute the actual available bandwidth

Z^{\prime}_{x,y}

between any adjacent nodes; Compute the minimum available bandwidth

Z^{min}_{p,q}

between any nodes;

9 Initialize the system average communication delay

T

;

11foreach Service $W_{k}$ in $\mathcal{W}$ do

12 Initialize the average communication delay

T_{k}

of service

W_{k}

;

13 foreach Invoke $M_{i},M_{j}$ during the execution of service $W_{k}$ do

15 Denote the node sets of deploying

M_{i}

and

M_{j}

\mathcal{N}^{i}

and

\mathcal{N}^{j}

, respectively;

17 foreach $N_{p}\in\mathcal{N}^{i}$ do

18 foreach $N_{q}\in\mathcal{N}^{j}$ do

20 Compute

T_{M_{i,a},M_{j,b}}

and

T^{\prime}_{M_{i,a},M_{j,b}}

between the corresponding instances on two nodes;

T_{k}

F_{M_{i,a},M_{j,b}}^{k}T_{M_{i,a},M_{j,b}}+R_{M_{i,a},M_{j,b}}^{k}T^{\prime}_{M_{i,a},M_{j,b}}

;

23 end foreach

25 end foreach

27 end foreach

28 T +=

\theta_{k}T_{k}

;

30 end foreach

Algorithm 2 Microservice Deployment Scheme Evaluation Algorithm

Moreover, we implement an individual-adaptive mechanism. This mechanism generates a select few “super individuals” during population initialization and endows them with adaptive capabilities to accelerate algorithm convergence. Algorithm 3 introduces a low-complexity algorithm named greedy-based time optimization algorithm to fine-tune deployment strategies for these super individuals. By altering the traversal order of services in line 3 of Algorithm 3, we can generate diverse deployment schemes, ensuring the genetic diversity of super individuals. To prevent the algorithm from getting trapped in local optima, we strictly limit the number of super individuals. Coupled with the tournament selection scheme, non-super individuals still have a significant chance of being selected as parents.

IV-C Computational Complexity Analysis

We first denote the maximum number of instances for each type of microservice as $X$ . Then the complexity for randomly generating the deployment scheme in the initialization phase can be calculated $\mathcal{O}\left(X\left|\mathcal{M}\right|\right)$ . In Algorithm 3, the complexity of deploying each microservice instance based on the invocation relationship is $\mathcal{O}\left(X\left|\mathcal{M}\right|\left|\mathcal{N}\right|\right)$ . As illustrated in Algorithm 1, each iteration involves generating a new population and evaluating its fitness. As a result, the complexities for selection, crossover, and mutation operations are $\mathcal{O}\left(Y_{m}\right)$ , $\mathcal{O}\left(\left|\mathcal{M}\right|\left|\mathcal{N}\right|\right)$ , and $\mathcal{O}\left(1\right)$ , respectively, resulting in the complexity for producing a new individual of $\mathcal{O}\left(Y_{m}+\left|\mathcal{M}\right|\left|\mathcal{N}\right|\right)$ . The fitness evaluation in Algorithm 2 considers node-to-node forwarding paths and microservice instance invocation relationships, leading to a complexity of $\mathcal{O}\left(X^{2}\left|\mathcal{M}\right|^{2}\right)$ . Algorithm 2 may require up to $Y_{k}$ iterations, each producing $Y_{n}$ individuals and evaluating their fitness. Compared to the iterative process, the initialization phase’s complexity is negligible. As a result, the proposed TASA-MD scheme has an overall computational complexity of $\mathcal{O}\left(Y_{k}Y_{n}\big{(}Y_{m}+X^{2}\left|\mathcal{M}\right|^{2}\big{)}\right)$ , significantly improving solving efficiency compared to the exponential time complexity of the enumeration algorithm $\mathcal{O}\left(\big{(}X\left|\mathcal{M}\right|\big{)}^{\left|\mathcal{N}\right|}\right)$ .

Input: Service set

\mathcal{W}

; Microservice set

\mathcal{M}

; Node set

\mathcal{N}

Output: Microservice deployment scheme

\mathcal{G}

1 Initialize microservice deployment scheme

\mathcal{G}

;

2 Compute the communication delay between any two nodes

N_{p}

and

N_{q}

\mathcal{N}

; foreach $W_{k}$ in $\mathcal{W}$ do

3 if $M_{i}$ is not deployed then

4 Compute the average delay from each computing node to all access nodes;

5 Deploy instances of

M_{i}

\left|\mathcal{A}_{i}\right|

nodes with the lowest average delay that meet the resource requirements of

M_{i}

;

6 Update the remaining resources of the corresponding computing nodes;

8 end if

9 foreach $M_{i}$ required by $W_{k}$ do

10 if $M_{i}$ is not deployed then

11 Denote the set of associated nodes deployed with

M_{i}

frontend

M_{j}

\mathcal{N}^{\prime}

;

12 Compute the average delay from each computing node to the nodes in

\mathcal{N}^{\prime}

;

13 Deploy instances of

M_{i}

to the

\left|\mathcal{A}_{i}\right|

nodes with the lowest average delay that meet resource requirements of

M_{i}

;

14 Update the remaining resources of the corresponding computing nodes;

16 end if

18 end foreach

20 end foreach

Algorithm 3 Greedy-based Microservice Deployment Algorithm

V Performance Evaluation

To comprehensively validate the superior performance of the proposed TAIA-MD scheme, we have conducted extensive simulations focusing on bandwidth resources, service arrival rates, computing resources, and network topology. In addition, to further demonstrate the effectiveness and practicality of the TAIA-MD scheme, we have also carried out the actual deployment on the previously developed microservice physical verification platform called DMSA [46]. These evaluations collectively confirm the robustness and adaptability of the TAIA-MD scheme in both simulated and real-world environments Notably, due to access restrictions on deployment permissions on commercial platforms such as Amazon and Netflix, our current performance evaluation, like most studies, is independent of actual and commercial parameters..

V-A Simulation Parameter Settings

In our simulations, we consider there are $50$ types of microservices, each with input and output data sizes ranging from $10$ KB to $100$ KB, requiring $0.1$ to $0.3$ CPU units and $100$ MB to $300$ MB of memory per instance, typically deployed in $3$ to $5$ instances. There are $10$ service types with $5\!\sim\!8$ microservice invocations and an arrival rate $\lambda^{k}$ of $30\!\sim\!50$ . The edge network topology comprises $50$ nodes, including $35$ computing nodes, $10$ routing nodes, and $5$ user-access nodes. Each link has a propagation delay of $10\mu s$ and a bandwidth of $100$ Mbps. Microservice instances are deployed on computing nodes, user requests arrive via user-access nodes, and routing nodes handle data forwarding. Each computing node has $8$ CPU cores and $8000$ MB memory. Notably, the simulation parameter settings are randomly generated within a reasonable range with reference to works [47, 31, 22] as well as the edge environment characteristics.

Furthermore, since different network topologies can significantly impact the performance of edge networks, for example, links used by multiple routing paths may lead to bandwidth shortages, in order to quantify this, we introduce the concept of link forwarding load (LFL) to indicate how often a link serves as the forwarding path in the network topology.

Definition 1.

Link forwarding load (LFL): For any given $N_{x}$ and its adjacent node $N_{y}$ , the link forwarding load between them can be represented as $O_{x,y}=\sum_{p=1}^{\left|\mathcal{N}\right|}\sum_{q=1}^{\left|\mathcal{N}\right|}I_{p,q}^{x,y}$ . Let $\mathbb{E}$ denote the set of edges in the network topology, then the average link forwarding load of the network topology can be represented as

O_{avg}=\frac{\sum_{x=1}^{\left|\mathcal{N}\right|}\sum_{y=1}^{\left|\mathcal{N}\right|}O_{x,y}}{\left|\mathbb{E}\right|}.

(28)

Fig. 6 depicts the LFLs across two different topologies, where edge weights signify LFLs calculated per Definition 1. In Topology 1, all adjacent nodes have LFLs of $12$ , resulting in an average LFL of $12$ . In Topology 2, links such as $E_{2,4}$ and $E_{4,7}$ bear higher routing tasks, leading to higher LFLs. Despite both topologies having identical node and edge counts, Topology 2 exhibits a higher average LFL, making it more prone to bandwidth bottlenecks. In our simulation’s default topology, the connections between nodes in the edge network are based on the concept of the defined LFL. By adjusting different LFLs, we can simulate various network topologies. The default topology used in our simulations is shown in Fig. 7.

V-B The Setup of Physical Verification Platform DMSA

DMSA is a decentralized MSA platform, and the biggest difference from traditional centralized MSAs is that it sinks the scheduling function from the control plane to edge nodes [46]. In particular, DMSA has redesigned three core modules of microservice discovery, monitoring, and scheduling, which achieve precise awareness of instance deployments, low monitoring overhead and measurement errors, and accurate dynamic scheduling, respectively. However, despite having a comprehensive scheduling mechanism, DMSA has not yet integrated microservice deployment optimization [46]. The picture of the real-world physical verification platform DMSA we constructed is shown in Fig. 8 (a).

In this paper, we have practically deployed the TAIA-MD scheme on the DMSA platform and verified its effectiveness and practicality. In particular, we exploit 6 Raspberry Pi 4Bs, 5 Orange Pi 5Bs, and 6 Gigabit Ethernet Switches to construct this edge network topology. This network topology includes 17 nodes, namely 6 Gigabit Ethernet switches as communication nodes, 3 Raspberry Pi 4B as user-access nodes, and 9 computing nodes (4 Raspberry Pi 4B and 5 Orange Pi 5B). We have designed three typical services: video services, download services, and graphic-text services. These services are implemented by 10 microservice instances ¹¹1Notably, three user-access nodes generate requests for three types of services at a specific arrival rate with loads evenly distributed.. Video segments range from $1$ to 3 MB with a maximum wait time of $10$ seconds. Graphic-text pages are $0.5$ to $1$ MB with the same wait time, while download files are $10$ to $20$ MB with a $100$ -second wait time. If it times out, the service execution will be marked as failed. Each test lasts for $40$ minutes. To comprehensively evaluate the network performance of the TAIA-MD scheme deployed on the DMSA platform, we design two network emergencies, as shown in Fig. 8 (b). ➀ Link 1 between Switch 1 and Node 3 is disconnected at the 10th minute and restored at the 15th minute to simulate the computing node suddenly goes offline. ➁ Link 3 between Switch 3 and Switch 5 is restricted to 100 Mbps at the 30th minute and restored back to $1$ Gbps at the 35th minute to portray the network fluctuations. As shown in Table I, we also test the performance of TAIA-MD under high, medium, and low load conditions on the DMSA platform.

TABLE I: Arrival Rates of Different Services under Various Loads.

Load Condition	Video Service	Download Service	Graphic-text Service
High Load	5	1	10
Medium Load	3	0.6	6
Low Load	1.5	0.3	3

V-C Comparison Schemes in Numerous Simulations

To comprehensively validate the effectiveness of the proposed TAIA-MD scheme, we compare it with four different baseline schemes in simulations, as follows:

•

Random Deployment Scheme (Random) [48]: Microservice instances are deployed randomly on selected nodes that meet resource requirements, generating the deployment scheme.
•

Greedy-based Deployment Scheme (Greedy) [31]: This scheme considers the complex dependencies among microservices, which are modeled as the chain structure but ignoring the network topology. It only considers the previous and next microservices in the invocation chain when deploying each instance, selecting the best-performing node after evaluating all possible nodes.
•

GA-Based Deployment Scheme (GA) [49]: This scheme takes into account the dependencies between microservices and the bandwidth capacity of edge nodes but overlooks the network topology. Since it also uses the GA, comparing it with our proposed TAIA-MD scheme highlights the performance gains from incorporating topology awareness.
•

Non-Individual-Adaptive TAIA-MD Scheme (w/o, IA) TAIA-MD: This variant of the proposed TAIA-MD scheme omits the individual-adaptation mechanism to validate the effectiveness individual-adaptation mechanism in accelerating convergence and avoiding local optima.

V-D Numerous Simulation Results

As illustrated in Fig. 9, we analyze the convergence performance of the traditional GA, (w/o, IA) TAIA-MD, and TAIA-MD schemes. Traditional GA converges quickly with fewer iterations. This primarily stems from its oversight of the edge network topology and individual adaptability, considering fewer influencing factors, thus necessitating fewer convergence iterations. Conversely, (w/o IA) TAIA-MD, which takes into account network topology and bandwidth contention, is more significantly impacted by deployment scheme variations, necessitating a longer optimization period. Addressing these issues, our proposed TAIA-MD scheme generates relatively superior individuals and gene fragments during genetic initialization, facilitating rapid convergence. Numerical results underscore that compared to (w/o, IA) TAIA-MD scheme, the proposed TAIA-MD reduces the iteration rounds by approximately $50\%$ , typically completing within $300$ iterations. Furthermore, the TAIA-MD scheme tackles the tardy convergence issue of (w/o, IA) TAIA-MD, significantly enhancing algorithm efficiency. Noteworthy is that although traditional GA exhibits slightly faster convergence, its deployment schemes are far inferior compared to (w/o, IA) TAIA-MD and TAIA-MD, as will be detailed later.

As depicted in Fig. 10, we compare the performance of TAIA-MD against the baseline schemes across different link bandwidths. In the scenarios with lower bandwidth, due to certain link traffic exceeding available bandwidth, the Random, Greedy, and GA schemes experience severe congestion and significant delays. In contrast, (w/o, IA) TAIA-MD and TAIA-MD can effectively reduce the delay by over $90\%$ and competently prevent congestion through optimized microservices deployment. As bandwidth scales up to $100$ Mbps, the congestion and delay issues for Random, Greedy, and GA schemes can be significantly alleviated. Nonetheless, (w/o, IA) TAIA-MD and TAIA-MD schemes still demonstrate remarkable performance advantages. This primarily arises from their consideration of the impact of microservice deployment on available bandwidth, incorporating network topology and individual adaptation. Compared to Greedy and GA, (w/o, IA) TAIA-MD reduces delay by $73.1\%$ and $39.3\%$ , respectively. As the link bandwidth further increases to above $150$ Mbps, due to the available bandwidth being much greater than the system requirement, the actual available bandwidth between nodes approaches the link bandwidth, and the impact of deployment schemes on available bandwidth gradually weakens. Except for the Random scheme, the performance gap between other schemes and (w/o, IA) TAIA-MD and TAIA-MD gradually narrows to within $20\%$ . Moreover, across diverse link bandwidths, TAIA-MD improves delay performance by about $5\%$ and convergence performance by approximately $50\%$ compared to (w/o, IA) TAIA-MD scheme, which demonstrates the effectiveness of the individual-adaptive mechanism, i.e., Algorithm 3.

Fig. 11 illustrates the performance of five schemes across diverse service arrival rates. In scenarios with lower service arrival rates, denoting low-load conditions, except for random deployment, the performance differentials among these schemes appear marginal. However, (w/o, IA) TAIA-MD and TAIA-MD consistently exhibit the most superior performance. As service arrival rates increase, the bandwidth requirements of the system also increase accordingly. Compared with Random, Greedy, and GA schemes, the advantages of (w/o, IA) TAIA-MD and TAIA-MD become increasingly pronounced. This is primarily because they have taken into account the impact of microservice deployments on available bandwidth. As the service arrival rates increase, the burden borne by the links of edge networks intensifies, thereby leading to the realm of heightened network intricacies. Therefore, the proposed TAIA-MD scheme that integrates topology awareness and individual adaptability can fully utilize its advantages in these constantly changing networks.

Given sufficient memory resources, we adjust the computational resources on each node to evaluate the performance of algorithms. As shown in Fig. 12, with limited CPU resources, the flexibility of microservice deployment options is constrained, which makes the crossover and mutation process difficult for GA and (w/o, IA) TAIA-MD. In this case, Greedy and TAIA-MD, with individual adaptability, demonstrate superior performance. As computational resources increase, the flexibility of microservice deployment options expands, enhancing the performance of GA and (w/o, IA) TAIA-MD. When CPUs exceed four units, they surpass microservice deployment requirements, resulting in no significant performance changes for all schemes. This reveals that dynamically adjusting computational resources based on microservice deployment results can reduce resource overhead.

To thoroughly investigate the impact of network topology on microservice deployment performance, we analyze the delay performance of TAIA-MD and baseline schemes under different average link forwarding loads, as shown in Fig. 13. As the average link forwarding load increases, the competition for bandwidth resources between microservices becomes more serious, leading to higher system delay. Under high average link forwarding loads, (w/o, IA) TAIA-MD and TAIA-MD exhibit significantly better delay performance compared to Random, GA, and Greedy, which demonstrates that (w/o, IA) TAIA-MD and TAIA-MD effectively address bandwidth collision in edge networks. Furthermore, although (w/o, IA) TAIA-MD and TAIA-MD achieve similar delay performance, Fig. 9 has already demonstrated TAIA-MD’s superior convergence performance.

Assume the delays for Scheme A and Scheme B are $T_{A}$ and $T_{B}$ , respectively. Then, the performance improvement percentage of Scheme A over Scheme B can be calculated as $\frac{T_{B}-T_{A}}{T_{B}}\times 100\%$ . As shown in Fig. 14, we present the performance improvement percentage of TAIA-MD compared to baseline schemes under varied bandwidths, service arrival rates, computational resources, and average link forwarding loads. In most cases, considering the effects of topology and traffic on available bandwidth during deployment scheme calculation can effectively reduce the system’s delay. TAIA-MD can decrease system delay by approximately $30\%$ - $60\%$ compared to other baseline schemes assuming fixed link bandwidth. Remarkably, in scenarios with low bandwidth, high service arrival rates, and complex topologies, TAIA-MD significantly prevents link congestion, providing substantial performance improvements. Furthermore, as illustrated in Fig. 14 (c), TAIA-MD generally suggests a performance gain of about $5\%$ over (w/o, IA) TAIA-MD. Importantly, with limited CPU resources, TAIA-MD performs markedly better than (w/o, IA) TAIA-MD, which underscores TAIA-MD’s extraordinary performance with topology-aware and individual-adaptive mechanisms for edge networks.

V-E DMSA Platform Validation Results

For latency-sensitive services like video and graphic-text services, we focus on analyzing the average delay performance of the TAIA-MD scheme deployed on the DMSA platform. As shown in Fig. 15, the average delay under various load conditions reveals that the DMSA platform with the TAIA-MD scheme (i.e., (DMSA, TAIA-MD) scheme) significantly outperforms the standalone DMSA platform. Specifically, for graphic-text services, the average delay improvement of the (DMSA, TAIA-MD) scheme over standalone the DMSA scheme is substantial, ranging from 31.13% under low loads to 38.51% under high loads. For video services, the performance enhancement of average delay for the (DMSA, TAIA-MD) scheme under low, medium, and high loads is 25.20%, 26.48%, and 37.92%, respectively. It can be seen that as the load increases, the delay performance advantage of the (DMSA, TAIA-MD) scheme becomes increasingly significant. In addition, the average delay of the DMSA platform deteriorates with the increase of load for both graphic-text and video services, while the average delay of the (DMSA, TAIA-MD) scheme is less affected by increasing load. This is primarily attributed to microservice deployment optimization of the (DMSA, TAIA-MD) scheme, which significantly reduces the overhead of instance scheduling in the DMSA platform and thus achieves lower average delay.

As illustrated in Fig. 16, we present the dynamic variations in delay performance over run time under various load conditions, taking video services as an example. Fig. 16 (a), (b), and (c) demonstrate that the (DMSA, TAIA-MD) scheme consistently outperforms the standalone DMSA platform in terms of delay performance under different loads. In particular, during the $10\sim 15$ min interval, a link disruption occurs between Switch 1 and Node 3, which causes a sharp increase in delay for both the (DMSA, TAIA-MD) scheme and DMSA. However, the delay performance can quickly recover from the link disconnection. Remarkably, the (DMSA, TAIA-MD) scheme restores network performance more rapidly. This is primarily due to the fact that the TAIA-MD scheme optimizes instance deployment on the basis of the DMSA platform’s effective sensing of network load and edge node status, and dynamically adjusts the scheduling strategies. During the $30\sim 35$ min interval, the link bandwidth between Switch 3 and Switch 5 fluctuates and drops sharply to $100$ Mbps. During this period, user requests are still routed through the constrained link, resulting in a significant increase in delay for both the (DMSA, TAIA-MD) scheme and DMSA platform. However, the delay performance can recover shortly after. The (DMSA, TAIA-MD) scheme is less affected by network fluctuations and recovers faster. This demonstrates the enhanced robustness and adaptability of the DMSA platform with the TAIA-MD scheme in handling network emergencies.

Next, the robustness of the (DMSA, TAIA-MD) scheme and DMSA platform is analyzed. As shown in Fig. 17 (a), (b), and (c), we examine the average service execution success rates of graphic-text, video, and file download services under various load conditions. Compared to the performance of the DMSA platform itself, the (DMSA, TAIA-MD) scheme overall improves the service execution success rates in edge networks and significantly enhances their robust performance. Fig. 17 (a)-(c) demonstrate that the (DMSA, TAIA-MD) scheme exhibits more pronounced performance advantages in microservice deployments as service data volume and load increase. For graphic-text and video services, the (DMSA, TAIA-MD) scheme stably maintains the service execution success rates of more than 98% and 97%, respectively. In large file downloads, the delay performance improvements in service execution success rates with the (DMSA, TAIA-MD) scheme become notably more significant as the load increases compared to the DMSA platform alone.

VI CONCLUSION AND FUTURE OUTLOOK

In this paper, we have introduced an innovative microservice deployment architecture, which integrates a three-tier network traffic model encompassing the service layer, microservices layer, and edge node layer. This traffic model meticulously characterizes the complex dependencies between edge network topology and microservices, and maps microservice deployments onto link traffic to accurately estimate communication delay. On this basis, we have formulated a weighted sum communication delay optimization problem to minimize communication delay by optimizing microservices deployments. To effectively solve this problem, we have proposed a novel deployment scheme called TAIA-MD, which accurately senses the network topology and incorporates an individual-adaptive mechanism in GA to accelerate convergence and avoid local optima. Extensive simulations show that in bandwidth-constrained edge networks with complex topologies, TAIA-MD improves the delay performance by approximately 30% to 60% compared to existing deployment schemes. Moreover, through real-world deployments on the DMSA platform, we have demonstrated the robustness of the TAIA-MD scheme for withstanding link failures and network fluctuations and validated its practicality in MSA for edge networks.

There are still limitations to this study. In particular, in the analysis of the microservice deployment problem, we focused on static routing schemes in edge networks. However, there actually may be multiple reachable paths between nodes, and alterations in routing schemes might impact link loads, consequently influencing the communication performance of services. Therefore, in future work, we intend to explore dynamic routing schemes to develop more comprehensive link traffic analysis models. In addition, we will also consider routing strategies as variable factors for joint optimization of routing strategies and microservice deployment schemes.

References

[1] Y. Chen et al., “Performance optimization in RSMA-assisted uplink xURLLC IIoT networks with statistical QoS provisioning,” arXiv preprint arXiv:2405.16471, 2024.
[2] A. Hannousse and S. Yahiouche, “Securing microservices and microservice architectures: A systematic mapping study,” Computer Science Review, vol. 41, p. 100415, 2021.
[3] T. Wu et al., “A brief overview of ChatGPT: The history, status quo and potential future development,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 5, pp. 1122–1136, 2023.
[4] Y. Chen et al., “Statistical QoS provisioning analysis and performance optimization in xURLLC-enabled massive MU-MIMO networks: A stochastic network calculus perspective,” IEEE Trans. Wireless Commun., pp. 1–1, 2024.
[5] Y. Chen et al., “When xURLLC meets NOMA: A stochastic network calculus perspective,” IEEE Commun. Mag., vol. 62, no. 6, pp. 90–96, 2024.
[6] A. K. Sandhu, “Big data with cloud computing: Discussions and challenges,” Big Data Mining and Analytics, vol. 5, no. 1, pp. 32–40, 2021.
[7] F. Jauro et al., “Deep learning architectures in emerging cloud computing architectures: Recent development, challenges and next research trend,” Applied Soft Computing, vol. 96, p. 106582, 2020.
[8] L. Qin et al., “Joint transmission and resource optimization in NOMA-assisted IoVT with mobile edge computing,” IEEE Trans. Veh. Technol., pp. 1–16, 2024.
[9] L. Qin et al., “Energy-efficient blockchain-enabled user-centric mobile edge computing,” IEEE Trans. Cogn. Commun. Netw., pp. 1–1, 2024.
[10] L. Qin et al., “Towards decentralized task offloading and resource allocation in user-centric MEC,” IEEE Trans. Mobile Comput., pp. 1–17, 2024.
[11] M. S. Elbamby et al., “Wireless edge computing with latency and reliability guarantees,” Proc. IEEE, vol. 107, no. 8, pp. 1717–1737, 2019.
[12] W. Zhang et al., “Energy-efficient workload allocation and computation resource configuration in distributed cloud/edge computing systems with stochastic workloads,” IEEE J. Sel. Areas Commun., vol. 38, no. 6, pp. 1118–1132, 2020.
[13] S. Pallewatta, V. Kostakos, and R. Buyya, “Placement of microservices-based IoT applications in Fog computing: A taxonomy and future directions,” ACM Computing Surveys, vol. 55, no. 14s, pp. 1–43, 2023.
[14] L. Wang et al., “Microservice-oriented service placement for mobile edge computing in sustainable internet of vehicles,” IEEE Trans. Intell. Transp. Syst., vol. 24, no. 9, pp. 10 012–10 026, 2023.
[15] Y. Mansouri and M. A. Babar, “A review of edge computing: Features and resource virtualization,” J. Parallel Distrib. Comput., vol. 150, pp. 155–183, 2021.
[16] F. Xu et al., “Managing performance overhead of virtual machines in cloud computing: A survey, state of the art, and future directions,” Proc. IEEE, vol. 102, no. 1, pp. 11–31, 2013.
[17] B. Wan et al., “Modeling analysis and cost-performance ratio optimization of virtual machine scheduling in cloud computing,” IEEE Trans. Parallel Distrib. Syst., vol. 31, no. 7, pp. 1518–1532, 2020.
[18] C. Pahl and P. Jamshidi, “Microservices: A systematic mapping study.” CLOSER (1), pp. 137–146, 2016.
[19] N. Cruz Coulson et al., “Adaptive microservice scaling for elastic applications,” IEEE Internet Things J., vol. 7, no. 5, pp. 4195–4202, 2020.
[20] K. Fu et al., “Adaptive resource efficient microservice deployment in cloud-edge continuum,” IEEE Trans. Parallel Distrib. Syst., vol. 33, no. 8, pp. 1825–1840, 2022.
[21] S. Luo et al., “An in-depth study of microservice call graph and runtime performance,” IEEE Trans. Parallel Distrib. Syst., vol. 33, no. 12, pp. 3901–3914, 2022.
[22] H. Zhao et al., “Distributed redundant placement for microservice-based applications at the edge,” IEEE Trans. Serv. Comput., vol. 15, no. 3, pp. 1732–1745, 2022.
[23] L. Gu et al., “Layer-aware collaborative microservice deployment toward maximal edge throughput,” in IEEE INFOCOM 2022-IEEE Conference on Computer Communications. IEEE, 2022, pp. 71–79.
[24] D. Zeng, et al., “Layered structure aware dependent microservice placement toward cost efficient edge clouds,” in IEEE INFOCOM 2023-IEEE Conference on Computer Communications. IEEE, 2023, pp. 1–9.
[25] G. Yu et al., “Microscaler: Cost-effective scaling for microservice applications in the cloud with an online learning approach,” IEEE Trans. Cloud Comput., vol. 10, no. 2, pp. 1100–1116, 2022.
[26] Y. Wang, H. Kadiyala, and J. Rubin, “Promises and challenges of microservices: an exploratory study,” Empirical Software Engineering, vol. 26, no. 4, p. 63, 2021.
[27] I. K. Aksakalli, T. Çelik, A. B. Can, and B. Tekinerdoğan, “Deployment and communication patterns in microservice architectures: A systematic literature review,” Journal of Systems and Software, vol. 180, p. 111014, 2021.
[28] “AWS Microservice Extractor for .NET,” https://aws.amazon.com/tw/microservice-extractor/, accessed on November 30, 2024.
[29] Z. Li et al., “Noah: Reinforcement-learning-based rate limiter for microservices in large-scale e-commerce services,” IEEE Trans. Neural Netw. Learn. Syst., vol. 34, no. 9, pp. 5403–5417, 2023.
[30] S. Pallewatta et al., “QoS-aware placement of microservices-based iot applications in Fog computing environments,” Future Gener. Comput. Syst., vol. 131, pp. 121–136, 2022.
[31] X. He et al., “Online deployment algorithms for microservice systems with complex dependencies,” IEEE Trans. Cloud Comput., vol. 11, no. 2, pp. 1746–1763, 2023.
[32] W. Lv et al., “Graph-reinforcement-learning-based dependency-aware microservice deployment in edge computing,” IEEE Internet Things J., vol. 11, no. 1, pp. 1604–1615, 2024.
[33] A. Samanta and J. Tang, “Dyme: Dynamic microservice scheduling in edge computing enabled IoT,” IEEE Internet of Things J., vol. 7, no. 7, pp. 6164–6174, 2020.
[34] X. Chen et al., “Dynamic service migration and request routing for microservice in multicell mobile-edge computing,” IEEE Internet of Things J., vol. 9, no. 15, pp. 13 126–13 143, 2022.
[35] J. Zhu et al., “QoS-aware co-scheduling for distributed long-running applications on shared clusters,” IEEE Trans. Parallel Distrib. Syst., vol. 33, no. 12, pp. 4818–4834, 2022.
[36] N. Bugshan et al., “Intrusion detection-based ensemble learning and microservices for zero touch networks,” IEEE Commun. Mag., vol. 61, no. 6, pp. 86–92, 2023.
[37] R. Heinrich et al., “Performance engineering for microservices: research challenges and directions,” in Proceedings of the 8th ACM/SPEC on international conference on performance engineering companion, 2017, pp. 223–226.
[38] L. Carvalho et al., “On the usefulness of automatically generated microservice architectures,” IEEE Trans. Software Eng., vol. 50, no. 3, pp. 651–667, 2024.
[39] J. Shi et al., “Adaptive QoS-aware microservice deployment with excessive loads via intra- and inter-datacenter scheduling,” IEEE Trans. Parallel Distrib. Syst., vol. 35, no. 9, pp. 1565–1582, 2024.
[40] M. Adeppady et al., “Reducing microservices interference and deployment time in resource-constrained cloud systems,” IEEE Trans. Netw. Serv. Manag., vol. 20, no. 3, pp. 3135–3147, 2023.
[41] X. Li et al., “Topology-aware scheduling framework for microservice applications in cloud,” IEEE Trans. Parallel Distrib. Syst., vol. 34, no. 5, pp. 1635–1649, 2023.
[42] R. Xie et al., “Towards minimum latency in cloud-native applications via service-characteristic-aware microservice deployment,” in 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), 2024, pp. 35–46.
[43] K. Fu et al., “QoS-aware and resource efficient microservice deployment in cloud-edge continuum,” in 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2021, pp. 932–941.
[44] F. Zhang et al., “Network-aware reliability modeling and optimization for microservice placement,” 2024. [Online]. Available: https://arxiv.org/abs/2405.18001
[45] Y. Liu et al., “How to share: Balancing layer and chain sharing in industrial microservice deployment,” IEEE Trans. Services Comput., vol. 16, no. 4, pp. 2685–2698, 2023.
[46] Y. Chen, C. Lu, Y. Huang, C. Wu, F. Guo, H. Lu, and C. W. Chen, “DMSA: A decentralized microservice architecture for edge networks,” 2025. [Online]. Available: https://arxiv.org/abs/2501.00883
[47] D. Alencar et al., “Dynamic microservice allocation for virtual reality distribution with qoe support,” IEEE Trans. Network Serv. Manag., vol. 19, no. 1, pp. 729–740, 2022.
[48] K. Peng et al., “Joint optimization of service deployment and request routing for microservices in mobile edge computing,” IEEE Trans. Services Comput., vol. 17, no. 3, pp. 1016–1028, 2024.
[49] L. Chen et al., “IoT microservice deployment in edge-cloud hybrid environment using reinforcement learning,” IEEE Internet of Things J., vol. 8, no. 16, pp. 12 610–12 622, 2021.