This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Decentralized Network Topology Design for Task Offloading in Mobile Edge Computing

Ke Ma Department of Electrical and Computer Engineering
University of California, San Diego and San Diego State University
La Jolla, USA
[email protected]
   Junfei Xie Department of Electrical and Computer Engineering
San Diego State University
San Diego, USA
[email protected]
Abstract

The rise of delay-sensitive yet computing-intensive Internet of Things (IoT) applications poses challenges due to the limited processing power of IoT devices. Mobile Edge Computing (MEC) offers a promising solution to address these challenges by placing computing servers close to end users. Despite extensive research on MEC, optimizing network topology to improve computational efficiency remains underexplored. Recognizing the critical role of network topology, we introduce a novel decentralized network topology design strategy for task offloading (DNTD-TO) that jointly considers topology design and task allocation. Inspired by communication and sensor networks, DNTD-TO efficiently constructs three-layered network structures for task offloading and generates optimal task allocations for these structures. Comparisons with existing topology design methods demonstrate the promising performance of our approach.

Index Terms:
MEC, Task offloading, Network topology design

I Introduction

With the advancement in Internet of Things (IoT) technology, many delay-sensitive yet computing-intensive applications have emerged, such as automotive driving, face recognition and virtual reality[1]. However, IoT devices typically have limited computing power, making it difficult to meet the demands of these tasks. Centralized cloud computing is traditionally used to process these tasks, but offloading them to the remote cloud can cause significant transmission delays that degrade user Quality of service (QoS). To address this issue, Mobile Edge Computing (MEC) solutions were introduced[2], where MEC places servers closer to end users at the network edge, such as at base stations, to reduce transmission delays.

To better serve end users, extensive research has been conducted in the field of MEC, addressing various design aspects such as system deployment, task offloading, resource allocation, mobility management, and privacy and security [3]. Nevertheless, little attention has been given to optimizing network topology to enhance computational efficiency. Most existing studies primarily focus on offloading tasks from users to one or more nearby MEC servers within communication range [4, 5]. A few studies [6, 7] have explored offloading tasks to servers multiple hops away, but these designs did not consider the impact of network topology. In our prior work [8], we proposed a multi-layered task offloading framework and demonstrated that computational efficiency can be improved by leveraging layered network structures. We also showed that computational efficiency is influenced not only by the computing and communication characteristics of the servers but also by their network topology. In this paper, we aim to further investigate the joint design of layered network structure and task allocation.

Layered structures have been widely used in communication and practical sensor networks due to energy efficiency and network management simplicity [9]. In these networks, a base station is typically present alongside several clusters of sensors or communication devices, with each cluster comprising a Cluster Head (CH) and multiple Cluster Members (CMs). The CH collects data from its CMs and transmits it to the base station. Methods for selecting CHs and CMs can be broadly categorized into two types: cluster-based [10, 11, 12, 13] and grid-based [14, 15, 16, 17]. In cluster-based methods, CHs are selected directly based on certain criteria. In contrast, grid-based methods first divide the network into grids, and then select CHs within each grid. Despite the widespread use of layered structures in communication and sensor networks, their design for task offloading remains underdeveloped.

In this paper, we introduce a decentralized network topology design strategy for task offloading (DNTD-TO). We explore three-layered network structures, similar to those commonly used in communication and sensor networks, to facilitate computing. In this setup, tasks are offloaded from the root node (referred to as master) to servers in the second layer (CHs), which then distribute the tasks to their child nodes in the third layer (CMs). To select CHs and CMs, our strategy iterates through two nested phases: a local cluster formation phase, where each server within master’s communication range selects CMs in a decentralized manner, and a cluster selection phase where the master selects CHs and their associated CMs. The selection of CHs and CMs is based on servers’ task processing capacities and their potential to enhance computational efficiency. Optimal task allocation is integrated into every step of the selection process.

The rest of the paper is organized as follows. Sec. II covers system modeling and problem formulation. Our approach and simulation studies are presented in Sec. III and Sec. IV, respectively. Sec. V concludes the paper.

II System Modeling and Problem Formulation

Consider an MEC system with NN servers scattered in an open area, each communicating wirelessly only with nearby servers within a uniform communication range ξ\xi. Suppose one of the servers, referred to as the master (e.g., a server located at or near a base station), receives computation task processing requests from end users with a total task size of YY. To process tasks efficiently, the master allocates tasks, which are assumed to be arbitrarily decomposable, to its neighbors, referred to as CHs. The CHs then decide whether to further offload these tasks to their own neighbors, referred to as CMs. Since simply offloading tasks to all neighbors may not yield optimal performance, this study investigates the joint optimization of network topology and task allocation.

II-A System Modeling

The MEC system is modeled as a graph G={𝒱,}G=\{\mathcal{V},\mathcal{E}\}, where 𝒱={0,1,,N1}\mathcal{V}=\{0,1,\ldots,N-1\} denotes the set of servers, with the master server labeled as 0. \mathcal{E} represents the set of edges. An edge between two servers, ii and jj, is established when they are within each other’s communication range, meaning their Euclidean distance, denoted as dijd_{ij}, satisfies dijξd_{ij}\leq\xi. The connectivity is described by the adjacency matrix AA, where each element aija_{ij} is equal to 11 if there is an edge between server ii and server jj, and 0 otherwise.

Additionally, define 𝒩i={j|aij=1,j𝒱}\mathcal{N}_{i}=\{j|a_{ij}=1,\forall j\in\mathcal{V}\}, where i𝒱i\in\mathcal{V}, as the set of neighbors of server ii that are within its communication range. By strategically selecting CHs from the set 𝒩0\mathcal{N}_{0} to receive tasks offloaded from the master, and assigning CMs from the set 𝒩i\mathcal{N}_{i} to each CH ii to handle tasks offloaded from CH ii, the master, CHs, and CMs form a three-layer tree topology, denoted as 𝒯\mathcal{T}. We assume that each CM can belong to only one CH.

To describe data transmission between two servers, we adopt the model in [18]. Specifically, suppose the servers communicate using the Orthogonal Frequency Division Multiplexing protocol [19], where the bandwidth is equally divided when a server transmits data simultaneously to its connected servers via orthogonal channels. For simplicity of analysis, we assume the channels are free from interference. If a server ii (e.g., a CH) transmits data to its mm neighboring servers (e.g., associated CMs) simultaneously, the data transmission rate for the link between server ii and its jj-th neighbor is given by Rij=Bmlog2(1+βijdij2)R_{ij}=\frac{B}{m}log_{2}(1+\frac{\beta_{ij}}{d_{ij}^{2}}), where BB (MHz) denotes the total bandwidth, and βij\beta_{ij} represents the signal-to-noise (SNR) ratio.

The computation model from [19] is used in this study to describe the task processing time. Let bb represent the number of CPU cycles required to compute 1 bit of data, and fif_{i} (MHz) denote the computation capacity of server ii. Given a task of size yy (Gbits), the time taken by server ii to process the task is Ticomp=yγiT_{i}^{comp}=y\gamma_{i}, where γi=bfi\gamma_{i}=\frac{b}{f_{i}} (s/Mbits) indicates the time taken by server ii to process one Gbit of data.

II-B Problem Formulation

In this subsection, we present the mathematical formulation of the problem.

The master selects a set of CHs from set 𝒩0\mathcal{N}_{0} for task offloading. To describe the selection of CHs, we introduce binary decision variables oio_{i}, where i𝒩0i\in\mathcal{N}_{0}. The variable oio_{i} equals 1 if server ii is selected as a CH, and 0 otherwise.

After receiving tasks from the master, the CHs select their CMs for further task offloading. To describe the selection of CMs, we introduce decision variables 𝒙i=[xi1,xi2,,xi|𝒩i|]\boldsymbol{x}_{i}=[x_{i1},x_{i2},\ldots,x_{i|\mathcal{N}_{i}|}], where i𝒩0i\in\mathcal{N}_{0} and |𝒩i||\mathcal{N}_{i}| finds the cardinality of the set 𝒩i\mathcal{N}_{i}. 𝒙i|𝒩i|\boldsymbol{x}_{i}\in\mathbb{R}^{|\mathcal{N}_{i}|} is a binary vector, where its jj-th element, xijx_{ij}, equals 1 if server jj is selected to join the cluster ii, and 0 otherwise. The resulting cluster with CH i𝒩0i\in\mathcal{N}_{0}, denoted as CiC_{i}, is given by Ci={j|xij=1,j𝒩i}C_{i}=\{j|x_{ij}=1,\forall j\in\mathcal{N}_{i}\}.

Additionally, we introduce continuous decision variables yi0y_{i}\in\mathbb{R}_{\geq 0} to represent the size of the tasks offloaded to server ii, where i𝒱i\in\mathcal{V}. In case when i=0i=0, y0y_{0} indicates the size of the tasks processed at the master.

The time required for each server ii to receive its assigned tasks can then be expressed as:

Titran={oiyi1R0i,if i𝒩0ojxjiyi(1Rji+1R0j),if i𝒩j,j𝒩00,elseT_{i}^{tran}=\begin{cases}o_{i}y_{i}\frac{1}{R_{0i}},&\text{if }i\in\mathcal{N}_{0}\\ o_{j}x_{ji}y_{i}(\frac{1}{R_{ji}}+\frac{1}{R_{0j}}),&\text{if }i\in\mathcal{N}_{j},j\in\mathcal{N}_{0}\\ 0,&\text{else}\\ \end{cases} (1)

Notably, if server ii is not selected for task offloading, the associated transmission delay is 0.

The total time required for each server ii to receive and process the assigned computation tasks is given by 𝒥i=Titran+Ticomp,i𝒱\mathcal{J}_{i}=T_{i}^{tran}+T_{i}^{comp},i\in\mathcal{V}, where Ticomp=yiγiT_{i}^{comp}=y_{i}\gamma_{i}. Here, we assume the generated results are small in size, making the transmission delay for sending them back to the master negligible, as is often assumed in existing studies (e.g., [20]). The total task completion time can then be written as 𝒥=max𝒥i,i𝒱\mathcal{J}=\max\mathcal{J}_{i},\ \forall i\in\mathcal{V}.

The objective of this study is to minimize the total task completion time by jointly optimizing the selection of CHs and CMs, and the task allocation, which is formulated as follows:

𝒫main:\displaystyle\mathcal{P}_{\text{main}}: min{yi}i𝒱,{𝒙i,oi}i𝒩0𝒥\displaystyle\min_{\{y_{i}\}_{i\in\mathcal{V}},\{\boldsymbol{x}_{i},o_{i}\}_{i\in\mathcal{N}_{0}}}\quad\mathcal{J}
s.t.\displaystyle s.t.\quad oi{0,1},i𝒩0\displaystyle o_{i}\in\{0,1\},\forall i\in\mathcal{N}_{0} C1\displaystyle C1
xij{0,1},i𝒩0,j𝒩i\displaystyle x_{ij}\in\{0,1\},\forall i\in\mathcal{N}_{0},j\in\mathcal{N}_{i} C2\displaystyle C2
0yiY,i𝒱\displaystyle 0\leq y_{i}\leq Y,\forall i\in\mathcal{V} C3\displaystyle C3
CiCj=,i,j𝒩0,ij\displaystyle C_{i}\cap C_{j}=\emptyset,\forall i,j\in\mathcal{N}_{0},i\neq j C4\displaystyle C4
yi=j𝒩ixijyj,i𝒩0\displaystyle y_{i}=\sum_{j\in\mathcal{N}_{i}}x_{ij}y_{j},i\in\mathcal{N}_{0} C5\displaystyle C5
y0+i𝒩0oiyi=Y\displaystyle y_{0}+\sum_{i\in\mathcal{N}_{0}}o_{i}y_{i}=Y C6\displaystyle C6

Constraints C1C1-C3C3 ensure that the decision variables take valid values. Constraints C4C4 ensure that clusters do not overlap. Constraints C5C5-C6C6 maintain the integrity of task sizes. It should be noted that solving this problem directly is challenging due to the nonlinear constraints C4C4.

III Decentralized Network Topology Design for Task Offloading (DNTD-TO)

In this section, we introduce our approach, DNTD-TO, for solving problem 𝒫main\mathcal{P}_{\text{main}}. This method consists of two nested phases: (1) Local Cluster Formation (LCF) phase, and (2) Cluster Selection phase.

III-A Local Cluster Formation (LCF)

In this phase, each server i𝒩0i\in\mathcal{N}_{0} within the master’s communication range selects its CMs in a decentralized manner, forming a candidate cluster with itself as the CH. These candidate clusters will then be examined by the master in the cluster selection phase as detailed in the next subsection. Notably, this phase allows a server to join multiple clusters.

To select CMs, each server i𝒩0i\in\mathcal{N}_{0} employs a forward selection mechanism, picking CMs from its neighbors j𝒩i\forall j\in\mathcal{N}_{i} one by one until either no further performance gains can be achieved or all neighbors have been evaluated. In each iteration, the CH adds the neighboring server that maximally reduces the task processing time, assuming tasks of any size yy are assigned to server ii. This selection is achieved by identifying the neighboring server with the highest processing capacity, defined as the time to receive and process a unit-sized task, and determining if its addition would reduce the overall task processing time. This is challenging, as a neighboring server’s processing capacity depends on the number of servers in the cluster due to shared bandwidth. Additionally, determining time savings from adding a server requires solving an optimization problem for task allocation. In the following, we first derive the processing capacity, denoted as αi\alpha_{i}.

Initially, the cluster CiC_{i}, with server i𝒩0i\in\mathcal{N}_{0} as the CH, is empty. Therefore, when a neighboring server is added to the cluster, this server can utilize the entire bandwidth resource. Then, for each server j=𝒩ij\in\mathcal{F}=\mathcal{N}_{i} within ii’s communication range, its processing capacity αj\alpha_{j} can be derived as αj=γj+1Blog2(1+βijdij2),j\alpha_{j}=\gamma_{j}+\frac{1}{Blog_{2}(1+\beta_{ij}d_{ij}^{-2})},j\in\mathcal{F}, where \mathcal{F} denotes the set of neighboring servers that have not been added to the cluster. Nevertheless, in subsequent iterations, as the cluster CiC_{i} becomes nonempty, a neighboring server jj\in\mathcal{F} can no longer utilize the entire bandwidth resource, and its processing capacity αj\alpha_{j} is given by:

αj=γj+|Ci|+1Blog2(1+βijdij2),j\displaystyle\alpha_{j}=\gamma_{j}+\frac{|C_{i}|+1}{Blog_{2}(1+\beta_{ij}d_{ij}^{-2})},j\in\mathcal{F} (2)

Notably, the processing capacity of the CH ii is always αi=γi\alpha_{i}=\gamma_{i} due to local computing.

To determine whether adding a neighboring server jj would reduce the task processing time, we define a performance indicator 𝙸\mathtt{I} that compares the minimum task processing time before and after the server is added. Specifically, consider cluster CiC_{i} at the kk-th iteration, for task yy assigned to server i𝒩0i\in\mathcal{N}_{0}, the minimum task processing time and the optimal task allocation can be derived by solving the following optimization problem:

𝒫1:\displaystyle\mathcal{P}_{1}: minyl,lCi{i}J\displaystyle\min_{y_{l},\forall l\in C_{i}\cup\{i\}}J
s.t.\displaystyle s.t.\quad lCi{i}yl=y\displaystyle\sum_{l\in C_{i}\cup\{i\}}y_{l}=y

where J=maxlCi{i}JlJ=\max_{l\in C_{i}\cup\{i\}}J_{l} is the task processing time and Jl=ylαlJ_{l}=y_{l}\alpha_{l} is the time required for server ll to receive and process its assigned task yly_{l}. The performance indicator is then defined as 𝙸=JCiJCi{j}\mathtt{I}=\frac{J^{*}_{C_{i}}}{J^{*}_{C_{i}\cup\{j\}}}, where JCiJ^{*}_{C_{i}} and JCi{j}J^{*}_{C_{i}\cup\{j\}} represent the minimum task processing time obtained before and after adding server jj\in\mathcal{F} to cluster CiC_{i}, respectively. Therefore, if 𝙸>1\mathtt{I}>1, adding the server to the cluster will improve performance; otherwise, it will not.

To derive the formula for 𝙸\mathtt{I}, we solve the above optimization problem, which leads to the following lemma, with the proof provided in the Appendix.

Lemma 1.

Consider problem 𝒫1\mathcal{P}_{1}. The optimal task allocation, denoted as yly^{*}_{l}, lCi{i}\forall l\in C_{i}\cup\{i\}, satisfy αiyi=αlyl\alpha_{i}y^{*}_{i}=\alpha_{l}y^{*}_{l}, li,lCi\forall l\neq i,l\in C_{i}, when CiC_{i}\neq\emptyset. Consequently, the minimum task processing time is given by J=αiyi=αlylJ^{*}=\alpha_{i}y^{*}_{i}=\alpha_{l}y^{*}_{l}, lCil\in C_{i}. In the special case where Ci=C_{i}=\emptyset, we have J=αiyJ^{*}=\alpha_{i}y.

From the above lemma, we can derive that JCi=ylαlJ^{*}_{C_{i}}=y_{l}^{*}\alpha_{l} and JCi{j}=y¯lα¯l=y¯jα¯jJ^{*}_{C_{i}\cup\{j\}}=\bar{y}_{l}^{*}\bar{\alpha}_{l}=\bar{y}_{j}^{*}\bar{\alpha}_{j} for lCi{i}l\in C_{i}\cup\{i\}. Here, αl\alpha_{l} and α¯l\bar{\alpha}_{l} represent the processing capacities for server ll before and after adding server jj to cluster CiC_{i}, respectively, while yly^{*}_{l} and y¯l\bar{y}^{*}_{l} denote the corresponding optimal task allocations. Therefore, we have 𝙸=ylαly¯lα¯l\mathtt{I}=\frac{y_{l}^{*}\alpha_{l}}{\bar{y}_{l}^{*}\bar{\alpha}_{l}}. In this equation, the processing capacities αl\alpha_{l} and α¯l\bar{\alpha}_{l} can be readily computed using (2). Specifically, if lCil\in C_{i}, αl=γl+|Ci|Blog2(1+βildil2)\alpha_{l}=\gamma_{l}+\frac{|C_{i}|}{Blog_{2}(1+\beta_{il}d_{il}^{-2})} and α¯l=γl+|Ci|+1Blog2(1+βildil2)\bar{\alpha}_{l}=\gamma_{l}+\frac{|C_{i}|+1}{Blog_{2}(1+\beta_{il}d_{il}^{-2})}; otherwise, if l=il=i, αi=α¯i=γi\alpha_{i}=\bar{\alpha}_{i}=\gamma_{i}. However, determining the optimal task allocations yly_{l}^{*} and y¯l\bar{y}_{l}^{*} by solving 𝒫1\mathcal{P}_{1} at each iteration is time-consuming. To address this issue, we introduce an iterative method to efficiently compute the values of 𝙸\mathtt{I}, yly_{l}^{*} and y¯l\bar{y}_{l}^{*}.

Initially, CiC_{i} is empty. Hence, we can easily obtain that yi=yy_{i}^{*}=y before adding server jj, and y¯i=α¯jyα¯i+α¯j\bar{y}^{*}_{i}=\frac{\bar{\alpha}_{j}y}{\bar{\alpha}_{i}+\bar{\alpha}_{j}}, y¯j=α¯iyα¯i+α¯j\bar{y}^{*}_{j}=\frac{\bar{\alpha}_{i}y}{\bar{\alpha}_{i}+\bar{\alpha}_{j}} after adding server jj based on Lemma 1. Thus, 𝙸=α¯i+α¯jα¯i\mathtt{I}=\frac{\bar{\alpha}_{i}+\bar{\alpha}_{j}}{\bar{\alpha}_{i}}. In subsequent iterations, to derive 𝙸\mathtt{I}, we note the existence of following relationships:

𝙸=ylαly¯lα¯l,lCi{i}\displaystyle\mathtt{I}=\frac{y_{l}^{*}\alpha_{l}}{\bar{y}_{l}^{*}\bar{\alpha}_{l}},l\in C_{i}\cup\{i\} (3a)
y¯j+lCi{i}y¯l=y\displaystyle\bar{y}_{j}^{*}+\sum_{l\in C_{i}\cup\{i\}}\bar{y}_{l}^{*}=y (3b)
lCi{i}yl=y\displaystyle\sum_{\forall l\in C_{i}\cup\{i\}}y_{l}^{*}=y (3c)

which yields

y¯j\displaystyle\bar{y}_{j}^{*} =lCi{i}(1αl𝙸αl¯)yl\displaystyle=\sum_{l\in C_{i}\cup\{i\}}(1-\frac{\alpha_{l}}{\mathtt{I}\bar{\alpha_{l}}})y_{l}^{*} (4)

Since y¯jα¯j=y¯lα¯l\bar{y}_{j}^{*}\bar{\alpha}_{j}=\bar{y}_{l}^{*}\bar{\alpha}_{l}, 𝙸=ylαly¯lα¯l\mathtt{I}=\frac{y_{l}^{*}\alpha_{l}}{\bar{y}_{l}^{*}\bar{\alpha}_{l}}, and ylαl=yiαiy_{l}^{*}\alpha_{l}=y_{i}^{*}\alpha_{i}, we have

y¯j=ylαlα¯j𝙸=yiαiα¯j𝙸\bar{y}_{j}^{*}=\frac{y_{l}^{*}\alpha_{l}}{\bar{\alpha}_{j}\mathtt{I}}=\frac{y_{i}^{*}\alpha_{i}}{\bar{\alpha}_{j}\mathtt{I}} (5)

Combining (3c), (4), and (5), we can derive that

𝙸={α¯i+α¯jα¯i,if k=0α¯jlCi{i}ylαlα¯l+α¯jyiα¯jy,if k1\displaystyle\mathtt{I}=\begin{cases}\frac{\bar{\alpha}_{i}+\bar{\alpha}_{j}}{\bar{\alpha}_{i}},&\text{if }k=0\\ \frac{\bar{\alpha}_{j}\sum_{l\in C_{i}\cup\{i\}}\frac{y_{l}^{*}\alpha_{l}}{\bar{\alpha}_{l}}+\bar{\alpha}_{j}y_{i}^{*}}{\bar{\alpha}_{j}y},&\text{if }k\geq 1\end{cases} (6)

Since yly^{*}_{l} was obtained in the previous iteration, i.e., yl(k)=y¯l(k1)y^{*(k)}_{l}=\bar{y}^{*(k-1)}_{l}, where superscript (k)(k) indicates the iteration index, (6) can be readily computed. Moreover, once 𝙸\mathtt{I} is obtained, we can derive y¯l\bar{y}_{l}^{*} by y¯l=ylαlαl¯𝙸,lCi{i}\bar{y}_{l}^{*}=\frac{y_{l}^{*}\alpha_{l}}{\bar{\alpha_{l}}\mathtt{I}},\forall l\in C_{i}\cup\{i\}, and y¯j\bar{y}_{j}^{*} can be obtained by (4). Algorithm 1 summarizes the procedure.

Algorithm 1 LCF(i,𝒩ii,\mathcal{N}_{i}, yy)
1:  CiC_{i}\leftarrow\emptyset, 𝒩i\mathcal{F}\leftarrow\mathcal{N}_{i}, αiγi\alpha_{i}\leftarrow\gamma_{i}, α¯iγi\bar{\alpha}_{i}\leftarrow\gamma_{i}, yiyy^{*}_{i}\leftarrow y;
2:  for k=0to|𝒩i|1k=0\ \text{to}\ |\mathcal{N}_{i}|-1 do
3:     Compute α¯l\bar{\alpha}_{l} using (2), l\forall l\in\mathcal{F};
4:     jargminlα¯lj\leftarrow\operatorname*{arg\,min}_{l\in\mathcal{F}}\bar{\alpha}_{l};
5:     Compute 𝙸\mathtt{I} by (6);
6:     if 𝙸>1\mathtt{I}>1 then
7:        Compute y¯j\bar{y}_{j}^{*} by (4);
8:        ylylαlα¯l𝙸{y}_{l}^{*}\leftarrow\frac{y_{l}^{*}\alpha_{l}}{\bar{\alpha}_{l}\mathtt{I}}, lCi{i}\forall l\in C_{i}\cup\{i\}; yjy¯jy_{j}^{*}\leftarrow\bar{y}_{j}^{*};
9:        CiCi{j}C_{i}\leftarrow C_{i}\cup\{j\}; {j}\mathcal{F}\leftarrow\mathcal{F}\setminus\{j\};
10:        αlα¯l\alpha_{l}\leftarrow\bar{\alpha}_{l}, lCi\forall l\in C_{i};
11:     else
12:        Break;
13:     end if
14:  end for
15:  return  CiC_{i}, {yl}l{Ci}{i}\{y^{*}_{l}\}_{l\in\{C_{i}\}\cup\{i\}}
Remark 1.

Inspired by Algorithm 1, we can solve the optimization problem 𝒫1\mathcal{P}_{1} efficiently using an iterative procedure as outlined in Algorithm 2, which generates the optimal solution.

Algorithm 2 OptiSolver-𝒫1\mathcal{P}_{1}(i,Ci,yi,C_{i},y)
1:  𝒚\boldsymbol{y}^{*}\leftarrow\emptyset, k0k\leftarrow 0;
2:  for jCi\forall j\in C_{i} do
3:     Compute α¯j\bar{\alpha}_{j}, 𝙸\mathtt{I} using (2) and (6), respectively;
4:     Calculate y¯j\bar{y}_{j}^{*} by (4);
5:     ylylαlα¯l𝙸{y}_{l}^{*}\leftarrow\frac{y_{l}^{*}\alpha_{l}}{\bar{\alpha}_{l}\mathtt{I}}, lCi{i}\forall l\in C_{i}\cup\{i\}; yjy¯jy_{j}^{*}\leftarrow\bar{y}_{j}^{*}; 𝒚𝒚{yj}\boldsymbol{y}^{*}\leftarrow\boldsymbol{y}^{*}\cup\{y_{j}^{*}\};
6:     CiCi{j}C_{i}\leftarrow C_{i}\cup\{j\}; αlα¯l\alpha_{l}\leftarrow\bar{\alpha}_{l}, lCi\forall l\in C_{i}; kk+1k\leftarrow k+1;
7:  end for
8:  return  𝒚\boldsymbol{y}^{*}

III-B Cluster Selection

In this phase, the master examines the candidate clusters formed in the LCF phase, selects the CHs and their associated CMs, and resolves any overlaps between clusters.

To select CHs, the master evaluates each server i𝒩0i\in\mathcal{N}_{0} within its communication range, following a procedure similar to that of LCF. Particularly, in each iteration, the master identifies the neighboring server i𝒩0i\in\mathcal{N}_{0} with the highest processing capacity and adds it to the set of CHs, denoted as C0C_{0}, if doing so would reduce task processing time. The iteration stops when no further reduction in processing is achievable or when all neighboring servers of the master have been examined.

Unlike the processing capacity defined for individual servers in the LCF phase, the processing capacity of each candidate CH i𝒩0i\in\mathcal{N}_{0} here is defined as the time required for it and its CMs, as a team, to receive and process a unit-sized task. Specifically, for server ii, the time to receive a unit-sized task from the master is 1R0i\frac{1}{R_{0i}}. Moreover, once server ii receives this task, according to Lemma 1, the minimum time required for it and its CMs to process this task is yiγiy\frac{y_{i}^{*}\gamma_{i}}{y}, where yiy_{i}^{*} is the output of Algorithm 2. Therefore, the processing capacity of server i𝒩0i\in\mathcal{N}_{0} in the GCF phase, denoted as ηi\eta_{i}, is given by ηi=yiγiy+1R0i\eta_{i}=\frac{y_{i}^{*}\gamma_{i}}{y}+\frac{1}{R_{0i}}. Notably, the processing capacity of the master is η0=γ0\eta_{0}=\gamma_{0}.

Moreover, to assess whether adding server ii to the set of CHs, C0C_{0}, would improve performance, we use the same performance indicator 𝙸\mathtt{I}, which can be computed similarly as detailed in the previous subsection. Particularly, at the kk-th iteration, we have:

𝙸={η¯0+η¯iη¯0,if k=0η¯ilC0{0}ylηlη¯l+η¯iy0η¯iY,if k1\displaystyle\mathtt{I}=\begin{cases}\frac{\bar{\eta}_{0}+\bar{\eta}_{i}}{\bar{\eta}_{0}},&\text{if }k=0\\ \frac{\bar{\eta}_{i}\sum_{l\in C_{0}\cup\{0\}}\frac{y_{l}^{*}\eta_{l}}{\bar{\eta}_{l}}+\bar{\eta}_{i}y_{0}^{*}}{\bar{\eta}_{i}Y},&\text{if }k\geq 1\end{cases} (7)

where ηl\eta_{l} and η¯l\bar{\eta}_{l} denote the processing capacities of server lC0{0}l\in C_{0}\cup\{0\} before and after adding server ii to the cluster C0C_{0}, and we have

η¯l=y¯lγly+|C0|+1Blog2(1+β0ld0l2)\bar{\eta}_{l}=\frac{\bar{y}_{l}^{*}\gamma_{l}}{y}+\frac{|C_{0}|+1}{Blog_{2}(1+\beta_{0l}d_{0l}^{-2})} (8)

when lC0{0}l\in C_{0}\cup\{0\}. yly^{*}_{l} and y¯l\bar{y}^{*}_{l} denote the optimal task allocations before and after adding server ii. Similarly, the updating equation for y¯l\bar{y}_{l}^{*} is y¯l=ylηlηl¯𝙸,lC0{0}\bar{y}_{l}^{*}=\frac{y_{l}^{*}\eta_{l}}{\bar{\eta_{l}}\mathtt{I}},\forall l\in C_{0}\cup\{0\}, and we have:

y¯i=lC0{0}(1ηl𝙸ηl¯)yl\bar{y}_{i}^{*}=\sum_{l\in C_{0}\cup\{0\}}(1-\frac{\eta_{l}}{\mathtt{I}\bar{\eta_{l}}})y_{l}^{*} (9)

To resolve any overlaps between clusters, the selected CH ii and its CMs are “removed” from the network at the end of each iteration and do not participate in subsequent iterations. At the start of each new iteration, the unselected neighbors of the master undergo the LCF phase to update their CMs, ensuring that clusters remain distinct and non-overlapping. Algorithm 3 outlines the core procedure of our approach, which constructs the three-layer tree topology 𝒯\mathcal{T} and determines the associated optimal task allocation {yl}𝒯\{y^{*}_{l}\}_{\mathcal{T}}.

Remark 2.

The task allocation {yl}𝒯\{y^{*}_{l}\}_{\mathcal{T}} generated by our approach in Algorithm 3 is optimal for the topology 𝒯\mathcal{T}.

Algorithm 3 DNTD-TO(YY)
1:  𝒯{0}\mathcal{T}\leftarrow\{0\}, yYy\leftarrow Y, 𝒩0\mathcal{F}\leftarrow\mathcal{N}_{0}, β0γ0\beta_{0}\leftarrow\gamma_{0}, β¯0γ0\bar{\beta}_{0}\leftarrow\gamma_{0}, y0Yy_{0}^{*}\leftarrow Y;
2:  for k=0to|𝒩0|1k=0\ \text{to}\ |\mathcal{N}_{0}|-1 do
3:     {Cl,{yh}h{Cl}{l}}\{C_{l},\{y^{*}_{h}\}_{h\in\{C_{l}\}\cup\{l\}}\}\leftarrow LCF(l,𝒩l,yl,\mathcal{N}_{l},y), l\forall l\in\mathcal{F};
4:     Compute β¯l\bar{\beta}_{l} using (8), l\forall l\in\mathcal{F};
5:     i=argminlβ¯li=\operatorname*{arg\,min}_{l\in\mathcal{F}}\bar{\beta}_{l};
6:     Compute 𝙸\mathtt{I} using (7);
7:     if 𝙸>1\mathtt{I}>1 then
8:        Compute y¯i\bar{y}_{i}^{*} by (9);
9:        ylylβlβl¯𝙸,lC0{0}y_{l}^{*}\leftarrow\frac{y_{l}^{*}\beta_{l}}{\bar{\beta_{l}}\mathtt{I}},\forall l\in C_{0}\cup\{0\}; yiy¯iy_{i}^{*}\leftarrow\bar{y}_{i}^{*};
10:        𝒯𝒯{j}Cj\mathcal{T}\leftarrow\mathcal{T}\cup\{j\}\cup C_{j}; C0C0{j}C_{0}\leftarrow C_{0}\cup\{j\}; {j}\mathcal{F}\leftarrow\mathcal{F}\setminus\{j\};
11:        βlβ¯l\beta_{l}\leftarrow\bar{\beta}_{l}, lC0\forall l\in C_{0};
12:        𝒩i{l|ail=1,l𝒱𝒯}\mathcal{N}_{i}\leftarrow\{l|a_{il}=1,l\in\mathcal{V}\setminus\mathcal{T}\}, i\forall i\in\mathcal{F};
13:     else
14:        Break;
15:     end if
16:  end for
17:  return  𝒯\mathcal{T}, {yl}l𝒯\{y^{*}_{l}\}_{l\in\mathcal{T}}

IV Simulation studies

In this section, we conduct simulation studies to evaluate the performance of our approach.

IV-A Experiment Setup

We randomly generate and distribute servers within a 100×100100\times 100m area, with the exception of the master, which is fixed at the location (20,20)(20,20)m. The computing power of each server fif_{i} is sampled from a uniform distribution over [0.1,10][0.1,10] MHz, with bb set to 11. The SNR ratio βij\beta_{ij} is sampled from a uniform distribution over [30,40][30,40]dbm. The task size YY is set to 100100Gbits, and the bandwidth BB is set to 50MHz. To evaluate the performance of our method, we compare it with the following three benchmarks:

  • Unequal Cluster (Unequal)[14]: This method selects CHs by comparing servers’ computing power; if two CHs are within each other’s communication range, the server with higher computing power is selected as the CH. CMs join the nearest CH.

  • Leach-C [21]: A centralized approach that calculates each server’s probability of being a CH based on computing power (originally, energy level was used), with the top servers selected as CHs. CMs join the CH with the highest received signal strength (originally, minimal communication energy was used).

  • LBAS [22]: CHs are selected using an iterative approach that considers distance, number of neighbors, and computing power, with higher-scoring servers selected as CHs. A similar procedure is applied to select CMs.

  • Dijkstra’s [8]: A tree topology is constructed by identifying the shortest routes with the least transmission delay from the master to each server using the Dijkstra’s algorithm. Servers more than two hops away are pruned.

Notably, these benchmarks only generate the tree topology 𝒯\mathcal{T}. The associated optimal task allocation is derived using the same approach as ours.

IV-B Experiment Results

In the first experiment, we compare the performance of different approaches across networks of varying sizes and topologies. Specifically, we examine a small-scale network with N=20N=20 servers and a large-scale network with N=100N=100 servers. For each network size, we randomly generate 10 different topologies by varying server locations, with each server’s communication range set to 5050 m. As shown in Fig. (1), our approach outperforms all benchmarks in both small-scale (Fig. 1) and large-scale networks (Fig. 1). The performance advantage is more pronounced in large-scale networks, due to the increase of servers that can potentially slow down processing if included without careful selection. Comparing the two figures, we can see that the task completion time generally decreases with the increase of the number of servers, due to more servers participating in computations.

In the second experiment, we compare the performance of different approaches under varying server communication ranges. Specifically, we consider N=20N=20 servers distributed over the simulation area, as illustrated in Fig. 2. The communication range ξ\xi is configured to be the same for each server, and we vary ξ\xi from 1010m to 130130m. From Fig. 2, we can see that our approach outperforms all benchmarks and continues to improve as the communication range ξ\xi increases. This improvement occurs because, as ξ\xi grows, network connectivity strengthens as more servers within each other’s communication range. This expanded connectivity enables more servers to contribute to task processing, which, when properly selected, further reduces task completion time.

Refer to caption
Refer to caption
Figure 1: Performance comparison across different topologies when (a) N=20N=20 and (b) N=100N=100.
Refer to caption
Refer to caption
Figure 2: (a) Illustration of the network with ξ=10\xi=10m. "×\boldsymbol{\times}" marks the master and the dashed circles indicates the communication range; (b) Performance comparison for different values of ξ\xi.

V Conclusion and Future Works

In this paper, we investigated the design of network topology to improve computational efficiency for task offloading in MEC. The proposed approach, DNTD-TO, draws inspiration from communication and sensor networks and builds three-layered network structures for task offloading in an iterative, decentralized manner. Additionally, it generates the optimal task allocation for the designed topology. To evaluate its performance, we conducted various comparison studies. The simulation results show that DNTD-TO significantly outperforms existing topology design methods. In the future, we will explore network structures with more than three layers and consider networks with mobile servers.

Acknowledgment

We would like to thank the National Science Foundation under Grant CAREER-2048266 and CCRI-1730675 for the support of this work.

[Proof of Lemma 1] It is straightforward that when Ci=C_{i}=\emptyset, we have J=αiyJ^{*}=\alpha_{i}y. When CiC_{i}\neq\emptyset, we can solve problem 𝒫1\mathcal{P}_{1} by relaxing it to a linear programming problem as follows:

𝒫2:\displaystyle\mathcal{P}_{2}: minz\displaystyle\min z
s.t.\displaystyle s.t.\quad zylαl,lCi{i}\displaystyle z\geq y_{l}\alpha_{l},l\in C_{i}\cup\{i\}
y0+lCiyl=y\displaystyle\quad\quad y_{0}+\sum_{l\in C_{i}}y_{l}=y

To solve problem 𝒫2\mathcal{P}_{2}, the Lagrangian multiplier method [8] can be used, with the Lagrangian function given by =z+lCi{i}λl(ylαlz)+μ(lCi{i}yly)\mathcal{L}=z+\sum_{l\in C_{i}\cup\{i\}}\lambda_{l}(y_{l}\alpha_{l}-z)+\mu(\sum_{l\in C_{i}\cup\{i\}}y_{l}-y), where 𝝀={λl0|lCi{i}}\boldsymbol{\lambda}=\{\lambda_{l}\geq 0|l\in C_{i}\cup\{i\}\} and μ\mu are Lagrangian multipliers. As it fulfills the Slater’s condition [23], we can resort to the Karush-Kuhn-Tucker (KKT) condition [24] to solve this problem:

yl=0,lCi{i}\displaystyle\frac{\partial}{\partial y_{l}}\mathcal{L}=0,~{}\forall l\in C_{i}\cup\{i\} (10a)
z=0\displaystyle\frac{\partial}{\partial z}\mathcal{L}=0 (10b)
λl(ylαlz)=0,lCi{i}\displaystyle\lambda_{l}(y_{l}\alpha_{l}-z)=0,~{}\forall l\in C_{i}\cup\{i\} (10c)
lCi{i}yl=y\displaystyle\sum_{l\in C_{i}\cup\{i\}}y_{l}=y
ylαlz0,lCi{i}\displaystyle y_{l}\alpha_{l}-z\leq 0,~{}\forall l\in C_{i}\cup\{i\}
λl0,lCi{i}\displaystyle\lambda_{l}\geq 0,~{}\forall l\in C_{i}\cup\{i\}

If αlylαiyi\alpha_{l}y^{*}_{l}\neq\alpha_{i}y^{*}_{i} and let j=argmaxlCi{i}αlylj=\operatorname*{arg\,max}_{l\in C_{i}\cup\{i\}}\alpha_{l}y^{*}_{l}, then (10a) - (10c) can be rewritten as:

yj=λjαj+μ=0\displaystyle\frac{\partial}{\partial y_{j}}\mathcal{L}=\lambda_{j}\alpha_{j}+\mu=0 (11a)
z=1lCi{i}λl=0\displaystyle\frac{\partial}{\partial z}\mathcal{L}=1-\sum_{l\in C_{i}\cup\{i\}}\lambda_{l}=0 (11b)
λl(αlylz)=0,ln,lCi{i}\displaystyle\lambda_{l}(\alpha_{l}y_{l}-z)=0,\forall l\neq n,l\in C_{i}\cup\{i\} (11c)

Since z=αjyjz=\alpha_{j}y^{*}_{j}, we can derive from (11c) that λl=0,lj,lCi{i}\lambda_{l}=0,\forall l\neq j,l\in C_{i}\cup\{i\}, from (11b) that λj=1\lambda_{j}=1, and then from (11a) that μ=αj<0\mu=-\alpha_{j}<0. However, for lj\forall l\neq j, (10a) can also be written as yl=λlαl+μ=0=μ\frac{\partial}{\partial y_{l}}\mathcal{L}=\lambda_{l}\alpha_{l}+\mu=0=\mu. This produces a conflict value for μ\mu. Therefore, the optimal task allocation has to satisfy αlyl=αiyi,lCi\alpha_{l}y^{*}_{l}=\alpha_{i}y^{*}_{i},\forall l\in C_{i}, and J=αlyl=αiyiJ^{*}=\alpha_{l}y^{*}_{l}=\alpha_{i}y^{*}_{i}.

References

  • [1] X. Yang, Z. Chen, K. Li, Y. Sun, N. Liu, W. Xie, and Y. Zhao, “Communication-constrained mobile edge computing systems for wireless virtual reality: Scheduling and tradeoff,” IEEE Access, vol. 6, pp. 16 665–16 677, 2018.
  • [2] Y. C. Hu, M. Patel, D. Sabella, N. Sprecher, and V. Young, “Mobile edge computing—a key technology towards 5g,” ETSI white paper, vol. 11, no. 11, pp. 1–16, 2015.
  • [3] J. Yang, A. A. Shah, and D. Pezaros, “A survey of energy optimization approaches for computational task offloading and resource allocation in mec networks,” Electronics, vol. 12, no. 17, p. 3548, 2023.
  • [4] J. Linderoth, M. Yoder et al., “Metacomputing and the master-worker paradigm,” Mathematics and Computer Science Division, Argonne National Laboratory, Tech. Rep. ANL/MCS-P792–0200, 2000.
  • [5] B. Wang, J. Xie, K. Lu, Y. Wan, and S. Fu, “Learning and batch-processing based coded computation with mobility awareness for networked airborne computing,” IEEE Transactions on Vehicular Technology, vol. 72, no. 5, pp. 6503–6517, 2023.
  • [6] H. Qi, M. Liwang, X. Wang, L. Li, W. Gong, J. Jin, and Z. Jiao, “Bridge the present and future: A cross-layer matching game in dynamic cloud-aided mobile edge networks,” IEEE Transactions on Mobile Computing, 2024.
  • [7] F. Liu, J. Huang, and X. Wang, “Joint task offloading and resource allocation for device-edge-cloud collaboration with subtask dependencies,” IEEE Transactions on Cloud Computing, vol. 11, no. 3, pp. 3027–3039, 2023.
  • [8] K. Ma and J. Xie, “Joint task allocation and scheduling for multi - hop distributed computing,” in ICC 2024 - IEEE International Conference on Communications, 2024, pp. 2664–2669.
  • [9] R. Priyadarshi, “Energy-efficient routing in wireless sensor networks: a meta-heuristic and artificial intelligence-based approach: a comprehensive review,” Archives of Computational Methods in Engineering, vol. 31, no. 4, pp. 2109–2137, 2024.
  • [10] W. R. Heinzelman, A. Chandrakasan, and H. Balakrishnan, “Energy-efficient communication protocol for wireless microsensor networks,” in Proceedings of the 33rd annual Hawaii international conference on system sciences.   IEEE, 2000, pp. 10–pp.
  • [11] W. Heinzelman, A. Chandrakasan, and H. Balakrishnan, “An application-specific protocol architecture for wireless microsensor networks,” IEEE Transactions on Wireless Communications, vol. 1, no. 4, pp. 660–670, 2002.
  • [12] G. Koltsidas and F.-N. Pavlidou, “A game theoretical approach to clustering of ad-hoc and sensor networks,” Telecommunication Systems, vol. 47, pp. 81–93, 2011.
  • [13] D. Xie, Q. Sun, Q. Zhou, Y. Qiu, and X. Yuan, “An efficient clustering protocol for wireless sensor networks based on localized game theoretical approach,” International Journal of Distributed Sensor Networks, vol. 9, no. 8, p. 476313, 2013.
  • [14] G. Chen, C. Li, M. Ye, and J. Wu, “An unequal cluster-based routing protocol in wireless sensor networks,” Wireless Networks, vol. 15, pp. 193–207, 2009.
  • [15] R. Logambigai, S. Ganapathy, and A. Kannan, “Energy–efficient grid–based routing algorithm using intelligent fuzzy rules for wireless sensor networks,” Computers & Electrical Engineering, vol. 68, pp. 62–75, 2018.
  • [16] Y.-K. Chiang, N.-C. Wang, and C.-H. Hsieh, “A cycle-based data aggregation scheme for grid-based wireless sensor networks,” Sensors, vol. 14, no. 5, pp. 8447–8464, 2014.
  • [17] H. Farman, H. Javed, J. Ahmad, B. Jan, and M. Zeeshan, “Grid-based hybrid network deployment approach for energy efficient wireless sensor networks,” Journal of Sensors, vol. 2016, no. 1, p. 2326917, 2016.
  • [18] Y. Wang, Z.-Y. Ru, K. Wang, and P.-Q. Huang, “Joint deployment and task scheduling optimization for large-scale mobile users in multi-uav-enabled mobile edge computing,” IEEE transactions on cybernetics, vol. 50, no. 9, pp. 3984–3997, 2019.
  • [19] C. You, K. Huang, H. Chae, and B.-H. Kim, “Energy-efficient resource allocation for mobile-edge computation offloading,” IEEE Transactions on Wireless Communications, vol. 16, no. 3, pp. 1397–1411, 2016.
  • [20] F. Zhou, Y. Wu, R. Q. Hu, and Y. Qian, “Computation rate maximization in uav-enabled wireless-powered mobile-edge computing systems,” IEEE Journal on Selected Areas in Communications, vol. 36, no. 9, pp. 1927–1941, 2018.
  • [21] W. B. Heinzelman, A. P. Chandrakasan, and H. Balakrishnan, “An application-specific protocol architecture for wireless microsensor networks,” IEEE Transactions on wireless communications, vol. 1, no. 4, pp. 660–670, 2002.
  • [22] W. Osamy, B. Alwasel, A. Salim, A. M. Khedr, and A. Aziz, “Lbas: Load balancing aware clustering scheme for iot-based heterogeneous wireless sensor networks,” IEEE Sensors Journal, 2024.
  • [23] A. Auslender and M. Teboulle, “Lagrangian duality and related multiplier methods for variational inequality problems,” SIAM Journal on Optimization, vol. 10, no. 4, pp. 1097–1115, 2000.
  • [24] Z.-Q. Luo and W. Yu, “An introduction to convex optimization for communications and signal processing,” IEEE Journal on Selected Areas in Communications, vol. 24, no. 8, pp. 1426–1438, 2006.