Dynamic Human Digital Twin Deployment at the Edge for Task Execution: A Two-Timescale Accuracy-Aware Online Optimization

Yuye Yang, You Shi, Changyan Yi, , Jun Cai, , Jiawen Kang,
Dusit Niyato, , and Xuemin (Sherman) Shen, Y. Yang, Y. Shi and C. Yi are with the College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China. (E-mail: {mryyy, shyou, changyan.yi}@nuaa.edu.cn). J. Cai is with the Department of Electrical and Computer Engineering, Concordia University, Montréal, QC, H3G 1M8, Canada. (E-mail: [email protected]). J. Kang is with the School of Automation, Guangdong University of Technology, Guangzhou 510062, China (E-mail: [email protected]). D. Niyato is with the School of Computer Science and Engineering, Nanyang Technological University, Singapore. (E-mail: [email protected]). X. (Sherman) Shen is with the Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada. (E-mail: [email protected]).

Abstract

Human digital twin (HDT) is an emerging paradigm that bridges physical twins (PTs) with powerful virtual twins (VTs) for assisting complex task executions in human-centric services. In this paper, we study a two-timescale online optimization for building HDT under an end-edge-cloud collaborative framework. As a unique feature of HDT, we consider that PTs’ corresponding VTs are deployed on edge servers, consisting of not only generic models placed by downloading experiential knowledge from the cloud but also customized models updated by collecting personalized data from end devices. To maximize task execution accuracy with stringent energy and delay constraints, and by taking into account HDT’s inherent mobility and status variation uncertainties, we jointly and dynamically optimize VTs’ construction and PTs’ task offloading, along with communication and computation resource allocations. Observing that decision variables are asynchronous with different triggers, we propose a novel two-timescale accuracy-aware online optimization approach (TACO). Specifically, TACO utilizes an improved Lyapunov method to decompose the problem into multiple instant ones, and then leverages piecewise McCormick envelopes and block coordinate descent based algorithms, addressing two timescales alternately. Theoretical analyses and simulations show that the proposed approach can reach asymptotic optimum within a polynomial-time complexity, and demonstrate its superiority over counterparts.

Index Terms:

HDT, end-edge-cloud collaboration, placement and update, accuracy awareness, two-timescale online optimization

1 Introduction

Human digital twin (HDT) is defined as a paradigm that can vividly characterize the replication of each individual human in the virtual space while real-time reflecting its actual physical and mental status in the physical space [1], [2]. With the personalized status information maintained in a high-fidelity virtual environment, HDT can be regarded as a “sandbox”, where complex tasks for human-centric services (e.g., activity recognition [3] and vital signal measurement [4]) are able to be repeatedly simulated and tested, guiding the practical implementation. Because of the large potential in assisting complex task execution with human-centric concerns, HDT has been envisioned as a key enabler for Metaverse, Healthcare 5.0, Society 5.0, etc., attracting significant attentions recently [5].

Essentially, the HDT system consists of a number of physical twin (PT) and virtual twin (VT) pairs, where PT stands for the physical entity (i.e., human) and VT represents the corresponding virtual model [6]. Obviously, the successful realization of HDT largely depends on the well construction and management of VT, so as to provide fast-responsive interactions and high-accurate task execution for its paired PT. These requirements prompt the adoption of end-edge-cloud collaborative framework [7], by which HDT can be built and operated at the network edge (while supported by both end devices and the cloud center), guaranteeing pervasive connectivities, customized services and low-delay feedbacks. Although some preliminary efforts have been dedicated on studying similar problems, such as industrial digital twin construction at the edge [8], [9], [10] and service application deployment across edges [11], [12], [13], establishing HDT at the edge for assisting task execution particularly involves some fundamentally different and unique issues that remain unexplored but are of great importance. On one hand, different from position-fixed industrial plants, PTs in the HDT system are highly mobile with unpredictable mobility patterns (including positional and postural variations), leading to potential instability of PT-VT connectivities [1]. Therefore, to guarantee seamless PT-VT interactions, it is necessary to dynamically place the associated VT of each PT on the edge server (ES) that this PT may switch its access to (caused by the random mobility). On the other hand, unlike generalized applications requesting encapsulated services, PTs in the HDT system are extremely personalized and their status may vary frequently by uncertain external factors or physiological state changes, resulting in the potential inconsistency between each PT and its associated VT [14]. Hence, to keep up-to-date high-fidelity VTs, it is necessary to keep the associated VT on the ES updated in a real-time manner (especially for the customized part).

Nevertheless, meeting all aforementioned requirements are very challenging because of the following reasons.

1)

To enhance the accuracy of complex task execution assisted by HDT, it is required to construct fine-grained VTs on ESs (by both dynamically place generic models via the cloud and update customized toppings via sensors worn on PTs). However, the data brought by VT constructions can be massive [15], inherently increasing the service delay and energy consumption, and thus the data size of generic model placement and customized model update should be carefully optimized for striking a balance between accuracy and cost. On top of this, considering that the ultimate goal is to assist the task execution, how to timely and efficiently offloading tasks from PTs to ESs (containing associated VTs) should also be jointly considered with VT constructions, because both of them share the same communication and computation resources.
2)

Since HDT is time-varying evolutionalized with uncertain PT-VT mobility and status variations, the system optimization has to be conducted online while the statistics of future information (related to human activities) may be hard to obtain, if not impossible [16]. Moreover, the dynamic placement of generic VT models is triggered by PTs’ mobility and access handover among different ESs, which usually happens over a long time period. In contrast, the dynamic update of customized VT models and the complex task offloading are triggered by PTs’ status variations, which may need to be adapted in a much higher frequency. These indicate that such actions for system optimization should be performed asynchronously in different timescales.

To tackle the aforementioned difficulties, in this paper, we propose a novel two-timescale accuracy-aware online optimization approach for building HDT in assisting complex task execution under an end-edge-cloud collaborative framework. Specifically, we consider that each PT’s associated VT (deployed on the ES) consists of a generic model placed by downloading experiential knowledge (e.g., feature parameters and weights) from the cloud center and a customized model updated by uploading personalized data (e.g., behavior characteristics) from sensors worn on the PT. With the objective of maximizing the average accuracy of complex task execution assisted by HDT under stringent energy and delay constraints, and by taking into account the system uncertainties (e.g., random mobility and status variations), we formulate a two-timescale online optimization problem. Particularly, we aim to dynamically optimize i) large-timescale decisions, including the granularity of each PT’s experiential knowledge for placing its generic VT model and the ES access selection of each PT, and ii) small-timescale decisions, including the amount of each PT’s personalized data for updating its customized VT model, task offloading decision (i.e., local computing or edge computing) for each task, and communication and computation resource allocations of each ES. To this end, we develop a novel two-timescale accuracy-aware online optimization approach (TACO) based on the improved Lyapunov optimization. Specifically, the long-term problem is first decomposed into a series of short-term deterministic subproblems with different timescales, and then an alternating algorithm is proposed, integrating piecewise McCormick envelopes (PME) and block coordinate descent (BCD) based methods, for iteratively solving these subproblems. Theoretical analyses show that the proposed approach can produce an asymptotically optimal outcome with a polynomial-time complexity.

The main contributions of this paper are summarized in the following.

•

To the best of our knowledge, we are the first to study the HDT deployment at the network edge for assisting human-centric task execution by formulating a two-timescale accuracy-aware online optimization problem, which jointly optimizes VTs’ construction (including dynamic generic model placement and customized model update) and PTs’ task offloading together with the management of PT-ES access selection and corresponding communication and computation resource allocations.
•

We propose a novel approach, called TACO, which first decomposes the long-term optimization problem into multiple instant ones. Then, we leverage PME and BCD based algorithms for alternately solving the decoupled subproblems in the large-timescale and small-timescale, respectively.
•

We theoretically analyze performance of the proposed TACO approach by rigorously deriving the gap to optimum and the computational complexity in the closed-form. Furthermore, extensive simulations show that the proposed TACO approach can outperform counterparts in terms of improving the HDT-assisted task execution accuracy, and reducing the service response delay and overall system energy consumption.

The rest of this paper is organized as follows. Section 2 reviews the recent related work and highlights the novelties of this paper. Section 3 describes the considered system model and the problem formulation. In Section 4, the two-timescale accuracy-aware online optimization approach, i.e., TACO, is proposed and analyzed theoretically. Simulation results are presented in Section 5, followed by the conclusion in Section 6.

2 Related Work

As one of the key enabler for emerging applications, HDT has recently drawn a lot of research attentions from both academia and industry. For example, Lee et al. in [17] proposed a large-scale HDT construction framework on the cloud server integrated with a synchronization mechanism to reduce system overall data transmission cost. Zhong et al. in [18] introduced a bidirectional long short-term memory based algorithm in designing high-fidelity HDT model with multimodel data on the cloud platform. In [19], Liu et al. developed a cloud HDT based healthcare system by optimizing a patient-data processing workflow to improve the quality of personal health management. However, these papers were mainly restricted to constructing HDT sorely on the cloud server, ignoring the potential of utilizing network edge resources for empowering HDT with the capability of providing pervasive, customized and low-delay services.

While deploying HDT at the network edge has rarely been investigated, some researchers have dedicated in studying the general DT construction at the edge. Dong et al. in [9] proposed a deep learning algorithm for constructing DTs of the mobile edge network, aiming to minimize the normalized energy consumption through the optimization of user associations, resource allocations and offloading probabilities. Zhang et al. in [10] formulated a DT adaptive placement and transfer problem to minimize the DT synchronization delay, which were then solved by the double auction based optimization. Nevertheless, these papers considered that DTs were constructed on fixed locations or placed following pre-known mobility patterns, making them unsuitable for HDT with human-centric features, where PTs are highly mobile with unpredictability. Another stream of related works have been conducted on general mobile service application deployment across edges. For instance, Wang et al. in [11] developed a user-centric learning-driven method for deploying and migrating delay-aware service applications to minimize the total service delay of mobile users. Ouyang et al. [12] formulated a dynamic service deployment problem with the objective of minimizing the user-perceived delay under the uncertain user mobility. However, in these papers, service applications were assumed to have limited and encapsulated types, meaning that they are not customized and do not need to be updated, which largely differ from those of HDT (where on-demand evolution is essential).

To guarantee the long-term performance in online problems, Lyapunov optimization method has been widely recognized as an efficient approach [20], [21], [22], yet most of existing solutions were restricted to problems with decisions in the single timescale only. Recently, some preliminary studies [23], [24] have delved into designing two-timescale Lyapunov methods, by which the original problem was decomposed and further decoupled into subproblems in two different timescales independently and then optimized separately. Besides these, in [16], [25], alternating algorithms were developed in addition to the Lyapunov framework to tackle subproblems in two timescales with coupled relationships but are both convex (or can be easily converted into convex forms). However, these solutions cannot be directly applied in this paper because the considered subproblems (after the decomposition) are not only tightly coupled but also highly non-convex.

In summary, different from all the existing works, this paper proposes a novel two-timescale accuracy-aware online optimization approach to jointly optimize the HDT deployment (i.e., generic placement and customized update of VT model) and task offloading under an end-edge-cloud collaborative framework, where the novelty lies in not only the system model but also the solution.

3 System Model and Problem Formulation

In this section, we first provide a system overview on how HDT is deployed and functioned on ESs. Then, to be more specific, the generic VT model placement in the large-timescale and the customized VT model update together with the task offloading in the small-timescale are described. After that, aiming to enhance the accuracy of complex task execution assisted by HDT, a two-timescale online optimization problem is formulated.

3.1 System Overview

Refer to caption — Figure 1: The end-edge-cloud collaborative HDT system.

Consider an HDT system building upon an end-edge-cloud collaborative framework, as illustrated in Fig. 1, consisting of a set of end users (regarded as PTs) $\mathcal{I}$ with cardinality of $\mid\mathcal{I}\mid=I$ , multiple geographically distributed ESs denoted as $\mathcal{M}$ with $\mid\mathcal{M}\mid=M$ , and a cloud center (acting as the central controller). PTs (roaming around) generate streams of complex tasks which require the construction of exclusive VT models (forming one-to-one PT-VT pairs) at the edge to assist their task executions. Each VT should be deployed on the ES to which its associated PT may access for offering high-quality and low-delay services. Note that one ES can host multiple VT models for different PTs, and the corresponding communication and computation resources are shared among them. Furthermore, the construction of a high-fidelity VT on the ES consists of two main procedures, i.e., generic model placement and customized model update [26]. For each VT, the generic part of the model is obtained by downloading the experiential knowledge with a selected granularity¹¹1Selecting a large (small) granularity of experiential knowledge for generic model placement may increase (decrease) the fidelity of the VT model, while also introducing a large (small) amount of data to be transferred, resulting in high (low) service delay and energy consumption. from the cloud center, and the target ES for its placement is determined following the access selection of the associated PT. By contrast, the customized part of each VT model is updated by uploading the personalized data with an optimized data size²²2The required size of personalized data for customized model update affects not only the fidelity of the VT model, but also the uplink communication and computation resource allocations among different PT-VT pairs and between their data transmission and task offloading. obtained from sensors worn on the associated PT. After VT establishment, PTs’ tasks can be either transmitted (offloaded) to VTs deployed on the ES or processed locally, depending on the demands of task execution accuracy versus the requirements of service delay and energy consumptions. It is worth noting that although VT models are not able to be built locally, PTs’ tasks can be executed by running offline service applications pre-installed on PTs, which are much less powerful but do not require to be real-time updated.

In practice, the dynamic placement of generic VT model and ES access handover of each PT commonly happen in a low frequency (i.e., over a large-timescale)³³3Particularly, frequently downloading experiential knowledge and switching access points can lead to large configuration costs, including the dramatic increase of delays and energy consumptions [5]., while the update of customized VT models and task offloading together with corresponding communication and computation resource allocations require immediate and frequent adaptation to accommodate the status variation of PT and its task generations [2]. To this end, we define that in the considered online optimization framework, the access selection of each PT and the granularity of experiential knowledge for its generic VT model placement are decided in the large-timescale, while the amount of personalized data for its customized VT model update, task offloading, communication and computation resource allocations are decided in the small-timescale, as shown in Fig. 2. Specifically, the timeline is segmented into $T\in\mathbb{N}^{+}$ coarse-grained time frames, and each frame can be further divided into a combination of $K\in\mathbb{N}^{+}$ fine-grained time slots. Let $t\in\mathcal{T}=\{0,1,\ldots,T-1\}$ be the index of the $t$ -th time frame, and define $\tau\in\mathcal{T}_{t}=\{tK,tK+1,\ldots,tK+K-1\}$ as the index of the $\tau$ -th time slot in the $t$ -th time frame.

Overall speaking, we target to optimize the long-term system performance (i.e., the average accuracy of complex task execution assisted by HDT under stringent delay and energy consumption constraints) by determining i) which ES should be selected to access for each PT and what level of granularity of the experiential knowledge should be chosen for placing its generic VT model on the ES in each time frame, and ii) how large the personalized data is required for updating the customized part of each VT model and whether the task of each PT should be processed locally or on the ES (deployed with its associated VT) together with communication and computation resource allocations in each time slot, in an online manner. For convenience, all important notations in this paper are listed in Table I.

TABLE I: IMPORTANT NOTATIONS IN THIS PAPER

Symbol	Meaning
$A_{i}(\tau)$	task execution accuracy for PT $i$ in time slot $\tau$
$a_{i,m}(t)$	access selection of PT $i$ in time frame $t$
$B_{m}$	total bandwidth resource for accessing ES $m$
$b_{i}(\tau)$	bandwidth resource allocation of PT $i$ in time slot $\tau$
$D_{i}(t)$	total experiential knowledge of PT $i$ in time frame $t$
$E_{i}^{dl}(t)$	energy consumption of downloading the experiential knowledge of PT $i$ in time frame $t$
$E_{i,m}^{pl}(t)$	energy consumption of placing generic VT model $i$ in ES $m$ in time frame $t$
$E_{i,m}^{ul}(\tau)$	energy consumption of uploading the personalized data from PT $i$ to ES $m$ in time slot $\tau$
$E_{i,m}^{ud}(\tau)$	energy consumption of updating the customized VT model $i$ on the ES $m$ in time slot $\tau$
$E_{i,m}^{ofld}(\tau)$	energy consumption of transmitting task data of PT $i$ to ES $m$ in time slot $\tau$
$E_{i}^{exec}(\tau)$	energy consumption of executing PT $i$ ’s task
$F_{i}$	total computation resource of PT $i$ ’s local device
$F_{m}$	total computation resource of ES $m$ for all VTs
$f_{i}(\tau)$	computation resource allocation of PT $i$ in time slot $\tau$
$\mathcal{I}$	set of end users (PTs) / corresponding VTs
$\mathcal{M}$	set of ESs
$r_{i,m}(\tau)$	transmission rate between PT $i$ and ES $m$ in time slot $\tau$
$S_{i}(\tau)$	total personalized data generated by PT $i$ in time slot $\tau$
$T_{i}^{dl}(t)$	delay of downloading the experiential knowledge of PT $i$ in time frame $t$
$T_{i,m}^{pl}(t)$	delay of placing generic VT model $i$ on ES $m$
$T_{i,m}^{ul}(\tau)$	delay of uploading the personalized data from PT $i$ to ES $m$ in time slot $\tau$
$T_{i,m}^{ud}(\tau)$	delay of updating customized VT model $i$ on the ES $m$
$T_{i,m}^{ofld}(\tau)$	delay of transmitting task data of PT $i$ to ES $m$
$T_{i}^{exec}(\tau)$	delay of task execution of PT $i$ in time slot $\tau$
$V$	Lyapunov control parameter
$x_{i}(t)$	experiential knowledge granularity for generic model placement of PT $i$ in time frame $t$
$y_{i}(\tau)$	personalized data size for customized model update of PT $i$ in time slot $\tau$
$z_{i}(\tau)$	task offloading of PT $i$ in time slot $\tau$

3.2 Generic VT Model Placement

Since PTs are mobile, to construct VT for each of them at the network edge so as to enable seamless PT-VT interactions, the generic VT model should be re-displaced on the ES that its associated PT switches its access to in each time frame $t\in\mathcal{T}$ . Denote $a_{i,m}(t)\in\{0,1\}$ as the access selection decision in the large-timescale indicating whether PT $i\in\mathcal{I}$ selects to access ES $m\in\mathcal{M}$ or not in time frame $t\in\mathcal{T}$ , i.e., $a_{i,m}(t)=1$ if PT $i\in\mathcal{I}$ connects to ES $m\in\mathcal{M}$ , and $a_{i,m}(t)=0$ otherwise. Obviously, we should have $\sum_{m\in\mathcal{M}}a_{i,m}(t)\leq 1$ , meaning that PT $i$ cannot access multiple ESs simultaneously [16]. For each PT $i\in\mathcal{I}$ , its generic VT model is placed on the accessed ES by downloading a certain granularity of experiential knowledge from the cloud center. We define that the full experiential knowledge of each PT $i\in\mathcal{I}$ for its generic VT model placement has a total size of $D_{i}(t)$ , and denote $x_{i}(t)\in[0,1]$ as its decision of granularity in time frame $t\in\mathcal{T}$ . Then, the data size of downloading the experiential knowledge for placing PT $i$ ’s VT model in time frame $t\in\mathcal{T}$ can be represented as $x_{i}(t)D_{i}(t)$ . Based on these, the corresponding delay and energy consumption of downloading such experiential knowledge can be respectively expressed as⁴⁴4Note that even for a special case that PT $i$ ’s access selection remains unchanged, it is still necessary to periodically replace its generic VT model on the same ES, because the experiential knowledge may experience “data drift” over the time [27].

T_{i}^{dl}(t)=\sum_{m\in\mathcal{M}}a_{i,m}(t)\frac{x_{i}(t)D_{i}(t)}{r^{c}},\forall i\in\mathcal{I},\forall t\in\mathcal{T},

(1)

E_{i}^{dl}(t)=T_{i}^{dl}(t)p^{c},\forall i\in\mathcal{I},\forall t\in\mathcal{T},

(2)

where $r^{c}$ and $p^{c}$ stand for the downlink transmission rate and unit transmission power from the cloud center to each ES, respectively, which are both considered as constants [25], [28].

To exploit this experiential knowledge, each ES has to allocate a proportion of its computation resource for completing the generic VT model placement at the beginning of each time frame. The delay of doing this for PT $i\in\mathcal{I}$ on ES $m\in\mathcal{M}$ in time frame $t\in\mathcal{T}$ can be calculated as

T_{i,m}^{pl}(t)=\frac{a_{i,m}(t)x_{i}(t)D_{i}(t)C_{m}}{f_{i}(tK)F_{m}},

(3)

where $C_{m}$ is the number of CPU cycles required for ES $m$ to process a unit of data, $F_{m}$ is the CPU speed (measured by cycles/s) of ES $m$ , and $f_{i}(tK)\in(0,1]$ represents the ratio of computation resource allocated to PT $i$ for its VT construction at the beginning of time frame $t$ (i.e., the first time slot of the frame with $\tau=tK$ ). Referring to the energy model widely used in CMOS circuits[29], the energy consumption of constructing the generic VT model of the associated PT $i\in\mathcal{I}$ on the ES $m\in\mathcal{M}$ in each time frame $t\in\mathcal{T}$ can be calculated as

E_{i,m}^{pl}(t)=\rho_{m}f_{i}(tK)(F_{m})^{3}T_{i,m}^{pl}(t),

(4)

where $\rho_{m}$ is the effective switched capacitance of ES $m$ depending on its chip architecture.

3.3 Customized VT Model Update

Since PTs are personalized and their status may vary frequently due to uncertain external or internal factors, to guarantee the timeliness and high-fidelity of VTs on ESs, the customized VT model of each PT should be updated in each time slot $\tau\in\mathcal{T}_{t}$ . Let $S_{i}(\tau)$ be the total amount of personalized data generated by PT $i\in\mathcal{I}$ , and define $y_{i}(\tau)\in[0,1]$ as the percentage of personalized data chosen to be uploaded in time slot $\tau\in\mathcal{T}_{t}$ . Then, the size of personalized data uploaded for updating PT $i$ ’s customized VT model in time slot $\tau\in\mathcal{T}_{t}$ can be expressed as $y_{i}(\tau)S_{i}(\tau)$ .

Within each time slot $\tau\in\mathcal{T}_{t}$ , we denote the location of PT $i\in\mathcal{I}$ as $(x_{i}(\tau),y_{i}(\tau))$ , which is a state information following its random mobility pattern, and let $(x_{m},y_{m})$ be the fixed location of each ES $m\in\mathcal{M}$ . The distance between any PT $i\in\mathcal{I}$ and ES $m\in\mathcal{M}$ can then be calculated as $S_{i,m}(\tau)=\sqrt{\left(x_{i}(\tau)-x_{m}\right)^{2}+\left(y_{i}(\tau)-y_{m}\right)^{2}}$ , and according to Shannon-Hartley formula, the transmission rate from PT $i\in\mathcal{I}$ to its accessed ES $m\in\mathcal{M}$ is written as

r_{i,m}(\tau)=a_{i,m}(t)b_{i}(\tau)B_{m}\log(1+\frac{(S_{i,m}(\tau))^{\theta}p_{i}|h_{i,m}(\tau)|^{2}}{N_{0}b_{i}(\tau)B_{m}}),

(5)

where $b_{i}(\tau)\in(0,1]$ is the proportion of communication resource allocated to PT $i\in\mathcal{I}$ in time slot $\tau\in\mathcal{T}_{t}$ , $h_{i,m}(\tau)$ is the fading amplitude between PT $i\in\mathcal{I}$ and ES $m\in\mathcal{M}$ in time slot $\tau\in\mathcal{T}_{t}$ (modeled as a circularly symmetric complex Gaussian random variable [30]), $B_{m}$ is the communication bandwidth of ES $m\in\mathcal{M}$ , $N_{0}$ is the spectral density of the channel noise power, $p_{i}$ is the pre-determined transmission power of PT $i\in\mathcal{I}$ , and $\theta\geq 2$ is the path loss exponent [30]. Correspondingly, the delay and energy consumption of PT $i\in\mathcal{I}$ in uploading the personalized data with size $y_{i}(\tau)S_{i}(\tau)$ for updating its VT on ES $m\in\mathcal{M}$ in each time slot $\tau\in\mathcal{T}_{t}$ can be respectively expressed as

T_{i,m}^{ul}(\tau)=\frac{y_{i}(\tau)S_{i}(\tau)}{r_{i,m}(\tau)},\forall i\in\mathcal{I},\forall m\in\mathcal{M},\forall\tau\in\mathcal{T}_{t},

(6)

E_{i,m}^{ul}(\tau)=T_{i,m}^{ul}p_{i},\forall i\in\mathcal{I},\forall m\in\mathcal{M},\forall\tau\in\mathcal{T}_{t}.

(7)

To utilize this personalized data, each ES has to allocate a proportion of its computation resource for completing the customized VT model update in each time slot. The delay of doing this for PT $i\in\mathcal{I}$ on ES $m\in\mathcal{M}$ in time slot $\tau\in\mathcal{T}_{t}$ can be calculated as

T_{i,m}^{ud}(\tau)=\frac{a_{i,m}(t)y_{i}(\tau)S_{i}(\tau)C_{m}}{f_{i}(\tau)F_{m}},

(8)

where $f_{i}(\tau)\in(0,1]$ indicates the proportion of computation resource allocated to PT $i$ for its VT update in each time slot $\tau\in[tK,tK+K-1]$ . Besides, the corresponding energy consumption can be calculated as

E_{i,m}^{ud}(\tau)=\rho_{m}f_{i}(\tau)(F_{m})^{3}T_{i}^{ud}(\tau).

(9)

3.4 HDT-Assisted Task Execution

Let $\lambda_{i}(\tau)$ be the data size of the complex task produced by PT $i\in\mathcal{I}$ in each time slot $\tau\in\mathcal{T}_{t}$ , which is allowed to follow a general random distribution. Denote the task offloading decision of PT $i\in\mathcal{I}$ in time slot $\tau\in\mathcal{T}_{t}$ as $z_{i}(\tau)\in\{0,1\}$ , i.e., $z_{i}(\tau)=1$ if PT $i$ offloads the task to its VT on the ES for assistance, and $z_{i}(\tau)=0$ if PT $i$ processes it locally. The delay and energy consumption of offloading/transmitting such task from PT $i\in\mathcal{I}$ to its associated VT deployed on the ES $m\in\mathcal{M}$ in time slot $\tau\in\mathcal{T}_{t}$ can be respectively expressed as

T_{i,m}^{ofld}(\tau)=\frac{\lambda_{i}(\tau)}{r_{i,m}(\tau)},

(10)

E_{i,m}^{ofld}(\tau)=T_{i,m}^{ofld}p_{i}.

(11)

Considering the possibility of both edge and local processing, the delay of HDT-assisted task execution of PT $i\in\mathcal{I}$ in time slot $\tau\in\mathcal{T}_{t}$ can be calculated as

		$\displaystyle T_{i}^{exec}(\tau)=\sum_{m\in\mathcal{M}}a_{i,m}(t)z_{i}(\tau)\frac{\lambda_{i}(\tau)C_{m}}{f_{i}(\tau)F_{m}}$		(12)
		$\displaystyle+(1-z_{i}(\tau))\frac{\lambda_{i}(\tau)C_{i}}{F_{i}},$		(12)

where $C_{i}$ is the number of CPU cycles required for PT $i$ to locally process a unit of data, and $F_{i}$ denotes its CPU speed (measured by cycles/s). Besides, the corresponding energy consumption can be calculated as

		$\displaystyle E_{i}^{exec}(\tau)=\sum_{m\in\mathcal{M}}a_{i,m}(t)z_{i}(\tau)\rho_{m}(F_{m})^{2}\lambda_{i}(\tau)C_{m}$		(13)
		$\displaystyle+(1-z_{i}(\tau))\rho_{i}(F_{i})^{2}\lambda_{i}(\tau)C_{i}.$		(13)

With the help of HDT at the edge, the accuracy of executing each task from PT $i\in\mathcal{I}$ in each time slot $\tau\in\mathcal{T}_{t}$ can be defined as

A_{i}(\tau)=z_{i}(\tau)g_{i}^{edge}(d_{i}(\tau))+(1-z_{i}(\tau))g_{i}^{local},

(14)

where $g_{i}^{local}$ represents the task execution accuracy of local processing on PT $i$ itself, and $g_{i}^{edge}(d_{i}(\tau))$ stands for the task execution accuracy of edge computing for PT $i$ depending on the total size of data used for its VT construction, i.e., $d_{i}(\tau)=x_{i}(t)D_{i}(t)+y_{i}(\tau)S_{i}(\tau),\forall\tau\in\mathcal{T}_{t},\forall t\in\mathcal{T}$ , consisting of both experiential knowledge for generic VT model placement in the large-timescale and personalized data for customized VT model update in the small-timescale. Note that $g_{i}^{edge}(d_{i}(\tau))$ is a mapping function that can be obtained via empirical studies or experiments [31].

Similar to [29], [32], we ignore overheads induced by computing outcomes feedback and system control signals. This is because the size of computing outcomes and control signals are much smaller than that of the input data. Technically, these overheads can be seen as small constants [33], which will not affect our analyses.

3.5 Problem Formulation

In summary, the total response delay for all tasks of PT $i\in\mathcal{I}$ in each time frame $t\in\mathcal{T}$ can be derived as

		$\displaystyle T_{i}^{tol}(t)=T_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}T_{i,m}^{pl}(t)+\sum_{\tau\in\mathcal{T}_{t}}[z_{i}(\tau)\sum_{m\in\mathcal{M}}$		(15)
		$\displaystyle(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)+T_{i,m}^{ofld}(\tau))+T_{i}^{exec}(\tau)],$		(15)

and the system overall energy consumption in each time frame $t\in\mathcal{T}$ can be derived as

		$\displaystyle E^{tol}(t)=\sum_{i\in\mathcal{I}}(E_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}E_{i,m}^{pl}(t))+\sum_{i\in\mathcal{I}}\sum_{\tau\in\mathcal{T}_{t}}[z_{i}(\tau)$
		$\displaystyle\sum_{m\in\mathcal{M}}(E_{i,m}^{ul}(\tau)+E_{i,m}^{ud}(\tau)+E_{i,m}^{ofld}(\tau))+E_{i}^{exec}(\tau)].$		(16)

To evaluate the core value of building HDT at the network edge, we take the long-term average accuracy of complex task execution assisted by the considered end-edge-cloud collaborative HDT system over all time frames as the performance measurement, which can be expressed as

\mathcal{A}=\frac{1}{TK}\sum_{t\in\mathcal{T}}\sum_{\tau\in\mathcal{T}_{t}}\sum_{i\in\mathcal{I}}A_{i}(\tau).

(17)

With the objective of maximizing $\mathcal{A}$ while ensuring that the response delay for all tasks of each PT $i\in\mathcal{I}$ and the overall system energy consumption do not exceed certain thresholds, we formulate a two-timescale online optimization problem by jointly optimizing $a_{i,m}(t)$ and $x_{i}(t)$ , denoted in short by a set of large-timescale decision variables $\mathcal{J}_{i}^{A}(t)=\{a_{i,m}(t),x_{i}(t)\}$ , for each PT $i$ in any time frame $t\in\mathcal{T}$ , and $y_{i}(\tau)$ , $b_{i}(\tau)$ , $f_{i}(\tau)$ and $z_{i}(\tau)$ , denoted in short by a set of small-timescale decision variables $\mathcal{J}_{i}^{B}(\tau)=\{b_{i}(\tau),y_{i}(\tau),f_{i}(\tau),z_{i}(\tau)\}$ , for each PT $i$ in any time slot $\tau\in\mathcal{T}_{t}$ .

Mathematically, such a two-timescale online optimization problem can be formulated as


	$\displaystyle\mathcal{P}_{1}:\max_{\mathcal{J}_{i}^{A}(t),\mathcal{J}_{i}^{B}(\tau)}\lim_{t\rightarrow\infty}\mathcal{A}$
	$\displaystyle\text{ s.t. }\sum_{m\in\mathcal{M}}a_{i,m}(t)\leq 1,\forall i\in\mathcal{I},\forall t\in\mathcal{T},$		(18a)
	$\displaystyle\sum_{i\in\mathcal{I}}a_{i,m}(t)b_{i}(\tau)\leq 1,\forall m\in\mathcal{M},\forall t\in\mathcal{T},\tau\in\mathcal{T}_{t},$		(18b)
	$\displaystyle\sum_{i\in\mathcal{I}}a_{i,m}(t)f_{i}(\tau)\leq 1,\forall m\in\mathcal{M},\forall t\in\mathcal{T},\tau\in\mathcal{T}_{t},$		(18c)
	$\displaystyle\lim_{T\rightarrow\infty}\frac{1}{T}\sum_{t\in\mathcal{T}}T_{i}^{tol}(t)\leq T_{i}^{max},\forall i\in\mathcal{I},$		(18d)
	$\displaystyle\lim_{T\rightarrow\infty}\frac{1}{T}\sum_{t\in\mathcal{T}}E^{tol}(t)\leq E^{max},$		(18e)

where constraint (18a) guarantees that one PT can connect to at most one ES in each time frame $t\in\mathcal{T}$ ; constraints (18b) and (18c) restrict that the communication and computation resource allocation should be less than the total capacities of each ES in any time slot $\tau\in\mathcal{T}_{t}$ ; constraints (18d) and (18e) are the long-term average task response delay and system overall energy consumption constraints (in which $T_{i}^{max}$ and $E^{max}$ represent the pre-determined response delay threshold of each PT $i\in\mathcal{I}$ and the overall system energy consumption threshold, respectively).

Obviously, solving problem $\mathcal{P}_{1}$ directly is very challenging because $i)$ PTs’ mobilities and status variations, due to human-related characteristics, are highly unpredictable, meaning that it is extremely hard, if not impossible, to obtain system statistics in advance, which necessitates the design of an online optimization algorithm; $ii)$ although Lyapunov optimization [34] is well-known as an effective method to solve such an online problem in general, decision variables in different timescales (i.e., $\mathcal{J}_{i}^{A}(t)$ and $\mathcal{J}_{i}^{B}(\tau)$ ) are tightly coupled in not only the objective function but also constraints (18d) and (18e); and $iii)$ all constraints include discrete decision variables (i.e., $a_{i,m}(t)$ or $z_{i}(\tau)$ ), and constraints (18d) and (18e) are both non-convex. These indicate that $\mathcal{P}_{1}$ is a two-timescale online non-convex mixed integer programming problem, which must be NP-hard.

4 A Two-Timescale Online Optimization Approach (TACO)

In this section, we propose a novel approach, namely TACO, for jointly optimizing VT construction and task offloading in the considered end-edge-cloud collaborative HDT system. Specifically, we first reformulate the two-timescale problem by distributing the task response delay and system overall energy consumption of each time frame $t\in\mathcal{T}$ into each of its contained time slot $\tau\in\mathcal{T}_{t}$ . Then, we decompose the problem into multiple short-term deterministic subproblems of different timescales with the help of Lyapunov optimization method but with a two-timescale extension. After that, we introduce an alternating algorithm integrating PME and BCD methods to solve subproblems in the large-timescale and small-timescale, respectively.

4.1 Problem Reformulation

Observed from problem $\mathcal{P}_{1}$ that the delay and energy consumptions caused by the generic VT model placement are on the large-timescale, while those caused by customized VT model update and task execution are on the small-timescale. To facilitate the solution, we evenly distribute the task response delay and system overall energy consumption in each time frame $t\in\mathcal{T}$ into all $\mid\mathcal{T}_{t}\mid=K$ time slots within this frame, which yields

		$\displaystyle T_{i}^{tol}(\tau)=(T_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}T_{i,m}^{pl}(t))/K+\sum_{\tau\in\mathcal{T}_{t}}[z_{i}(\tau)$		(19)
		$\displaystyle\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)+T_{i,m}^{ofld}(\tau))+T_{i}^{exec}(\tau)],$		(19)

		$\displaystyle E^{tol}(\tau)=\sum_{i\in\mathcal{I}}(E_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}E_{i,m}^{pl}(t))/K+\sum_{i\in\mathcal{I}}\sum_{\tau\in\mathcal{T}_{t}}[z_{i}(\tau)$
		$\displaystyle\sum_{m\in\mathcal{M}}(E_{i,m}^{ul}(\tau)+E_{i,m}^{ud}(\tau)+E_{i,m}^{ofld}(\tau))+E_{i}^{exec}(\tau)].$		(20)

Substituting (19) and (4.1) into (18d) and (18e) of problem $\mathcal{P}_{1}$ , we have


	$\displaystyle\mathcal{P}_{2}:\max_{\mathcal{J}_{i}^{A}(t),\mathcal{J}_{i}^{B}(\tau)}\lim_{t\rightarrow\infty}\mathcal{A}$
	$\displaystyle\text{ s.t. }(\ref{C1}),(\ref{C3}),(\ref{C4}),$
	$\displaystyle\lim_{T\rightarrow\infty}\frac{1}{TK}\sum_{t\in\mathcal{T}}\sum_{\tau\in\mathcal{T}_{t}}T_{i}^{tol}(\tau)\leq T_{i}^{max}/K,\forall i\in\mathcal{I},$		(21a)
	$\displaystyle\lim_{T\rightarrow\infty}\frac{1}{TK}\sum_{t\in\mathcal{T}}\sum_{\tau\in\mathcal{T}_{t}}E^{tol}(\tau)\leq E^{max}/K.$		(21b)

Note that the reformulated problem $\mathcal{P}_{2}$ is equivalent to the original problem $\mathcal{P}_{1}$ with exactly the same decision variables remaining in two different timescales, while all long-term constraints have been unified into a single timescale (i.e., in terms of the time slot only) but will not affect the optimization performance.

Obviously, $\mathcal{P}_{2}$ is still a long-term optimization problem, and the major difficulties for solving it are i) how to address the long-term average delay and energy consumption constraints; and ii) how to optimize two-timescale decision variables simultaneously. To this end, in the next subsection, we employ the Lyapunov method [34], and reformulate $\mathcal{P}_{2}$ to accommodate the two-timescale features.

4.2 Problem Decomposition

We first define a delay overflow queue and an energy deficit queue to respectively describe how task response delay $T_{i}^{tol}(\tau)$ of each PT $i\in\mathcal{I}$ and the overall system energy consumption $E^{tol}(\tau)$ in time slot $\tau\in\mathcal{T}_{t}$ may deviate from the long-term budget $T_{i}^{max}/K$ and $E^{max}/K$ . The dynamic evolution of these two queues can be expressed as

H_{i}(\tau+1)=\max[H_{i}(\tau)+T_{i}^{tol}(\tau)-T_{i}^{max}/K,0],

(22)

E(\tau+1)=\max[E(\tau)+E^{tol}(\tau)-E^{max}/K,0].

(23)

After that, we combine the delay overflow queue $H_{i}(\tau)$ for all tasks of PTs and energy deficit queue $E^{tol}(\tau)$ by a vector as $\bm{\Theta}(\tau)=[\bm{H}(\tau),E(\tau)]$ , and introduce a quadratic Lyapunov function as [34]:

L(\bm{\Theta}(\tau))\triangleq\frac{1}{2}[\sum_{i\in\mathcal{I}}H_{i}(\tau)^{2}+E(\tau)^{2}],

(24)

which quantitatively reflects the congestion of all queues, and should be persistently pushed towards a minimum value to keep queue stabilities. Referring to [35], the conditional Lyapunov drift is given by

\Delta(\bm{\Theta}(\tau))=\mathbb{E}[L(\bm{\Theta}(\tau+K))-L(\bm{\Theta}(\tau))|\bm{\Theta}(\tau)],

(25)

where $\mathbb{E}[\cdot]$ denotes the expectation, and $\Delta(\bm{\Theta}(\tau))$ measures the difference of the Lyapunov function between $K$ consecutive time slots. Intuitively, by minimizing the Lyapunov drift in (25), we can prevent queue backlogs from the unbounded growth, and thus guarantee that the desired delay and energy consumption constraints can be met.

Accordingly, the Lyapunov drift-plus-penalty function becomes

\Delta(\bm{\Theta}(\tau))-V\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)],

(26)

where a control parameter $V>0$ is introduced, representing the weight of significance on maximizing the HDT-assisted task execution accuracy versus that of strictly satisfying the delay and energy consumption constraints.

Theorem 1

Let $V>0$ , and the drift-plus-penalty is bounded by any possible decisions in any time slot $\tau\in\mathcal{T}_{t}$ , i.e.,

		$\displaystyle\Delta(\bm{\Theta}(\tau))-V\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)]$		(27)
		$\displaystyle\leq G+\sum_{i\in\mathcal{I}}\mathbb{E}[H_{i}(\tau)(T_{i}^{tol}(\tau)-T_{i}^{max}/K)\mid\bm{\Theta}(\tau)]$
		$\displaystyle+\mathbb{E}[E(\tau)(E^{tol}(\tau)-E^{max}/K)\mid\bm{\Theta}(\tau)]$
		$\displaystyle-V\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)],$

where

\displaystyle G=\frac{1}{2}\sum_{i\in\mathcal{I}}[T_{i}^{tol}(max)-T_{i}^{max}]^{2}+\frac{1}{2}[E^{tol}(max)-E^{max}]^{2}

is a positive constant that adjusts the tradeoff between the HDT-assisted task execution accuracy and the satisfaction degree of the delay and energy consumption constraints.

Proof:

Please see Appendix A. ∎

Theorem 1 shows that the drift-plus-penalty is deterministically upper bounded in each time slot $\tau\in\mathcal{T}_{t}$ . Then, taking the sum over all time slots within time frame $t$ for both sides of (27), we have

		$\displaystyle\sum_{\tau\in\mathcal{T}_{t}}\Delta(\bm{\Theta}(\tau))-V\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)]$
		$\displaystyle\leq GK+\sum_{\tau\in\mathcal{T}_{t}}\sum_{i\in\mathcal{I}}\mathbb{E}[H_{i}(\tau)(T_{i}^{tol}(\tau)-T_{i}^{max}/K)\mid\bm{\Theta}(\tau)]$
		$\displaystyle+\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[E(\tau)(E^{tol}(\tau)-E^{max}/K)\mid\bm{\Theta}(\tau)]$
		$\displaystyle-V\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)].$		(28)

Then, $\mathcal{P}_{2}$ can be decomposed into multiple subproblems, each of which opportunistically minimize the right-hand-side of (4.2) in one time frame $t\in\mathcal{T}$ , i.e.,

	$\displaystyle\mathcal{P}_{3}:\min_{\mathcal{J}_{i}^{A}(t),\mathcal{J}_{i}^{B}(\tau)}\sum_{\tau\in\mathcal{T}_{t}}\sum_{i\in\mathcal{I}}\mathbb{E}[H_{i}(\tau)(T_{i}^{tol}(\tau)$
	$\displaystyle-T_{i}^{max}/K)\mid\bm{\Theta}(\tau)]+\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[E(\tau)(E^{tol}(\tau)$
	$\displaystyle-E^{max}/K)\mid\bm{\Theta}(\tau)]-V\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)]$
	$\displaystyle\text{ s.t. }(\ref{C1}),(\ref{C3}),(\ref{C4}).$

Note that, in problem $\mathcal{P}_{3}$ , decisions $\mathcal{J}_{i}^{A}(t)=\{a_{i,m}(t),x_{i}(t)\}$ and $\mathcal{J}_{i}^{B}(\tau)=\{b_{i}(\tau),y_{i}(\tau),f_{i}(\tau),z_{i}(\tau)\}$ remain unchanged as those in $\mathcal{P}_{3}$ . This means that, although $\mathcal{P}_{3}$ focuses on the optimization in a single time frame, it still includes two-timescale variables.

4.3 Alternating Algorithm between Two Timescales

4.3.1 Two-Timescale Decoupling and Alternation

To solve problem $\mathcal{P}_{3}$ , we can decouple it into two subproblems (one for the large-timescale, and the other for the small-timescale), and then solve them alternately till the convergence.

Large-timescale Problem: Given the small-timescale decision $\mathcal{J}_{i}^{B}(\tau)$ together with the current backlogs of delay overflow queues $H_{i}(\tau)$ of all PTs and the system energy consumption deficit queue $E(\tau)$ , the large-timescale subproblem aims to jointly optimize the granularity of experiential knowledge $x_{i}(t)$ and access selection $a_{i,m}(t)$ at the beginning of each time frame $t\in\mathcal{T}$ , which can be formulated as

	$\displaystyle\mathcal{P}_{4}:\min_{\mathcal{J}_{i}^{A}(t)}\sum_{i\in\mathcal{I}}H_{i}(\tau)[(T_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}T_{i,m}^{pl}(t))/K$
	$\displaystyle+z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)+T_{i,m}^{ofld}(\tau))$
	$\displaystyle+T_{i}^{exec}(\tau)]+E(\tau)\sum_{i\in\mathcal{I}}[(E_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}E_{i,m}^{pl}(t))/K$
	$\displaystyle+z_{i}(\tau)\sum_{m\in\mathcal{M}}(E_{i,m}^{ul}(\tau)+E_{i,m}^{ud}(\tau)+E_{i,m}^{ofld}(\tau))$
	$\displaystyle+E_{i}^{exec}(\tau)]-V\sum_{i\in\mathcal{I}}A_{i}(\tau)$
	$\displaystyle\text{ s.t. }(\ref{C1}),(\ref{C3}),(\ref{C4}).$

Small-timescale Problem: Given the large-timescale decision $\mathcal{J}_{i}^{A}(t)$ , the small-timescale subproblem targets to jointly optimize the personalized data size $y_{i}(\tau)$ , bandwidth resource allocation $b_{i}(\tau)$ , computation resource allocation $f_{i}(\tau)$ and task offloading $z_{i}(\tau)$ in each time slot $\tau\in\mathcal{T}_{t}$ , which can be formulated as

	$\displaystyle\mathcal{P}_{5}:\min_{\mathcal{J}_{i}^{B}(\tau)}\sum_{i\in\mathcal{I}}H_{i}(\tau)[\sum_{m\in\mathcal{M}}T_{i,m}^{pl}(t)/K+z_{i}(\tau)\sum_{m\in\mathcal{M}}$
	$\displaystyle(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)+T_{i,m}^{ofld}(\tau))+T_{i}^{exec}(\tau)]$
	$\displaystyle+E(\tau)\sum_{i\in\mathcal{I}}[\sum_{m\in\mathcal{M}}E_{i,m}^{pl}(t)/K+\sum_{m\in\mathcal{M}}z_{i}(\tau)(E_{i,m}^{ul}(\tau)$
	$\displaystyle+E_{i,m}^{ud}(\tau)+E_{i,m}^{ofld}(\tau))+E_{i}^{exec}(\tau)]-V\sum_{i\in\mathcal{I}}A_{i}(\tau)$
	$\displaystyle\text{ s.t. }(\ref{C3}),(\ref{C4}).$

Alternating Process: For $\tau=tK$ (i.e., the beginning of each time frame), we iteratively optimize subproblems $\mathcal{P}_{4}$ and $\mathcal{P}_{5}$ until the objective of $\mathcal{P}_{3}$ converges. Specifically, by fixing small-timescale decisions as $\mathcal{J}_{i}^{B}(tK)=J_{i}^{B}(tK-1)$ (inherited from the last time slot $\tau=tK-1$ of the previous time frame $\mathcal{T}_{t-1}$ ), large-timescale subproblem $\mathcal{P}_{4}$ is first solved to optimize $\mathcal{J}_{i}^{A}(tK)$ . Then, given large-timescale decisions $\mathcal{J}_{i}^{A}(tK)$ , small-timescale subproblem $\mathcal{P}_{5}$ is solved to update $\mathcal{J}_{i}^{B}(tK)$ , which is returned back to $\mathcal{P}_{4}$ . For each $\tau\in[tK+1,tK+K-1]$ (i.e., the rest of each time frame), with the optimized large-timescale decisions $\mathcal{J}_{i}^{A}(tK)$ after the convergence, we repeatedly solve $\mathcal{P}_{5}$ to obtain $\mathcal{J}_{i}^{B}(\tau)$ in each time slot.

Input:

N

partitions, and initial feasible solution

z^{{}^{\prime}}

1 Linear relaxation:

a_{i,m}(t)\in\{0,1\}\rightarrow\tilde{a}_{i,m}(t)\in[0,1]

;

2 for $x_{i}(t)\in\bm{x}$ do

3 for $\tilde{a}_{i,m}(t)\in\bm{\tilde{a}}$ do

4 for $n\in N$ do

x_{i}^{L}(t)=x_{i,n}^{L}(t);x_{i}^{U}(t)=x_{i,n}^{U}(t)

6 if (42) is feasible then

7 Tighten bound

[\tilde{a}_{i,m,n}^{L}(t),\tilde{a}_{i,m,n}^{U}(t)]

from (42);

9 else

10 Remove partition

[x_{i,n}^{L}(t),x_{i,n}^{U}(t)]

;

13 Update lower bound as

\tilde{a}_{i,m}^{L}(t)=\min_{n}\tilde{a}_{i,m,n}^{L}(t)

;

14 Update upper bound as

\tilde{a}_{i,m}^{U}(t)=\max_{n}\tilde{a}_{i,m,n}^{U}(t)

;

16 Update lower bound as

x_{i}^{L}(t)=\min_{n}x_{i,n}^{L}(t)

;

17 Update upper bound as

x_{i}^{U}(t)=\max_{n}x_{i,n}^{U}(t)

;

19Solve

\mathcal{P}_{4-3}

in the pruned partition to get solution

(\bm{x}^{R},\bm{\tilde{a}}^{R})

;

20 Round

a_{i,m^{*}}(t)=\max_{m^{*}}\tilde{a}_{i,m}(t),\forall m\in\mathcal{M}

to 1 and set the others to 0;

21 Get solution (

\bm{x}^{R},\bm{a}^{R}

) of

\mathcal{P}_{4}

;

Output: (

\bm{x}^{R},\bm{a}^{R}

)

Algorithm 1 PME-Based Algorithm

4.3.2 Solution for Large-Timescale Decisions

Large-timescale subproblem $\mathcal{P}_{4}$ is non-convex in general, but can be regarded as a bilinear optimization problem (i.e., the problem is linear if we fix one decision variable and optimize the other one within $\mathcal{J}_{i}^{A}(t)$ ). This motivates us to design a PME-based algorithm [36], which first constructs convex envelops for bilinear terms, transforming the problem to a piecewise linear form, and then solves it by partitioning and pruning.

Let $u_{i,m}(t)=a_{i,m}(t)x_{i}(t)$ be an auxiliary variable of bivariate $a_{i,m}(t)x_{i}(t)$ . Then, $\mathcal{P}_{4}$ can be relaxed into a convex optimization problem as


	$\displaystyle\mathcal{P}_{4-1}:\min_{\mathcal{J}_{i}^{A}(t),u_{i,m}(t)}\sum_{i\in\mathcal{I}}H_{i}(\tau)[(T_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}u_{i,m}(t)$
	$\displaystyle\frac{D_{i}(t)C_{m}}{f_{i}(tK)F_{m}})/K+z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)+T_{i,m}^{ofld}(\tau))$
	$\displaystyle+T_{i,m}^{exec}(\tau)]+E(\tau)\sum_{i\in\mathcal{I}}[(E_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}u_{i,m}(t)$
	$\displaystyle\rho_{m}(F_{m})^{2}D_{i}(t)C_{m})/K+z_{i}(\tau)\sum_{m\in\mathcal{M}}(E_{i,m}^{ul}(\tau)$
	$\displaystyle+E_{i,m}^{ud}(\tau)+E_{i,m}^{ofld}(\tau))+E_{i}^{exec}(\tau)]-V\sum_{i\in\mathcal{I}}A_{i}(\tau)$
	$\displaystyle\text{ s.t. }(\ref{C1}),(\ref{C3}),(\ref{C4}),$
	$\displaystyle u_{i,m}(t)\geq 0,\forall i\in\mathcal{I},\forall m\in\mathcal{M},\forall t\in\mathcal{T},$		(29a)
	$\displaystyle u_{i,m}(t)\geq x_{i}(t)+a_{i,m}(t)-1,\forall i\in\mathcal{I},\forall m\in\mathcal{M},\forall t\in\mathcal{T},$		(29b)
	$\displaystyle u_{i,m}(t)\leq a_{i,m}(t),\forall i\in\mathcal{I},\forall m\in\mathcal{M},\forall t\in\mathcal{T},$		(29c)
	$\displaystyle u_{i,m}(t)\leq x_{i}(t),\forall i\in\mathcal{I},\forall m\in\mathcal{M},\forall t\in\mathcal{T},$		(29d)

where constraints (29a)-(29d) are the corresponding relaxed ones on bivariate $a_{i,m}(t)x_{i}(t)$ .

Next, to improve the solution quality, we divide $\bm{x}=\{x_{i}(t),\forall i\in\mathcal{I},\forall t\in\mathcal{T}\}$ and $\bm{a}=\{a_{i,m}(t),\forall i\in\mathcal{I},\forall i\in\mathcal{I},\forall t\in\mathcal{T}\}$ into $\mid\mathcal{N}\mid=N$ partitions. Specifically, let $x_{i,n}(t)\in[x_{i,n}^{L}(t),x_{i,n}^{U}(t)]$ be the range of $x_{i,n}(t)$ in partition $n\in\mathcal{N}$ , where $x_{i,n}^{L}(t)$ and $x_{i,n}^{U}(t)$ are its lower and upper bounds, respectively. Besides, a new auxiliary variable $y_{i,n}(t)\in\{0,1\}$ is introduced, where $y_{i,n}(t)=1$ if the value of $x_{i}(t)$ belongs to partition $n$ , and $y_{i,n}(t)=0$ otherwise. Similarly, the binary variable $a_{i,m}(t)$ is first relaxed into a continuous variable $\tilde{a}_{i,m}(t)\in[0,1]$ and then divided into $N$ piecewise areas, where the range of partition $n\in\mathcal{N}$ is $\tilde{a}_{i,m,n}(t)\in[\tilde{a}_{i,m,n}^{L}(t),\tilde{a}_{i,m,n}^{U}(t)]$ . Based on these, $\mathcal{P}_{4-1}$ can be converted into a generalized disjunctive programming problem as

	$\displaystyle\mathcal{P}_{4-2}:\min_{\mathcal{J}_{i}^{A}(t),u_{i,m}(t)}\sum_{i\in\mathcal{I}}H_{i}(\tau)[(T_{i}^{dl}(t)+\sum_{m\in M}u_{i,m}(t)$
	$\displaystyle\frac{D_{i}(t)C_{m}}{f_{i}(tK)F_{m}})/K+z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)+T_{i,m}^{ofld}(\tau))$
	$\displaystyle+T_{i,m}^{exec}(\tau)]+E(\tau)\sum_{i\in\mathcal{I}}[(E_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}u_{i,m}(t)$
	$\displaystyle\rho_{m}(F_{m})^{2}D_{i}(t)C_{m})/K+z_{i}(\tau)\sum_{m\in\mathcal{M}}(E_{i,m}^{ul}(\tau)+E_{i,m}^{ud}(\tau)$
	$\displaystyle+E_{i,m}^{ofld}(\tau))+E_{i}^{exec}(\tau)]-V\sum_{i\in\mathcal{I}}A_{i}(\tau)$
	$\displaystyle\text{ s.t. }(\ref{C1}),(\ref{C3}),(\ref{C4}),$
	$\displaystyle\bigvee_{n\in\mathcal{N}}\left[\begin{array}[]{l}u_{i,m}(t)\geqslant\tilde{a}_{i,m}(t)\cdot x_{i,n}^{L}(t)+\tilde{a}_{i,m,n}^{L}(t)\cdot x_{i}(t)\\ -\tilde{a}_{i,m,n}^{L}(t)\cdot x_{i,n}^{L}(t)\\ u_{i,m}(t)\geqslant\tilde{a}_{i,m}(t)\cdot x_{i,n}^{U}(t)+\tilde{a}_{i,m,n}^{U}(t)\cdot x_{i}(t)\\ -\tilde{a}_{i,m,n}^{U}(t)\cdot x_{i,n}^{U}(t)\\ u_{i,m}(t)\leqslant\tilde{a}_{i,m}(t)\cdot x_{i,n}^{L}(t)+\tilde{a}_{i,m,n}^{U}(t)\cdot x_{i}(t)\\ -\tilde{a}_{i,m,n}^{U}(t)\cdot x_{i,n}^{L}(t)\\ u_{i,m}(t)\leqslant\tilde{a}_{i,m}(t)\cdot x_{i,n}^{U}(t)+\tilde{a}_{i,m,n}^{L}(t)\cdot x_{i}(t)\\ -\tilde{a}_{i,m,n}^{L}(t)\cdot x_{i,n}^{U}(t)\\ \tilde{a}_{i,m,n}^{L}(t)\leqslant\tilde{a}_{i,m}(t)\leqslant\tilde{a}_{i,m,n}^{U}(t)\\ x_{i,n}^{L}(t)\leqslant x_{i}(t)\leqslant x_{i,n}^{U}(t)\\ y_{i,n}(t)\end{array}\right],$		(41)
	$\displaystyle x_{i,n}^{L}(t)=x_{i}^{L}(t)+\frac{(x_{i}^{U}(t)-x_{i}^{L}(t))\cdot(n-1)}{N},\forall i,\forall n,\forall t,$
	$\displaystyle x_{i,n}^{U}(t)=x_{i}^{L}(t)+\frac{(x_{i}^{U}(t)-x_{i}^{L}(t))\cdot n}{N},\forall i,\forall n,\forall t,$
	$\displaystyle y_{i,n}(t)\in\{0,1\},\forall n,\forall t.$

Then, by applying the convex hull relaxation [37], [38], we can further transform problem $\mathcal{P}_{4-2}$ into a piecewise linear form as

		$\displaystyle\mathcal{P}_{4-3}:\min_{\mathcal{J}_{i}^{A}(t),u_{i,m}(t)}\sum_{i\in\mathcal{I}}H_{i}(\tau)[(T_{i}^{dl}(t)+\sum_{m\in M}u_{i,m}(t)$
		$\displaystyle\frac{D_{i}(t)C_{m}}{f_{i}(tK)F_{m}})/K+z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)+T_{i,m}^{ofld}(\tau))$
		$\displaystyle+T_{i,m}^{exec}(\tau)]+E(\tau)\sum_{i\in\mathcal{I}}[(E_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}u_{i,m}(t)$
		$\displaystyle\rho_{m}(F_{m})^{2}D_{i}(t)C_{m})/K+z_{i}(\tau)\sum_{m\in\mathcal{M}}(E_{i,m}^{ul}(\tau)+E_{i,m}^{ud}(\tau)$
		$\displaystyle+E_{i,m}^{ofld}(\tau))+E_{i}^{exec}(\tau)]-V\sum_{i\in\mathcal{I}}A_{i}(\tau)$
		$\displaystyle\text{ s.t. }(\ref{C1}),(\ref{C3}),(\ref{C4}),$
		$\displaystyle u_{i,m}(t)\geqslant\hat{a}_{i,m,n}(t)\cdot x_{i,n}^{L}(t)+\tilde{a}_{i,m,n}^{L}(t)\cdot\hat{x}_{i,n}(t)$
		$\displaystyle-\tilde{a}_{i,m,n}^{L}(t)\cdot x_{i,n}^{L}(t)\cdot y_{i,n}(t),\forall i,\forall m,\forall n,\forall t,$
		$\displaystyle u_{i,m}(t)\geqslant\hat{a}_{i,m,n}(t)\cdot x_{i,n}^{u}(t)+\tilde{a}_{i,m,n}^{U}(t)\cdot\hat{x}_{i,n}(t)$
		$\displaystyle-\tilde{a}_{i,m,n}^{U}(t)\cdot x_{i,n}^{u}(t)\cdot y_{i,n}(t),\forall i,\forall m,\forall n,\forall t,$
		$\displaystyle u_{i,m}(t)\leqslant\hat{a}_{i,m,n}(t)\cdot x_{i,n}^{L}(t)+\tilde{a}_{i,m,n}^{U}(t)\cdot\hat{x}_{i,n}(t)$
		$\displaystyle-\tilde{a}_{i,m,n}^{U}(t)\cdot x_{i,n}^{L}(t)\cdot y_{i,n}(t),\forall i,\forall m,\forall n,\forall t,$
		$\displaystyle u_{i,m}(t)\leqslant\hat{a}_{i,m,n}(t)\cdot x_{i,n}^{u}(t)+\tilde{a}_{i,m,n}^{L}(t)\cdot\hat{x}_{i,n}(t)$
		$\displaystyle-\tilde{a}_{i,m,n}^{L}(t)\cdot x_{i,n}^{u}(t)\cdot y_{i,n}(t),\forall i,\forall m,\forall n,\forall t,$
		$\displaystyle\tilde{a}_{i,m}(t)=\sum_{n\in\mathcal{N}}\hat{a}_{i,m,n}(t),\forall i,\forall m,\forall t,$
		$\displaystyle x_{i}(t)=\sum_{n\in\mathcal{N}}\hat{x}_{i,n}(t),\forall i,\forall t,$
		$\displaystyle\sum_{n\in\mathcal{N}}y_{i,n}(t)=1,\forall t,$
		$\displaystyle x_{i,n}^{L}(t)=x_{i}^{L}(t)+\frac{(x_{i}^{U}(t)-x_{i}^{L}(t))\cdot n(n-1)}{N},\forall i,\forall n,\forall t,$
		$\displaystyle x_{i,n}^{U}(t)=x_{i}^{L}(t)+\frac{(x_{i}^{U}(t)-x_{i}^{L}(t))\cdot nn}{N},\forall i,\forall n,\forall t,$
		$\displaystyle x_{i,n}^{L}(t)y_{i,n}(t)\leqslant\hat{x}_{i,n}(t)\leqslant x_{i,n}^{U}(t)y_{i,n}(t),\forall i,\forall n,\forall t,$
		$\displaystyle\tilde{a}_{i,m,n}^{L}(t)y_{i,n}(t)\leqslant\hat{a}_{i,m,n}(t)\leqslant\tilde{a}_{i,m,n}^{U}(t)y_{i,n}(t),\forall i,\forall m,\forall n,\forall t,$
		$\displaystyle y_{i,n}(t)\in\{0,1\},\forall n,\forall t.$

To this end, we prune $x_{i}(t)$ and $\tilde{a}_{i,m}(t)$ to tighten their relaxed bounds. By traversing each partition $n\in N$ of $x_{i}(t)$ , we first determine the lower bound $\tilde{a}_{i,m,n}^{L}(t)$ and upper bound $\tilde{a}_{i,m,n}^{U}(t)$ of $\tilde{a}_{i,m}(t)$ by solving the following linear programming problem:

		$\displaystyle\tilde{a}_{i,m,n}^{L}(t)=\min\tilde{a}_{i,m}(t)\text{ or }\tilde{a}_{i,m,n}^{U}(t)=\max\tilde{a}_{i,m}(t)$		(42)
		$\displaystyle\text{ s.t. }(\ref{C1}),(\ref{C3}),(\ref{C4}),(\ref{C5})-(\ref{C8}),$
		$\displaystyle obj(\mathcal{P}_{4-3})\leqslant z^{\prime},$
		$\displaystyle x_{i}^{L}(t)=x_{i,n}^{L}(t)\leqslant x_{i}(t)\leqslant x_{i,n}^{U}(t)=x_{i}^{U}(t).$

Besides, the range of $x_{i}(t)$ is updated as $x_{i}^{L}(t)=\min_{n}x_{i,n}^{L}(t)$ and $x_{i}^{U}(t)=\max_{n}x_{i,n}^{U}(t)$ after traversing all the partitions of $x_{i}(t)$ . Then, after pruning all $x_{i}(t)$ and $\tilde{a}_{i,m}(t)$ , we can solve problem $\mathcal{P}_{4-3}$ in the pruned partition with software-based optimization solvers (e.g., CVX [39]), and obtain its solution $(\bm{x}^{R},\bm{\tilde{a}}^{R})$ . Lastly, we round continuous variables $\bm{\tilde{a}}^{R}$ to binary forms for obtaining integer solutions. Note that all constraints are automatically satisfied under $(\bm{x}^{R},\bm{a}^{R})$ because they have been taken into account in the pruning process of (42). The detailed steps of the designed PME-based algorithm for solving the large-timescale problem is presented in Algorithm 1.

4.3.3 Solution for Small-Timescale Decisions

Small-timescale subproblem $\mathcal{P}_{5}$ is also non-convex in general, but by relaxing the integer decision $z_{i}(\tau)$ (i.e., $z_{i}(\tau)\in\{0,1\}\rightarrow z_{i}(\tau)\in[0,1]$ ), it becomes a block multi-convex problem (i.e., the problem is convex if we solve one block of decision variable while fixing the others). This motivates us to design a BCD-based algorithm by further dividing problem $\mathcal{P}_{5}$ into four subproblems and solving them alternately until the objective function of $\mathcal{P}_{5}$ converges.

Personalized Data Size Determination: Given $b_{i}(\tau)$ , $f_{i}(\tau)$ and $z_{i}(\tau)$ , we have

	$\displaystyle\mathcal{P}_{5-1}:\min_{y_{i}(\tau)}\sum_{i\in\mathcal{I}}H_{i}(\tau)z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau))$
	$\displaystyle+E(\tau)\sum_{i\in\mathcal{I}}\sum_{m\in\mathcal{M}}z_{i}(\tau)(E_{i,m}^{ul}(\tau)+E_{i,m}^{ud}(\tau))-V\sum_{i\in\mathcal{I}}A_{i}(\tau)$
	$\displaystyle\text{ s.t. }y_{i}(\tau)\in[0,1].$

Bandwidth Resource Allocation: Given $y_{i}(\tau)$ , $f_{i}(\tau)$ and $z_{i}(\tau)$ , we have

	$\displaystyle\mathcal{P}_{5-2}:\min_{b_{i}(\tau)}\sum_{i\in\mathcal{I}}H_{i}(\tau)z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ofld}(\tau))$
	$\displaystyle+E(\tau)\sum_{i\in\mathcal{I}}\sum_{m\in\mathcal{M}}z_{i}(\tau)(E_{i,m}^{ul}(\tau)+E_{i,m}^{ofld}(\tau))$
	$\displaystyle\text{ s.t. }(\ref{C3}),b_{i}(\tau)\in(0,1].$

Computation Resource Allocation: Given $y_{i}(\tau)$ , $b_{i}(\tau)$ and $z_{i}(\tau)$ , we have

	$\displaystyle\mathcal{P}_{5-3}:\min_{f_{i}(\tau)}\sum_{i\in\mathcal{I}}H_{i}(\tau)(\sum_{m\in\mathcal{M}}T_{i,m}^{pl}(t)/K+z_{i}(\tau)\sum_{m\in\mathcal{M}}$
	$\displaystyle T_{i,m}^{ud}(\tau)+T_{i}^{exec}(\tau))+E(\tau)\sum_{i\in\mathcal{I}}(\sum_{m\in\mathcal{M}}E_{i,m}^{pl}(t)/K$
	$\displaystyle+\sum_{m\in\mathcal{M}}z_{i}(\tau)E_{i,m}^{ud}(\tau)+E_{i}^{exec}(\tau))$
	$\displaystyle\text{ s.t. }(\ref{C4}),f_{i}(\tau)\in(0,1].$

Input: Iteration index

l=0

, iterative convergence threshold

\epsilon

, and initial feasible solutions (

\bm{y}^{(0)},\bm{b}^{(0)},\bm{f}^{(0)},\bm{z}^{(0)}

1 Linear relaxation:

z_{i}(\tau)\in\{0,1\}\rightarrow z_{i}(\tau)\in[0,1]

;

2 repeat

3 Solve

\bm{y}^{(l)}

from

\mathcal{P}_{5-1}

with given

\bm{b}^{(l-1)},\bm{f}^{(l-1)},\bm{z}^{(l-1)}

;

4 Solve

\bm{b}^{(l)}

from

\mathcal{P}_{5-2}

with given

\bm{y}^{(l-1)},\bm{f}^{(l-1)},\bm{z}^{(l-1)}

;

5 Solve

\bm{f}^{(l)}

from

\mathcal{P}_{5-3}

with given

\bm{y}^{(l-1)},\bm{b}^{(l-1)},\bm{z}^{(l-1)}

;

6 Solve

\bm{z}^{(l)}

from

\mathcal{P}_{5-4}

with given

\bm{y}^{(l-1)},\bm{b}^{(l-1)},\bm{f}^{(l-1)}

;

l=l+1

;

8until $\mid obj^{(l)}(\mathcal{P}_{5})-obj^{(l-1)}(\mathcal{P}_{5})\mid\leq\epsilon$ ;

9Round

z_{i}(\tau)

to 1 with probability

z_{i}(\tau)

;

Output: solution of

\mathcal{P}_{5}

: (

\bm{y}^{(l)},\bm{b}^{(l)},\bm{f}^{(l)},\bm{z}^{(l)}

Algorithm 2 BCD-Based Algorithm

Task Offloading Decision: Given $y_{i}(\tau)$ , $b_{i}(\tau)$ and $f_{i}(\tau)$ , we have

	$\displaystyle\mathcal{P}_{5-4}:\min_{z_{i}(\tau)}\sum_{i\in\mathcal{I}}H_{i}(\tau)[z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)$
	$\displaystyle+T_{i,m}^{ofld}(\tau))+T_{i}^{exec}(\tau)]+E(\tau)\sum_{i\in\mathcal{I}}[\sum_{m\in\mathcal{M}}z_{i}(\tau)(E_{i,m}^{ul}(\tau)$
	$\displaystyle+E_{i,m}^{ud}(\tau)+E_{i,m}^{ofld}(\tau))+E_{i}^{exec}(\tau)]-V\sum_{i\in\mathcal{I}}A_{i}(\tau)$
	$\displaystyle\text{ s.t. }z_{i}(\tau)\in[0,1].$

Theorem 2

Problems $\mathcal{P}_{5-1}$ , $\mathcal{P}_{5-2}$ , $\mathcal{P}_{5-3}$ and $\mathcal{P}_{5-4}$ are all convex.

Proof:

Please see Appendix B. ∎

Thanks to Theorem 2, we can easily solve problems $\mathcal{P}_{5-1}$ , $\mathcal{P}_{5-2}$ , $\mathcal{P}_{5-3}$ and $\mathcal{P}_{5-4}$ by leveraging existing software-based optimization solvers (e.g. CVX [39]). Note that all these problems have to be solved iteratively, and the iteration terminates whenever the objective of problem $\mathcal{P}_{5}$ can no longer be enhanced. The detailed steps of the designed BCD-based algorithm for solving the small-timescale problem is illustrated in Algorithm 2.

4.4 Analysis of Proposed TACO Approach

In summary, the proposed two-timescale accuracy-aware online optimization approach (TACO) consists of problem reformulation, decomposition and alternation between two timescales. The flowchart of TACO is shown in Fig. 3.

Theorem 3

The proposed TACO approach can converge with limited alternations and iterations.

Proof:

For problem $\mathcal{P}_{5}$ , the convergence of TACO depends on that of the designed BCD-based algorithm and that of the employed alternating algorithm for problem $\mathcal{P}_{3}$ .

First, to prove the convergence of the BCD-based algorithm, we derive the partial derivatives of the objective function of problem $\mathcal{P}_{5}$ as follows:

		$\displaystyle\nabla_{\bm{y}}obj(\mathcal{P}_{5})=\sum_{i\in\mathcal{I}}H_{i}(\tau)z_{i}(\tau)\sum_{m\in\mathcal{M}}(\frac{S_{i}(\tau)}{r_{i,m}(\tau)}$		(43)
		$\displaystyle+\frac{a_{i,m}(t)S_{i}(\tau)C_{m}}{f_{i}(\tau)F_{m}})+E(\tau)\sum_{i\in\mathcal{I}}z_{i}(\tau)\sum_{m\in\mathcal{M}}(\frac{S_{i}(\tau)p_{i}}{r_{i,m}(\tau)}$
		$\displaystyle+a_{i,m}(t)\rho_{m}(F_{m})^{2}S_{i}(\tau)C_{m})-2V\sum_{i\in\mathcal{I}}\frac{z_{i}(\tau)S_{i}(\tau)}{D_{i}(t)+S_{i}(\tau)},$

		$\displaystyle\nabla_{\bm{b}}obj(\mathcal{P}_{5})=[\sum_{i\in\mathcal{I}}H_{i}(\tau)z_{i}(\tau)(y_{i}(\tau)S_{i}(\tau)$
		$\displaystyle+\lambda_{i}(\tau))+E(\tau)\sum_{i\in\mathcal{I}}z_{i}(\tau)p_{i}(y_{i}(\tau)S_{i}(\tau)+\lambda_{i}(\tau))]$
		$\displaystyle[\frac{p_{i}\|h_{i,m}(\tau)\|^{2}}{a_{i,m}(t)(S_{i,m}(\tau))^{\theta}N_{0}(\tau)(B_{m})^{2}\ln 2}$		(44)
		$\displaystyle\cdot\frac{(S_{i,m}(\tau))^{\theta}N_{0}(\tau)B_{m}}{((b_{i}(\tau))^{3}+(b_{i}(\tau))^{2}p_{i}\|h_{i,m}(\tau)\|^{2})}$
		$\displaystyle\cdot\frac{1}{\log_{2}^{2}(1+\frac{p_{i}\|h_{i,m}(\tau)\|^{2}}{(S_{i,m}(\tau))^{\theta}N_{0}b_{i}(\tau)B_{m}})}$
		$\displaystyle-\frac{1}{a_{i,m}(t)(b_{i}(\tau))^{2}B_{m}\log_{2}(1+\frac{p_{i}\|h_{i,m}(\tau)\|^{2}}{(S_{i,m}(\tau))^{\theta}N_{0}b_{i}(\tau)B_{m}})}],$

		$\displaystyle\nabla_{\bm{f}}obj(\mathcal{P}_{5})=-\sum_{i\in\mathcal{I}}H_{i}(\tau)[\sum_{m\in\mathcal{M}}\frac{a_{i,m}(t)x_{i}(t)D_{i}(t)C_{m}}{(f_{i}(tK))^{2}F_{m}K}$		(45)
		$\displaystyle+z_{i}(\tau)\sum_{m\in\mathcal{M}}\frac{a_{i,m}(t)y_{i}(\tau)S_{i}(\tau)C_{m}}{(f_{i}(\tau))^{2}F_{m}}$
		$\displaystyle+\sum_{m\in\mathcal{M}}a_{i,m}(t)z_{i}(\tau)\frac{\lambda_{i}(\tau)C_{m}}{(f_{i}(\tau))^{2}F_{m}}],$

		$\displaystyle\nabla_{\bm{z}}obj(\mathcal{P}_{5})=\sum_{i\in\mathcal{I}}H_{i}(\tau)[\sum_{m\in M}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)$
		$\displaystyle+T_{i,m}^{ofld}(\tau))+\sum_{m\in\mathcal{M}}a_{i,m}(t)\frac{\lambda_{i}(\tau)C_{m}}{f_{i}(\tau)F_{m}}-\frac{\lambda_{i}(\tau)C_{i}}{F_{i}}]$
		$\displaystyle+E(\tau)\sum_{i\in\mathcal{I}}[\sum_{m\in\mathcal{M}}(E_{i,m}^{ul}(\tau)+E_{i,m}^{ud}(\tau)+E_{i,m}^{ofld}(\tau))$		(46)
		$\displaystyle+\sum_{m\in\mathcal{M}}a_{i,m}(t)\rho_{m}(F_{m})^{2}\lambda_{i}(\tau)C_{m}-\rho_{i}(F_{i})^{2}\lambda_{i}(\tau)C_{i}]$
		$\displaystyle-V\sum_{i\in\mathcal{I}}[1-(1-\frac{x_{i}(t)D_{i}(t)+y_{i}(\tau)S_{i}(\tau)}{D_{i}(t)+S_{i}(\tau)})^{2}-g_{i}^{local}].$

Evidently, (43) and (4.4) are constants, and (4.4) and (45) are linear, meaning that all derived partial derivatives are L-lipschitz continuous according to [40], and hence the BCD-based algorithm can converge with limited iterations.

Then, to prove the convergence of the alternating algorithm, we scale the objective function of $\mathcal{P}_{3}$ as

		$\displaystyle obj^{l-1}(\mathcal{P}_{3})=obj(\mathcal{P}_{3})(\bm{a}^{l-1},\bm{x}^{l-1},\bm{y}^{l-1},\bm{b}^{l-1},\bm{f}^{l-1},\bm{z}^{l-1})$
		$\displaystyle\geq obj(\mathcal{P}_{3})(\bm{a}^{l},\bm{x}^{l},\bm{y}^{l-1},\bm{b}^{l-1},\bm{f}^{l-1},\bm{z}^{l-1})$		(47)
		$\displaystyle\geq obj(\mathcal{P}_{3})(\bm{a}^{l},\bm{x}^{l},\bm{y}^{l},\bm{b}^{l},\bm{f}^{l},\bm{z}^{l})=obj^{l}(\mathcal{P}_{3}),$

where $l$ is the number of alternations. The first inequality holds due to the sub-optimality of $\{\bm{y}^{l},\bm{b}^{l},\bm{f}^{l},\bm{z}^{l}\}$ by BCD-based algorithm, and the second inequality holds because of the sub-optimality of $\{\bm{a}^{l},\bm{x}^{l}\}$ by PME-based algorithm. This indicates that the objective function of $\mathcal{P}_{3}$ is monotonically decreasing along with the alternating process, and will be lower-bounded by $-VKI$ (i.e., by setting two queue backlogs to $0$ and all $A_{i}(\tau)$ to $1$ ) in finite alternations. ∎

Theorem 4

The computational complexity of the proposed TACO approach is $O(TR^{max}((I+M)log_{2}(I+M)+I^{3}M^{3}N^{3}+IM^{2.055}+4W^{max}I^{3}+2^{2.055}I))$ , where $I$ is the number of PTs, $M$ is the number of ESs, $N$ is the number of partitions in the PME-based algorithm, $T$ is the number of time frames, $R^{max}$ is the number of iterations in the alternating algorithm between two timescales, and $W^{max}$ is the number of iterations in the BCD-based algorithm.

Proof:

The complexity of TACO mainly depends on the alternating algorithm integrating PME-based algorithm and BCD-based algorithm.

For the PME-based algorithm, as stated in [41], the computational complexity for obtaining an initial feasible solution is $O((I+M)log_{2}(I+M))$ , and that of solving (42) is $O(I^{3}M^{3}N^{3})$ with the interior point method [42] in the CVX solver. Besides, the linear relaxation and linear programming for solving $a_{i,m}(t)$ has an asymptotic computational complexity of $O(IM^{2.055})$ [43]. For the BCD-based algorithm, the total computational complexity for solving all subproblems is $O(4I^{3})$ with the interior point method [42] in CVX solver, and that of the linear relaxation and linear programming solver for solving $z_{i}(\tau)$ is $O(2^{2.055}I)$ .

To sum up, the computational complexity of TACO can be expressed as $O(TR^{max}((I+M)log_{2}(I+M)+I^{3}M^{3}N^{3}+IM^{2.055}+4W^{max}I^{3}+2^{2.055}I))$ . ∎

Theorem 5

Given Lyapunov control parameter $V$ , the optimality gap between the solution obtained by the proposed TACO approach and the theoretically optimal solution to problem $\mathcal{P}_{1}$ can be expressed as

\sum_{t\in\mathcal{T}}\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)]/KT-\mathcal{O}\leq G/V+(\Lambda+\Gamma)/VT,

(48)

where $\mathcal{O}$ stands for the theoretically optimal solution, $\Lambda$ is the optimality gap of the PME-based algorithm, $\Gamma$ represents the optimality gap of the BCD-based algorithm, and $G$ is defined in (27).

Proof:

First, inequality (4.2) can be intuitively expanded and rewritten as

		$\displaystyle\sum_{\tau\in\mathcal{T}_{t}}\Delta(\bm{\Theta}(\tau))-V\cdot\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)]$		(49)
		$\displaystyle\leq GK+\sum_{\tau\in\mathcal{T}_{t}}\{\sum_{i\in\mathcal{I}}\mathbb{E}[H_{i}(\tau)[T_{i}^{tol}(\tau)-T_{i}^{max}]\mid\bm{\Theta}(\tau)]$
		$\displaystyle+\mathbb{E}\{E(\tau)\cdot[E^{tol}(\tau)-E^{max}]\mid\bm{\Theta}(\tau)\}$
		$\displaystyle-V\cdot\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)]\}$
		$\displaystyle\leq GK+V\cdot K\mathcal{O}+\Lambda+\Gamma.$

Then, by summing up (49) over $T$ time frames, we have

		$\displaystyle(G+V\cdot\mathcal{O}+\frac{1}{K}(\Lambda+\Gamma))\cdot KT$		(50)
		$\displaystyle\geq\sum_{t\in T}\{\sum_{\tau\in\mathcal{T}_{t}}\Delta(\bm{\Theta}(\tau))-V\cdot\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)]\}$
		$\displaystyle=\mathbb{E}[L(\bm{\Theta}(KT))-L(\bm{\Theta}(0))]$
		$\displaystyle+V\cdot\sum_{t\in T}\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)]$
		$\displaystyle=\mathbb{E}[L(\bm{\Theta}(KT))]-\mathbb{E}[L(\bm{\Theta}(0))]$
		$\displaystyle+V\cdot\sum_{t\in T}\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)]$

Afterwards, by moving $\mathbb{E}[L(\bm{\Theta}(0))]$ to the left-hand-side of (50), and then dividing both sides by $TV$ , inequality (48) can be obtained.

Next, we further analyze the optimality gaps $\Lambda$ and $\Gamma$ as follows. For $\Lambda$ , according to [36], its value is theoretically bounded in the range of $\Lambda\in[0,0.12]$ . For $\Gamma$ , by letting $\tilde{\mathcal{P}}_{5}$ and $\mathcal{P}_{5}^{*}$ be the solutions given by the BCD-based algorithm and the optimal one, respectively, and taking the subtraction between them, we have

	$\displaystyle\tilde{\mathcal{P}}_{5}-\mathcal{P}_{5}^{*}$
	$\displaystyle\leq\Gamma$
	$\displaystyle=\sum_{i\in\mathcal{I}}H_{i}(\tau)\{\tilde{z}_{i}(\tau)\sum_{m\in\mathcal{M}}(S_{i}(\tau)+\lambda_{i}(\tau)+a_{i,m}(t)S_{i}(\tau)C_{m})$
	$\displaystyle+z_{i}^{*}(\tau)a_{i,m}(t)S_{i}(\tau)C_{m}\}+E(\tau)\sum_{i\in\mathcal{I}}\sum_{m\in\mathcal{M}}[\tilde{z}_{i}(\tau)$
	$\displaystyle[a_{i,m}(t)\rho_{m}(F_{m})^{2}S_{i}(\tau)C_{m}+S_{i}(\tau)p_{i}+z_{i}^{*}(\tau)S_{i}(\tau)p_{i}]$
	$\displaystyle+V\sum_{i\in\mathcal{I}}(A_{i}^{}(\tau)\tilde{A}_{i}(\tau))=\sum_{i\in\mathcal{I}}H_{i}(\tau)(\tilde{z}_{i}(\tau)\omega_{i}(\tau)+z_{i}^{}(\tau)\phi_{i}(\tau))$
	$\displaystyle+E(\tau)\sum_{i\in\mathcal{I}}\sum_{m\in\mathcal{M}}[\tilde{z}_{i}(\tau)\omega_{i}^{\prime}(\tau)+z_{i}^{*}(\tau)\phi_{i}^{\prime}(\tau)]$
	$\displaystyle+V\sum_{i\in\mathcal{I}}(A_{i}^{*}(\tau)+\tilde{A}_{i}(\tau)),$		(51)

where $\omega_{i}(\tau)=\sum_{m\in\mathcal{M}}(S_{i}(\tau)+\lambda_{i}(\tau)+a_{i,m}(t)S_{i}(\tau)C_{m})$ , $\phi_{i}(\tau)=a_{i,m}(t)S_{i}(\tau)C_{m}$ , $\omega_{i}^{\prime}(\tau)=a_{i,m}(t)\rho_{m}(F_{m})^{2}S_{i}(\tau)C_{m}$ and $\phi_{i}^{\prime}(\tau)=S_{i}(\tau)p_{i}$ , which are both constant in each time slot $\tau\in\mathcal{T}_{t}$ . ∎

5 Simulation Results

In this section, simulations are conducted to evaluate the performance of the proposed TACO approach for jointly optimizing the HDT deployment (i.e., generic placement and customized update of the VT model) and task offloading in an end-edge-cloud collaborative framework. All simulation results are obtained based on real-world datasets (including a human activity dataset [44] and an ES distribution dataset [45]), by taking averages over 1000 runs under various parameter settings.

5.1 Simulation Settings

Consider an HDT system in a $1km\times 1km$ square area with $M=10$ ESs and $I=40$ PTs (following a Random-Waypoint (RWP) mobility model under the same settings as those in [46]). According to [31], for the HDT-assisted complex task execution, the average accuracy of edge execution and local execution for any PT $i$ ’ s tasks are approximated as $g_{i}^{edge}(d_{i}(\tau))=1-[1-d_{i}(\tau)/(D_{i}(t)+S_{i}(\tau))]^{2}$ (which is a function of the total data size for PT $i$ ’s corresponding VT construction at the edge, i.e., $d_{i}(\tau)$ ) and $g_{i}^{local}=0.5$ , respectively. Table II lists the values of main simulation parameters, while most of them have also been widely employed in the literature [11], [12], [20]. Furthermore, to show the superiority of the proposed TACO approach, the following schemes are simulated as benchmarks. Note that, since the original objectives of these benchmark schemes are different from ours, for the fairness of comparison, we have modified them to adapt to the considered settings and particularly changed their optimization objectives to align with ours.

•

LOT [20]: Generic VT model placement from the cloud and PTs’ task offloading are jointly determined by adopting a contract theory-based incentive mechanism. However, this scheme ignores the customized VT update by collecting personalized data from end devices, and it optimizes all decisions synchronously in a single-timescale only.
•

CRO [10]: Both generic VT model placement from the cloud and customized VT update from end devices along with PTs’ task offloading are jointly determined by adopting a double auction based optimization, while all decisions are optimized in a single-timescale for simplicity.

TABLE II: Simulation Parameters

Parameter	Value	Parameter	Value
$p_{i}$	500 mW	$B_{m}$	5 MHz
$p^{c}$	5 W	$F_{m}$	20 GHz
$\theta$	4	$C_{m},C_{i}$	300 cycles/bit
$N_{0}$	-174 dBm/Hz	$\rho_{m},\rho_{i}$	$10^{-27}$
$S_{i}(\tau)$	[6.1, 12.2] Mbits	$r^{c}$	50 Mbps
$D_{i}(t)$	[73.2, 97.6] Mbits	$K$	10
$\lambda_{i}(\tau)$	[10, 20] Mbits	$M$	10
$F_{i}$	1 GHz	$I$	40

5.2 Performance Evaluations

Fig. 4 examines the convergence of the proposed TACO approach in solving problem $\mathcal{P}_{1}$ by showing the average HDT-assisted task execution accuracy under different values of parameter $V$ ranging from $10^{6}$ to $8\times 10^{6}$ . The timeline is divided into $T=200$ time frames, and each frame has $K=10$ time slots. It is shown that, for all three cases, within $50$ time frames, the task execution accuracy decreases at the beginning but quickly converges over the time, which verifies the convergence property of TACO in solving problem $\mathcal{P}_{1}$ . Besides, from this figure, it can also be observed that the average task execution accuracy increases with $V$ . The reason is that, Lyapunov parameter $V$ is introduced to control the weight of significance on maximizing the HDT-assisted task execution accuracy versus that of enforcing the delay and energy consumption constraints, and a larger $V$ indicates more emphasis on the task execution accuracy.

Fig. 5 demonstrates the stability of the proposed TACO approach by showing the performance of delay overflow queue backlog $H_{i}(\tau)$ and energy consumption deficit queue backlog $E(\tau)$ brought by the Lyapunov decomposition by varying $V$ and $K$ . From this figure, we can see that two queue backlogs decrease and quickly stabilize over the time, because TACO focuses on controlling system costs (i.e., service response delay and system energy consumption) to minimize the objective function of $\mathcal{P}_{3}$ , thereby shrinking two queue backlogs, and can eventually achieve a well balance between the task execution accuracy and system costs, leading to a stable outcome. Besides, in Fig. 5(a), it is intuitive that both queue backlogs stabilize on higher values with the increase of $V$ as more emphasis is on maximizing the task execution accuracy, resulting in the growth of delay and energy consumption. Meanwhile, Fig. 5(b) shows that both queue backlogs stabilize on lower values with the raise of $K$ . This is because a larger value of $K$ means that generic VT models are placed with a lower frequency, and hence greatly reduces the generic VT model placement delay (i.e., $T_{i,m}^{dl}(\tau)+T_{i,m}^{pl}(\tau)$ ) and energy consumption (i.e., $E_{i,m}^{dl}(\tau)+E_{i,m}^{pl}(\tau)$ ).

Fig. 6 illustrates the average updating delay of customized VT models (i.e., $\sum_{t\in\mathcal{T}}\sum_{\tau\in\mathcal{T}_{t}}\sum_{i\in\mathcal{I}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau))/TKI$ ) with different ES bandwidth $B_{m}$ under LOT, CRO and the proposed TACO approach. Since LOT ignores customized VT model update, meaning that personalized data of PTs does not need to be uploaded and processed, the average updating delay maintains as zero regardless of the ES’s bandwidth $B_{m}$ . In contrast, the average updating delay decreases with the increase of $B_{m}$ for both CRO and the proposed TACO approach due to the reduction in uploading delay $T_{i,m}^{ul}(\tau)$ , as more uplink bandwidth resource is provided. Besides, we can also see that TACO outperforms CRO because, with the help of generic VT model placement, TACO needs to upload much less personalized data than that of CRO.

Fig. 7 shows the average placement delay of generic VT models (i.e., $\sum_{t\in\mathcal{T}}\sum_{\tau\in\mathcal{T}_{t}}\sum_{i\in\mathcal{I}}(T_{i,m}^{dl}(t)+T_{i,m}^{pl}(t))/TKI$ ) with different cloud-ES transmission rate $r^{c}$ under LOT, CRO and the proposed TACO approach. It is obvious that the placement delay decreases logarithmically with the raise of $r^{c}$ for all schemes. Furthermore, from this figure, we can see that LOT has the worst performance because it ignores customized model update and can only download as much experiential knowledge as possible to improve the average task execution accuracy, resulting in the highest average placement delay. In contrast, both CRO and TACO outperform LOT due to the introduction of customized VT update, alleviating the burden on generic VT model placement. Moreover, the proposed TACO approach achieves the best performance (with the placement delay reduced by $14.3\%$ and $25.5\%$ on average compared to CRO and LOT, respectively). This is because TACO builds VTs in two timescales (with large-timescale generic model placement and small-timescale customized model update) which only requires to download experiential knowledge at the beginning of each time frame significantly reducing the total data size in the process of model placement.

Figs. 8 and 9 investigate the service response delay and system energy consumption, achieved by LOT, CRO and the proposed TACO approach with different numbers of PTs $I$ . Both figures show that the service response delay and system energy consumption increase exponentially with $I$ for all schemes, because a larger $I$ implies more demands for VT construction with potentially larger amount of data in competing limited communication and computation resources. Besides, LOT has the worst performances in these two metrics because its task execution accuracy only relies on the experiential knowledge downloaded in VT construction, and thus it has to download much larger amounts of experiential knowledge for guaranteeing a desired task execution accuracy. Meanwhile, CRO and TACO outperform LOT, especially when $I$ becomes larger, which reveals the necessity of customized VT model update in addition to generic VT model placement. Moreover, the proposed TACO approach achieves the best performance, and the reason follows the same as that in explaining Fig. 7.

Fig. 10 compares the average task execution accuracy achieved by LOT, CRO and the proposed TACO approach by varying ES’s bandwidth $B_{m}$ , transmission rate $r^{c}$ and number of PTs $I$ . It is shown that, i) in Fig. 10(a), the average task execution accuracy increases with $B_{m}$ for all schemes, except for LOT; ii) in Fig. 10(b), the average task execution accuracy increases with $r^{c}$ for all three schemes; and iii) in Fig. 10(c), the average task execution accuracy decreases with $I$ for all three schemes. The main reason is that, when more resource is supplied (i.e., $B_{m}$ and $r^{c}$ are large) or total resource demand is less competitive (i.e., $I$ is small), more experiential knowledge and personalized data can be transmitted for dynamic VT construction and more tasks can be offloaded for HDT-assisted edge execution, all contributing to the enhancement of task execution accuracy. Furthermore, from this figure, we can see that LOT has the worst performance because it only considers generic VT model placement, and CRO is better than LOT thanks to the joint consideration of both generic VT model placement and customized VT model update. Moreover, TACO achieves the best performance because it further strikes the balance of generic VT model placement and customized VT model update by conducting these two processes asynchronously in two different timescales.

6 Conclusion

In this paper, the optimization of HDT deployment at the network edge has been studied. Particularly, aiming to maximize the accuracy of complex task execution assisted by HDT under various system uncertainties (e.g., random mobility and status variations), a two-timescale online optimization problem is formulated for jointly determining VTs’ construction (including dynamic generic model placement and customized model update) and PTs’ task offloading together with the management of access selection and corresponding communication and computation resource allocations. A novel solution approach, called TACO, is proposed, which decomposes the online problem into a series of deterministic ones and then leverages PME-based and BCD-based algorithms for solving different subproblems in different timescales alternately. Theoretical analyses and simulations show that TACO is superior compared to counterparts in the optimization of HDT deployment at the edge for not only improving the HDT-assisted task execution accuracy, but also reducing the service response delay and overall system energy consumption.

References

[1] S. D. Okegbile, J. Cai, C. Yi, and D. Niyato, “Human digital twin for personalized healthcare: Vision, architecture and future directions,” IEEE Netw., 2022.
[2] Y. Lin, L. Chen, A. Ali, C. Nugent, C. Ian, R. Li, D. Gao, H. Wang, Y. Wang, and H. Ning, “Human digital twin: A survey,” arXiv preprint arXiv:2212.05937, 2022.
[3] P. Thamotharan, S. Srinivasan et al., “Human digital twin for personalized elderly type 2 diabetes management,” J. Clin. Med., vol. 12, no. 6, p. 2094, 2023.
[4] B. Björnsson, C. Borrebaeck et al., “Digital twins to personalize medicine,” Genome Med., vol. 12, pp. 1–4, 2020.
[5] J. Chen, C. Yi et al., “Networking architecture and key supporting technologies for human digital twin in personalized healthcare: A comprehensive survey,” IEEE Commun. Surv. Tutor., 2023.
[6] S. D. Okegbile and J. Cai, “Edge-assisted human-to-virtual twin connectivity scheme for human digital twin frameworks,” in Proc. IEEE VTC, 2022, pp. 1–6.
[7] J. Ren, D. Zhang, S. He et al., “A survey on end-edge-cloud orchestrated network computing paradigms: Transparent computing, mobile edge computing, fog computing, and cloudlet,” ACM Comput. Surv. (CSUR), vol. 52, no. 6, pp. 1–36, 2019.
[8] P. Jia, X. Wang, and X. Shen, “Accurate and efficient digital twin construction using concurrent end-to-end synchronization and multi-attribute data resampling,” IEEE Internet Things J., vol. 10, no. 6, pp. 4857–4870, 2022.
[9] R. Dong, C. She et al., “Deep learning for hybrid 5G services in mobile edge computing systems: Learn from a digital twin,” IEEE Trans. Wirel. Commun., vol. 18, no. 10, pp. 4692–4707, 2019.
[10] Y. Zhang, H. Zhang, Y. Lu, W. Sun, L. Wei, Y. Zhang, and B. Wang, “Adaptive digital twin placement and transfer in wireless computing power network,” IEEE Internet Things J., pp. 1–1, 2023.
[11] J. Wang, J. Hu et al., “Online service migration in mobile edge with incomplete system information: A deep recurrent actor-critic learning approach,” IEEE Trans. Mob. Comput., 2022.
[12] T. Ouyang, Z. Zhou et al., “Follow me at the edge: Mobility-aware dynamic service placement for mobile edge computing,” IEEE J. Sel. Areas Commun., vol. 36, no. 10, pp. 2333–2345, 2018.
[13] D. Harutyunyan, N. Shahriar et al., “Latency and mobility–aware service function chain placement in 5G networks,” IEEE Trans. Mob. Comput., vol. 21, no. 5, pp. 1697–1709, 2020.
[14] B. Wang, H. Zhou et al., “Human digital twin in the context of Industry 5.0,” Robot Comput. Integr. Manuf., vol. 85, p. 102626, 2024.
[15] M. Lauer-Schmaltz, P. Cash et al., “Designing human digital twins for behaviour-changing therapy and rehabilitation: a systematic review,” Proc. Design Soc., vol. 2, pp. 1303–1312, 2022.
[16] Y. Shi, C. Yi, R. Wang, Q. Wu, B. Chen, and J. Cai, “Service migration or task rerouting: A two-timescale online resource optimization for MEC,” IEEE Trans. Wirel. Commun., 2023.
[17] D. Lee, J. Cho, and J. Kim, “Meta-human synchronization framework for large-scale digital twin,” in Proc. IEEE MetaCom, 2023, pp. 738–741.
[18] R. Zhong, B. Hu et al., “Construction of human digital twin model based on multimodal data and its application in locomotion mode identification,” Chin. J. Mech. Eng., vol. 36, no. 1, p. 126, 2023.
[19] Y. Liu, L. Zhang et al., “A novel cloud-based framework for the elderly healthcare services using digital twin,” IEEE Access, vol. 7, pp. 49 088–49 101, 2019.
[20] X. Lin, J. Wu, J. Li, W. Yang, and M. Guizani, “Stochastic digital-twin service demand with edge response: An incentive-based congestion control approach,” IEEE Trans. Mob., 2021.
[21] Y. Ding, K. Li, C. Liu, and K. Li, “A potential game theoretic approach to computation offloading strategy optimization in end-edge-cloud computing,” IEEE Trans. Parallel Distrib. Syst., vol. 33, no. 6, pp. 1503–1519, 2021.
[22] Y. Shi, C. Yi, B. Chen, C. Yang, K. Zhu, and J. Cai, “Joint online optimization of data sampling rate and preprocessing mode for edge–cloud collaboration-enabled industrial IoT,” IEEE Internet Things J., vol. 9, no. 17, pp. 16 402–16 417, 2022.
[23] Y. He, Y. Ren et al., “Two-timescale resource allocation for automated networks in IIoT,” IEEE Trans. Wirel. Commun., vol. 21, no. 10, pp. 7881–7896, 2022.
[24] H. Ma, Z. Zhou, and X. Chen, “Leveraging the power of prediction: Predictive service placement for latency-sensitive mobile edge computing,” IEEE Trans. Wirel. Commun., vol. 19, no. 10, pp. 6454–6468, 2020.
[25] R. Zhou, X. Wu, H. Tan, and R. Zhang, “Two time-scale joint service caching and task offloading for UAV-assisted mobile edge computing,” in Proc. IEEE INFOCOM, 2022, pp. 1189–1198.
[26] W. Shengli, “Is human digital twin possible?” Comput. Methods and Programs in Biomed. Update, vol. 1, p. 100014, 2021.
[27] J. Xie, R. Yang et al., “Dual digital twin: Cloud–edge collaboration with lyapunov-based incremental learning in EV batteries,” Appl. Energy, vol. 355, p. 122237, 2024.
[28] M. Mohammadi, H. A. Suraweera, and C. Tellambura, “Uplink/downlink rate analysis and impact of power allocation for full-duplex cloud-RANs,” IEEE Trans. Wirel. Commun., vol. 17, no. 9, pp. 5774–5788, 2018.
[29] C. Yi, J. Cai et al., “A queueing game based management framework for fog computing with strategic computing speed control,” IEEE Trans. Mob., vol. 21, no. 5, pp. 1537–1551, 2020.
[30] A. Goldsmith, Wireless communications. Cambridge university press, 2005.
[31] W. Wu, P. Yang et al., “Accuracy-guaranteed collaborative DNN inference in industrial IoT via deep reinforcement learning,” IEEE Trans. Ind. Inform., vol. 17, no. 7, pp. 4988–4998, 2020.
[32] K.-C. Wu, W.-Y. Liu, and S.-Y. Wu, “Dynamic deployment and cost-sensitive provisioning for elastic mobile cloud services,” IEEE Trans. Mob. Comput., vol. 17, no. 6, pp. 1326–1338, 2017.
[33] C. Yi, J. Cai et al., “Workload re-allocation for edge computing with server collaboration: A cooperative queueing game approach,” IEEE Trans. Mob. Comput., 2021.
[34] M. Neely, Stochastic network optimization with application to communication and queueing systems. Springer Nature, 2022.
[35] L. Georgiadis, M. J. Neely, L. Tassiulas et al., “Resource allocation and cross-layer control in wireless networks,” Found. Trends® Netw., vol. 1, no. 1, pp. 1–144, 2006.
[36] P. M. Castro, “Tightening piecewise mccormick relaxations for bilinear problems,” Comput. Chem. Eng., vol. 72, pp. 300–311, 2015.
[37] R. Karuppiah and I. E. Grossmann, “Global optimization for the synthesis of integrated water systems in chemical processes,” Comput. Chem. Eng., vol. 30, no. 4, pp. 650–673, 2006.
[38] C. A. Meyer and C. A. Floudas, “Global optimization of a combinatorially complex generalized pooling problem,” AIChE J., vol. 52, no. 3, pp. 1027–1037, 2006.
[39] C. Bliek1ú, P. Bonami, and A. Lodi, “Solving mixed-integer quadratic programming problems with IBM-CPLEX: a progress report,” in Proc. RAMP Symp., 2014, pp. 16–17.
[40] J. Bolte, S. Sabach, and M. Teboulle, “Proximal alternating linearized minimization for nonconvex and nonsmooth problems,” Math. Program., vol. 146, no. 1-2, pp. 459–494, 2014.
[41] T. P. Peixoto, “Efficient monte carlo and greedy heuristic for the inference of stochastic block models,” Phys. Rev. E, vol. 89, no. 1, p. 012804, 2014.
[42] S. P. Boyd and L. Vandenberghe, Convex optimization. Cambridge university press, 2004.
[43] A. Srinivasan, “Approximation algorithms via randomized rounding: a survey,” Adv. Topics in Math., PWN, pp. 9–71, 1999.
[44] M. Zhang and A. A. Sawchuk, “USC-HAD: A daily activity dataset for ubiquitous activity recognition using wearable sensors,” in Proc. ACM SAGAware, Pittsburgh, Pennsylvania, USA, September 2012.
[45] S. Telecom, “The distribution of 3233 base stations,” 2019.
[46] D. B. Johnson and D. A. Maltz, “Dynamic source routing in ad hoc wireless networks,” Mobile Comput., pp. 153–181, 1996.

Appendix A

By squaring both sides of the delay overflow queue in (22), we have

	$\displaystyle H_{i}^{2}(\tau+1)=\{[H_{i}(\tau)+T_{i}^{tol}(\tau)-T_{i}^{max}]^{+}\}^{2}$		(52)
	$\displaystyle\leq H_{i}^{2}(\tau)+(T_{i}^{tol}(\tau)-T_{i}^{max})^{2}+2H_{i}(\tau)(T_{i}^{tol}(\tau)-T_{i}^{max}).$

By subtracting $H_{i}^{2}(\tau)$ from both sides, dividing by 2 and summing up all inequalities for $i\in\mathcal{I}$ , we have

		$\displaystyle\frac{1}{2}\sum_{i\in\mathcal{I}}(H_{i}^{2}(\tau+1)-H_{i}^{2}(\tau))$		(53)
		$\displaystyle\leq\frac{1}{2}\sum_{i\in\mathcal{I}}(T_{i}^{tol}(\tau)-T_{i}^{max})^{2}+\sum_{i\in\mathcal{I}}H_{i}(\tau)(T_{i}^{tol}(\tau)-T_{i}^{max}).$

Similarly, by squaring both sides of the energy deficit queue in (23), we have

	$\displaystyle E^{2}(\tau+1)=\{[E(\tau)+E^{tol}(\tau)-E^{max}]^{+}\}^{2}$		(54)
	$\displaystyle\leq E^{2}(\tau)+(E^{tol}(\tau)-E^{max})^{2}+2E(\tau)(E^{tol}(\tau)-E^{max}).$

By subtracting $E^{2}(\tau)$ from both sides and dividing by 2, we have

		$\displaystyle\frac{1}{2}(E^{2}(\tau+1)-E^{2}(\tau))$		(55)
		$\displaystyle\leq\frac{1}{2}(E^{tol}(\tau)-E^{max})^{2}+E(\tau)(E^{tol}(\tau)-E^{max}).$		(55)

Combining (53) and (55), we have

		$\displaystyle L(\bm{\Theta}(\tau+1))-L(\bm{\Theta}(\tau))$		(56)
		$\displaystyle=\frac{1}{2}\sum_{i\in\mathcal{I}}(H_{i}^{2}(\tau+1)-H_{i}^{2}(\tau))+\frac{1}{2}(E^{2}(\tau+1)-E^{2}(\tau))$
		$\displaystyle\leq\frac{1}{2}\sum_{i\in\mathcal{I}}(T_{i}^{tol}(max)-T_{i}^{max})^{2}+\frac{1}{2}(E^{tol}(max)-E^{max})^{2}$
		$\displaystyle+\sum_{i\in\mathcal{I}}H_{i}(\tau)(T_{i}^{tol}(\tau)-T_{i}^{max})+E(\tau)(E^{tol}(\tau)-E^{max}).$

Lastly, by adding $V\cdot\sum_{i\in\mathcal{I}}A_{i}(\tau)$ to both sides of (LABEL:LyapunovDrift) and taking the expectation of both sides of $\bm{\Theta}(\tau)$ , inequality (27) can be derived.

Appendix B

For subproblem $\mathcal{P}_{5-1}$ , the Lagrangian function is given by

		$\displaystyle\mathcal{L}_{1}(\bm{y})=\sum_{i\in\mathcal{I}}H_{i}(\tau)z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau))$		(57)
		$\displaystyle+E(\tau)\sum_{i\in\mathcal{I}}\sum_{m\in\mathcal{M}}z_{i}(\tau)(E_{i,m}^{ul}(\tau)+E_{i,m}^{ud}(\tau))$
		$\displaystyle-V\cdot\sum_{i\in\mathcal{I}}A_{i}(\tau).$

By taking the first-order and second-order derivatives of (LABEL:Ptau1), we have

		$\displaystyle\frac{\partial\mathcal{L}_{1}}{\partial y_{i}(\tau)}=\sum_{i\in\mathcal{I}}H_{i}(\tau)z_{i}(\tau)\sum_{m\in\mathcal{M}}(\frac{a_{i,m}(t)S_{i}(\tau)C_{m}}{f_{i}(\tau)F_{m}}$		(58)
		$\displaystyle+\frac{S_{i}(\tau)}{r_{i,m}(\tau)})+E(\tau)\sum_{i\in\mathcal{I}}\sum_{m\in M}(a_{i,m}(t)\rho_{m}(F_{m})^{2}S_{i}(\tau)$
		$\displaystyle C_{m}+\frac{S_{i}(\tau)p_{i}}{r_{i,m}(\tau)})-2V\sum_{i\in\mathcal{I}}\frac{z_{i}(\tau)S_{i}(\tau)}{D_{i}(t)+S_{i}(\tau)},$

\displaystyle\frac{\partial^{2}\mathcal{L}_{1}}{\partial(y_{i}(\tau))^{2}}=2V\cdot\sum_{i\in\mathcal{I}}S_{i}(\tau)^{2}\geq 0.

(59)

Since $\frac{\partial^{2}\mathcal{L}_{1}}{\partial(y_{i}(\tau))^{2}}\geq 0$ , we can conclude that problem $\mathcal{P}_{5-1}$ is convex.

For subproblem $\mathcal{P}_{5-2}$ , we denote $\bm{\gamma}=\{\gamma_{i}(\tau),\forall i\in\mathcal{I},\forall\tau\in\mathcal{T}_{t}\}$ as the corresponding Lagrangian multiplier, which can be expressed as

		$\displaystyle\mathcal{L}_{2}(\bm{b},\bm{\gamma})=\sum_{i\in\mathcal{I}}H_{i}(\tau)z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ofld}(\tau))$		(60)
		$\displaystyle+E(\tau)\sum_{i\in\mathcal{I}}\sum_{m\in\mathcal{M}}(E_{i,m}^{ul}(\tau)+E_{i,m}^{ofld}(\tau))$
		$\displaystyle+\sum_{i\in\mathcal{I}}\gamma_{i}(\tau)(a_{i,m}(t)b_{i}(\tau)-1).$

Since $1/r_{i,m}(\tau)$ in (60) decreases monotonically, it is convex when $b_{i}(\tau)\in(0,1]$ , and thus problem $\mathcal{P}_{5-2}$ is convex.

For subproblem $\mathcal{P}_{5-3}$ , we denote $\bm{\eta}=\{\eta_{i}(\tau),\forall i\in\mathcal{I},\forall\tau\in\mathcal{T}_{t}\}$ as the corresponding Lagrangian multiplier, which can be calculated as

		$\displaystyle\mathcal{L}_{3}(\bm{f},\bm{\eta})=\sum_{i\in\mathcal{I}}H_{i}(\tau)(\sum_{m\in\mathcal{M}}T_{i,m}^{pl}(t)/K+z_{i}(\tau)\sum_{m\in\mathcal{M}}$		(61)
		$\displaystyle T_{i,m}^{ud}(\tau)+T_{i}^{exec}(\tau))+E(\tau)\sum_{i\in\mathcal{I}}(\sum_{m\in\mathcal{M}}E_{i,m}^{pl}(t)/K$
		$\displaystyle+\sum_{m\in\mathcal{M}}z_{i}(\tau)E_{i,m}^{ud}(\tau)+E_{i}^{exec}(\tau))$
		$\displaystyle+\sum_{i\in\mathcal{I}}\eta_{i}(\tau)(a_{i,m}(t)f_{i}(\tau)-1)$

Taking the first-order and second-order derivatives yields

		$\displaystyle\frac{\partial\mathcal{L}_{3}}{\partial f_{i}(\tau)}=-\sum_{i\in\mathcal{I}}H_{i}(\tau)(\sum_{m\in\mathcal{M}}\frac{a_{i,m}(t)x_{i}(t)D_{i}(t)C_{m}}{(f_{i}(tK))^{2}F_{m}K}$		(62)
		$\displaystyle+z_{i}(\tau)\sum_{m\in\mathcal{M}}\frac{a_{i,m}(t)y_{i}(\tau)S_{i}(\tau)C_{m}}{(f_{i}(\tau))^{2}F_{m}}$
		$\displaystyle+\sum_{m\in\mathcal{M}}a_{i,m}(t)z_{i}(\tau)\frac{\lambda_{i}(\tau)C_{m}}{(f_{i}(\tau))^{2}F_{m}})$
		$\displaystyle+\sum_{i\in\mathcal{I}}\eta_{i}(\tau)a_{i,m}(t),$

		$\displaystyle\frac{\partial^{2}\mathcal{L}_{3}}{\partial(f_{i}(\tau))^{2}}=2\sum_{i\in\mathcal{I}}H_{i}(\tau)\{\sum_{m\in\mathcal{M}}\frac{a_{i,m}(t)x_{i}(t)D_{i}(t)C_{m}}{(f_{i}(tK))^{3}F_{m}K}$		(63)
		$\displaystyle+z_{i}(\tau)\sum_{m\in\mathcal{M}}\frac{a_{i,m}(t)y_{i}(\tau)S_{i}(\tau)C_{m}}{(f_{i}(\tau))^{3}F_{m}}$
		$\displaystyle+\sum_{m\in\mathcal{M}}a_{i,m}(t)z_{i}(\tau)\frac{\lambda_{i}(\tau)C_{m}}{(f_{i}(\tau))^{3}F_{m}}\}.$

Obviously, when $f_{i}(\tau)\in(0,1]$ , $\frac{\partial^{2}\mathcal{L}_{3}}{\partial(f_{i}(\tau))^{2}}\geq 0$ , and thus problem $\mathcal{P}_{5-3}$ is convex.

For subproblem $\mathcal{P}_{5-4}$ , the Lagrangian function is given by

		$\displaystyle\mathcal{L}_{4}(\bm{z})=\sum_{i\in\mathcal{I}}H_{i}(\tau)[z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)$
		$\displaystyle+T_{i,m}^{ofld}(\tau))+T_{i}^{exec}(\tau)]+E(\tau)\sum_{i\in\mathcal{I}}[\sum_{m\in\mathcal{M}}z_{i}(\tau)(E_{i,m}^{ul}(\tau)$
		$\displaystyle+E_{i,m}^{ud}(\tau)+E_{i,m}^{ofld}(\tau))+E_{i}^{exec}(\tau)]-V\sum_{i\in\mathcal{I}}A_{i}(\tau).$		(64)

By taking the first-order derivative and second-order derivatives of (B), we have

		$\displaystyle\frac{\partial\mathcal{L}_{4}}{\partial z_{i}(\tau)}=\sum_{i\in\mathcal{I}}H_{i}(\tau)[\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)$		(65)
		$\displaystyle+T_{i,m}^{ofld}(\tau))+T_{i}^{exec}(\tau)]+E(\tau)\sum_{i\in\mathcal{I}}[\sum_{m\in\mathcal{M}}(E_{i,m}^{ul}(\tau)$
		$\displaystyle+E_{i,m}^{ud}(\tau)+E_{i,m}^{ofld}(\tau))+E_{i}^{exec}(\tau)]$
		$\displaystyle-V\sum_{i\in\mathcal{I}}[1-(1-\frac{x_{i}(t)D_{i}(t)+y_{i}(\tau)S_{i}(\tau)}{D_{i}(t)+S_{i}(\tau)})^{2}-g_{i}^{local}].$

\displaystyle\frac{\partial^{2}\mathcal{L}_{4}}{\partial(z_{i}(\tau))^{2}}=0.

(66)

Since $\frac{\partial^{2}\mathcal{L}_{4}}{\partial(z_{i}(\tau))^{2}}\geq 0$ , problem $\mathcal{P}_{5-4}$ is convex.

		$\displaystyle\nabla_{\bm{b}}obj(\mathcal{P}_{5})=[\sum_{i\in\mathcal{I}}H_{i}(\tau)z_{i}(\tau)(y_{i}(\tau)S_{i}(\tau)$
		$\displaystyle+\lambda_{i}(\tau))+E(\tau)\sum_{i\in\mathcal{I}}z_{i}(\tau)p_{i}(y_{i}(\tau)S_{i}(\tau)+\lambda_{i}(\tau))]$
		$\displaystyle[\frac{p_{i}\|h_{i,m}(\tau)\|^{2}}{a_{i,m}(t)(S_{i,m}(\tau))^{\theta}N_{0}(\tau)(B_{m})^{2}\ln 2}$		(44)
		$\displaystyle\cdot\frac{(S_{i,m}(\tau))^{\theta}N_{0}(\tau)B_{m}}{((b_{i}(\tau))^{3}+(b_{i}(\tau))^{2}p_{i}\|h_{i,m}(\tau)\|^{2})}$
		$\displaystyle\cdot\frac{1}{\log_{2}^{2}(1+\frac{p_{i}\|h_{i,m}(\tau)\|^{2}}{(S_{i,m}(\tau))^{\theta}N_{0}b_{i}(\tau)B_{m}})}$
		$\displaystyle-\frac{1}{a_{i,m}(t)(b_{i}(\tau))^{2}B_{m}\log_{2}(1+\frac{p_{i}\|h_{i,m}(\tau)\|^{2}}{(S_{i,m}(\tau))^{\theta}N_{0}b_{i}(\tau)B_{m}})}],$