This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Dynamic Human Digital Twin Deployment at the Edge for Task Execution: A Two-Timescale Accuracy-Aware Online Optimization

Yuye Yang, You Shi, Changyan Yi, , Jun Cai, , Jiawen Kang,
Dusit Niyato, , and Xuemin (Sherman) Shen, 
Y. Yang, Y. Shi and C. Yi are with the College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China. (E-mail: {mryyy, shyou, changyan.yi}@nuaa.edu.cn). J. Cai is with the Department of Electrical and Computer Engineering, Concordia University, Montréal, QC, H3G 1M8, Canada. (E-mail: [email protected]). J. Kang is with the School of Automation, Guangdong University of Technology, Guangzhou 510062, China (E-mail: [email protected]). D. Niyato is with the School of Computer Science and Engineering, Nanyang Technological University, Singapore. (E-mail: [email protected]). X. (Sherman) Shen is with the Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada. (E-mail: [email protected]).
Abstract

Human digital twin (HDT) is an emerging paradigm that bridges physical twins (PTs) with powerful virtual twins (VTs) for assisting complex task executions in human-centric services. In this paper, we study a two-timescale online optimization for building HDT under an end-edge-cloud collaborative framework. As a unique feature of HDT, we consider that PTs’ corresponding VTs are deployed on edge servers, consisting of not only generic models placed by downloading experiential knowledge from the cloud but also customized models updated by collecting personalized data from end devices. To maximize task execution accuracy with stringent energy and delay constraints, and by taking into account HDT’s inherent mobility and status variation uncertainties, we jointly and dynamically optimize VTs’ construction and PTs’ task offloading, along with communication and computation resource allocations. Observing that decision variables are asynchronous with different triggers, we propose a novel two-timescale accuracy-aware online optimization approach (TACO). Specifically, TACO utilizes an improved Lyapunov method to decompose the problem into multiple instant ones, and then leverages piecewise McCormick envelopes and block coordinate descent based algorithms, addressing two timescales alternately. Theoretical analyses and simulations show that the proposed approach can reach asymptotic optimum within a polynomial-time complexity, and demonstrate its superiority over counterparts.

Index Terms:
HDT, end-edge-cloud collaboration, placement and update, accuracy awareness, two-timescale online optimization

1 Introduction

Human digital twin (HDT) is defined as a paradigm that can vividly characterize the replication of each individual human in the virtual space while real-time reflecting its actual physical and mental status in the physical space [1], [2]. With the personalized status information maintained in a high-fidelity virtual environment, HDT can be regarded as a “sandbox”, where complex tasks for human-centric services (e.g., activity recognition [3] and vital signal measurement [4]) are able to be repeatedly simulated and tested, guiding the practical implementation. Because of the large potential in assisting complex task execution with human-centric concerns, HDT has been envisioned as a key enabler for Metaverse, Healthcare 5.0, Society 5.0, etc., attracting significant attentions recently [5].

Essentially, the HDT system consists of a number of physical twin (PT) and virtual twin (VT) pairs, where PT stands for the physical entity (i.e., human) and VT represents the corresponding virtual model [6]. Obviously, the successful realization of HDT largely depends on the well construction and management of VT, so as to provide fast-responsive interactions and high-accurate task execution for its paired PT. These requirements prompt the adoption of end-edge-cloud collaborative framework [7], by which HDT can be built and operated at the network edge (while supported by both end devices and the cloud center), guaranteeing pervasive connectivities, customized services and low-delay feedbacks. Although some preliminary efforts have been dedicated on studying similar problems, such as industrial digital twin construction at the edge [8], [9], [10] and service application deployment across edges [11], [12], [13], establishing HDT at the edge for assisting task execution particularly involves some fundamentally different and unique issues that remain unexplored but are of great importance. On one hand, different from position-fixed industrial plants, PTs in the HDT system are highly mobile with unpredictable mobility patterns (including positional and postural variations), leading to potential instability of PT-VT connectivities [1]. Therefore, to guarantee seamless PT-VT interactions, it is necessary to dynamically place the associated VT of each PT on the edge server (ES) that this PT may switch its access to (caused by the random mobility). On the other hand, unlike generalized applications requesting encapsulated services, PTs in the HDT system are extremely personalized and their status may vary frequently by uncertain external factors or physiological state changes, resulting in the potential inconsistency between each PT and its associated VT [14]. Hence, to keep up-to-date high-fidelity VTs, it is necessary to keep the associated VT on the ES updated in a real-time manner (especially for the customized part).

Nevertheless, meeting all aforementioned requirements are very challenging because of the following reasons.

  • 1)

    To enhance the accuracy of complex task execution assisted by HDT, it is required to construct fine-grained VTs on ESs (by both dynamically place generic models via the cloud and update customized toppings via sensors worn on PTs). However, the data brought by VT constructions can be massive [15], inherently increasing the service delay and energy consumption, and thus the data size of generic model placement and customized model update should be carefully optimized for striking a balance between accuracy and cost. On top of this, considering that the ultimate goal is to assist the task execution, how to timely and efficiently offloading tasks from PTs to ESs (containing associated VTs) should also be jointly considered with VT constructions, because both of them share the same communication and computation resources.

  • 2)

    Since HDT is time-varying evolutionalized with uncertain PT-VT mobility and status variations, the system optimization has to be conducted online while the statistics of future information (related to human activities) may be hard to obtain, if not impossible [16]. Moreover, the dynamic placement of generic VT models is triggered by PTs’ mobility and access handover among different ESs, which usually happens over a long time period. In contrast, the dynamic update of customized VT models and the complex task offloading are triggered by PTs’ status variations, which may need to be adapted in a much higher frequency. These indicate that such actions for system optimization should be performed asynchronously in different timescales.

To tackle the aforementioned difficulties, in this paper, we propose a novel two-timescale accuracy-aware online optimization approach for building HDT in assisting complex task execution under an end-edge-cloud collaborative framework. Specifically, we consider that each PT’s associated VT (deployed on the ES) consists of a generic model placed by downloading experiential knowledge (e.g., feature parameters and weights) from the cloud center and a customized model updated by uploading personalized data (e.g., behavior characteristics) from sensors worn on the PT. With the objective of maximizing the average accuracy of complex task execution assisted by HDT under stringent energy and delay constraints, and by taking into account the system uncertainties (e.g., random mobility and status variations), we formulate a two-timescale online optimization problem. Particularly, we aim to dynamically optimize i) large-timescale decisions, including the granularity of each PT’s experiential knowledge for placing its generic VT model and the ES access selection of each PT, and ii) small-timescale decisions, including the amount of each PT’s personalized data for updating its customized VT model, task offloading decision (i.e., local computing or edge computing) for each task, and communication and computation resource allocations of each ES. To this end, we develop a novel two-timescale accuracy-aware online optimization approach (TACO) based on the improved Lyapunov optimization. Specifically, the long-term problem is first decomposed into a series of short-term deterministic subproblems with different timescales, and then an alternating algorithm is proposed, integrating piecewise McCormick envelopes (PME) and block coordinate descent (BCD) based methods, for iteratively solving these subproblems. Theoretical analyses show that the proposed approach can produce an asymptotically optimal outcome with a polynomial-time complexity.

The main contributions of this paper are summarized in the following.

  • To the best of our knowledge, we are the first to study the HDT deployment at the network edge for assisting human-centric task execution by formulating a two-timescale accuracy-aware online optimization problem, which jointly optimizes VTs’ construction (including dynamic generic model placement and customized model update) and PTs’ task offloading together with the management of PT-ES access selection and corresponding communication and computation resource allocations.

  • We propose a novel approach, called TACO, which first decomposes the long-term optimization problem into multiple instant ones. Then, we leverage PME and BCD based algorithms for alternately solving the decoupled subproblems in the large-timescale and small-timescale, respectively.

  • We theoretically analyze performance of the proposed TACO approach by rigorously deriving the gap to optimum and the computational complexity in the closed-form. Furthermore, extensive simulations show that the proposed TACO approach can outperform counterparts in terms of improving the HDT-assisted task execution accuracy, and reducing the service response delay and overall system energy consumption.

The rest of this paper is organized as follows. Section 2 reviews the recent related work and highlights the novelties of this paper. Section 3 describes the considered system model and the problem formulation. In Section 4, the two-timescale accuracy-aware online optimization approach, i.e., TACO, is proposed and analyzed theoretically. Simulation results are presented in Section 5, followed by the conclusion in Section 6.

2 Related Work

As one of the key enabler for emerging applications, HDT has recently drawn a lot of research attentions from both academia and industry. For example, Lee et al. in [17] proposed a large-scale HDT construction framework on the cloud server integrated with a synchronization mechanism to reduce system overall data transmission cost. Zhong et al. in [18] introduced a bidirectional long short-term memory based algorithm in designing high-fidelity HDT model with multimodel data on the cloud platform. In [19], Liu et al. developed a cloud HDT based healthcare system by optimizing a patient-data processing workflow to improve the quality of personal health management. However, these papers were mainly restricted to constructing HDT sorely on the cloud server, ignoring the potential of utilizing network edge resources for empowering HDT with the capability of providing pervasive, customized and low-delay services.

While deploying HDT at the network edge has rarely been investigated, some researchers have dedicated in studying the general DT construction at the edge. Dong et al. in [9] proposed a deep learning algorithm for constructing DTs of the mobile edge network, aiming to minimize the normalized energy consumption through the optimization of user associations, resource allocations and offloading probabilities. Zhang et al. in [10] formulated a DT adaptive placement and transfer problem to minimize the DT synchronization delay, which were then solved by the double auction based optimization. Nevertheless, these papers considered that DTs were constructed on fixed locations or placed following pre-known mobility patterns, making them unsuitable for HDT with human-centric features, where PTs are highly mobile with unpredictability. Another stream of related works have been conducted on general mobile service application deployment across edges. For instance, Wang et al. in [11] developed a user-centric learning-driven method for deploying and migrating delay-aware service applications to minimize the total service delay of mobile users. Ouyang et al. [12] formulated a dynamic service deployment problem with the objective of minimizing the user-perceived delay under the uncertain user mobility. However, in these papers, service applications were assumed to have limited and encapsulated types, meaning that they are not customized and do not need to be updated, which largely differ from those of HDT (where on-demand evolution is essential).

To guarantee the long-term performance in online problems, Lyapunov optimization method has been widely recognized as an efficient approach [20], [21], [22], yet most of existing solutions were restricted to problems with decisions in the single timescale only. Recently, some preliminary studies [23], [24] have delved into designing two-timescale Lyapunov methods, by which the original problem was decomposed and further decoupled into subproblems in two different timescales independently and then optimized separately. Besides these, in [16], [25], alternating algorithms were developed in addition to the Lyapunov framework to tackle subproblems in two timescales with coupled relationships but are both convex (or can be easily converted into convex forms). However, these solutions cannot be directly applied in this paper because the considered subproblems (after the decomposition) are not only tightly coupled but also highly non-convex.

In summary, different from all the existing works, this paper proposes a novel two-timescale accuracy-aware online optimization approach to jointly optimize the HDT deployment (i.e., generic placement and customized update of VT model) and task offloading under an end-edge-cloud collaborative framework, where the novelty lies in not only the system model but also the solution.

3 System Model and Problem Formulation

In this section, we first provide a system overview on how HDT is deployed and functioned on ESs. Then, to be more specific, the generic VT model placement in the large-timescale and the customized VT model update together with the task offloading in the small-timescale are described. After that, aiming to enhance the accuracy of complex task execution assisted by HDT, a two-timescale online optimization problem is formulated.

3.1 System Overview

Refer to caption
Figure 1: The end-edge-cloud collaborative HDT system.

Consider an HDT system building upon an end-edge-cloud collaborative framework, as illustrated in Fig. 1, consisting of a set of end users (regarded as PTs) \mathcal{I} with cardinality of =I\mid\mathcal{I}\mid=I, multiple geographically distributed ESs denoted as \mathcal{M} with =M\mid\mathcal{M}\mid=M, and a cloud center (acting as the central controller). PTs (roaming around) generate streams of complex tasks which require the construction of exclusive VT models (forming one-to-one PT-VT pairs) at the edge to assist their task executions. Each VT should be deployed on the ES to which its associated PT may access for offering high-quality and low-delay services. Note that one ES can host multiple VT models for different PTs, and the corresponding communication and computation resources are shared among them. Furthermore, the construction of a high-fidelity VT on the ES consists of two main procedures, i.e., generic model placement and customized model update [26]. For each VT, the generic part of the model is obtained by downloading the experiential knowledge with a selected granularity111Selecting a large (small) granularity of experiential knowledge for generic model placement may increase (decrease) the fidelity of the VT model, while also introducing a large (small) amount of data to be transferred, resulting in high (low) service delay and energy consumption. from the cloud center, and the target ES for its placement is determined following the access selection of the associated PT. By contrast, the customized part of each VT model is updated by uploading the personalized data with an optimized data size222The required size of personalized data for customized model update affects not only the fidelity of the VT model, but also the uplink communication and computation resource allocations among different PT-VT pairs and between their data transmission and task offloading. obtained from sensors worn on the associated PT. After VT establishment, PTs’ tasks can be either transmitted (offloaded) to VTs deployed on the ES or processed locally, depending on the demands of task execution accuracy versus the requirements of service delay and energy consumptions. It is worth noting that although VT models are not able to be built locally, PTs’ tasks can be executed by running offline service applications pre-installed on PTs, which are much less powerful but do not require to be real-time updated.

Refer to caption
Figure 2: Two-timescale online optimization framework.

In practice, the dynamic placement of generic VT model and ES access handover of each PT commonly happen in a low frequency (i.e., over a large-timescale)333Particularly, frequently downloading experiential knowledge and switching access points can lead to large configuration costs, including the dramatic increase of delays and energy consumptions [5]., while the update of customized VT models and task offloading together with corresponding communication and computation resource allocations require immediate and frequent adaptation to accommodate the status variation of PT and its task generations [2]. To this end, we define that in the considered online optimization framework, the access selection of each PT and the granularity of experiential knowledge for its generic VT model placement are decided in the large-timescale, while the amount of personalized data for its customized VT model update, task offloading, communication and computation resource allocations are decided in the small-timescale, as shown in Fig. 2. Specifically, the timeline is segmented into T+T\in\mathbb{N}^{+} coarse-grained time frames, and each frame can be further divided into a combination of K+K\in\mathbb{N}^{+} fine-grained time slots. Let t𝒯={0,1,,T1}t\in\mathcal{T}=\{0,1,\ldots,T-1\} be the index of the tt-th time frame, and define τ𝒯t={tK,tK+1,,tK+K1}\tau\in\mathcal{T}_{t}=\{tK,tK+1,\ldots,tK+K-1\} as the index of the τ\tau-th time slot in the tt-th time frame.

Overall speaking, we target to optimize the long-term system performance (i.e., the average accuracy of complex task execution assisted by HDT under stringent delay and energy consumption constraints) by determining i) which ES should be selected to access for each PT and what level of granularity of the experiential knowledge should be chosen for placing its generic VT model on the ES in each time frame, and ii) how large the personalized data is required for updating the customized part of each VT model and whether the task of each PT should be processed locally or on the ES (deployed with its associated VT) together with communication and computation resource allocations in each time slot, in an online manner. For convenience, all important notations in this paper are listed in Table I.

TABLE I: IMPORTANT NOTATIONS IN THIS PAPER
Symbol Meaning
Ai(τ)A_{i}(\tau) task execution accuracy for PT ii in time slot τ\tau
ai,m(t)a_{i,m}(t) access selection of PT ii in time frame tt
BmB_{m} total bandwidth resource for accessing ES mm
bi(τ)b_{i}(\tau) bandwidth resource allocation of PT ii in time slot τ\tau
Di(t)D_{i}(t) total experiential knowledge of PT ii in time frame tt
Eidl(t)E_{i}^{dl}(t) energy consumption of downloading the experiential knowledge of PT ii in time frame tt
Ei,mpl(t)E_{i,m}^{pl}(t) energy consumption of placing generic VT model ii in ES mm in time frame tt
Ei,mul(τ)E_{i,m}^{ul}(\tau) energy consumption of uploading the personalized data from PT ii to ES mm in time slot τ\tau
Ei,mud(τ)E_{i,m}^{ud}(\tau) energy consumption of updating the customized VT model ii on the ES mm in time slot τ\tau
Ei,mofld(τ)E_{i,m}^{ofld}(\tau) energy consumption of transmitting task data of PT ii to ES mm in time slot τ\tau
Eiexec(τ)E_{i}^{exec}(\tau) energy consumption of executing PT ii’s task
FiF_{i} total computation resource of PT ii’s local device
FmF_{m} total computation resource of ES mm for all VTs
fi(τ)f_{i}(\tau) computation resource allocation of PT ii in time slot τ\tau
\mathcal{I} set of end users (PTs) / corresponding VTs
\mathcal{M} set of ESs
ri,m(τ)r_{i,m}(\tau) transmission rate between PT ii and ES mm in time slot τ\tau
Si(τ)S_{i}(\tau) total personalized data generated by PT ii in time slot τ\tau
Tidl(t)T_{i}^{dl}(t) delay of downloading the experiential knowledge of PT ii in time frame tt
Ti,mpl(t)T_{i,m}^{pl}(t) delay of placing generic VT model ii on ES mm
Ti,mul(τ)T_{i,m}^{ul}(\tau) delay of uploading the personalized data from PT ii to ES mm in time slot τ\tau
Ti,mud(τ)T_{i,m}^{ud}(\tau) delay of updating customized VT model ii on the ES mm
Ti,mofld(τ)T_{i,m}^{ofld}(\tau) delay of transmitting task data of PT ii to ES mm
Tiexec(τ)T_{i}^{exec}(\tau) delay of task execution of PT ii in time slot τ\tau
VV Lyapunov control parameter
xi(t)x_{i}(t) experiential knowledge granularity for generic model placement of PT ii in time frame tt
yi(τ)y_{i}(\tau) personalized data size for customized model update of PT ii in time slot τ\tau
zi(τ)z_{i}(\tau) task offloading of PT ii in time slot τ\tau

3.2 Generic VT Model Placement

Since PTs are mobile, to construct VT for each of them at the network edge so as to enable seamless PT-VT interactions, the generic VT model should be re-displaced on the ES that its associated PT switches its access to in each time frame t𝒯t\in\mathcal{T}. Denote ai,m(t){0,1}a_{i,m}(t)\in\{0,1\} as the access selection decision in the large-timescale indicating whether PT ii\in\mathcal{I} selects to access ES mm\in\mathcal{M} or not in time frame t𝒯t\in\mathcal{T}, i.e., ai,m(t)=1a_{i,m}(t)=1 if PT ii\in\mathcal{I} connects to ES mm\in\mathcal{M}, and ai,m(t)=0a_{i,m}(t)=0 otherwise. Obviously, we should have mai,m(t)1\sum_{m\in\mathcal{M}}a_{i,m}(t)\leq 1, meaning that PT ii cannot access multiple ESs simultaneously [16]. For each PT ii\in\mathcal{I}, its generic VT model is placed on the accessed ES by downloading a certain granularity of experiential knowledge from the cloud center. We define that the full experiential knowledge of each PT ii\in\mathcal{I} for its generic VT model placement has a total size of Di(t)D_{i}(t), and denote xi(t)[0,1]x_{i}(t)\in[0,1] as its decision of granularity in time frame t𝒯t\in\mathcal{T}. Then, the data size of downloading the experiential knowledge for placing PT ii’s VT model in time frame t𝒯t\in\mathcal{T} can be represented as xi(t)Di(t)x_{i}(t)D_{i}(t). Based on these, the corresponding delay and energy consumption of downloading such experiential knowledge can be respectively expressed as444Note that even for a special case that PT ii’s access selection remains unchanged, it is still necessary to periodically replace its generic VT model on the same ES, because the experiential knowledge may experience “data drift” over the time [27].

Tidl(t)=mai,m(t)xi(t)Di(t)rc,i,t𝒯,T_{i}^{dl}(t)=\sum_{m\in\mathcal{M}}a_{i,m}(t)\frac{x_{i}(t)D_{i}(t)}{r^{c}},\forall i\in\mathcal{I},\forall t\in\mathcal{T}, (1)
Eidl(t)=Tidl(t)pc,i,t𝒯,E_{i}^{dl}(t)=T_{i}^{dl}(t)p^{c},\forall i\in\mathcal{I},\forall t\in\mathcal{T}, (2)

where rcr^{c} and pcp^{c} stand for the downlink transmission rate and unit transmission power from the cloud center to each ES, respectively, which are both considered as constants [25], [28].

To exploit this experiential knowledge, each ES has to allocate a proportion of its computation resource for completing the generic VT model placement at the beginning of each time frame. The delay of doing this for PT ii\in\mathcal{I} on ES mm\in\mathcal{M} in time frame t𝒯t\in\mathcal{T} can be calculated as

Ti,mpl(t)=ai,m(t)xi(t)Di(t)Cmfi(tK)Fm,T_{i,m}^{pl}(t)=\frac{a_{i,m}(t)x_{i}(t)D_{i}(t)C_{m}}{f_{i}(tK)F_{m}}, (3)

where CmC_{m} is the number of CPU cycles required for ES mm to process a unit of data, FmF_{m} is the CPU speed (measured by cycles/s) of ES mm, and fi(tK)(0,1]f_{i}(tK)\in(0,1] represents the ratio of computation resource allocated to PT ii for its VT construction at the beginning of time frame tt (i.e., the first time slot of the frame with τ=tK\tau=tK). Referring to the energy model widely used in CMOS circuits[29], the energy consumption of constructing the generic VT model of the associated PT ii\in\mathcal{I} on the ES mm\in\mathcal{M} in each time frame t𝒯t\in\mathcal{T} can be calculated as

Ei,mpl(t)=ρmfi(tK)(Fm)3Ti,mpl(t),E_{i,m}^{pl}(t)=\rho_{m}f_{i}(tK)(F_{m})^{3}T_{i,m}^{pl}(t), (4)

where ρm\rho_{m} is the effective switched capacitance of ES mm depending on its chip architecture.

3.3 Customized VT Model Update

Since PTs are personalized and their status may vary frequently due to uncertain external or internal factors, to guarantee the timeliness and high-fidelity of VTs on ESs, the customized VT model of each PT should be updated in each time slot τ𝒯t\tau\in\mathcal{T}_{t}. Let Si(τ)S_{i}(\tau) be the total amount of personalized data generated by PT ii\in\mathcal{I}, and define yi(τ)[0,1]y_{i}(\tau)\in[0,1] as the percentage of personalized data chosen to be uploaded in time slot τ𝒯t\tau\in\mathcal{T}_{t}. Then, the size of personalized data uploaded for updating PT ii’s customized VT model in time slot τ𝒯t\tau\in\mathcal{T}_{t} can be expressed as yi(τ)Si(τ)y_{i}(\tau)S_{i}(\tau).

Within each time slot τ𝒯t\tau\in\mathcal{T}_{t}, we denote the location of PT ii\in\mathcal{I} as (xi(τ),yi(τ))(x_{i}(\tau),y_{i}(\tau)), which is a state information following its random mobility pattern, and let (xm,ym)(x_{m},y_{m}) be the fixed location of each ES mm\in\mathcal{M}. The distance between any PT ii\in\mathcal{I} and ES mm\in\mathcal{M} can then be calculated as Si,m(τ)=(xi(τ)xm)2+(yi(τ)ym)2S_{i,m}(\tau)=\sqrt{\left(x_{i}(\tau)-x_{m}\right)^{2}+\left(y_{i}(\tau)-y_{m}\right)^{2}}, and according to Shannon-Hartley formula, the transmission rate from PT ii\in\mathcal{I} to its accessed ES mm\in\mathcal{M} is written as

ri,m(τ)=ai,m(t)bi(τ)Bmlog(1+(Si,m(τ))θpi|hi,m(τ)|2N0bi(τ)Bm),r_{i,m}(\tau)=a_{i,m}(t)b_{i}(\tau)B_{m}\log(1+\frac{(S_{i,m}(\tau))^{\theta}p_{i}|h_{i,m}(\tau)|^{2}}{N_{0}b_{i}(\tau)B_{m}}), (5)

where bi(τ)(0,1]b_{i}(\tau)\in(0,1] is the proportion of communication resource allocated to PT ii\in\mathcal{I} in time slot τ𝒯t\tau\in\mathcal{T}_{t}, hi,m(τ)h_{i,m}(\tau) is the fading amplitude between PT ii\in\mathcal{I} and ES mm\in\mathcal{M} in time slot τ𝒯t\tau\in\mathcal{T}_{t} (modeled as a circularly symmetric complex Gaussian random variable [30]), BmB_{m} is the communication bandwidth of ES mm\in\mathcal{M}, N0N_{0} is the spectral density of the channel noise power, pip_{i} is the pre-determined transmission power of PT ii\in\mathcal{I}, and θ2\theta\geq 2 is the path loss exponent [30]. Correspondingly, the delay and energy consumption of PT ii\in\mathcal{I} in uploading the personalized data with size yi(τ)Si(τ)y_{i}(\tau)S_{i}(\tau) for updating its VT on ES mm\in\mathcal{M} in each time slot τ𝒯t\tau\in\mathcal{T}_{t} can be respectively expressed as

Ti,mul(τ)=yi(τ)Si(τ)ri,m(τ),i,m,τ𝒯t,T_{i,m}^{ul}(\tau)=\frac{y_{i}(\tau)S_{i}(\tau)}{r_{i,m}(\tau)},\forall i\in\mathcal{I},\forall m\in\mathcal{M},\forall\tau\in\mathcal{T}_{t}, (6)
Ei,mul(τ)=Ti,mulpi,i,m,τ𝒯t.E_{i,m}^{ul}(\tau)=T_{i,m}^{ul}p_{i},\forall i\in\mathcal{I},\forall m\in\mathcal{M},\forall\tau\in\mathcal{T}_{t}. (7)

To utilize this personalized data, each ES has to allocate a proportion of its computation resource for completing the customized VT model update in each time slot. The delay of doing this for PT ii\in\mathcal{I} on ES mm\in\mathcal{M} in time slot τ𝒯t\tau\in\mathcal{T}_{t} can be calculated as

Ti,mud(τ)=ai,m(t)yi(τ)Si(τ)Cmfi(τ)Fm,T_{i,m}^{ud}(\tau)=\frac{a_{i,m}(t)y_{i}(\tau)S_{i}(\tau)C_{m}}{f_{i}(\tau)F_{m}}, (8)

where fi(τ)(0,1]f_{i}(\tau)\in(0,1] indicates the proportion of computation resource allocated to PT ii for its VT update in each time slot τ[tK,tK+K1]\tau\in[tK,tK+K-1]. Besides, the corresponding energy consumption can be calculated as

Ei,mud(τ)=ρmfi(τ)(Fm)3Tiud(τ).E_{i,m}^{ud}(\tau)=\rho_{m}f_{i}(\tau)(F_{m})^{3}T_{i}^{ud}(\tau). (9)

3.4 HDT-Assisted Task Execution

Let λi(τ)\lambda_{i}(\tau) be the data size of the complex task produced by PT ii\in\mathcal{I} in each time slot τ𝒯t\tau\in\mathcal{T}_{t}, which is allowed to follow a general random distribution. Denote the task offloading decision of PT ii\in\mathcal{I} in time slot τ𝒯t\tau\in\mathcal{T}_{t} as zi(τ){0,1}z_{i}(\tau)\in\{0,1\}, i.e., zi(τ)=1z_{i}(\tau)=1 if PT ii offloads the task to its VT on the ES for assistance, and zi(τ)=0z_{i}(\tau)=0 if PT ii processes it locally. The delay and energy consumption of offloading/transmitting such task from PT ii\in\mathcal{I} to its associated VT deployed on the ES mm\in\mathcal{M} in time slot τ𝒯t\tau\in\mathcal{T}_{t} can be respectively expressed as

Ti,mofld(τ)=λi(τ)ri,m(τ),T_{i,m}^{ofld}(\tau)=\frac{\lambda_{i}(\tau)}{r_{i,m}(\tau)}, (10)
Ei,mofld(τ)=Ti,mofldpi.E_{i,m}^{ofld}(\tau)=T_{i,m}^{ofld}p_{i}. (11)

Considering the possibility of both edge and local processing, the delay of HDT-assisted task execution of PT ii\in\mathcal{I} in time slot τ𝒯t\tau\in\mathcal{T}_{t} can be calculated as

Tiexec(τ)=mai,m(t)zi(τ)λi(τ)Cmfi(τ)Fm\displaystyle T_{i}^{exec}(\tau)=\sum_{m\in\mathcal{M}}a_{i,m}(t)z_{i}(\tau)\frac{\lambda_{i}(\tau)C_{m}}{f_{i}(\tau)F_{m}} (12)
+(1zi(τ))λi(τ)CiFi,\displaystyle+(1-z_{i}(\tau))\frac{\lambda_{i}(\tau)C_{i}}{F_{i}},

where CiC_{i} is the number of CPU cycles required for PT ii to locally process a unit of data, and FiF_{i} denotes its CPU speed (measured by cycles/s). Besides, the corresponding energy consumption can be calculated as

Eiexec(τ)=mai,m(t)zi(τ)ρm(Fm)2λi(τ)Cm\displaystyle E_{i}^{exec}(\tau)=\sum_{m\in\mathcal{M}}a_{i,m}(t)z_{i}(\tau)\rho_{m}(F_{m})^{2}\lambda_{i}(\tau)C_{m} (13)
+(1zi(τ))ρi(Fi)2λi(τ)Ci.\displaystyle+(1-z_{i}(\tau))\rho_{i}(F_{i})^{2}\lambda_{i}(\tau)C_{i}.

With the help of HDT at the edge, the accuracy of executing each task from PT ii\in\mathcal{I} in each time slot τ𝒯t\tau\in\mathcal{T}_{t} can be defined as

Ai(τ)=zi(τ)giedge(di(τ))+(1zi(τ))gilocal,A_{i}(\tau)=z_{i}(\tau)g_{i}^{edge}(d_{i}(\tau))+(1-z_{i}(\tau))g_{i}^{local}, (14)

where gilocalg_{i}^{local} represents the task execution accuracy of local processing on PT ii itself, and giedge(di(τ))g_{i}^{edge}(d_{i}(\tau)) stands for the task execution accuracy of edge computing for PT ii depending on the total size of data used for its VT construction, i.e., di(τ)=xi(t)Di(t)+yi(τ)Si(τ),τ𝒯t,t𝒯d_{i}(\tau)=x_{i}(t)D_{i}(t)+y_{i}(\tau)S_{i}(\tau),\forall\tau\in\mathcal{T}_{t},\forall t\in\mathcal{T}, consisting of both experiential knowledge for generic VT model placement in the large-timescale and personalized data for customized VT model update in the small-timescale. Note that giedge(di(τ))g_{i}^{edge}(d_{i}(\tau)) is a mapping function that can be obtained via empirical studies or experiments [31].

Similar to [29], [32], we ignore overheads induced by computing outcomes feedback and system control signals. This is because the size of computing outcomes and control signals are much smaller than that of the input data. Technically, these overheads can be seen as small constants [33], which will not affect our analyses.

3.5 Problem Formulation

In summary, the total response delay for all tasks of PT ii\in\mathcal{I} in each time frame t𝒯t\in\mathcal{T} can be derived as

Titol(t)=Tidl(t)+mTi,mpl(t)+τ𝒯t[zi(τ)m\displaystyle T_{i}^{tol}(t)=T_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}T_{i,m}^{pl}(t)+\sum_{\tau\in\mathcal{T}_{t}}[z_{i}(\tau)\sum_{m\in\mathcal{M}} (15)
(Ti,mul(τ)+Ti,mud(τ)+Ti,mofld(τ))+Tiexec(τ)],\displaystyle(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)+T_{i,m}^{ofld}(\tau))+T_{i}^{exec}(\tau)],

and the system overall energy consumption in each time frame t𝒯t\in\mathcal{T} can be derived as

Etol(t)=i(Eidl(t)+mEi,mpl(t))+iτ𝒯t[zi(τ)\displaystyle E^{tol}(t)=\sum_{i\in\mathcal{I}}(E_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}E_{i,m}^{pl}(t))+\sum_{i\in\mathcal{I}}\sum_{\tau\in\mathcal{T}_{t}}[z_{i}(\tau)
m(Ei,mul(τ)+Ei,mud(τ)+Ei,mofld(τ))+Eiexec(τ)].\displaystyle\sum_{m\in\mathcal{M}}(E_{i,m}^{ul}(\tau)+E_{i,m}^{ud}(\tau)+E_{i,m}^{ofld}(\tau))+E_{i}^{exec}(\tau)]. (16)

To evaluate the core value of building HDT at the network edge, we take the long-term average accuracy of complex task execution assisted by the considered end-edge-cloud collaborative HDT system over all time frames as the performance measurement, which can be expressed as

𝒜=1TKt𝒯τ𝒯tiAi(τ).\mathcal{A}=\frac{1}{TK}\sum_{t\in\mathcal{T}}\sum_{\tau\in\mathcal{T}_{t}}\sum_{i\in\mathcal{I}}A_{i}(\tau). (17)

With the objective of maximizing 𝒜\mathcal{A} while ensuring that the response delay for all tasks of each PT ii\in\mathcal{I} and the overall system energy consumption do not exceed certain thresholds, we formulate a two-timescale online optimization problem by jointly optimizing ai,m(t)a_{i,m}(t) and xi(t)x_{i}(t), denoted in short by a set of large-timescale decision variables 𝒥iA(t)={ai,m(t),xi(t)}\mathcal{J}_{i}^{A}(t)=\{a_{i,m}(t),x_{i}(t)\}, for each PT ii in any time frame t𝒯t\in\mathcal{T}, and yi(τ)y_{i}(\tau), bi(τ)b_{i}(\tau), fi(τ)f_{i}(\tau) and zi(τ)z_{i}(\tau), denoted in short by a set of small-timescale decision variables 𝒥iB(τ)={bi(τ),yi(τ),fi(τ),zi(τ)}\mathcal{J}_{i}^{B}(\tau)=\{b_{i}(\tau),y_{i}(\tau),f_{i}(\tau),z_{i}(\tau)\}, for each PT ii in any time slot τ𝒯t\tau\in\mathcal{T}_{t}.

Mathematically, such a two-timescale online optimization problem can be formulated as

𝒫1:max𝒥iA(t),𝒥iB(τ)limt𝒜\displaystyle\mathcal{P}_{1}:\max_{\mathcal{J}_{i}^{A}(t),\mathcal{J}_{i}^{B}(\tau)}\lim_{t\rightarrow\infty}\mathcal{A}
s.t. mai,m(t)1,i,t𝒯,\displaystyle\text{ s.t. }\sum_{m\in\mathcal{M}}a_{i,m}(t)\leq 1,\forall i\in\mathcal{I},\forall t\in\mathcal{T}, (18a)
iai,m(t)bi(τ)1,m,t𝒯,τ𝒯t,\displaystyle\sum_{i\in\mathcal{I}}a_{i,m}(t)b_{i}(\tau)\leq 1,\forall m\in\mathcal{M},\forall t\in\mathcal{T},\tau\in\mathcal{T}_{t}, (18b)
iai,m(t)fi(τ)1,m,t𝒯,τ𝒯t,\displaystyle\sum_{i\in\mathcal{I}}a_{i,m}(t)f_{i}(\tau)\leq 1,\forall m\in\mathcal{M},\forall t\in\mathcal{T},\tau\in\mathcal{T}_{t}, (18c)
limT1Tt𝒯Titol(t)Timax,i,\displaystyle\lim_{T\rightarrow\infty}\frac{1}{T}\sum_{t\in\mathcal{T}}T_{i}^{tol}(t)\leq T_{i}^{max},\forall i\in\mathcal{I}, (18d)
limT1Tt𝒯Etol(t)Emax,\displaystyle\lim_{T\rightarrow\infty}\frac{1}{T}\sum_{t\in\mathcal{T}}E^{tol}(t)\leq E^{max}, (18e)

where constraint (18a) guarantees that one PT can connect to at most one ES in each time frame t𝒯t\in\mathcal{T}; constraints (18b) and (18c) restrict that the communication and computation resource allocation should be less than the total capacities of each ES in any time slot τ𝒯t\tau\in\mathcal{T}_{t}; constraints (18d) and (18e) are the long-term average task response delay and system overall energy consumption constraints (in which TimaxT_{i}^{max} and EmaxE^{max} represent the pre-determined response delay threshold of each PT ii\in\mathcal{I} and the overall system energy consumption threshold, respectively).

Obviously, solving problem 𝒫1\mathcal{P}_{1} directly is very challenging because i)i) PTs’ mobilities and status variations, due to human-related characteristics, are highly unpredictable, meaning that it is extremely hard, if not impossible, to obtain system statistics in advance, which necessitates the design of an online optimization algorithm; ii)ii) although Lyapunov optimization [34] is well-known as an effective method to solve such an online problem in general, decision variables in different timescales (i.e., 𝒥iA(t)\mathcal{J}_{i}^{A}(t) and 𝒥iB(τ)\mathcal{J}_{i}^{B}(\tau)) are tightly coupled in not only the objective function but also constraints (18d) and (18e); and iii)iii) all constraints include discrete decision variables (i.e., ai,m(t)a_{i,m}(t) or zi(τ)z_{i}(\tau)), and constraints (18d) and (18e) are both non-convex. These indicate that 𝒫1\mathcal{P}_{1} is a two-timescale online non-convex mixed integer programming problem, which must be NP-hard.

4 A Two-Timescale Online Optimization Approach (TACO)

In this section, we propose a novel approach, namely TACO, for jointly optimizing VT construction and task offloading in the considered end-edge-cloud collaborative HDT system. Specifically, we first reformulate the two-timescale problem by distributing the task response delay and system overall energy consumption of each time frame t𝒯t\in\mathcal{T} into each of its contained time slot τ𝒯t\tau\in\mathcal{T}_{t}. Then, we decompose the problem into multiple short-term deterministic subproblems of different timescales with the help of Lyapunov optimization method but with a two-timescale extension. After that, we introduce an alternating algorithm integrating PME and BCD methods to solve subproblems in the large-timescale and small-timescale, respectively.

4.1 Problem Reformulation

Observed from problem 𝒫1\mathcal{P}_{1} that the delay and energy consumptions caused by the generic VT model placement are on the large-timescale, while those caused by customized VT model update and task execution are on the small-timescale. To facilitate the solution, we evenly distribute the task response delay and system overall energy consumption in each time frame t𝒯t\in\mathcal{T} into all 𝒯t=K\mid\mathcal{T}_{t}\mid=K time slots within this frame, which yields

Titol(τ)=(Tidl(t)+mTi,mpl(t))/K+τ𝒯t[zi(τ)\displaystyle T_{i}^{tol}(\tau)=(T_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}T_{i,m}^{pl}(t))/K+\sum_{\tau\in\mathcal{T}_{t}}[z_{i}(\tau) (19)
m(Ti,mul(τ)+Ti,mud(τ)+Ti,mofld(τ))+Tiexec(τ)],\displaystyle\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)+T_{i,m}^{ofld}(\tau))+T_{i}^{exec}(\tau)],
Etol(τ)=i(Eidl(t)+mEi,mpl(t))/K+iτ𝒯t[zi(τ)\displaystyle E^{tol}(\tau)=\sum_{i\in\mathcal{I}}(E_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}E_{i,m}^{pl}(t))/K+\sum_{i\in\mathcal{I}}\sum_{\tau\in\mathcal{T}_{t}}[z_{i}(\tau)
m(Ei,mul(τ)+Ei,mud(τ)+Ei,mofld(τ))+Eiexec(τ)].\displaystyle\sum_{m\in\mathcal{M}}(E_{i,m}^{ul}(\tau)+E_{i,m}^{ud}(\tau)+E_{i,m}^{ofld}(\tau))+E_{i}^{exec}(\tau)]. (20)

Substituting (19) and (4.1) into (18d) and (18e) of problem 𝒫1\mathcal{P}_{1}, we have

𝒫2:max𝒥iA(t),𝒥iB(τ)limt𝒜\displaystyle\mathcal{P}_{2}:\max_{\mathcal{J}_{i}^{A}(t),\mathcal{J}_{i}^{B}(\tau)}\lim_{t\rightarrow\infty}\mathcal{A}
s.t. (18a),(18b),(18c),\displaystyle\text{ s.t. }(\ref{C1}),(\ref{C3}),(\ref{C4}),
limT1TKt𝒯τ𝒯tTitol(τ)Timax/K,i,\displaystyle\lim_{T\rightarrow\infty}\frac{1}{TK}\sum_{t\in\mathcal{T}}\sum_{\tau\in\mathcal{T}_{t}}T_{i}^{tol}(\tau)\leq T_{i}^{max}/K,\forall i\in\mathcal{I}, (21a)
limT1TKt𝒯τ𝒯tEtol(τ)Emax/K.\displaystyle\lim_{T\rightarrow\infty}\frac{1}{TK}\sum_{t\in\mathcal{T}}\sum_{\tau\in\mathcal{T}_{t}}E^{tol}(\tau)\leq E^{max}/K. (21b)

Note that the reformulated problem 𝒫2\mathcal{P}_{2} is equivalent to the original problem 𝒫1\mathcal{P}_{1} with exactly the same decision variables remaining in two different timescales, while all long-term constraints have been unified into a single timescale (i.e., in terms of the time slot only) but will not affect the optimization performance.

Obviously, 𝒫2\mathcal{P}_{2} is still a long-term optimization problem, and the major difficulties for solving it are i) how to address the long-term average delay and energy consumption constraints; and ii) how to optimize two-timescale decision variables simultaneously. To this end, in the next subsection, we employ the Lyapunov method [34], and reformulate 𝒫2\mathcal{P}_{2} to accommodate the two-timescale features.

4.2 Problem Decomposition

We first define a delay overflow queue and an energy deficit queue to respectively describe how task response delay Titol(τ)T_{i}^{tol}(\tau) of each PT ii\in\mathcal{I} and the overall system energy consumption Etol(τ)E^{tol}(\tau) in time slot τ𝒯t\tau\in\mathcal{T}_{t} may deviate from the long-term budget Timax/KT_{i}^{max}/K and Emax/KE^{max}/K. The dynamic evolution of these two queues can be expressed as

Hi(τ+1)=max[Hi(τ)+Titol(τ)Timax/K,0],H_{i}(\tau+1)=\max[H_{i}(\tau)+T_{i}^{tol}(\tau)-T_{i}^{max}/K,0], (22)
E(τ+1)=max[E(τ)+Etol(τ)Emax/K,0].E(\tau+1)=\max[E(\tau)+E^{tol}(\tau)-E^{max}/K,0]. (23)

After that, we combine the delay overflow queue Hi(τ)H_{i}(\tau) for all tasks of PTs and energy deficit queue Etol(τ)E^{tol}(\tau) by a vector as 𝚯(τ)=[𝑯(τ),E(τ)]\bm{\Theta}(\tau)=[\bm{H}(\tau),E(\tau)], and introduce a quadratic Lyapunov function as [34]:

L(𝚯(τ))12[iHi(τ)2+E(τ)2],L(\bm{\Theta}(\tau))\triangleq\frac{1}{2}[\sum_{i\in\mathcal{I}}H_{i}(\tau)^{2}+E(\tau)^{2}], (24)

which quantitatively reflects the congestion of all queues, and should be persistently pushed towards a minimum value to keep queue stabilities. Referring to [35], the conditional Lyapunov drift is given by

Δ(𝚯(τ))=𝔼[L(𝚯(τ+K))L(𝚯(τ))|𝚯(τ)],\Delta(\bm{\Theta}(\tau))=\mathbb{E}[L(\bm{\Theta}(\tau+K))-L(\bm{\Theta}(\tau))|\bm{\Theta}(\tau)], (25)

where 𝔼[]\mathbb{E}[\cdot] denotes the expectation, and Δ(𝚯(τ))\Delta(\bm{\Theta}(\tau)) measures the difference of the Lyapunov function between KK consecutive time slots. Intuitively, by minimizing the Lyapunov drift in (25), we can prevent queue backlogs from the unbounded growth, and thus guarantee that the desired delay and energy consumption constraints can be met.

Accordingly, the Lyapunov drift-plus-penalty function becomes

Δ(𝚯(τ))V𝔼[iAi(τ)𝚯(τ)],\Delta(\bm{\Theta}(\tau))-V\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)], (26)

where a control parameter V>0V>0 is introduced, representing the weight of significance on maximizing the HDT-assisted task execution accuracy versus that of strictly satisfying the delay and energy consumption constraints.

Theorem 1

Let V>0V>0, and the drift-plus-penalty is bounded by any possible decisions in any time slot τ𝒯t\tau\in\mathcal{T}_{t}, i.e.,

Δ(𝚯(τ))V𝔼[iAi(τ)𝚯(τ)]\displaystyle\Delta(\bm{\Theta}(\tau))-V\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)] (27)
G+i𝔼[Hi(τ)(Titol(τ)Timax/K)𝚯(τ)]\displaystyle\leq G+\sum_{i\in\mathcal{I}}\mathbb{E}[H_{i}(\tau)(T_{i}^{tol}(\tau)-T_{i}^{max}/K)\mid\bm{\Theta}(\tau)]
+𝔼[E(τ)(Etol(τ)Emax/K)𝚯(τ)]\displaystyle+\mathbb{E}[E(\tau)(E^{tol}(\tau)-E^{max}/K)\mid\bm{\Theta}(\tau)]
V𝔼[iAi(τ)𝚯(τ)],\displaystyle-V\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)],

where

G=12i[Titol(max)Timax]2+12[Etol(max)Emax]2\displaystyle G=\frac{1}{2}\sum_{i\in\mathcal{I}}[T_{i}^{tol}(max)-T_{i}^{max}]^{2}+\frac{1}{2}[E^{tol}(max)-E^{max}]^{2}

is a positive constant that adjusts the tradeoff between the HDT-assisted task execution accuracy and the satisfaction degree of the delay and energy consumption constraints.

Proof:

Please see Appendix A. ∎

Theorem 1 shows that the drift-plus-penalty is deterministically upper bounded in each time slot τ𝒯t\tau\in\mathcal{T}_{t}. Then, taking the sum over all time slots within time frame tt for both sides of (27), we have

τ𝒯tΔ(𝚯(τ))Vτ𝒯t𝔼[iAi(τ)𝚯(τ)]\displaystyle\sum_{\tau\in\mathcal{T}_{t}}\Delta(\bm{\Theta}(\tau))-V\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)]
GK+τ𝒯ti𝔼[Hi(τ)(Titol(τ)Timax/K)𝚯(τ)]\displaystyle\leq GK+\sum_{\tau\in\mathcal{T}_{t}}\sum_{i\in\mathcal{I}}\mathbb{E}[H_{i}(\tau)(T_{i}^{tol}(\tau)-T_{i}^{max}/K)\mid\bm{\Theta}(\tau)]
+τ𝒯t𝔼[E(τ)(Etol(τ)Emax/K)𝚯(τ)]\displaystyle+\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[E(\tau)(E^{tol}(\tau)-E^{max}/K)\mid\bm{\Theta}(\tau)]
Vτ𝒯t𝔼[iAi(τ)𝚯(τ)].\displaystyle-V\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)]. (28)

Then, 𝒫2\mathcal{P}_{2} can be decomposed into multiple subproblems, each of which opportunistically minimize the right-hand-side of (4.2) in one time frame t𝒯t\in\mathcal{T}, i.e.,

𝒫3:min𝒥iA(t),𝒥iB(τ)τ𝒯ti𝔼[Hi(τ)(Titol(τ)\displaystyle\mathcal{P}_{3}:\min_{\mathcal{J}_{i}^{A}(t),\mathcal{J}_{i}^{B}(\tau)}\sum_{\tau\in\mathcal{T}_{t}}\sum_{i\in\mathcal{I}}\mathbb{E}[H_{i}(\tau)(T_{i}^{tol}(\tau)
Timax/K)𝚯(τ)]+τ𝒯t𝔼[E(τ)(Etol(τ)\displaystyle-T_{i}^{max}/K)\mid\bm{\Theta}(\tau)]+\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[E(\tau)(E^{tol}(\tau)
Emax/K)𝚯(τ)]Vτ𝒯t𝔼[iAi(τ)𝚯(τ)]\displaystyle-E^{max}/K)\mid\bm{\Theta}(\tau)]-V\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)]
s.t. (18a),(18b),(18c).\displaystyle\text{ s.t. }(\ref{C1}),(\ref{C3}),(\ref{C4}).

Note that, in problem 𝒫3\mathcal{P}_{3}, decisions 𝒥iA(t)={ai,m(t),xi(t)}\mathcal{J}_{i}^{A}(t)=\{a_{i,m}(t),x_{i}(t)\} and 𝒥iB(τ)={bi(τ),yi(τ),fi(τ),zi(τ)}\mathcal{J}_{i}^{B}(\tau)=\{b_{i}(\tau),y_{i}(\tau),f_{i}(\tau),z_{i}(\tau)\} remain unchanged as those in 𝒫3\mathcal{P}_{3}. This means that, although 𝒫3\mathcal{P}_{3} focuses on the optimization in a single time frame, it still includes two-timescale variables.

4.3 Alternating Algorithm between Two Timescales

4.3.1 Two-Timescale Decoupling and Alternation

To solve problem 𝒫3\mathcal{P}_{3}, we can decouple it into two subproblems (one for the large-timescale, and the other for the small-timescale), and then solve them alternately till the convergence.

Large-timescale Problem: Given the small-timescale decision 𝒥iB(τ)\mathcal{J}_{i}^{B}(\tau) together with the current backlogs of delay overflow queues Hi(τ)H_{i}(\tau) of all PTs and the system energy consumption deficit queue E(τ)E(\tau), the large-timescale subproblem aims to jointly optimize the granularity of experiential knowledge xi(t)x_{i}(t) and access selection ai,m(t)a_{i,m}(t) at the beginning of each time frame t𝒯t\in\mathcal{T}, which can be formulated as

𝒫4:min𝒥iA(t)iHi(τ)[(Tidl(t)+mTi,mpl(t))/K\displaystyle\mathcal{P}_{4}:\min_{\mathcal{J}_{i}^{A}(t)}\sum_{i\in\mathcal{I}}H_{i}(\tau)[(T_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}T_{i,m}^{pl}(t))/K
+zi(τ)m(Ti,mul(τ)+Ti,mud(τ)+Ti,mofld(τ))\displaystyle+z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)+T_{i,m}^{ofld}(\tau))
+Tiexec(τ)]+E(τ)i[(Eidl(t)+mEi,mpl(t))/K\displaystyle+T_{i}^{exec}(\tau)]+E(\tau)\sum_{i\in\mathcal{I}}[(E_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}E_{i,m}^{pl}(t))/K
+zi(τ)m(Ei,mul(τ)+Ei,mud(τ)+Ei,mofld(τ))\displaystyle+z_{i}(\tau)\sum_{m\in\mathcal{M}}(E_{i,m}^{ul}(\tau)+E_{i,m}^{ud}(\tau)+E_{i,m}^{ofld}(\tau))
+Eiexec(τ)]ViAi(τ)\displaystyle+E_{i}^{exec}(\tau)]-V\sum_{i\in\mathcal{I}}A_{i}(\tau)
s.t. (18a),(18b),(18c).\displaystyle\text{ s.t. }(\ref{C1}),(\ref{C3}),(\ref{C4}).

Small-timescale Problem: Given the large-timescale decision 𝒥iA(t)\mathcal{J}_{i}^{A}(t), the small-timescale subproblem targets to jointly optimize the personalized data size yi(τ)y_{i}(\tau), bandwidth resource allocation bi(τ)b_{i}(\tau), computation resource allocation fi(τ)f_{i}(\tau) and task offloading zi(τ)z_{i}(\tau) in each time slot τ𝒯t\tau\in\mathcal{T}_{t}, which can be formulated as

𝒫5:min𝒥iB(τ)iHi(τ)[mTi,mpl(t)/K+zi(τ)m\displaystyle\mathcal{P}_{5}:\min_{\mathcal{J}_{i}^{B}(\tau)}\sum_{i\in\mathcal{I}}H_{i}(\tau)[\sum_{m\in\mathcal{M}}T_{i,m}^{pl}(t)/K+z_{i}(\tau)\sum_{m\in\mathcal{M}}
(Ti,mul(τ)+Ti,mud(τ)+Ti,mofld(τ))+Tiexec(τ)]\displaystyle(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)+T_{i,m}^{ofld}(\tau))+T_{i}^{exec}(\tau)]
+E(τ)i[mEi,mpl(t)/K+mzi(τ)(Ei,mul(τ)\displaystyle+E(\tau)\sum_{i\in\mathcal{I}}[\sum_{m\in\mathcal{M}}E_{i,m}^{pl}(t)/K+\sum_{m\in\mathcal{M}}z_{i}(\tau)(E_{i,m}^{ul}(\tau)
+Ei,mud(τ)+Ei,mofld(τ))+Eiexec(τ)]ViAi(τ)\displaystyle+E_{i,m}^{ud}(\tau)+E_{i,m}^{ofld}(\tau))+E_{i}^{exec}(\tau)]-V\sum_{i\in\mathcal{I}}A_{i}(\tau)
s.t. (18b),(18c).\displaystyle\text{ s.t. }(\ref{C3}),(\ref{C4}).

Alternating Process: For τ=tK\tau=tK (i.e., the beginning of each time frame), we iteratively optimize subproblems 𝒫4\mathcal{P}_{4} and 𝒫5\mathcal{P}_{5} until the objective of 𝒫3\mathcal{P}_{3} converges. Specifically, by fixing small-timescale decisions as 𝒥iB(tK)=JiB(tK1)\mathcal{J}_{i}^{B}(tK)=J_{i}^{B}(tK-1) (inherited from the last time slot τ=tK1\tau=tK-1 of the previous time frame 𝒯t1\mathcal{T}_{t-1}), large-timescale subproblem 𝒫4\mathcal{P}_{4} is first solved to optimize 𝒥iA(tK)\mathcal{J}_{i}^{A}(tK). Then, given large-timescale decisions 𝒥iA(tK)\mathcal{J}_{i}^{A}(tK), small-timescale subproblem 𝒫5\mathcal{P}_{5} is solved to update 𝒥iB(tK)\mathcal{J}_{i}^{B}(tK), which is returned back to 𝒫4\mathcal{P}_{4}. For each τ[tK+1,tK+K1]\tau\in[tK+1,tK+K-1] (i.e., the rest of each time frame), with the optimized large-timescale decisions 𝒥iA(tK)\mathcal{J}_{i}^{A}(tK) after the convergence, we repeatedly solve 𝒫5\mathcal{P}_{5} to obtain 𝒥iB(τ)\mathcal{J}_{i}^{B}(\tau) in each time slot.

Input: NN partitions, and initial feasible solution zz^{{}^{\prime}}
1 Linear relaxation: ai,m(t){0,1}a~i,m(t)[0,1]a_{i,m}(t)\in\{0,1\}\rightarrow\tilde{a}_{i,m}(t)\in[0,1];
2 for xi(t)𝐱x_{i}(t)\in\bm{x} do
3       for a~i,m(t)𝐚~\tilde{a}_{i,m}(t)\in\bm{\tilde{a}} do
4             for nNn\in N do
5                   xiL(t)=xi,nL(t);xiU(t)=xi,nU(t)x_{i}^{L}(t)=x_{i,n}^{L}(t);x_{i}^{U}(t)=x_{i,n}^{U}(t)
6                   if (42) is feasible then
7                         Tighten bound [a~i,m,nL(t),a~i,m,nU(t)][\tilde{a}_{i,m,n}^{L}(t),\tilde{a}_{i,m,n}^{U}(t)] from (42);
8                        
9                  else
10                         Remove partition [xi,nL(t),xi,nU(t)][x_{i,n}^{L}(t),x_{i,n}^{U}(t)];
11                        
12                  
13            Update lower bound as a~i,mL(t)=minna~i,m,nL(t)\tilde{a}_{i,m}^{L}(t)=\min_{n}\tilde{a}_{i,m,n}^{L}(t);
14             Update upper bound as a~i,mU(t)=maxna~i,m,nU(t)\tilde{a}_{i,m}^{U}(t)=\max_{n}\tilde{a}_{i,m,n}^{U}(t);
15            
16      Update lower bound as xiL(t)=minnxi,nL(t)x_{i}^{L}(t)=\min_{n}x_{i,n}^{L}(t);
17       Update upper bound as xiU(t)=maxnxi,nU(t)x_{i}^{U}(t)=\max_{n}x_{i,n}^{U}(t);
18      
19Solve 𝒫43\mathcal{P}_{4-3} in the pruned partition to get solution (𝒙R,𝒂~R)(\bm{x}^{R},\bm{\tilde{a}}^{R});
20 Round ai,m(t)=maxma~i,m(t),ma_{i,m^{*}}(t)=\max_{m^{*}}\tilde{a}_{i,m}(t),\forall m\in\mathcal{M} to 1 and set the others to 0;
21 Get solution (𝒙R,𝒂R\bm{x}^{R},\bm{a}^{R}) of 𝒫4\mathcal{P}_{4};
Output: (𝒙R,𝒂R\bm{x}^{R},\bm{a}^{R})
Algorithm 1 PME-Based Algorithm

4.3.2 Solution for Large-Timescale Decisions

Large-timescale subproblem 𝒫4\mathcal{P}_{4} is non-convex in general, but can be regarded as a bilinear optimization problem (i.e., the problem is linear if we fix one decision variable and optimize the other one within 𝒥iA(t)\mathcal{J}_{i}^{A}(t)). This motivates us to design a PME-based algorithm [36], which first constructs convex envelops for bilinear terms, transforming the problem to a piecewise linear form, and then solves it by partitioning and pruning.

Let ui,m(t)=ai,m(t)xi(t)u_{i,m}(t)=a_{i,m}(t)x_{i}(t) be an auxiliary variable of bivariate ai,m(t)xi(t)a_{i,m}(t)x_{i}(t). Then, 𝒫4\mathcal{P}_{4} can be relaxed into a convex optimization problem as

𝒫41:min𝒥iA(t),ui,m(t)iHi(τ)[(Tidl(t)+mui,m(t)\displaystyle\mathcal{P}_{4-1}:\min_{\mathcal{J}_{i}^{A}(t),u_{i,m}(t)}\sum_{i\in\mathcal{I}}H_{i}(\tau)[(T_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}u_{i,m}(t)
Di(t)Cmfi(tK)Fm)/K+zi(τ)m(Ti,mul(τ)+Ti,mud(τ)+Ti,mofld(τ))\displaystyle\frac{D_{i}(t)C_{m}}{f_{i}(tK)F_{m}})/K+z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)+T_{i,m}^{ofld}(\tau))
+Ti,mexec(τ)]+E(τ)i[(Eidl(t)+mui,m(t)\displaystyle+T_{i,m}^{exec}(\tau)]+E(\tau)\sum_{i\in\mathcal{I}}[(E_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}u_{i,m}(t)
ρm(Fm)2Di(t)Cm)/K+zi(τ)m(Ei,mul(τ)\displaystyle\rho_{m}(F_{m})^{2}D_{i}(t)C_{m})/K+z_{i}(\tau)\sum_{m\in\mathcal{M}}(E_{i,m}^{ul}(\tau)
+Ei,mud(τ)+Ei,mofld(τ))+Eiexec(τ)]ViAi(τ)\displaystyle+E_{i,m}^{ud}(\tau)+E_{i,m}^{ofld}(\tau))+E_{i}^{exec}(\tau)]-V\sum_{i\in\mathcal{I}}A_{i}(\tau)
s.t. (18a),(18b),(18c),\displaystyle\text{ s.t. }(\ref{C1}),(\ref{C3}),(\ref{C4}),
ui,m(t)0,i,m,t𝒯,\displaystyle u_{i,m}(t)\geq 0,\forall i\in\mathcal{I},\forall m\in\mathcal{M},\forall t\in\mathcal{T}, (29a)
ui,m(t)xi(t)+ai,m(t)1,i,m,t𝒯,\displaystyle u_{i,m}(t)\geq x_{i}(t)+a_{i,m}(t)-1,\forall i\in\mathcal{I},\forall m\in\mathcal{M},\forall t\in\mathcal{T}, (29b)
ui,m(t)ai,m(t),i,m,t𝒯,\displaystyle u_{i,m}(t)\leq a_{i,m}(t),\forall i\in\mathcal{I},\forall m\in\mathcal{M},\forall t\in\mathcal{T}, (29c)
ui,m(t)xi(t),i,m,t𝒯,\displaystyle u_{i,m}(t)\leq x_{i}(t),\forall i\in\mathcal{I},\forall m\in\mathcal{M},\forall t\in\mathcal{T}, (29d)

where constraints (29a)-(29d) are the corresponding relaxed ones on bivariate ai,m(t)xi(t)a_{i,m}(t)x_{i}(t).

Next, to improve the solution quality, we divide 𝒙={xi(t),i,t𝒯}\bm{x}=\{x_{i}(t),\forall i\in\mathcal{I},\forall t\in\mathcal{T}\} and 𝒂={ai,m(t),i,i,t𝒯}\bm{a}=\{a_{i,m}(t),\forall i\in\mathcal{I},\forall i\in\mathcal{I},\forall t\in\mathcal{T}\} into 𝒩=N\mid\mathcal{N}\mid=N partitions. Specifically, let xi,n(t)[xi,nL(t),xi,nU(t)]x_{i,n}(t)\in[x_{i,n}^{L}(t),x_{i,n}^{U}(t)] be the range of xi,n(t)x_{i,n}(t) in partition n𝒩n\in\mathcal{N}, where xi,nL(t)x_{i,n}^{L}(t) and xi,nU(t)x_{i,n}^{U}(t) are its lower and upper bounds, respectively. Besides, a new auxiliary variable yi,n(t){0,1}y_{i,n}(t)\in\{0,1\} is introduced, where yi,n(t)=1y_{i,n}(t)=1 if the value of xi(t)x_{i}(t) belongs to partition nn, and yi,n(t)=0y_{i,n}(t)=0 otherwise. Similarly, the binary variable ai,m(t)a_{i,m}(t) is first relaxed into a continuous variable a~i,m(t)[0,1]\tilde{a}_{i,m}(t)\in[0,1] and then divided into NN piecewise areas, where the range of partition n𝒩n\in\mathcal{N} is a~i,m,n(t)[a~i,m,nL(t),a~i,m,nU(t)]\tilde{a}_{i,m,n}(t)\in[\tilde{a}_{i,m,n}^{L}(t),\tilde{a}_{i,m,n}^{U}(t)]. Based on these, 𝒫41\mathcal{P}_{4-1} can be converted into a generalized disjunctive programming problem as

𝒫42:min𝒥iA(t),ui,m(t)iHi(τ)[(Tidl(t)+mMui,m(t)\displaystyle\mathcal{P}_{4-2}:\min_{\mathcal{J}_{i}^{A}(t),u_{i,m}(t)}\sum_{i\in\mathcal{I}}H_{i}(\tau)[(T_{i}^{dl}(t)+\sum_{m\in M}u_{i,m}(t)
Di(t)Cmfi(tK)Fm)/K+zi(τ)m(Ti,mul(τ)+Ti,mud(τ)+Ti,mofld(τ))\displaystyle\frac{D_{i}(t)C_{m}}{f_{i}(tK)F_{m}})/K+z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)+T_{i,m}^{ofld}(\tau))
+Ti,mexec(τ)]+E(τ)i[(Eidl(t)+mui,m(t)\displaystyle+T_{i,m}^{exec}(\tau)]+E(\tau)\sum_{i\in\mathcal{I}}[(E_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}u_{i,m}(t)
ρm(Fm)2Di(t)Cm)/K+zi(τ)m(Ei,mul(τ)+Ei,mud(τ)\displaystyle\rho_{m}(F_{m})^{2}D_{i}(t)C_{m})/K+z_{i}(\tau)\sum_{m\in\mathcal{M}}(E_{i,m}^{ul}(\tau)+E_{i,m}^{ud}(\tau)
+Ei,mofld(τ))+Eiexec(τ)]ViAi(τ)\displaystyle+E_{i,m}^{ofld}(\tau))+E_{i}^{exec}(\tau)]-V\sum_{i\in\mathcal{I}}A_{i}(\tau)
s.t. (18a),(18b),(18c),\displaystyle\text{ s.t. }(\ref{C1}),(\ref{C3}),(\ref{C4}),
n𝒩[ui,m(t)a~i,m(t)xi,nL(t)+a~i,m,nL(t)xi(t)a~i,m,nL(t)xi,nL(t)ui,m(t)a~i,m(t)xi,nU(t)+a~i,m,nU(t)xi(t)a~i,m,nU(t)xi,nU(t)ui,m(t)a~i,m(t)xi,nL(t)+a~i,m,nU(t)xi(t)a~i,m,nU(t)xi,nL(t)ui,m(t)a~i,m(t)xi,nU(t)+a~i,m,nL(t)xi(t)a~i,m,nL(t)xi,nU(t)a~i,m,nL(t)a~i,m(t)a~i,m,nU(t)xi,nL(t)xi(t)xi,nU(t)yi,n(t)],\displaystyle\bigvee_{n\in\mathcal{N}}\left[\begin{array}[]{l}u_{i,m}(t)\geqslant\tilde{a}_{i,m}(t)\cdot x_{i,n}^{L}(t)+\tilde{a}_{i,m,n}^{L}(t)\cdot x_{i}(t)\\ -\tilde{a}_{i,m,n}^{L}(t)\cdot x_{i,n}^{L}(t)\\ u_{i,m}(t)\geqslant\tilde{a}_{i,m}(t)\cdot x_{i,n}^{U}(t)+\tilde{a}_{i,m,n}^{U}(t)\cdot x_{i}(t)\\ -\tilde{a}_{i,m,n}^{U}(t)\cdot x_{i,n}^{U}(t)\\ u_{i,m}(t)\leqslant\tilde{a}_{i,m}(t)\cdot x_{i,n}^{L}(t)+\tilde{a}_{i,m,n}^{U}(t)\cdot x_{i}(t)\\ -\tilde{a}_{i,m,n}^{U}(t)\cdot x_{i,n}^{L}(t)\\ u_{i,m}(t)\leqslant\tilde{a}_{i,m}(t)\cdot x_{i,n}^{U}(t)+\tilde{a}_{i,m,n}^{L}(t)\cdot x_{i}(t)\\ -\tilde{a}_{i,m,n}^{L}(t)\cdot x_{i,n}^{U}(t)\\ \tilde{a}_{i,m,n}^{L}(t)\leqslant\tilde{a}_{i,m}(t)\leqslant\tilde{a}_{i,m,n}^{U}(t)\\ x_{i,n}^{L}(t)\leqslant x_{i}(t)\leqslant x_{i,n}^{U}(t)\\ y_{i,n}(t)\end{array}\right], (41)
xi,nL(t)=xiL(t)+(xiU(t)xiL(t))(n1)N,i,n,t,\displaystyle x_{i,n}^{L}(t)=x_{i}^{L}(t)+\frac{(x_{i}^{U}(t)-x_{i}^{L}(t))\cdot(n-1)}{N},\forall i,\forall n,\forall t,
xi,nU(t)=xiL(t)+(xiU(t)xiL(t))nN,i,n,t,\displaystyle x_{i,n}^{U}(t)=x_{i}^{L}(t)+\frac{(x_{i}^{U}(t)-x_{i}^{L}(t))\cdot n}{N},\forall i,\forall n,\forall t,
yi,n(t){0,1},n,t.\displaystyle y_{i,n}(t)\in\{0,1\},\forall n,\forall t.

Then, by applying the convex hull relaxation [37], [38], we can further transform problem 𝒫42\mathcal{P}_{4-2} into a piecewise linear form as

𝒫43:min𝒥iA(t),ui,m(t)iHi(τ)[(Tidl(t)+mMui,m(t)\displaystyle\mathcal{P}_{4-3}:\min_{\mathcal{J}_{i}^{A}(t),u_{i,m}(t)}\sum_{i\in\mathcal{I}}H_{i}(\tau)[(T_{i}^{dl}(t)+\sum_{m\in M}u_{i,m}(t)
Di(t)Cmfi(tK)Fm)/K+zi(τ)m(Ti,mul(τ)+Ti,mud(τ)+Ti,mofld(τ))\displaystyle\frac{D_{i}(t)C_{m}}{f_{i}(tK)F_{m}})/K+z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)+T_{i,m}^{ofld}(\tau))
+Ti,mexec(τ)]+E(τ)i[(Eidl(t)+mui,m(t)\displaystyle+T_{i,m}^{exec}(\tau)]+E(\tau)\sum_{i\in\mathcal{I}}[(E_{i}^{dl}(t)+\sum_{m\in\mathcal{M}}u_{i,m}(t)
ρm(Fm)2Di(t)Cm)/K+zi(τ)m(Ei,mul(τ)+Ei,mud(τ)\displaystyle\rho_{m}(F_{m})^{2}D_{i}(t)C_{m})/K+z_{i}(\tau)\sum_{m\in\mathcal{M}}(E_{i,m}^{ul}(\tau)+E_{i,m}^{ud}(\tau)
+Ei,mofld(τ))+Eiexec(τ)]ViAi(τ)\displaystyle+E_{i,m}^{ofld}(\tau))+E_{i}^{exec}(\tau)]-V\sum_{i\in\mathcal{I}}A_{i}(\tau)
s.t. (18a),(18b),(18c),\displaystyle\text{ s.t. }(\ref{C1}),(\ref{C3}),(\ref{C4}),
ui,m(t)a^i,m,n(t)xi,nL(t)+a~i,m,nL(t)x^i,n(t)\displaystyle u_{i,m}(t)\geqslant\hat{a}_{i,m,n}(t)\cdot x_{i,n}^{L}(t)+\tilde{a}_{i,m,n}^{L}(t)\cdot\hat{x}_{i,n}(t)
a~i,m,nL(t)xi,nL(t)yi,n(t),i,m,n,t,\displaystyle-\tilde{a}_{i,m,n}^{L}(t)\cdot x_{i,n}^{L}(t)\cdot y_{i,n}(t),\forall i,\forall m,\forall n,\forall t,
ui,m(t)a^i,m,n(t)xi,nu(t)+a~i,m,nU(t)x^i,n(t)\displaystyle u_{i,m}(t)\geqslant\hat{a}_{i,m,n}(t)\cdot x_{i,n}^{u}(t)+\tilde{a}_{i,m,n}^{U}(t)\cdot\hat{x}_{i,n}(t)
a~i,m,nU(t)xi,nu(t)yi,n(t),i,m,n,t,\displaystyle-\tilde{a}_{i,m,n}^{U}(t)\cdot x_{i,n}^{u}(t)\cdot y_{i,n}(t),\forall i,\forall m,\forall n,\forall t,
ui,m(t)a^i,m,n(t)xi,nL(t)+a~i,m,nU(t)x^i,n(t)\displaystyle u_{i,m}(t)\leqslant\hat{a}_{i,m,n}(t)\cdot x_{i,n}^{L}(t)+\tilde{a}_{i,m,n}^{U}(t)\cdot\hat{x}_{i,n}(t)
a~i,m,nU(t)xi,nL(t)yi,n(t),i,m,n,t,\displaystyle-\tilde{a}_{i,m,n}^{U}(t)\cdot x_{i,n}^{L}(t)\cdot y_{i,n}(t),\forall i,\forall m,\forall n,\forall t,
ui,m(t)a^i,m,n(t)xi,nu(t)+a~i,m,nL(t)x^i,n(t)\displaystyle u_{i,m}(t)\leqslant\hat{a}_{i,m,n}(t)\cdot x_{i,n}^{u}(t)+\tilde{a}_{i,m,n}^{L}(t)\cdot\hat{x}_{i,n}(t)
a~i,m,nL(t)xi,nu(t)yi,n(t),i,m,n,t,\displaystyle-\tilde{a}_{i,m,n}^{L}(t)\cdot x_{i,n}^{u}(t)\cdot y_{i,n}(t),\forall i,\forall m,\forall n,\forall t,
a~i,m(t)=n𝒩a^i,m,n(t),i,m,t,\displaystyle\tilde{a}_{i,m}(t)=\sum_{n\in\mathcal{N}}\hat{a}_{i,m,n}(t),\forall i,\forall m,\forall t,
xi(t)=n𝒩x^i,n(t),i,t,\displaystyle x_{i}(t)=\sum_{n\in\mathcal{N}}\hat{x}_{i,n}(t),\forall i,\forall t,
n𝒩yi,n(t)=1,t,\displaystyle\sum_{n\in\mathcal{N}}y_{i,n}(t)=1,\forall t,
xi,nL(t)=xiL(t)+(xiU(t)xiL(t))n(n1)N,i,n,t,\displaystyle x_{i,n}^{L}(t)=x_{i}^{L}(t)+\frac{(x_{i}^{U}(t)-x_{i}^{L}(t))\cdot n(n-1)}{N},\forall i,\forall n,\forall t,
xi,nU(t)=xiL(t)+(xiU(t)xiL(t))nnN,i,n,t,\displaystyle x_{i,n}^{U}(t)=x_{i}^{L}(t)+\frac{(x_{i}^{U}(t)-x_{i}^{L}(t))\cdot nn}{N},\forall i,\forall n,\forall t,
xi,nL(t)yi,n(t)x^i,n(t)xi,nU(t)yi,n(t),i,n,t,\displaystyle x_{i,n}^{L}(t)y_{i,n}(t)\leqslant\hat{x}_{i,n}(t)\leqslant x_{i,n}^{U}(t)y_{i,n}(t),\forall i,\forall n,\forall t,
a~i,m,nL(t)yi,n(t)a^i,m,n(t)a~i,m,nU(t)yi,n(t),i,m,n,t,\displaystyle\tilde{a}_{i,m,n}^{L}(t)y_{i,n}(t)\leqslant\hat{a}_{i,m,n}(t)\leqslant\tilde{a}_{i,m,n}^{U}(t)y_{i,n}(t),\forall i,\forall m,\forall n,\forall t,
yi,n(t){0,1},n,t.\displaystyle y_{i,n}(t)\in\{0,1\},\forall n,\forall t.

To this end, we prune xi(t)x_{i}(t) and a~i,m(t)\tilde{a}_{i,m}(t) to tighten their relaxed bounds. By traversing each partition nNn\in N of xi(t)x_{i}(t), we first determine the lower bound a~i,m,nL(t)\tilde{a}_{i,m,n}^{L}(t) and upper bound a~i,m,nU(t)\tilde{a}_{i,m,n}^{U}(t) of a~i,m(t)\tilde{a}_{i,m}(t) by solving the following linear programming problem:

a~i,m,nL(t)=mina~i,m(t) or a~i,m,nU(t)=maxa~i,m(t)\displaystyle\tilde{a}_{i,m,n}^{L}(t)=\min\tilde{a}_{i,m}(t)\text{ or }\tilde{a}_{i,m,n}^{U}(t)=\max\tilde{a}_{i,m}(t) (42)
s.t. (18a),(18b),(18c),(29a)(29d),\displaystyle\text{ s.t. }(\ref{C1}),(\ref{C3}),(\ref{C4}),(\ref{C5})-(\ref{C8}),
obj(𝒫43)z,\displaystyle obj(\mathcal{P}_{4-3})\leqslant z^{\prime},
xiL(t)=xi,nL(t)xi(t)xi,nU(t)=xiU(t).\displaystyle x_{i}^{L}(t)=x_{i,n}^{L}(t)\leqslant x_{i}(t)\leqslant x_{i,n}^{U}(t)=x_{i}^{U}(t).

Besides, the range of xi(t)x_{i}(t) is updated as xiL(t)=minnxi,nL(t)x_{i}^{L}(t)=\min_{n}x_{i,n}^{L}(t) and xiU(t)=maxnxi,nU(t)x_{i}^{U}(t)=\max_{n}x_{i,n}^{U}(t) after traversing all the partitions of xi(t)x_{i}(t). Then, after pruning all xi(t)x_{i}(t) and a~i,m(t)\tilde{a}_{i,m}(t), we can solve problem 𝒫43\mathcal{P}_{4-3} in the pruned partition with software-based optimization solvers (e.g., CVX [39]), and obtain its solution (𝒙R,𝒂~R)(\bm{x}^{R},\bm{\tilde{a}}^{R}). Lastly, we round continuous variables 𝒂~R\bm{\tilde{a}}^{R} to binary forms for obtaining integer solutions. Note that all constraints are automatically satisfied under (𝒙R,𝒂R)(\bm{x}^{R},\bm{a}^{R}) because they have been taken into account in the pruning process of (42). The detailed steps of the designed PME-based algorithm for solving the large-timescale problem is presented in Algorithm 1.

4.3.3 Solution for Small-Timescale Decisions

Small-timescale subproblem 𝒫5\mathcal{P}_{5} is also non-convex in general, but by relaxing the integer decision zi(τ)z_{i}(\tau) (i.e., zi(τ){0,1}zi(τ)[0,1]z_{i}(\tau)\in\{0,1\}\rightarrow z_{i}(\tau)\in[0,1]), it becomes a block multi-convex problem (i.e., the problem is convex if we solve one block of decision variable while fixing the others). This motivates us to design a BCD-based algorithm by further dividing problem 𝒫5\mathcal{P}_{5} into four subproblems and solving them alternately until the objective function of 𝒫5\mathcal{P}_{5} converges.

Personalized Data Size Determination: Given bi(τ)b_{i}(\tau), fi(τ)f_{i}(\tau) and zi(τ)z_{i}(\tau), we have

𝒫51:minyi(τ)iHi(τ)zi(τ)m(Ti,mul(τ)+Ti,mud(τ))\displaystyle\mathcal{P}_{5-1}:\min_{y_{i}(\tau)}\sum_{i\in\mathcal{I}}H_{i}(\tau)z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau))
+E(τ)imzi(τ)(Ei,mul(τ)+Ei,mud(τ))ViAi(τ)\displaystyle+E(\tau)\sum_{i\in\mathcal{I}}\sum_{m\in\mathcal{M}}z_{i}(\tau)(E_{i,m}^{ul}(\tau)+E_{i,m}^{ud}(\tau))-V\sum_{i\in\mathcal{I}}A_{i}(\tau)
s.t. yi(τ)[0,1].\displaystyle\text{ s.t. }y_{i}(\tau)\in[0,1].

Bandwidth Resource Allocation: Given yi(τ)y_{i}(\tau), fi(τ)f_{i}(\tau) and zi(τ)z_{i}(\tau), we have

𝒫52:minbi(τ)iHi(τ)zi(τ)m(Ti,mul(τ)+Ti,mofld(τ))\displaystyle\mathcal{P}_{5-2}:\min_{b_{i}(\tau)}\sum_{i\in\mathcal{I}}H_{i}(\tau)z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ofld}(\tau))
+E(τ)imzi(τ)(Ei,mul(τ)+Ei,mofld(τ))\displaystyle+E(\tau)\sum_{i\in\mathcal{I}}\sum_{m\in\mathcal{M}}z_{i}(\tau)(E_{i,m}^{ul}(\tau)+E_{i,m}^{ofld}(\tau))
s.t. (18b),bi(τ)(0,1].\displaystyle\text{ s.t. }(\ref{C3}),b_{i}(\tau)\in(0,1].

Computation Resource Allocation: Given yi(τ)y_{i}(\tau), bi(τ)b_{i}(\tau) and zi(τ)z_{i}(\tau), we have

𝒫53:minfi(τ)iHi(τ)(mTi,mpl(t)/K+zi(τ)m\displaystyle\mathcal{P}_{5-3}:\min_{f_{i}(\tau)}\sum_{i\in\mathcal{I}}H_{i}(\tau)(\sum_{m\in\mathcal{M}}T_{i,m}^{pl}(t)/K+z_{i}(\tau)\sum_{m\in\mathcal{M}}
Ti,mud(τ)+Tiexec(τ))+E(τ)i(mEi,mpl(t)/K\displaystyle T_{i,m}^{ud}(\tau)+T_{i}^{exec}(\tau))+E(\tau)\sum_{i\in\mathcal{I}}(\sum_{m\in\mathcal{M}}E_{i,m}^{pl}(t)/K
+mzi(τ)Ei,mud(τ)+Eiexec(τ))\displaystyle+\sum_{m\in\mathcal{M}}z_{i}(\tau)E_{i,m}^{ud}(\tau)+E_{i}^{exec}(\tau))
s.t. (18c),fi(τ)(0,1].\displaystyle\text{ s.t. }(\ref{C4}),f_{i}(\tau)\in(0,1].
Input: Iteration index l=0l=0, iterative convergence threshold ϵ\epsilon, and initial feasible solutions (𝒚(0),𝒃(0),𝒇(0),𝒛(0)\bm{y}^{(0)},\bm{b}^{(0)},\bm{f}^{(0)},\bm{z}^{(0)}).
1 Linear relaxation: zi(τ){0,1}zi(τ)[0,1]z_{i}(\tau)\in\{0,1\}\rightarrow z_{i}(\tau)\in[0,1];
2 repeat
3       Solve 𝒚(l)\bm{y}^{(l)} from 𝒫51\mathcal{P}_{5-1} with given 𝒃(l1),𝒇(l1),𝒛(l1)\bm{b}^{(l-1)},\bm{f}^{(l-1)},\bm{z}^{(l-1)};
4       Solve 𝒃(l)\bm{b}^{(l)} from 𝒫52\mathcal{P}_{5-2} with given 𝒚(l1),𝒇(l1),𝒛(l1)\bm{y}^{(l-1)},\bm{f}^{(l-1)},\bm{z}^{(l-1)};
5       Solve 𝒇(l)\bm{f}^{(l)} from 𝒫53\mathcal{P}_{5-3} with given 𝒚(l1),𝒃(l1),𝒛(l1)\bm{y}^{(l-1)},\bm{b}^{(l-1)},\bm{z}^{(l-1)};
6       Solve 𝒛(l)\bm{z}^{(l)} from 𝒫54\mathcal{P}_{5-4} with given 𝒚(l1),𝒃(l1),𝒇(l1)\bm{y}^{(l-1)},\bm{b}^{(l-1)},\bm{f}^{(l-1)};
7       l=l+1l=l+1;
8until obj(l)(𝒫5)obj(l1)(𝒫5)ϵ\mid obj^{(l)}(\mathcal{P}_{5})-obj^{(l-1)}(\mathcal{P}_{5})\mid\leq\epsilon;
9Round zi(τ)z_{i}(\tau) to 1 with probability zi(τ)z_{i}(\tau);
Output: solution of 𝒫5\mathcal{P}_{5}: (𝒚(l),𝒃(l),𝒇(l),𝒛(l)\bm{y}^{(l)},\bm{b}^{(l)},\bm{f}^{(l)},\bm{z}^{(l)}).
Algorithm 2 BCD-Based Algorithm

Task Offloading Decision: Given yi(τ)y_{i}(\tau), bi(τ)b_{i}(\tau) and fi(τ)f_{i}(\tau), we have

𝒫54:minzi(τ)iHi(τ)[zi(τ)m(Ti,mul(τ)+Ti,mud(τ)\displaystyle\mathcal{P}_{5-4}:\min_{z_{i}(\tau)}\sum_{i\in\mathcal{I}}H_{i}(\tau)[z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)
+Ti,mofld(τ))+Tiexec(τ)]+E(τ)i[mzi(τ)(Ei,mul(τ)\displaystyle+T_{i,m}^{ofld}(\tau))+T_{i}^{exec}(\tau)]+E(\tau)\sum_{i\in\mathcal{I}}[\sum_{m\in\mathcal{M}}z_{i}(\tau)(E_{i,m}^{ul}(\tau)
+Ei,mud(τ)+Ei,mofld(τ))+Eiexec(τ)]ViAi(τ)\displaystyle+E_{i,m}^{ud}(\tau)+E_{i,m}^{ofld}(\tau))+E_{i}^{exec}(\tau)]-V\sum_{i\in\mathcal{I}}A_{i}(\tau)
s.t. zi(τ)[0,1].\displaystyle\text{ s.t. }z_{i}(\tau)\in[0,1].
Theorem 2

Problems 𝒫51\mathcal{P}_{5-1}, 𝒫52\mathcal{P}_{5-2}, 𝒫53\mathcal{P}_{5-3} and 𝒫54\mathcal{P}_{5-4} are all convex.

Proof:

Please see Appendix B. ∎

Thanks to Theorem 2, we can easily solve problems 𝒫51\mathcal{P}_{5-1}, 𝒫52\mathcal{P}_{5-2}, 𝒫53\mathcal{P}_{5-3} and 𝒫54\mathcal{P}_{5-4} by leveraging existing software-based optimization solvers (e.g. CVX [39]). Note that all these problems have to be solved iteratively, and the iteration terminates whenever the objective of problem 𝒫5\mathcal{P}_{5} can no longer be enhanced. The detailed steps of the designed BCD-based algorithm for solving the small-timescale problem is illustrated in Algorithm 2.

Refer to caption
Figure 3: Flowchart of the proposed TACO approach.

4.4 Analysis of Proposed TACO Approach

In summary, the proposed two-timescale accuracy-aware online optimization approach (TACO) consists of problem reformulation, decomposition and alternation between two timescales. The flowchart of TACO is shown in Fig. 3.

Theorem 3

The proposed TACO approach can converge with limited alternations and iterations.

Proof:

For problem 𝒫5\mathcal{P}_{5}, the convergence of TACO depends on that of the designed BCD-based algorithm and that of the employed alternating algorithm for problem 𝒫3\mathcal{P}_{3}.

First, to prove the convergence of the BCD-based algorithm, we derive the partial derivatives of the objective function of problem 𝒫5\mathcal{P}_{5} as follows:

𝒚obj(𝒫5)=iHi(τ)zi(τ)m(Si(τ)ri,m(τ)\displaystyle\nabla_{\bm{y}}obj(\mathcal{P}_{5})=\sum_{i\in\mathcal{I}}H_{i}(\tau)z_{i}(\tau)\sum_{m\in\mathcal{M}}(\frac{S_{i}(\tau)}{r_{i,m}(\tau)} (43)
+ai,m(t)Si(τ)Cmfi(τ)Fm)+E(τ)izi(τ)m(Si(τ)piri,m(τ)\displaystyle+\frac{a_{i,m}(t)S_{i}(\tau)C_{m}}{f_{i}(\tau)F_{m}})+E(\tau)\sum_{i\in\mathcal{I}}z_{i}(\tau)\sum_{m\in\mathcal{M}}(\frac{S_{i}(\tau)p_{i}}{r_{i,m}(\tau)}
+ai,m(t)ρm(Fm)2Si(τ)Cm)2Vizi(τ)Si(τ)Di(t)+Si(τ),\displaystyle+a_{i,m}(t)\rho_{m}(F_{m})^{2}S_{i}(\tau)C_{m})-2V\sum_{i\in\mathcal{I}}\frac{z_{i}(\tau)S_{i}(\tau)}{D_{i}(t)+S_{i}(\tau)},
𝒃obj(𝒫5)=[iHi(τ)zi(τ)(yi(τ)Si(τ)\displaystyle\nabla_{\bm{b}}obj(\mathcal{P}_{5})=[\sum_{i\in\mathcal{I}}H_{i}(\tau)z_{i}(\tau)(y_{i}(\tau)S_{i}(\tau)
+λi(τ))+E(τ)izi(τ)pi(yi(τ)Si(τ)+λi(τ))]\displaystyle+\lambda_{i}(\tau))+E(\tau)\sum_{i\in\mathcal{I}}z_{i}(\tau)p_{i}(y_{i}(\tau)S_{i}(\tau)+\lambda_{i}(\tau))]
[pi|hi,m(τ)|2ai,m(t)(Si,m(τ))θN0(τ)(Bm)2ln2\displaystyle[\frac{p_{i}|h_{i,m}(\tau)|^{2}}{a_{i,m}(t)(S_{i,m}(\tau))^{\theta}N_{0}(\tau)(B_{m})^{2}\ln 2} (44)
(Si,m(τ))θN0(τ)Bm((bi(τ))3+(bi(τ))2pi|hi,m(τ)|2)\displaystyle\cdot\frac{(S_{i,m}(\tau))^{\theta}N_{0}(\tau)B_{m}}{((b_{i}(\tau))^{3}+(b_{i}(\tau))^{2}p_{i}|h_{i,m}(\tau)|^{2})}
1log22(1+pi|hi,m(τ)|2(Si,m(τ))θN0bi(τ)Bm)\displaystyle\cdot\frac{1}{\log_{2}^{2}(1+\frac{p_{i}|h_{i,m}(\tau)|^{2}}{(S_{i,m}(\tau))^{\theta}N_{0}b_{i}(\tau)B_{m}})}
1ai,m(t)(bi(τ))2Bmlog2(1+pi|hi,m(τ)|2(Si,m(τ))θN0bi(τ)Bm)],\displaystyle-\frac{1}{a_{i,m}(t)(b_{i}(\tau))^{2}B_{m}\log_{2}(1+\frac{p_{i}|h_{i,m}(\tau)|^{2}}{(S_{i,m}(\tau))^{\theta}N_{0}b_{i}(\tau)B_{m}})}],
𝒇obj(𝒫5)=iHi(τ)[mai,m(t)xi(t)Di(t)Cm(fi(tK))2FmK\displaystyle\nabla_{\bm{f}}obj(\mathcal{P}_{5})=-\sum_{i\in\mathcal{I}}H_{i}(\tau)[\sum_{m\in\mathcal{M}}\frac{a_{i,m}(t)x_{i}(t)D_{i}(t)C_{m}}{(f_{i}(tK))^{2}F_{m}K} (45)
+zi(τ)mai,m(t)yi(τ)Si(τ)Cm(fi(τ))2Fm\displaystyle+z_{i}(\tau)\sum_{m\in\mathcal{M}}\frac{a_{i,m}(t)y_{i}(\tau)S_{i}(\tau)C_{m}}{(f_{i}(\tau))^{2}F_{m}}
+mai,m(t)zi(τ)λi(τ)Cm(fi(τ))2Fm],\displaystyle+\sum_{m\in\mathcal{M}}a_{i,m}(t)z_{i}(\tau)\frac{\lambda_{i}(\tau)C_{m}}{(f_{i}(\tau))^{2}F_{m}}],
𝒛obj(𝒫5)=iHi(τ)[mM(Ti,mul(τ)+Ti,mud(τ)\displaystyle\nabla_{\bm{z}}obj(\mathcal{P}_{5})=\sum_{i\in\mathcal{I}}H_{i}(\tau)[\sum_{m\in M}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)
+Ti,mofld(τ))+mai,m(t)λi(τ)Cmfi(τ)Fmλi(τ)CiFi]\displaystyle+T_{i,m}^{ofld}(\tau))+\sum_{m\in\mathcal{M}}a_{i,m}(t)\frac{\lambda_{i}(\tau)C_{m}}{f_{i}(\tau)F_{m}}-\frac{\lambda_{i}(\tau)C_{i}}{F_{i}}]
+E(τ)i[m(Ei,mul(τ)+Ei,mud(τ)+Ei,mofld(τ))\displaystyle+E(\tau)\sum_{i\in\mathcal{I}}[\sum_{m\in\mathcal{M}}(E_{i,m}^{ul}(\tau)+E_{i,m}^{ud}(\tau)+E_{i,m}^{ofld}(\tau)) (46)
+mai,m(t)ρm(Fm)2λi(τ)Cmρi(Fi)2λi(τ)Ci]\displaystyle+\sum_{m\in\mathcal{M}}a_{i,m}(t)\rho_{m}(F_{m})^{2}\lambda_{i}(\tau)C_{m}-\rho_{i}(F_{i})^{2}\lambda_{i}(\tau)C_{i}]
Vi[1(1xi(t)Di(t)+yi(τ)Si(τ)Di(t)+Si(τ))2gilocal].\displaystyle-V\sum_{i\in\mathcal{I}}[1-(1-\frac{x_{i}(t)D_{i}(t)+y_{i}(\tau)S_{i}(\tau)}{D_{i}(t)+S_{i}(\tau)})^{2}-g_{i}^{local}].

Evidently, (43) and (4.4) are constants, and (4.4) and (45) are linear, meaning that all derived partial derivatives are L-lipschitz continuous according to [40], and hence the BCD-based algorithm can converge with limited iterations.

Then, to prove the convergence of the alternating algorithm, we scale the objective function of 𝒫3\mathcal{P}_{3} as

objl1(𝒫3)=obj(𝒫3)(𝒂l1,𝒙l1,𝒚l1,𝒃l1,𝒇l1,𝒛l1)\displaystyle obj^{l-1}(\mathcal{P}_{3})=obj(\mathcal{P}_{3})(\bm{a}^{l-1},\bm{x}^{l-1},\bm{y}^{l-1},\bm{b}^{l-1},\bm{f}^{l-1},\bm{z}^{l-1})
obj(𝒫3)(𝒂l,𝒙l,𝒚l1,𝒃l1,𝒇l1,𝒛l1)\displaystyle\geq obj(\mathcal{P}_{3})(\bm{a}^{l},\bm{x}^{l},\bm{y}^{l-1},\bm{b}^{l-1},\bm{f}^{l-1},\bm{z}^{l-1}) (47)
obj(𝒫3)(𝒂l,𝒙l,𝒚l,𝒃l,𝒇l,𝒛l)=objl(𝒫3),\displaystyle\geq obj(\mathcal{P}_{3})(\bm{a}^{l},\bm{x}^{l},\bm{y}^{l},\bm{b}^{l},\bm{f}^{l},\bm{z}^{l})=obj^{l}(\mathcal{P}_{3}),

where ll is the number of alternations. The first inequality holds due to the sub-optimality of {𝒚l,𝒃l,𝒇l,𝒛l}\{\bm{y}^{l},\bm{b}^{l},\bm{f}^{l},\bm{z}^{l}\} by BCD-based algorithm, and the second inequality holds because of the sub-optimality of {𝒂l,𝒙l}\{\bm{a}^{l},\bm{x}^{l}\} by PME-based algorithm. This indicates that the objective function of 𝒫3\mathcal{P}_{3} is monotonically decreasing along with the alternating process, and will be lower-bounded by VKI-VKI (i.e., by setting two queue backlogs to 0 and all Ai(τ)A_{i}(\tau) to 11) in finite alternations. ∎

Theorem 4

The computational complexity of the proposed TACO approach is O(TRmax((I+M)log2(I+M)+I3M3N3+IM2.055+4WmaxI3+22.055I))O(TR^{max}((I+M)log_{2}(I+M)+I^{3}M^{3}N^{3}+IM^{2.055}+4W^{max}I^{3}+2^{2.055}I)), where II is the number of PTs, MM is the number of ESs, NN is the number of partitions in the PME-based algorithm, TT is the number of time frames, RmaxR^{max} is the number of iterations in the alternating algorithm between two timescales, and WmaxW^{max} is the number of iterations in the BCD-based algorithm.

Proof:

The complexity of TACO mainly depends on the alternating algorithm integrating PME-based algorithm and BCD-based algorithm.

For the PME-based algorithm, as stated in [41], the computational complexity for obtaining an initial feasible solution is O((I+M)log2(I+M))O((I+M)log_{2}(I+M)), and that of solving (42) is O(I3M3N3)O(I^{3}M^{3}N^{3}) with the interior point method [42] in the CVX solver. Besides, the linear relaxation and linear programming for solving ai,m(t)a_{i,m}(t) has an asymptotic computational complexity of O(IM2.055)O(IM^{2.055}) [43]. For the BCD-based algorithm, the total computational complexity for solving all subproblems is O(4I3)O(4I^{3}) with the interior point method [42] in CVX solver, and that of the linear relaxation and linear programming solver for solving zi(τ)z_{i}(\tau) is O(22.055I)O(2^{2.055}I).

To sum up, the computational complexity of TACO can be expressed as O(TRmax((I+M)log2(I+M)+I3M3N3+IM2.055+4WmaxI3+22.055I))O(TR^{max}((I+M)log_{2}(I+M)+I^{3}M^{3}N^{3}+IM^{2.055}+4W^{max}I^{3}+2^{2.055}I)). ∎

Theorem 5

Given Lyapunov control parameter VV, the optimality gap between the solution obtained by the proposed TACO approach and the theoretically optimal solution to problem 𝒫1\mathcal{P}_{1} can be expressed as

t𝒯τ𝒯t𝔼[iAi(τ)𝚯(τ)]/KT𝒪G/V+(Λ+Γ)/VT,\sum_{t\in\mathcal{T}}\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)]/KT-\mathcal{O}\leq G/V+(\Lambda+\Gamma)/VT, (48)

where 𝒪\mathcal{O} stands for the theoretically optimal solution, Λ\Lambda is the optimality gap of the PME-based algorithm, Γ\Gamma represents the optimality gap of the BCD-based algorithm, and GG is defined in (27).

Proof:

First, inequality (4.2) can be intuitively expanded and rewritten as

τ𝒯tΔ(𝚯(τ))Vτ𝒯t𝔼[iAi(τ)𝚯(τ)]\displaystyle\sum_{\tau\in\mathcal{T}_{t}}\Delta(\bm{\Theta}(\tau))-V\cdot\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)] (49)
GK+τ𝒯t{i𝔼[Hi(τ)[Titol(τ)Timax]𝚯(τ)]\displaystyle\leq GK+\sum_{\tau\in\mathcal{T}_{t}}\{\sum_{i\in\mathcal{I}}\mathbb{E}[H_{i}(\tau)[T_{i}^{tol}(\tau)-T_{i}^{max}]\mid\bm{\Theta}(\tau)]
+𝔼{E(τ)[Etol(τ)Emax]𝚯(τ)}\displaystyle+\mathbb{E}\{E(\tau)\cdot[E^{tol}(\tau)-E^{max}]\mid\bm{\Theta}(\tau)\}
V𝔼[iAi(τ)𝚯(τ)]}\displaystyle-V\cdot\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)]\}
GK+VK𝒪+Λ+Γ.\displaystyle\leq GK+V\cdot K\mathcal{O}+\Lambda+\Gamma.

Then, by summing up (49) over TT time frames, we have

(G+V𝒪+1K(Λ+Γ))KT\displaystyle(G+V\cdot\mathcal{O}+\frac{1}{K}(\Lambda+\Gamma))\cdot KT (50)
tT{τ𝒯tΔ(𝚯(τ))Vτ𝒯t𝔼[iAi(τ)𝚯(τ)]}\displaystyle\geq\sum_{t\in T}\{\sum_{\tau\in\mathcal{T}_{t}}\Delta(\bm{\Theta}(\tau))-V\cdot\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)]\}
=𝔼[L(𝚯(KT))L(𝚯(0))]\displaystyle=\mathbb{E}[L(\bm{\Theta}(KT))-L(\bm{\Theta}(0))]
+VtTτ𝒯t𝔼[iAi(τ)𝚯(τ)]\displaystyle+V\cdot\sum_{t\in T}\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)]
=𝔼[L(𝚯(KT))]𝔼[L(𝚯(0))]\displaystyle=\mathbb{E}[L(\bm{\Theta}(KT))]-\mathbb{E}[L(\bm{\Theta}(0))]
+VtTτ𝒯t𝔼[iAi(τ)𝚯(τ)]\displaystyle+V\cdot\sum_{t\in T}\sum_{\tau\in\mathcal{T}_{t}}\mathbb{E}[\sum_{i\in\mathcal{I}}A_{i}(\tau)\mid\bm{\Theta}(\tau)]

Afterwards, by moving 𝔼[L(𝚯(0))]\mathbb{E}[L(\bm{\Theta}(0))] to the left-hand-side of (50), and then dividing both sides by TVTV, inequality (48) can be obtained.

Next, we further analyze the optimality gaps Λ\Lambda and Γ\Gamma as follows. For Λ\Lambda, according to [36], its value is theoretically bounded in the range of Λ[0,0.12]\Lambda\in[0,0.12]. For Γ\Gamma, by letting 𝒫~5\tilde{\mathcal{P}}_{5} and 𝒫5\mathcal{P}_{5}^{*} be the solutions given by the BCD-based algorithm and the optimal one, respectively, and taking the subtraction between them, we have

𝒫~5𝒫5\displaystyle\tilde{\mathcal{P}}_{5}-\mathcal{P}_{5}^{*}
Γ\displaystyle\leq\Gamma
=iHi(τ){z~i(τ)m(Si(τ)+λi(τ)+ai,m(t)Si(τ)Cm)\displaystyle=\sum_{i\in\mathcal{I}}H_{i}(\tau)\{\tilde{z}_{i}(\tau)\sum_{m\in\mathcal{M}}(S_{i}(\tau)+\lambda_{i}(\tau)+a_{i,m}(t)S_{i}(\tau)C_{m})
+zi(τ)ai,m(t)Si(τ)Cm}+E(τ)im[z~i(τ)\displaystyle+z_{i}^{*}(\tau)a_{i,m}(t)S_{i}(\tau)C_{m}\}+E(\tau)\sum_{i\in\mathcal{I}}\sum_{m\in\mathcal{M}}[\tilde{z}_{i}(\tau)
[ai,m(t)ρm(Fm)2Si(τ)Cm+Si(τ)pi+zi(τ)Si(τ)pi]\displaystyle[a_{i,m}(t)\rho_{m}(F_{m})^{2}S_{i}(\tau)C_{m}+S_{i}(\tau)p_{i}+z_{i}^{*}(\tau)S_{i}(\tau)p_{i}]
+Vi(Ai(τ)A~i(τ))=iHi(τ)(z~i(τ)ωi(τ)+zi(τ)ϕi(τ))\displaystyle+V\sum_{i\in\mathcal{I}}(A_{i}^{*}(\tau)\tilde{A}_{i}(\tau))=\sum_{i\in\mathcal{I}}H_{i}(\tau)(\tilde{z}_{i}(\tau)\omega_{i}(\tau)+z_{i}^{*}(\tau)\phi_{i}(\tau))
+E(τ)im[z~i(τ)ωi(τ)+zi(τ)ϕi(τ)]\displaystyle+E(\tau)\sum_{i\in\mathcal{I}}\sum_{m\in\mathcal{M}}[\tilde{z}_{i}(\tau)\omega_{i}^{\prime}(\tau)+z_{i}^{*}(\tau)\phi_{i}^{\prime}(\tau)]
+Vi(Ai(τ)+A~i(τ)),\displaystyle+V\sum_{i\in\mathcal{I}}(A_{i}^{*}(\tau)+\tilde{A}_{i}(\tau)), (51)

where ωi(τ)=m(Si(τ)+λi(τ)+ai,m(t)Si(τ)Cm)\omega_{i}(\tau)=\sum_{m\in\mathcal{M}}(S_{i}(\tau)+\lambda_{i}(\tau)+a_{i,m}(t)S_{i}(\tau)C_{m}), ϕi(τ)=ai,m(t)Si(τ)Cm\phi_{i}(\tau)=a_{i,m}(t)S_{i}(\tau)C_{m}, ωi(τ)=ai,m(t)ρm(Fm)2Si(τ)Cm\omega_{i}^{\prime}(\tau)=a_{i,m}(t)\rho_{m}(F_{m})^{2}S_{i}(\tau)C_{m} and ϕi(τ)=Si(τ)pi\phi_{i}^{\prime}(\tau)=S_{i}(\tau)p_{i}, which are both constant in each time slot τ𝒯t\tau\in\mathcal{T}_{t}. ∎

5 Simulation Results

In this section, simulations are conducted to evaluate the performance of the proposed TACO approach for jointly optimizing the HDT deployment (i.e., generic placement and customized update of the VT model) and task offloading in an end-edge-cloud collaborative framework. All simulation results are obtained based on real-world datasets (including a human activity dataset [44] and an ES distribution dataset [45]), by taking averages over 1000 runs under various parameter settings.

5.1 Simulation Settings

Consider an HDT system in a 1km×1km1km\times 1km square area with M=10M=10 ESs and I=40I=40 PTs (following a Random-Waypoint (RWP) mobility model under the same settings as those in [46]). According to [31], for the HDT-assisted complex task execution, the average accuracy of edge execution and local execution for any PT ii’ s tasks are approximated as giedge(di(τ))=1[1di(τ)/(Di(t)+Si(τ))]2g_{i}^{edge}(d_{i}(\tau))=1-[1-d_{i}(\tau)/(D_{i}(t)+S_{i}(\tau))]^{2} (which is a function of the total data size for PT ii’s corresponding VT construction at the edge, i.e., di(τ)d_{i}(\tau)) and gilocal=0.5g_{i}^{local}=0.5, respectively. Table II lists the values of main simulation parameters, while most of them have also been widely employed in the literature [11], [12], [20]. Furthermore, to show the superiority of the proposed TACO approach, the following schemes are simulated as benchmarks. Note that, since the original objectives of these benchmark schemes are different from ours, for the fairness of comparison, we have modified them to adapt to the considered settings and particularly changed their optimization objectives to align with ours.

  • LOT [20]: Generic VT model placement from the cloud and PTs’ task offloading are jointly determined by adopting a contract theory-based incentive mechanism. However, this scheme ignores the customized VT update by collecting personalized data from end devices, and it optimizes all decisions synchronously in a single-timescale only.

  • CRO [10]: Both generic VT model placement from the cloud and customized VT update from end devices along with PTs’ task offloading are jointly determined by adopting a double auction based optimization, while all decisions are optimized in a single-timescale for simplicity.

TABLE II: Simulation Parameters
Parameter Value Parameter Value
pip_{i} 500 mW BmB_{m} 5 MHz
pcp^{c} 5 W FmF_{m} 20 GHz
θ\theta 4 Cm,CiC_{m},C_{i} 300 cycles/bit
N0N_{0} -174 dBm/Hz ρm,ρi\rho_{m},\rho_{i} 102710^{-27}
Si(τ)S_{i}(\tau) [6.1, 12.2] Mbits rcr^{c} 50 Mbps
Di(t)D_{i}(t) [73.2, 97.6] Mbits KK 10
λi(τ)\lambda_{i}(\tau) [10, 20] Mbits MM 10
FiF_{i} 1 GHz II 40

5.2 Performance Evaluations

Refer to caption
Figure 4: Convergence of the proposed TACO approach.

Fig. 4 examines the convergence of the proposed TACO approach in solving problem 𝒫1\mathcal{P}_{1} by showing the average HDT-assisted task execution accuracy under different values of parameter VV ranging from 10610^{6} to 8×1068\times 10^{6}. The timeline is divided into T=200T=200 time frames, and each frame has K=10K=10 time slots. It is shown that, for all three cases, within 5050 time frames, the task execution accuracy decreases at the beginning but quickly converges over the time, which verifies the convergence property of TACO in solving problem 𝒫1\mathcal{P}_{1}. Besides, from this figure, it can also be observed that the average task execution accuracy increases with VV. The reason is that, Lyapunov parameter VV is introduced to control the weight of significance on maximizing the HDT-assisted task execution accuracy versus that of enforcing the delay and energy consumption constraints, and a larger VV indicates more emphasis on the task execution accuracy.

Refer to caption
(a) Queue backlogs w.r.t VV.
Refer to caption
(b) Queue backlogs w.r.t KK.
Figure 5: Stability of two queue backlogs by varying VV and KK.
Refer to caption
Figure 6: Comparison on updating delay of customized VT models with different bandwidth resources of ESs BmB_{m}.
Refer to caption
Figure 7: Comparison on placement delay of generic VT models with different transmission rate rcr^{c}.

Fig. 5 demonstrates the stability of the proposed TACO approach by showing the performance of delay overflow queue backlog Hi(τ)H_{i}(\tau) and energy consumption deficit queue backlog E(τ)E(\tau) brought by the Lyapunov decomposition by varying VV and KK. From this figure, we can see that two queue backlogs decrease and quickly stabilize over the time, because TACO focuses on controlling system costs (i.e., service response delay and system energy consumption) to minimize the objective function of 𝒫3\mathcal{P}_{3}, thereby shrinking two queue backlogs, and can eventually achieve a well balance between the task execution accuracy and system costs, leading to a stable outcome. Besides, in Fig. 5(a), it is intuitive that both queue backlogs stabilize on higher values with the increase of VV as more emphasis is on maximizing the task execution accuracy, resulting in the growth of delay and energy consumption. Meanwhile, Fig. 5(b) shows that both queue backlogs stabilize on lower values with the raise of KK. This is because a larger value of KK means that generic VT models are placed with a lower frequency, and hence greatly reduces the generic VT model placement delay (i.e., Ti,mdl(τ)+Ti,mpl(τ)T_{i,m}^{dl}(\tau)+T_{i,m}^{pl}(\tau)) and energy consumption (i.e., Ei,mdl(τ)+Ei,mpl(τ)E_{i,m}^{dl}(\tau)+E_{i,m}^{pl}(\tau)).

Refer to caption
Figure 8: Comparison on the service response delay with different numbers of PT II.
Refer to caption
Figure 9: Comparison on the system energy consumption with different numbers of PT II.
Refer to caption
(a) Accuracy w.r.t BmB_{m}.
Refer to caption
(b) Accuracy w.r.t rcr^{c}.
Refer to caption
(c) Accuracy w.r.t II.
Figure 10: Comparison on the average task execution accuracy by varying bandwidth resources of ESs BmB_{m}, transmission rate rcr^{c} and numbers of PTs II.

Fig. 6 illustrates the average updating delay of customized VT models (i.e., t𝒯τ𝒯ti(Ti,mul(τ)+Ti,mud(τ))/TKI\sum_{t\in\mathcal{T}}\sum_{\tau\in\mathcal{T}_{t}}\sum_{i\in\mathcal{I}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau))/TKI) with different ES bandwidth BmB_{m} under LOT, CRO and the proposed TACO approach. Since LOT ignores customized VT model update, meaning that personalized data of PTs does not need to be uploaded and processed, the average updating delay maintains as zero regardless of the ES’s bandwidth BmB_{m}. In contrast, the average updating delay decreases with the increase of BmB_{m} for both CRO and the proposed TACO approach due to the reduction in uploading delay Ti,mul(τ)T_{i,m}^{ul}(\tau), as more uplink bandwidth resource is provided. Besides, we can also see that TACO outperforms CRO because, with the help of generic VT model placement, TACO needs to upload much less personalized data than that of CRO.

Fig. 7 shows the average placement delay of generic VT models (i.e., t𝒯τ𝒯ti(Ti,mdl(t)+Ti,mpl(t))/TKI\sum_{t\in\mathcal{T}}\sum_{\tau\in\mathcal{T}_{t}}\sum_{i\in\mathcal{I}}(T_{i,m}^{dl}(t)+T_{i,m}^{pl}(t))/TKI) with different cloud-ES transmission rate rcr^{c} under LOT, CRO and the proposed TACO approach. It is obvious that the placement delay decreases logarithmically with the raise of rcr^{c} for all schemes. Furthermore, from this figure, we can see that LOT has the worst performance because it ignores customized model update and can only download as much experiential knowledge as possible to improve the average task execution accuracy, resulting in the highest average placement delay. In contrast, both CRO and TACO outperform LOT due to the introduction of customized VT update, alleviating the burden on generic VT model placement. Moreover, the proposed TACO approach achieves the best performance (with the placement delay reduced by 14.3%14.3\% and 25.5%25.5\% on average compared to CRO and LOT, respectively). This is because TACO builds VTs in two timescales (with large-timescale generic model placement and small-timescale customized model update) which only requires to download experiential knowledge at the beginning of each time frame significantly reducing the total data size in the process of model placement.

Figs. 8 and 9 investigate the service response delay and system energy consumption, achieved by LOT, CRO and the proposed TACO approach with different numbers of PTs II. Both figures show that the service response delay and system energy consumption increase exponentially with II for all schemes, because a larger II implies more demands for VT construction with potentially larger amount of data in competing limited communication and computation resources. Besides, LOT has the worst performances in these two metrics because its task execution accuracy only relies on the experiential knowledge downloaded in VT construction, and thus it has to download much larger amounts of experiential knowledge for guaranteeing a desired task execution accuracy. Meanwhile, CRO and TACO outperform LOT, especially when II becomes larger, which reveals the necessity of customized VT model update in addition to generic VT model placement. Moreover, the proposed TACO approach achieves the best performance, and the reason follows the same as that in explaining Fig. 7.

Fig. 10 compares the average task execution accuracy achieved by LOT, CRO and the proposed TACO approach by varying ES’s bandwidth BmB_{m}, transmission rate rcr^{c} and number of PTs II. It is shown that, i) in Fig. 10(a), the average task execution accuracy increases with BmB_{m} for all schemes, except for LOT; ii) in Fig. 10(b), the average task execution accuracy increases with rcr^{c} for all three schemes; and iii) in Fig. 10(c), the average task execution accuracy decreases with II for all three schemes. The main reason is that, when more resource is supplied (i.e., BmB_{m} and rcr^{c} are large) or total resource demand is less competitive (i.e., II is small), more experiential knowledge and personalized data can be transmitted for dynamic VT construction and more tasks can be offloaded for HDT-assisted edge execution, all contributing to the enhancement of task execution accuracy. Furthermore, from this figure, we can see that LOT has the worst performance because it only considers generic VT model placement, and CRO is better than LOT thanks to the joint consideration of both generic VT model placement and customized VT model update. Moreover, TACO achieves the best performance because it further strikes the balance of generic VT model placement and customized VT model update by conducting these two processes asynchronously in two different timescales.

6 Conclusion

In this paper, the optimization of HDT deployment at the network edge has been studied. Particularly, aiming to maximize the accuracy of complex task execution assisted by HDT under various system uncertainties (e.g., random mobility and status variations), a two-timescale online optimization problem is formulated for jointly determining VTs’ construction (including dynamic generic model placement and customized model update) and PTs’ task offloading together with the management of access selection and corresponding communication and computation resource allocations. A novel solution approach, called TACO, is proposed, which decomposes the online problem into a series of deterministic ones and then leverages PME-based and BCD-based algorithms for solving different subproblems in different timescales alternately. Theoretical analyses and simulations show that TACO is superior compared to counterparts in the optimization of HDT deployment at the edge for not only improving the HDT-assisted task execution accuracy, but also reducing the service response delay and overall system energy consumption.

References

  • [1] S. D. Okegbile, J. Cai, C. Yi, and D. Niyato, “Human digital twin for personalized healthcare: Vision, architecture and future directions,” IEEE Netw., 2022.
  • [2] Y. Lin, L. Chen, A. Ali, C. Nugent, C. Ian, R. Li, D. Gao, H. Wang, Y. Wang, and H. Ning, “Human digital twin: A survey,” arXiv preprint arXiv:2212.05937, 2022.
  • [3] P. Thamotharan, S. Srinivasan et al., “Human digital twin for personalized elderly type 2 diabetes management,” J. Clin. Med., vol. 12, no. 6, p. 2094, 2023.
  • [4] B. Björnsson, C. Borrebaeck et al., “Digital twins to personalize medicine,” Genome Med., vol. 12, pp. 1–4, 2020.
  • [5] J. Chen, C. Yi et al., “Networking architecture and key supporting technologies for human digital twin in personalized healthcare: A comprehensive survey,” IEEE Commun. Surv. Tutor., 2023.
  • [6] S. D. Okegbile and J. Cai, “Edge-assisted human-to-virtual twin connectivity scheme for human digital twin frameworks,” in Proc. IEEE VTC, 2022, pp. 1–6.
  • [7] J. Ren, D. Zhang, S. He et al., “A survey on end-edge-cloud orchestrated network computing paradigms: Transparent computing, mobile edge computing, fog computing, and cloudlet,” ACM Comput. Surv. (CSUR), vol. 52, no. 6, pp. 1–36, 2019.
  • [8] P. Jia, X. Wang, and X. Shen, “Accurate and efficient digital twin construction using concurrent end-to-end synchronization and multi-attribute data resampling,” IEEE Internet Things J., vol. 10, no. 6, pp. 4857–4870, 2022.
  • [9] R. Dong, C. She et al., “Deep learning for hybrid 5G services in mobile edge computing systems: Learn from a digital twin,” IEEE Trans. Wirel. Commun., vol. 18, no. 10, pp. 4692–4707, 2019.
  • [10] Y. Zhang, H. Zhang, Y. Lu, W. Sun, L. Wei, Y. Zhang, and B. Wang, “Adaptive digital twin placement and transfer in wireless computing power network,” IEEE Internet Things J., pp. 1–1, 2023.
  • [11] J. Wang, J. Hu et al., “Online service migration in mobile edge with incomplete system information: A deep recurrent actor-critic learning approach,” IEEE Trans. Mob. Comput., 2022.
  • [12] T. Ouyang, Z. Zhou et al., “Follow me at the edge: Mobility-aware dynamic service placement for mobile edge computing,” IEEE J. Sel. Areas Commun., vol. 36, no. 10, pp. 2333–2345, 2018.
  • [13] D. Harutyunyan, N. Shahriar et al., “Latency and mobility–aware service function chain placement in 5G networks,” IEEE Trans. Mob. Comput., vol. 21, no. 5, pp. 1697–1709, 2020.
  • [14] B. Wang, H. Zhou et al., “Human digital twin in the context of Industry 5.0,” Robot Comput. Integr. Manuf., vol. 85, p. 102626, 2024.
  • [15] M. Lauer-Schmaltz, P. Cash et al., “Designing human digital twins for behaviour-changing therapy and rehabilitation: a systematic review,” Proc. Design Soc., vol. 2, pp. 1303–1312, 2022.
  • [16] Y. Shi, C. Yi, R. Wang, Q. Wu, B. Chen, and J. Cai, “Service migration or task rerouting: A two-timescale online resource optimization for MEC,” IEEE Trans. Wirel. Commun., 2023.
  • [17] D. Lee, J. Cho, and J. Kim, “Meta-human synchronization framework for large-scale digital twin,” in Proc. IEEE MetaCom, 2023, pp. 738–741.
  • [18] R. Zhong, B. Hu et al., “Construction of human digital twin model based on multimodal data and its application in locomotion mode identification,” Chin. J. Mech. Eng., vol. 36, no. 1, p. 126, 2023.
  • [19] Y. Liu, L. Zhang et al., “A novel cloud-based framework for the elderly healthcare services using digital twin,” IEEE Access, vol. 7, pp. 49 088–49 101, 2019.
  • [20] X. Lin, J. Wu, J. Li, W. Yang, and M. Guizani, “Stochastic digital-twin service demand with edge response: An incentive-based congestion control approach,” IEEE Trans. Mob., 2021.
  • [21] Y. Ding, K. Li, C. Liu, and K. Li, “A potential game theoretic approach to computation offloading strategy optimization in end-edge-cloud computing,” IEEE Trans. Parallel Distrib. Syst., vol. 33, no. 6, pp. 1503–1519, 2021.
  • [22] Y. Shi, C. Yi, B. Chen, C. Yang, K. Zhu, and J. Cai, “Joint online optimization of data sampling rate and preprocessing mode for edge–cloud collaboration-enabled industrial IoT,” IEEE Internet Things J., vol. 9, no. 17, pp. 16 402–16 417, 2022.
  • [23] Y. He, Y. Ren et al., “Two-timescale resource allocation for automated networks in IIoT,” IEEE Trans. Wirel. Commun., vol. 21, no. 10, pp. 7881–7896, 2022.
  • [24] H. Ma, Z. Zhou, and X. Chen, “Leveraging the power of prediction: Predictive service placement for latency-sensitive mobile edge computing,” IEEE Trans. Wirel. Commun., vol. 19, no. 10, pp. 6454–6468, 2020.
  • [25] R. Zhou, X. Wu, H. Tan, and R. Zhang, “Two time-scale joint service caching and task offloading for UAV-assisted mobile edge computing,” in Proc. IEEE INFOCOM, 2022, pp. 1189–1198.
  • [26] W. Shengli, “Is human digital twin possible?” Comput. Methods and Programs in Biomed. Update, vol. 1, p. 100014, 2021.
  • [27] J. Xie, R. Yang et al., “Dual digital twin: Cloud–edge collaboration with lyapunov-based incremental learning in EV batteries,” Appl. Energy, vol. 355, p. 122237, 2024.
  • [28] M. Mohammadi, H. A. Suraweera, and C. Tellambura, “Uplink/downlink rate analysis and impact of power allocation for full-duplex cloud-RANs,” IEEE Trans. Wirel. Commun., vol. 17, no. 9, pp. 5774–5788, 2018.
  • [29] C. Yi, J. Cai et al., “A queueing game based management framework for fog computing with strategic computing speed control,” IEEE Trans. Mob., vol. 21, no. 5, pp. 1537–1551, 2020.
  • [30] A. Goldsmith, Wireless communications.   Cambridge university press, 2005.
  • [31] W. Wu, P. Yang et al., “Accuracy-guaranteed collaborative DNN inference in industrial IoT via deep reinforcement learning,” IEEE Trans. Ind. Inform., vol. 17, no. 7, pp. 4988–4998, 2020.
  • [32] K.-C. Wu, W.-Y. Liu, and S.-Y. Wu, “Dynamic deployment and cost-sensitive provisioning for elastic mobile cloud services,” IEEE Trans. Mob. Comput., vol. 17, no. 6, pp. 1326–1338, 2017.
  • [33] C. Yi, J. Cai et al., “Workload re-allocation for edge computing with server collaboration: A cooperative queueing game approach,” IEEE Trans. Mob. Comput., 2021.
  • [34] M. Neely, Stochastic network optimization with application to communication and queueing systems.   Springer Nature, 2022.
  • [35] L. Georgiadis, M. J. Neely, L. Tassiulas et al., “Resource allocation and cross-layer control in wireless networks,” Found. Trends® Netw., vol. 1, no. 1, pp. 1–144, 2006.
  • [36] P. M. Castro, “Tightening piecewise mccormick relaxations for bilinear problems,” Comput. Chem. Eng., vol. 72, pp. 300–311, 2015.
  • [37] R. Karuppiah and I. E. Grossmann, “Global optimization for the synthesis of integrated water systems in chemical processes,” Comput. Chem. Eng., vol. 30, no. 4, pp. 650–673, 2006.
  • [38] C. A. Meyer and C. A. Floudas, “Global optimization of a combinatorially complex generalized pooling problem,” AIChE J., vol. 52, no. 3, pp. 1027–1037, 2006.
  • [39] C. Bliek1ú, P. Bonami, and A. Lodi, “Solving mixed-integer quadratic programming problems with IBM-CPLEX: a progress report,” in Proc. RAMP Symp., 2014, pp. 16–17.
  • [40] J. Bolte, S. Sabach, and M. Teboulle, “Proximal alternating linearized minimization for nonconvex and nonsmooth problems,” Math. Program., vol. 146, no. 1-2, pp. 459–494, 2014.
  • [41] T. P. Peixoto, “Efficient monte carlo and greedy heuristic for the inference of stochastic block models,” Phys. Rev. E, vol. 89, no. 1, p. 012804, 2014.
  • [42] S. P. Boyd and L. Vandenberghe, Convex optimization.   Cambridge university press, 2004.
  • [43] A. Srinivasan, “Approximation algorithms via randomized rounding: a survey,” Adv. Topics in Math., PWN, pp. 9–71, 1999.
  • [44] M. Zhang and A. A. Sawchuk, “USC-HAD: A daily activity dataset for ubiquitous activity recognition using wearable sensors,” in Proc. ACM SAGAware, Pittsburgh, Pennsylvania, USA, September 2012.
  • [45] S. Telecom, “The distribution of 3233 base stations,” 2019.
  • [46] D. B. Johnson and D. A. Maltz, “Dynamic source routing in ad hoc wireless networks,” Mobile Comput., pp. 153–181, 1996.

Appendix A

By squaring both sides of the delay overflow queue in (22), we have

Hi2(τ+1)={[Hi(τ)+Titol(τ)Timax]+}2\displaystyle H_{i}^{2}(\tau+1)=\{[H_{i}(\tau)+T_{i}^{tol}(\tau)-T_{i}^{max}]^{+}\}^{2} (52)
Hi2(τ)+(Titol(τ)Timax)2+2Hi(τ)(Titol(τ)Timax).\displaystyle\leq H_{i}^{2}(\tau)+(T_{i}^{tol}(\tau)-T_{i}^{max})^{2}+2H_{i}(\tau)(T_{i}^{tol}(\tau)-T_{i}^{max}).

By subtracting Hi2(τ)H_{i}^{2}(\tau) from both sides, dividing by 2 and summing up all inequalities for ii\in\mathcal{I}, we have

12i(Hi2(τ+1)Hi2(τ))\displaystyle\frac{1}{2}\sum_{i\in\mathcal{I}}(H_{i}^{2}(\tau+1)-H_{i}^{2}(\tau)) (53)
12i(Titol(τ)Timax)2+iHi(τ)(Titol(τ)Timax).\displaystyle\leq\frac{1}{2}\sum_{i\in\mathcal{I}}(T_{i}^{tol}(\tau)-T_{i}^{max})^{2}+\sum_{i\in\mathcal{I}}H_{i}(\tau)(T_{i}^{tol}(\tau)-T_{i}^{max}).

Similarly, by squaring both sides of the energy deficit queue in (23), we have

E2(τ+1)={[E(τ)+Etol(τ)Emax]+}2\displaystyle E^{2}(\tau+1)=\{[E(\tau)+E^{tol}(\tau)-E^{max}]^{+}\}^{2} (54)
E2(τ)+(Etol(τ)Emax)2+2E(τ)(Etol(τ)Emax).\displaystyle\leq E^{2}(\tau)+(E^{tol}(\tau)-E^{max})^{2}+2E(\tau)(E^{tol}(\tau)-E^{max}).

By subtracting E2(τ)E^{2}(\tau) from both sides and dividing by 2, we have

12(E2(τ+1)E2(τ))\displaystyle\frac{1}{2}(E^{2}(\tau+1)-E^{2}(\tau)) (55)
12(Etol(τ)Emax)2+E(τ)(Etol(τ)Emax).\displaystyle\leq\frac{1}{2}(E^{tol}(\tau)-E^{max})^{2}+E(\tau)(E^{tol}(\tau)-E^{max}).

Combining (53) and (55), we have

L(𝚯(τ+1))L(𝚯(τ))\displaystyle L(\bm{\Theta}(\tau+1))-L(\bm{\Theta}(\tau)) (56)
=12i(Hi2(τ+1)Hi2(τ))+12(E2(τ+1)E2(τ))\displaystyle=\frac{1}{2}\sum_{i\in\mathcal{I}}(H_{i}^{2}(\tau+1)-H_{i}^{2}(\tau))+\frac{1}{2}(E^{2}(\tau+1)-E^{2}(\tau))
12i(Titol(max)Timax)2+12(Etol(max)Emax)2\displaystyle\leq\frac{1}{2}\sum_{i\in\mathcal{I}}(T_{i}^{tol}(max)-T_{i}^{max})^{2}+\frac{1}{2}(E^{tol}(max)-E^{max})^{2}
+iHi(τ)(Titol(τ)Timax)+E(τ)(Etol(τ)Emax).\displaystyle+\sum_{i\in\mathcal{I}}H_{i}(\tau)(T_{i}^{tol}(\tau)-T_{i}^{max})+E(\tau)(E^{tol}(\tau)-E^{max}).

Lastly, by adding ViAi(τ)V\cdot\sum_{i\in\mathcal{I}}A_{i}(\tau) to both sides of (LABEL:LyapunovDrift) and taking the expectation of both sides of 𝚯(τ)\bm{\Theta}(\tau), inequality (27) can be derived.

Appendix B

For subproblem 𝒫51\mathcal{P}_{5-1}, the Lagrangian function is given by

1(𝒚)=iHi(τ)zi(τ)m(Ti,mul(τ)+Ti,mud(τ))\displaystyle\mathcal{L}_{1}(\bm{y})=\sum_{i\in\mathcal{I}}H_{i}(\tau)z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)) (57)
+E(τ)imzi(τ)(Ei,mul(τ)+Ei,mud(τ))\displaystyle+E(\tau)\sum_{i\in\mathcal{I}}\sum_{m\in\mathcal{M}}z_{i}(\tau)(E_{i,m}^{ul}(\tau)+E_{i,m}^{ud}(\tau))
ViAi(τ).\displaystyle-V\cdot\sum_{i\in\mathcal{I}}A_{i}(\tau).

By taking the first-order and second-order derivatives of (LABEL:Ptau1), we have

1yi(τ)=iHi(τ)zi(τ)m(ai,m(t)Si(τ)Cmfi(τ)Fm\displaystyle\frac{\partial\mathcal{L}_{1}}{\partial y_{i}(\tau)}=\sum_{i\in\mathcal{I}}H_{i}(\tau)z_{i}(\tau)\sum_{m\in\mathcal{M}}(\frac{a_{i,m}(t)S_{i}(\tau)C_{m}}{f_{i}(\tau)F_{m}} (58)
+Si(τ)ri,m(τ))+E(τ)imM(ai,m(t)ρm(Fm)2Si(τ)\displaystyle+\frac{S_{i}(\tau)}{r_{i,m}(\tau)})+E(\tau)\sum_{i\in\mathcal{I}}\sum_{m\in M}(a_{i,m}(t)\rho_{m}(F_{m})^{2}S_{i}(\tau)
Cm+Si(τ)piri,m(τ))2Vizi(τ)Si(τ)Di(t)+Si(τ),\displaystyle C_{m}+\frac{S_{i}(\tau)p_{i}}{r_{i,m}(\tau)})-2V\sum_{i\in\mathcal{I}}\frac{z_{i}(\tau)S_{i}(\tau)}{D_{i}(t)+S_{i}(\tau)},
21(yi(τ))2=2ViSi(τ)20.\displaystyle\frac{\partial^{2}\mathcal{L}_{1}}{\partial(y_{i}(\tau))^{2}}=2V\cdot\sum_{i\in\mathcal{I}}S_{i}(\tau)^{2}\geq 0. (59)

Since 21(yi(τ))20\frac{\partial^{2}\mathcal{L}_{1}}{\partial(y_{i}(\tau))^{2}}\geq 0, we can conclude that problem 𝒫51\mathcal{P}_{5-1} is convex.

For subproblem 𝒫52\mathcal{P}_{5-2}, we denote 𝜸={γi(τ),i,τ𝒯t}\bm{\gamma}=\{\gamma_{i}(\tau),\forall i\in\mathcal{I},\forall\tau\in\mathcal{T}_{t}\} as the corresponding Lagrangian multiplier, which can be expressed as

2(𝒃,𝜸)=iHi(τ)zi(τ)m(Ti,mul(τ)+Ti,mofld(τ))\displaystyle\mathcal{L}_{2}(\bm{b},\bm{\gamma})=\sum_{i\in\mathcal{I}}H_{i}(\tau)z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ofld}(\tau)) (60)
+E(τ)im(Ei,mul(τ)+Ei,mofld(τ))\displaystyle+E(\tau)\sum_{i\in\mathcal{I}}\sum_{m\in\mathcal{M}}(E_{i,m}^{ul}(\tau)+E_{i,m}^{ofld}(\tau))
+iγi(τ)(ai,m(t)bi(τ)1).\displaystyle+\sum_{i\in\mathcal{I}}\gamma_{i}(\tau)(a_{i,m}(t)b_{i}(\tau)-1).

Since 1/ri,m(τ)1/r_{i,m}(\tau) in (60) decreases monotonically, it is convex when bi(τ)(0,1]b_{i}(\tau)\in(0,1], and thus problem 𝒫52\mathcal{P}_{5-2} is convex.

For subproblem 𝒫53\mathcal{P}_{5-3}, we denote 𝜼={ηi(τ),i,τ𝒯t}\bm{\eta}=\{\eta_{i}(\tau),\forall i\in\mathcal{I},\forall\tau\in\mathcal{T}_{t}\} as the corresponding Lagrangian multiplier, which can be calculated as

3(𝒇,𝜼)=iHi(τ)(mTi,mpl(t)/K+zi(τ)m\displaystyle\mathcal{L}_{3}(\bm{f},\bm{\eta})=\sum_{i\in\mathcal{I}}H_{i}(\tau)(\sum_{m\in\mathcal{M}}T_{i,m}^{pl}(t)/K+z_{i}(\tau)\sum_{m\in\mathcal{M}} (61)
Ti,mud(τ)+Tiexec(τ))+E(τ)i(mEi,mpl(t)/K\displaystyle T_{i,m}^{ud}(\tau)+T_{i}^{exec}(\tau))+E(\tau)\sum_{i\in\mathcal{I}}(\sum_{m\in\mathcal{M}}E_{i,m}^{pl}(t)/K
+mzi(τ)Ei,mud(τ)+Eiexec(τ))\displaystyle+\sum_{m\in\mathcal{M}}z_{i}(\tau)E_{i,m}^{ud}(\tau)+E_{i}^{exec}(\tau))
+iηi(τ)(ai,m(t)fi(τ)1)\displaystyle+\sum_{i\in\mathcal{I}}\eta_{i}(\tau)(a_{i,m}(t)f_{i}(\tau)-1)

Taking the first-order and second-order derivatives yields

3fi(τ)=iHi(τ)(mai,m(t)xi(t)Di(t)Cm(fi(tK))2FmK\displaystyle\frac{\partial\mathcal{L}_{3}}{\partial f_{i}(\tau)}=-\sum_{i\in\mathcal{I}}H_{i}(\tau)(\sum_{m\in\mathcal{M}}\frac{a_{i,m}(t)x_{i}(t)D_{i}(t)C_{m}}{(f_{i}(tK))^{2}F_{m}K} (62)
+zi(τ)mai,m(t)yi(τ)Si(τ)Cm(fi(τ))2Fm\displaystyle+z_{i}(\tau)\sum_{m\in\mathcal{M}}\frac{a_{i,m}(t)y_{i}(\tau)S_{i}(\tau)C_{m}}{(f_{i}(\tau))^{2}F_{m}}
+mai,m(t)zi(τ)λi(τ)Cm(fi(τ))2Fm)\displaystyle+\sum_{m\in\mathcal{M}}a_{i,m}(t)z_{i}(\tau)\frac{\lambda_{i}(\tau)C_{m}}{(f_{i}(\tau))^{2}F_{m}})
+iηi(τ)ai,m(t),\displaystyle+\sum_{i\in\mathcal{I}}\eta_{i}(\tau)a_{i,m}(t),
23(fi(τ))2=2iHi(τ){mai,m(t)xi(t)Di(t)Cm(fi(tK))3FmK\displaystyle\frac{\partial^{2}\mathcal{L}_{3}}{\partial(f_{i}(\tau))^{2}}=2\sum_{i\in\mathcal{I}}H_{i}(\tau)\{\sum_{m\in\mathcal{M}}\frac{a_{i,m}(t)x_{i}(t)D_{i}(t)C_{m}}{(f_{i}(tK))^{3}F_{m}K} (63)
+zi(τ)mai,m(t)yi(τ)Si(τ)Cm(fi(τ))3Fm\displaystyle+z_{i}(\tau)\sum_{m\in\mathcal{M}}\frac{a_{i,m}(t)y_{i}(\tau)S_{i}(\tau)C_{m}}{(f_{i}(\tau))^{3}F_{m}}
+mai,m(t)zi(τ)λi(τ)Cm(fi(τ))3Fm}.\displaystyle+\sum_{m\in\mathcal{M}}a_{i,m}(t)z_{i}(\tau)\frac{\lambda_{i}(\tau)C_{m}}{(f_{i}(\tau))^{3}F_{m}}\}.

Obviously, when fi(τ)(0,1]f_{i}(\tau)\in(0,1], 23(fi(τ))20\frac{\partial^{2}\mathcal{L}_{3}}{\partial(f_{i}(\tau))^{2}}\geq 0, and thus problem 𝒫53\mathcal{P}_{5-3} is convex.

For subproblem 𝒫54\mathcal{P}_{5-4}, the Lagrangian function is given by

4(𝒛)=iHi(τ)[zi(τ)m(Ti,mul(τ)+Ti,mud(τ)\displaystyle\mathcal{L}_{4}(\bm{z})=\sum_{i\in\mathcal{I}}H_{i}(\tau)[z_{i}(\tau)\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau)
+Ti,mofld(τ))+Tiexec(τ)]+E(τ)i[mzi(τ)(Ei,mul(τ)\displaystyle+T_{i,m}^{ofld}(\tau))+T_{i}^{exec}(\tau)]+E(\tau)\sum_{i\in\mathcal{I}}[\sum_{m\in\mathcal{M}}z_{i}(\tau)(E_{i,m}^{ul}(\tau)
+Ei,mud(τ)+Ei,mofld(τ))+Eiexec(τ)]ViAi(τ).\displaystyle+E_{i,m}^{ud}(\tau)+E_{i,m}^{ofld}(\tau))+E_{i}^{exec}(\tau)]-V\sum_{i\in\mathcal{I}}A_{i}(\tau). (64)

By taking the first-order derivative and second-order derivatives of (B), we have

4zi(τ)=iHi(τ)[m(Ti,mul(τ)+Ti,mud(τ)\displaystyle\frac{\partial\mathcal{L}_{4}}{\partial z_{i}(\tau)}=\sum_{i\in\mathcal{I}}H_{i}(\tau)[\sum_{m\in\mathcal{M}}(T_{i,m}^{ul}(\tau)+T_{i,m}^{ud}(\tau) (65)
+Ti,mofld(τ))+Tiexec(τ)]+E(τ)i[m(Ei,mul(τ)\displaystyle+T_{i,m}^{ofld}(\tau))+T_{i}^{exec}(\tau)]+E(\tau)\sum_{i\in\mathcal{I}}[\sum_{m\in\mathcal{M}}(E_{i,m}^{ul}(\tau)
+Ei,mud(τ)+Ei,mofld(τ))+Eiexec(τ)]\displaystyle+E_{i,m}^{ud}(\tau)+E_{i,m}^{ofld}(\tau))+E_{i}^{exec}(\tau)]
Vi[1(1xi(t)Di(t)+yi(τ)Si(τ)Di(t)+Si(τ))2gilocal].\displaystyle-V\sum_{i\in\mathcal{I}}[1-(1-\frac{x_{i}(t)D_{i}(t)+y_{i}(\tau)S_{i}(\tau)}{D_{i}(t)+S_{i}(\tau)})^{2}-g_{i}^{local}].
24(zi(τ))2=0.\displaystyle\frac{\partial^{2}\mathcal{L}_{4}}{\partial(z_{i}(\tau))^{2}}=0. (66)

Since 24(zi(τ))20\frac{\partial^{2}\mathcal{L}_{4}}{\partial(z_{i}(\tau))^{2}}\geq 0, problem 𝒫54\mathcal{P}_{5-4} is convex.