This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Wireless Human-Machine Collaboration in Industry 5.0

Gaoyang Pang Wanchun Liu   Dusit Niyato   Daniel E. Quevedo Branka Vucetic   Yonghui Li The work of W. Liu was supported by the Australian Research Council’s Discovery Early Career Researcher Award (DECRA) Project DE230100016. (Corresponding author: W. Liu.)G. Pang, W. Liu, D. Quevedo, B. Vucetic, and Y. Li are with the School of Electrical and Computer Engineering, The University of Sydney, Sydney, NSW 2006, Australia (e-mail: {gaoyang.pang, wanchun.liu, daniel.quevedo, branka.vucetic, yonghui.li}@sydney.edu.au).D. Niyato is with the College of Computing and Data Science, Nanyang Technological University, Singapore 639798, (e-mail: [email protected]).
Abstract

Wireless Human-Machine Collaboration (WHMC) represents a critical advancement for Industry 5.0, enabling seamless interaction between humans and machines across geographically distributed systems. As the WHMC systems become increasingly important for achieving complex collaborative control tasks, ensuring their stability is essential for practical deployment and long-term operation. Stability analysis certifies how the closed-loop system will behave under model randomness, which is essential for systems operating with wireless communications. However, the fundamental stability analysis of the WHMC systems remains an unexplored challenge due to the intricate interplay between the stochastic nature of wireless communications, dynamic human operations, and the inherent complexities of control system dynamics. This paper establishes a fundamental WHMC model incorporating dual wireless loops for machine and human control. Our framework accounts for practical factors such as short-packet transmissions, fading channels, and advanced HARQ schemes. We model human control lag as a Markov process, which is crucial for capturing the stochastic nature of human interactions. Building on this model, we propose a stochastic cycle-cost-based approach to derive a stability condition for the WHMC system, expressed in terms of wireless channel statistics, human dynamics, and control parameters. Our findings are validated through extensive numerical simulations and a proof-of-concept experiment, where we developed and tested a novel wireless collaborative cart-pole control system. The results confirm the effectiveness of our approach and provide a robust framework for future research on WHMC systems in more complex environments.

Index Terms:
Wireless control, Industry 5.0, Human-machine collaboration, Stability analysis.

I Introduction

The Fourth Industrial Revolution, known as Industry 4.0, envisions significantly increased automation and mechanization in manufacturing, driven by rapidly advancing cyber-physical systems (CPS) with minimal human intervention on the factory floor [1]. However, many dynamically changing and unforeseen control tasks in manufacturing, such as reconfiguring the production line, are challenging for autonomous machines to handle alone [2]. Therefore, humans are reintroduced to the manufacturing process to collaborate with machines in the fifth industrial revolution, Industry 5.0 [3]. In the Industry 5.0 era, human-machine collaboration (HMC) emerges as a key enabling technology to boost productivity, efficiency, and sustainability by combining human’s creativity, cognitive ability, and dexterity with machine’s strength, precision, and speed [3]. Future wireless communications, e.g., 6G, will be essential to provide high-performance connectivity for humans, machines (including robots), autonomous controllers, and ubiquitous sensors, enabling the flexible, scalable, and low-cost deployment of geographically distributed HMC systems [4]. Integrating wireless capabilities within an HMC system will unlock the full potential of human-machine collaboration in Industry 5.0, offering unprecedented flexibility and scalability. This wireless HMC (WHMC) framework will serve as the backbone for seamlessly connecting humans, machines, and sensors across geographically distributed environments, enabling real-time collaboration and decision-making.

The main application of WHMC is in collaborative control, where humans and autonomous controllers work together to achieve shared objectives [5]. WHMC systems enable seamless coordination between human operators and machines, enhancing the efficiency of control tasks. Existing research on WHMC has focused on applied areas such as teleoperation [6], driver assistance systems [7], and human-machine interaction [8], including scenarios where robots anticipate human intentions and assist in tasks like tool-passing during assembly [5]. While these efforts have led to successful implementations in specific domains, they often lack the foundational modeling and theoretical analysis needed for broader application [9, 10].

In a WHMC system, stability is a fundamental property that determines whether the controlled states will converge to a steady state and remain bounded under given collaborations. Stability analysis is essential for certifying that the closed-loop system will perform safely and effectively, even in the face of human-and-network-induced challenges like random delays and packet loss  [11]. However, the fundamental theories and analytical tools for designing a WHMC system with guaranteed stability are scarce, as this research area is relatively new. Analyzing the stability of a WHMC system presents unique challenges, as it is determined by three tightly coupled domains: wireless communication, human behavior, and control dynamics. Whilst the dynamical properties of individual components are well understood, the stability condition of WHMC systems has yet to be thoroughly investigated.

I-A Related Work

Establishing fundamental theories is important for guiding the systematic design of a desired WHMC system. Researchers have extensively explored theoretical aspects, such as human control modeling, human characteristics modeling, system stability analysis, and wireless networked control.

I-A1 Human control modeling

The primary goal of human control modeling is to mathematically represent how humans perform tasks, enabling machines to understand and adapt to human control policies. This modeling is essential for the design of high-performance machine control systems that can effectively collaborate with human operators. Researchers have proposed various methods to model human control policies. For example, the human operator is often modeled as a classical machine controller, such as linear feedback controller[12, 13], proportional-integral-feedback controller[14], and impedance controller[15]. The human control behavior can also be modeled using the crossover-reference model with time-invariant dynamics [16], where human operators are characterized as an open-loop transfer function. In addition to the above deterministic models, researchers have also proposed several probabilistic models, such as hidden Markov models (HMMs) [17], partially observable Markov decision processes (POMDPs) [18] and Markov decision processes (MDPs) [19]. Despite significant progress in human control modeling, accurately formulating human control policies mathematically remains a long-lasting unsolved challenge.

I-A2 Human characteristics modeling

Human characteristics modeling aims to represent the stochastic human traits that implicitly influence the delivery and accuracy of control commands generated by the human decision-making process. These time-varying characteristics include operator workload, proficiency, fatigue, and control lag. For example, the operator’s workload can be modeled as a uniform distribution over binary state sets of high and low workload[19]. The operator’s fatigue can be modeled as a binary state set (awake or sleepy) with a certain distribution [18]. However, the human characteristics in these works are modeled as independent and identically distributed random states. Human characteristics are commonly time-correlated. In order to capture the time-correlated feature, many works model human characteristics as a Markov process [20] and adopt HMMs to infer human characteristics based on the recorded temporal data [17, 21]. Although using the Markov process to model time-varying human characteristics has garnered significant attention, its application to modeling human control lag has been less considered.111Modeling the human characteristics impacting the accuracy of human control commands relies on the precise formulation of human control policies, which is a long-lasting unsolved challenge and beyond the scope of our current work [22]. For a specific collaboration task, the control policy of a human operator commonly remains unchanged in the short term. Thus, we focus on the human control lag, which influences the delivery of human control commands and impacts the collaborative control performance. The impact of such stochastic human control lag on system performance remains underexplored.

I-A3 System stability analysis

Stability analysis is crucial for designing a WHMC system to operate efficiently and safely. Effective stability analysis requires tractable modeling of the WHMC system. However, most works focus on the fundamental stability analysis of simplified WHMC systems [12, 14, 15, 23, 16]. In this regard, these works can perform classical analytical frameworks to enable optimal control with a stability guarantee in specific applications, such as irrigation canal [14], robotic exoskeleton [15], collaborative driving [23], and collaborative piloting [16]. These limitations make the methodology of most existing works on stability analysis limited to specified control applications, which may weaken the generalization ability of their analytical frameworks. In addition, these systems do not integrate with wireless communication links. Tractable mathematical modeling of advanced WHMC systems with the integration of wireless communication links to establish the stability condition is an unsolved problem.

I-A4 Wireless networked control

Wireless networked control involves integrating autonomous control systems with wireless communication networks. It primarily focuses on establishing systematic theories related to the stability and optimization of state estimation and automatic control over wireless networks [24, 25, 26]. Existing research has largely concentrated on developing optimal control algorithms that address the challenges posed by imperfect wireless communication channels, such as errors and delays [27, 28]. Some studies investigate the impact of communication protocols and parameters on the stability of automatic control systems [29, 30]. WHMC extends wireless networked control by incorporating human intelligence into the control loop, enhancing system adaptability and performance. While traditional wireless networked control focuses on how communication systems affect control stability, it does not account for the complexities introduced by human operators. Consequently, existing methods in wireless networked control are insufficient for WHMC systems, which require new approaches to address the challenges posed by integrating human factors into the control process.

I-B Motivation

A WHMC system is significantly more complex than a conventional control system. This complexity arises from the integration of wireless human control loops, the need for collaborative control, and the challenge of addressing time-varying and unforeseen tasks. In a WHMC system, the wireless communication links, the human operator, and the automatic machine controller collectively work to achieve dynamic control objectives under stringent stability constraints. This creates a novel networked topology with tightly coupled wireless human and machine control loops. The system’s stability and performance are critically influenced by three factors: wireless communication errors and delays, the stochastic nature of human behavior, and the dynamics of the physical system under control. We name these three factors as the “three-level dynamics”. Addressing these factors presents a unique challenge in modeling and stability analysis for WHMC systems. To date, the impact of these dynamics on WHMC system stability has not been investigated at all.

Fundamental modeling and analysis of a WHMC system, which features a substantially different control model, requires addressing the following fundamental questions:

  1. 1.

    How can we achieve tractable mathematical modeling of a WHMC system that effectively captures the three-level dynamics?

  2. 2.

    How can we establish an analytical framework for stability analysis when an accurate mathematical model of the human control policy is unavailable?

  3. 3.

    What are the primary conditions within the three-level dynamics that enable the stable operation of a WHMC system?

I-C Contributions

In this work, we address the fundamental questions outlined above, and our novel contributions are summarized below.

  1. 1.

    Novel tractable modeling of the WHMC system. For the first time, we propose a WHMC model that consists of dual wireless loops, i.e., the machine control loop and the human control loop. In particular, we have taken into account practical wireless communication factors such as short-packet communications, fading channel models and advanced hybrid automatic repeat request (HARQ) schemes for wireless sensors-controller-actuator transmission (referred to as the machine control loop) and sensors-human-actuator transmission (referred to as the human control loop). Unlike most existing HMC studies, which typically overlook the temporal variability and stochastic nature of human interactions, we model the dynamics of human control lag as a Markov process.

  2. 2.

    Stability analysis of the WHMC system. Leveraging the proposed system model, we introduce a novel cycle-cost-based approach to derive a sufficient condition for the stochastic stability of the WHMC system for the first time. This stability condition is expressed in terms of wireless channel statistics, human state dynamics, and control system parameters. We thoroughly investigate the structural properties and special cases of the derived stability condition, providing comprehensive analysis and numerical illustrations.

  3. 3.

    Proof-of-concept experiment for the proposed WHMC system. To demonstrate the advantages of WHMC and validate the developed fundamental theories and analytical tools, a proof-of-concept experiment is conducted. Specifically, we develop and evaluate a wireless collaborative cart-pole control system in terms of control performance and system stability. The experiment confirms the practicality of our approach and provides the validation of the theoretical framework, which is set in 1) and 2).

Outline. The proposed model of the WHMC system is described in Section II. The stability analysis is presented in Section III. A proof-of-concept experiment is demonstrated in Section IV, followed by conclusions in Section V.

Notations. Matrices and vectors are denoted by capital and lowercase upright bold letters, e.g., 𝐀\mathbf{A} and 𝐚\mathbf{a}, respectively. |𝐯||\mathbf{v}| is the Euclidean norm of vector 𝐯\mathbf{v}. 𝔼[]\mathbb{E}\left[\cdot\right] is the expectation operator. [𝐀]i,j\left[\mathbf{A}\right]_{i,j} denotes the element at ii-th row and jj-th column of a matrix 𝐀\mathbf{A}. ()(\cdot)^{\top} is the vector or matrix transpose operator. \mathbb{R} and \mathbb{N} denote the sets of real numbers and positive integers, respectively. 0\mathbb{N}_{0} denotes the non-negative integers.

II WHMC System

II-A Control System Dynamics

Refer to caption
Figure 1: Illustration of the WHMC system, consisting of two types of control loops, i.e., the machine control loop and the human control loop.

We consider a WHMC system consisting of a dynamic plant, two actuators, an autonomous controller (i.e., a machine), and a human operator, as shown in Fig. 1. The sensors attached to the plant send state measurements to the remote controller and the human operator. These two agents then send their individual control signals to the corresponding actuators in order to complete a collaborative control task of the plant. All information for sensing and control is exchanged via four wireless links: sensor-human (SH) uplink, sensor-controller (SC) uplink, human-actuator (HA) downlink, and controller-actuator (CA) downlink. Such a system model has two types of control loops, i.e., the machine control loop and the human control loop. It is abstracted from the existing visions of HMC systems, e.g., homecare robotic systems for Healthcare 4.0 [31], factory edge robotic systems for Industrial 5.0 [4], collaborative surgery in healthcare [32], collaborative piloting in aviation [16], and collaborative driving in a vehicle [23]. These systems require a human operator to control an actuator as well as collaborate with other machine-controlled actuators.

Having two loops in parallel allows one to clearly distinguish between human and machine contributions and enables individual analysis of each loop’s dynamics and their interactions. Our model can also adjust the degree of influence each loop has, allowing for a spectrum of control schemes, such as human-in-the-loop, supervisory, and shared control.222For example, if the time period of a human control loop is far longer than that of a machine control loop, our model becomes supervisory control, where the machine is predominantly responsive. If the time period of a machine control loop is longer than that of a human control loop, it can be seen as human-in-the-loop control, where the human operator is predominantly responsive. If the time period of a machine control loop is close to that of a human control loop, our model encompasses shared control, where both the human operator and the machine contribute significantly.

The plant dynamics is modeled as a nonlinear discrete time-invariant system

𝐱(t+1)=f(𝐱(t),𝐮H(t),𝐮M(t),𝐰(t)),\mathbf{x}(t+1)=f(\mathbf{x}(t),\mathbf{u}_{H}(t),\mathbf{u}_{M}(t),\mathbf{w}(t)), (1)

where tt is the time index given the sampling period TsT_{s}; 𝐱(t)ls\mathbf{x}(t)\in\mathbb{R}^{l_{s}} is the plant state vector at time tt; 𝐮H(t)lh\mathbf{u}_{H}(t)\in\mathbb{R}^{l_{h}} and 𝐮M(t)lm\mathbf{u}_{M}(t)\in\mathbb{R}^{l_{m}} are the corresponding human control input and machine control input, respectively; 𝐰(t)lw\mathbf{w}(t)\in\mathbb{R}^{l_{w}} is the plant disturbance. The control algorithms for generating control inputs will be presented later in this section.

II-B Wireless Control Loops

The temporal operation of the two control loops is shown in Fig. 2. We assume block Rayleigh fading channels, where the channel characteristics remain constant during each time slot but change independently from one time slot to the next.

Refer to caption
Figure 2: Temporal operation of the two control loops.

II-B1 Machine control loops

Each machine control loop takes a single time step, i.e., the period of a machine control loop is TsT_{s}, and consists of a pair of SC uplink and CA downlink transmissions. If the SC packet is not detected successfully, there is no CA transmission scheduled as the controller has no instantaneous plant state information. We consider short-packet transmissions for low-latency communications [33]. The computation time for generating a control signal is usually much shorter than the transmission delay and thus is omitted [34, 35]. A machine control loop is closed only when both the SC and CA transmissions within it are successful.

II-B2 Human control loops

Each human control loop period is delineated by an HA downlink transmission, as illustrated in Fig. 2. A human control loop contains multiple SH uplink transmissions, a human control procedure, and an HA downlink transmission. A downlink transmission attempt marks the end of one human control loop period and the beginning of the next. Each period starts from a new packet transmission from the sensors, which contains the current plant state measurement. If the transmission fails, then a retransmission takes place using a HARQ protocol. In instances where a given maximum number of retransmissions NN has been reached, a new transmission is triggered.333Unlike machine control loops, the lag of human control, which captures the delay in human decision-making, is significantly longer than the transmission delay [36]. This results in frequent machine control actions and infrequent human interventions. In contexts where the lag of human control is substantial, the transmission delay becomes relatively insignificant. Consequently, retransmissions are used to improve transmission reliability, as a longer transmission delay caused by retransmissions does not notably affect the overall human control process. The human operator generates a control command after receiving a successful packet, and then sends the command to the actuator. There is no retransmission for the HA and CA downlinks, since retransmissions lead to unpredictable delays, making the generated time-sensitive control command useless. Let tt^{\prime} denote the human control loop index. Then, the transmission delay for the SH and HA transmissions are τSH(t)\tau_{SH}(t^{\prime}) and τHA=1\tau_{HA}=1, respectively, and the lag of human control is τH(t)𝒮{1,2,,τmax}\tau_{H}(t^{\prime})\in\mathcal{S}\triangleq\{1,2,\dots,\tau_{\mathrm{max}}\}. In particular, {τH(t)}\{\tau_{H}(t^{\prime})\} is modeled as a finite state Markov chain with a transition probability matrix 𝐌\mathbf{M}, where pi,j[𝐌]i,jp_{i,j}\triangleq[\mathbf{M}]_{i,j}. A shorter lag of human control leads to better control performance. The stationary distribution of τH(t)\tau_{H}(t^{\prime}) is given as

vk[τH(t)=k],1kτmax.v_{k}\triangleq\mathbb{P}\!\left[\tau_{H}(t^{\prime})\!=\!k\right]\!,1\leq k\leq\tau_{\text{max}}. (2)

We assume each transmission in a human control loop takes one time step because human-type communication generally requires a larger packet length than machine-type communication [37]. Considering the random period of each human control loop, we define κ(t)\kappa(t^{\prime}) as the starting time slot of the tt^{\prime}-th human control loop. The human control loop is closed once the HA transmission is successful.

II-C Control Algorithms

Due to packet detection errors, the sensor’s packet for the remote controller may not be received by the remote controller, and the machine control input may not be received by the actuator at every time step. Let the binary variables ζSC(t),ζCA(t),ζSH(t),ζHA(t){1,0}\zeta_{SC}(t),\zeta_{CA}(t),\zeta_{SH}(t^{\prime}),\zeta_{HA}(t^{\prime})\in\{1,0\} denote the transmission success and failure of the corresponding channel at tt, respectively. The machine control input at tt is given as

𝐮M(t)={fM(𝐱(t)), if ζCA(t)ζSC(t)=1,𝟎,otherwise,\mathbf{u}_{M}(t)=\begin{cases}f_{M}\left(\mathbf{x}(t)\right),&\text{ if }\zeta_{CA}(t)\zeta_{SC}(t)=1,\\ \mathbf{0},&\text{otherwise,}\end{cases} (3)

where fM()f_{M}(\cdot) is the machine control policy. Hence, only a pair of successful uplink and downlink transmissions can generate an effective control input, closing the machine control loop.

From the definition of human control loops, a human control input can only be available at the beginning of each control loop. Considering the random delay of SH transmissions and human decision-making, the human control input at tt is

𝐮H(t)={fH(𝐱(tτSHA(t)),for t=κ(t) and ζHA(t)=1𝟎,otherwise,\mathbf{u}_{H}(t)\!=\!\begin{cases}f_{H}(\mathbf{x}(t\!-\!\tau_{SHA}(t^{\prime})),\!&\text{for }t\!=\!\kappa(t^{\prime})\text{ and }\zeta_{HA}(t^{\prime})\!=\!1\\ \mathbf{0},\!&\text{otherwise,}\end{cases} (4)

where fH()f_{H}(\cdot) is the human control policy and

τSHA(t)τSH(t)SH Tx. delay+τH(t)Human control lag+τHAHA Rx. delay.\tau_{SHA}(t^{\prime})\triangleq\underbrace{\tau_{SH}(t^{\prime})}_{\text{SH Tx. delay}}+\underbrace{\tau_{H}(t^{\prime})}_{\text{Human control lag}}+\underbrace{\tau_{HA}}_{\text{HA Rx. delay}}. (5)

As an accurate model of the human control policy is unavailable, we propose an analytical framework for stability analysis without using specific control policies of human and machine, but using their control significance in the next section.

III Stability Analysis

From (3)–(5), we see that the two control loops can be either open or closed due to the packet loss and delays, which may cause instability of the WHMC system. In this section, we derive the stability condition of the proposed WHMC system by taking into account the randomness in wireless communications and human decision-making. Since only closed control loops generate effective control inputs that regulate plant state and affect stability, we analyze the statistics of the stochastic closed (and open) control loop first.

III-A Stochastic Control Loop Analysis

III-A1 Open loop probabilities of human and machine control

Let γHA(t)\gamma_{HA}(t), γSH(t)\gamma_{SH}(t), γCA(t)\gamma_{CA}(t), and γSC(t)\gamma_{SC}(t) denote the signal-to-noise ratio (SNR) of received packets in HA, SH, CA, and SC channels, respectively. Given the packet length lpl_{p} (i.e., the number of symbols per packet), the number of data bits bb in the packet, and the SNR γ\gamma of the packet, we have the approximated decoding error probability of a packet as [38]

ε(γ)𝒬(𝒞(γ)blp𝒱(γ)lp),\varepsilon\left(\gamma\right)\approx\mathcal{Q}\left(\frac{\mathcal{C}\left(\gamma\right)-\frac{b}{l_{p}}}{\sqrt{\frac{\mathcal{V}\left(\gamma\right)}{l_{p}}}}\right), (6)

where 𝒞(γ)=log2(1+γ)\mathcal{C}(\gamma)=\log_{2}{(1+\gamma)} and 𝒱(γ)=(1(1+γ)2)(log2e)2\mathcal{V}(\gamma)=(1-(1+\gamma)^{-2})(\log_{2}{e})^{2} are the Shannon capacity and the channel dispersion, respectively, and 𝒬(x)=(12π)xet22dt\mathcal{Q}(x)=(\frac{1}{\sqrt{2\pi}})\int_{x}^{\infty}{e^{-\frac{t^{2}}{2}}\text{d}t} is the Gaussian Q-function.

The probability of the machine control operating in an open loop at time tt can be obtained as

pM(t)\displaystyle p_{M}(t) =[ζSC(t)=ζCA(t)=1]\displaystyle=\mathbb{P}\!\left[\zeta_{SC}(t)=\zeta_{CA}(t)=1\right] (7)
=1(1ε(γSC(t)))(1ε(γCA(t))).\displaystyle=1-\left(1-\varepsilon\left(\gamma_{SC}\left(t\right)\right)\right)\left(1-\varepsilon\left(\gamma_{CA}\left(t\right)\right)\right).

The expectation of (7) with respect to γSC(t)\gamma_{SC}\left(t\right) and γCA(t)\gamma_{CA}\left(t\right) is denoted as p¯M\bar{p}_{M}, and can be obtained by

p¯M𝔼[pM(t)]=1(1𝔼[ε(γSC(t))])(1𝔼[ε(γCA(t))]).\bar{p}_{M}\!\triangleq\!\mathbb{E}\!\left[p_{M}(t)\right]\!=\!1\!-\!\left(1\!-\!\mathbb{E}\!\left[\varepsilon\!\left(\!\gamma_{SC}(t)\right)\right]\right)\!\left(\!1\!-\!\mathbb{E}\!\left[\varepsilon\!\left(\!\gamma_{CA}(t)\right)\right]\right)\!. (8)

Since each human control loop contains a successful SH packet, the probability of an open human control loop only depends on the HA transmission and is given by

pH(t)=[ζHA(t)=0]=ε(γHA(κ(t+1)1)).p_{H}(t^{\prime})=\mathbb{P}\!\left[\zeta_{HA}(t^{\prime})\!=\!0\right]=\varepsilon\left(\gamma_{HA}(\kappa(t^{\prime}+1)-1)\right). (9)

The expectation of (9) with respect to γHA(κ(t)1)\gamma_{HA}(\kappa(t^{\prime})-1) is denoted as p¯H\bar{p}_{H}, and can be obtained by

p¯H𝔼[pH(t)]=𝔼[ε(γHA(κ(t)1))].\bar{p}_{H}\triangleq\mathbb{E}\!\left[p_{H}(t^{\prime})\right]=\mathbb{E}\!\left[\varepsilon\left(\gamma_{HA}(\kappa(t^{\prime})-1)\right)\right]. (10)
Refer to caption
Figure 3: Illustration of the time horizon of plant dynamics between two adjacent closed human control loops. Δ()=τSH()+τH()+τHA\Delta(\cdot)=\tau_{SH}(\cdot)+\tau_{H}(\cdot)+\tau_{HA}

III-A2 Distribution of the SH delay

The duration of a human control loop τSHA(t)\tau_{SHA}(t^{\prime}) in (5) includes the SH channel delay τSH(t)\tau_{SH}(t^{\prime}), human control lag τH(t)\tau_{H}(t^{\prime}), and the HA channel delay τHA\tau_{HA}. The HA channel delay is constant across human control loops, while the human control lag is time-correlated across human control loops due to the Markovian property. The SH channel delay is attributed to the HARQ and i.i.d across all human control loops. We analyze the distribution of the SH channel delay before proceeding with the distribution of the duration of consecutive time steps where the human control loop is open. We consider the following three types of HARQ schemes for the SH channel, including Type I HARQ (TI-HARQ), Chase Combing HARQ (CC-HARQ), and Incremental Redundancy HARQ (IR-HARQ).444In TI-HARQ, the packet is re/transmitted for all re/transmissions, and all erroneously decoded packets are discarded at the receiver side. All decoding attempts during re/transmissions of the packet are independent. In CC-HARQ, all erroneously decoded packets in previous re/transmissions are saved and their signals are combined together as a single strengthened signal for decoding. In IR-HARQ, the packet in each re/transmission is a punctured version of a low-rate mother packet. If errors occur, it only retransmits the additional redundancy for the previous uncorrectable packets. The newly received redundancy is combined with the previously received packets to construct a packet with a longer length for decoding.

The number of re/transmission attempts is r{1,2,,N}r\in\{1,2,\dots,N\}. Let 𝜸r(κ(t1),r)\bm{\gamma}_{r}(\kappa(t^{\prime}-1),r) denote the set of experienced SNRs during rr re/transmission attempts, that is

𝜸r(κ(t1),r){γSH(κ(t1)),,γSH(κ(t1)+r1)}.\bm{\gamma}_{r}(\kappa(t^{\prime}\!-\!1),r)\!\triangleq\!\{\!\gamma_{SH}(\kappa(t^{\prime}\!-\!1)\!),\dots,\gamma_{SH}(\kappa(t^{\prime}\!-\!1)\!+\!r\!-\!1\!)\!\}. (11)

The decoding error probability of the packet after rr re/transmission attempts Θ(r)\Theta(r) is an expectation over (11), and can be approximated as [39, 40, 41]

Θ(r)\displaystyle\Theta(r) [ζSH(t)=0𝜸r(κ(t1),r)]\displaystyle\triangleq\!\mathbb{P}\!\left[\zeta_{SH}(t^{\prime})\!=\!0\mid\bm{\gamma}_{r}(\kappa(t^{\prime}-1),r)\right] (12)
{i=0r1ε(γSH(κ(t1)+i)),TI-HARQ,ε(i=0r1γSH(κ(t1)+i)),CC-HARQ,𝒬(i=0r1𝒞(γSH(κ(t1)+i))blpi=0r1𝒱(γSH(κ(t1)+i))lp),IR-HARQ.\displaystyle\approx\!\begin{cases}\!\prod_{i=0}^{r-1}\varepsilon\left(\gamma_{SH}(\kappa(t^{\prime}-1)+i)\right)\!,&\!\text{\small TI-HARQ},\\ \varepsilon\left(\sum_{i=0}^{r-1}\gamma_{SH}(\kappa(t^{\prime}-1)\!+\!i)\!\right)\!,&\!\text{\small CC-HARQ},\\ \mathcal{Q}\!\!\left(\!\!\frac{\sum_{i=0}^{r-1}\mathcal{C}\left(\gamma_{SH}(\kappa(t^{\prime}\!-\!1)+i)\right)-\frac{b}{l_{p}}}{\sqrt{\frac{\sum_{i=0}^{r-1}\mathcal{V}\left(\gamma_{SH}(\kappa(t^{\prime}-1)+i)\!\right)}{l_{p}}}}\!\!\right)\!\!,&\!\text{\small IR-HARQ}.\end{cases}

To facilitate our subsequent analysis, we assume that all packets have the same length lpl_{p}. For CC-HARQ, since the channel gain is exponentially distributed, i=0r1γSH(κ(t+i))\sum_{i=0}^{r-1}\gamma_{SH}(\kappa(t^{\prime}+i)) is gamma distributed with the probability distribution function [42]

[i=0r1γSH(κ(t+i))=γ^]=1γ¯rγ^r1eγ^γ¯(r1)!,\mathbb{P}\!\left[\sum_{i=0}^{r-1}\gamma_{SH}(\kappa(t^{\prime}+i))=\hat{\gamma}\right]\!=\frac{\frac{1}{\bar{\gamma}^{r}}\hat{\gamma}^{r-1}e^{-\frac{\hat{\gamma}}{\bar{\gamma}}}}{\left(r-1\right)!}, (13)

where γ¯\bar{\gamma} is the mean of the exponential distribution. Thus, for TI- and CC-HARQ, Θ(r)\Theta(r) is obtained by leveraging (6), (12), and (13). For IR-HARQ, Θ(r)\Theta(r) can be determined by Monte Carlo simulations.

wk[τSH(t)=k]={(Θ(N))q(1Θ(kqN)),for k=qN+1,(Θ(N))q(Θ(kqN1)Θ(kqN)),for qN+2k(q+1)N,0,otherwise.\!w_{k}\triangleq\mathbb{P}\!\left[\tau_{SH}(t^{\prime})\!=\!k\right]\!\!=\!\begin{cases}\!\left(\Theta(N)\right)^{q}\!\left(1\!-\!\Theta(k-qN)\right),&\!\text{for }k\!=\!qN\!+\!1,\\ \!\left(\Theta(N)\right)^{q}\!\left(\Theta(k-qN-1)\!-\!\Theta(k-qN)\right),&\!\text{for }qN\!+\!2\!\leq k\!\leq(q\!+\!1)N,\\ 0,&\text{otherwise.}\end{cases} (14)

The delay induced by the SH transmission period τSH(t)\tau_{SH}(t^{\prime}) is k0k\in\mathbb{N}_{0}. Note that the SH transmission period may contain multiple NN-trails of the retransmission process as described in Section II-B2, and the number of experienced NN-trails is q0q\in\mathbb{N}_{0}. The probability distribution of τSH(t)\tau_{SH}(t^{\prime}) is then given as (14). When q=0q=0, the SH transmission is successful in the first NN-trials. In this case, if 2kN12\leq k\leq N-1, τSH(t)=k\tau_{SH}(t^{\prime})\!=\!k indicates that the first k1k-1 trials have failed and the kk-th transmission attempt is successful. When q>0q>0, the SH transmission is successful in the (q+1)\left(q+1\right)th NN-trials, while the former qqth NN-trials are decoded erroneously.

III-A3 Time interval distribution between consecutive closed human control loops

We denote the starting time of the nnth closed human control loop as t=knt=k_{n}, as shown in Fig. 3. Let LL and MM denote time steps and the numbers of (open or closed) human control loops between knk_{n} and kn+1k_{n+1}, respectively, i.e.,

Li=1MτSH(t+i)+i=1MτH(t+i)+M,L\triangleq\sum_{i=1}^{M}\tau_{SH}(t^{\prime}+i)+\sum_{i=1}^{M}\tau_{H}(t^{\prime}+i)+M, (15)

where tt^{\prime} is the index number of the nnth closed human control loop among all the loops. The probability distribution of LL in (15) can be expressed as

zl[L=l]=m=1[L=lM=m][M=m].z_{l}\triangleq\mathbb{P}\!\left[L\!=\!l\right]\!=\sum_{m=1}^{\infty}\mathbb{P}\!\left[L=l\mid M=m\right]\mathbb{P}\!\left[M\!=\!m\right]. (16)

The probability distribution of the number of consecutive open human control loops in (16) can be expressed as

[M=m]=(1p¯H)(p¯H)m1,m,\mathbb{P}\!\left[M\!=\!m\right]\!=(1-\bar{p}_{H})(\bar{p}_{H})^{m-1},m\in\mathbb{N}, (17)

where p¯H\bar{p}_{H} is defined in (10). The time interval distribution of LL under the condition with mm open human control loops in (16) consists of two independent and stochastic parts, i.e., the total delay induced by SH channel and human control lag. In the following, we analyze the conditional probabilities of the two parts. The conditional probabilities of the delay induced by the SH channel can be expressed as

wk,m\displaystyle w_{k,m} [i=1mτSH(t+i)=kM=m]\displaystyle\triangleq\mathbb{P}\!\left[\sum_{i=1}^{m}\tau_{SH}(t^{\prime}+i)=k\mid M=m\right] (18)
={i=1kwi,m1wki+1, for m>1,wk, for m=1,\displaystyle=\begin{cases}\sum_{i=1}^{k}w_{i,m-1}w_{k-i+1},&\text{ for }m>1,\\ w_{k},&\text{ for }m=1,\end{cases}

where wkw_{k} is defined in (14). The conditional probabilities of the delay induced by the human control lag can be expressed as (19),

vk,m[τH(t+i)=kM=m]\displaystyle v_{k,m}\triangleq\mathbb{P}\!\left[\tau_{H}(t^{\prime}+i)=k\mid M=m\right] ={δ1++δm=k[τH(t+1)=δ1,,τH(t+m)=δm], for m>1,vk, for m=1.\displaystyle=\begin{cases}\sum_{\delta_{1}+\dots+\delta_{m}=k}\mathbb{P}\!\left[\tau_{H}(t^{\prime}+1)=\delta_{1},\dots,\tau_{H}(t^{\prime}+m)=\delta_{m}\right],&\text{ for }m>1,\\ v_{k},&\text{ for }m=1.\end{cases} (19)
={δ1++δm=kvδ1pδ1,δ2pδ2,δ3pδm1,δm, for m>1,vk, for m=1.\displaystyle=\begin{cases}\sum_{\delta_{1}+\dots+\delta_{m}=k}v_{\delta_{1}}p_{\delta_{1},\delta_{2}}p_{\delta_{2},\delta_{3}}\dots p_{\delta_{m-1},\delta_{m}},&\text{ for }m>1,\\ v_{k},&\text{ for }m=1.\end{cases}

where δm𝒮\delta_{m}\in\mathcal{S}, vkv_{k} is defined in (2), and pδm1,δm=[𝐌]δm1,δmp_{\delta_{m-1},\delta_{m}}=[\mathbf{M}]_{\delta_{m-1},\delta_{m}}. Then, the time interval distribution under the condition with mm open human control loops is

zl,m\displaystyle z_{l,m} [L=lM=m]\displaystyle\triangleq\mathbb{P}\!\left[L=l\mid M=m\right] (20)
={k=1lmwk,mvlmk+1,m,for l>m0,otherwise,\displaystyle=\begin{cases}\sum_{k=1}^{l-m}w_{k,m}v_{l-m-k+1,m},&\text{for }l>m\\ 0,&\text{otherwise,}\end{cases}

where wk,mw_{k,m} and vk,mv_{k,m} are conditional probabilities defined in (18) and (19), respectively. In summary, by using (18) and (19), we can obtain (20). By substituting (17) and (20) into (16), we can obtain the time interval distribution between consecutive closed human control loops.

III-B Stability Condition of WHMC

Lyapunov functions are powerful tools used for stability analysis in dynamic systems without needing explicit control policies. A function V:ls0V:\mathbb{R}^{l_{s}}\rightarrow\mathbb{R}_{\geq 0} is said to be a Lyapunov-like function, if V(0)=0V(0)=0, V(𝐱(t))>0V\left(\mathbf{x}(t)\right)>0 for 𝐱(t)0\mathbf{x}(t)\neq 0, and lim𝐱(t)V(𝐱(t))=\lim_{||\mathbf{x}(t)||\rightarrow\infty}V\left(\mathbf{x}(t)\right)=\infty. It is a scalar function that can be treated as a cost function associated with the system state 𝐱(t)\mathbf{x}(t). For example, the function V(𝐱(t))V(\mathbf{x}(t)) can be the magnitude of the input vector 𝐱(t)\mathbf{x}(t). The dynamic system is stable if the expected cumulative cost over an infinite time horizon remains bounded. Thus, we have the following definition.

Definition 1 (Stochastic Stability [43, 44, 45]).

The wireless networked human-machine collaborative system is stochastically stable, if for some Lyapunov-like functions VV: ls0\mathbb{R}^{l_{s}}\rightarrow\mathbb{R}_{\geq 0}, the expected value t=0𝔼[V(𝐱(t))]<\sum_{t=0}^{\infty}\mathbb{E}\left[V\left(\mathbf{x}(t)\right)\right]<\infty.

From (7), (9) and (18), we note that the WHMC system randomly switches between the following four cases: 1) Case one: both the machine control loop and the human control loop are closed; 2) Case two: only the machine control loop is closed; 3) Case three: only the human control loop is closed; and 4) Case four: both the machine control loop and the human control loop are open. We next examine the stability condition taking into account each individual case.

For tractable analysis, we make the following assumption.

Assumption 1 (Lyapunov-Like Function Gains).

There exists a Lyapunov-like function VV: ls0\mathbb{R}^{l_{s}}\rightarrow\mathbb{R}_{\geq 0}, non-negative control system parameters αHM0\alpha_{HM}\in\mathbb{R}_{\geq 0}, αM0\alpha_{M}\in\mathbb{R}_{\geq 0}, αH0\alpha_{H}\in\mathbb{R}_{\geq 0}, and α>0\alpha\in\mathbb{R}_{>0}, such that for all 𝐱(t)\mathbf{x}(t) following (1) and the initial plant state satisfying 𝔼[V(𝐱(0))]<\mathbb{E}\left[V(\mathbf{x}(0))\right]<\infty, we have

V(𝐱(t+1)){αHMV(𝐱(t)),for case one,αMV(𝐱(t)),for case two,αHV(𝐱(t)),for case three,αV(𝐱(t)),for case four.V(\mathbf{x}(t+1))\leq\begin{cases}\alpha_{HM}V(\mathbf{x}(t)),&\text{for case one},\\ \alpha_{M}V(\mathbf{x}(t)),&\text{for case two},\\ \alpha_{H}V(\mathbf{x}(t)),&\text{for case three},\\ \alpha V(\mathbf{x}(t)),&\text{for case four}.\end{cases} (21)

Assumption 1 bounds the one-step cost function ratio between V(𝐱(t+1))V(\mathbf{x}(t+1)) and V(𝐱(t))V(\mathbf{x}(t)) in the four cases based on the Lyapunov gains, αHM\alpha_{HM}, αM\alpha_{M}, αH\alpha_{H}, and α\alpha. Note that Lyapunov gains are often assumed in non-linear system stability analysis [43, 44, 45]. If a ratio is less than 11, then the cost decreases; otherwise, it increases. Considering extreme cases, if all ratios in the four cases are less than 1, the WHMC system is directly stabilized, as the cost in all cases decreases over time. Conversely, if all ratios are significantly larger, the system may not stabilize according to Definition 1. The control system parameters αHM\alpha_{HM}, αM\alpha_{M}, αH\alpha_{H}, and α\alpha are determined by the plant dynamics (1) and the control algorithms (3) and (4).

III-B1 Stability condition

In the following, we propose a stochastic cycle-cost-based approach to obtain sufficient stability conditions for the WHMC system.

Theorem 1.

The plant of the WHMC system defined in Section II is stochastically stable if

𝔼[(αM(1p¯M)+αp¯M)L](αHM(1p¯M)+αHp¯M)<1,\mathbb{E}\left[\!\left(\alpha_{M}\!\left(\!1\!-\!\bar{p}_{M}\right)\!+\!\alpha\bar{p}_{M}\!\right)^{L}\!\right]\!\left(\!\alpha_{HM}\!\left(\!1\!-\!\bar{p}_{M}\!\right)\!+\!\alpha_{H}\bar{p}_{M}\!\right)\!<\!1, (22)

where p¯M\bar{p}_{M} is the expected probability of an open machine control loop defined in (8); the control system parameters αHM\alpha_{HM}, αM\alpha_{M}, αH\alpha_{H}, and α\alpha are defined in Assumption 1; LL is the random time interval between consecutive closed human control loops with the probability distribution defined in (16).

Proof.

(Main ideas) We investigate the stability condition of the WHMC system defined in (1) by following the stability analysis framework adopting Lyapunov-like functions [43, 44, 45].555The methods in [43, 44, 45] are not directly applicable, as the control process involves human control operations with a Markovian lag model. Since human control is less frequent than machine control, it is convenient to focus on the plant events in which the actuator received human control commands. Therefore, the control process is divided by the closed-human-control-loop events. We name the time interval between consecutive closed human control loops as a cycle within the control process, and the sum of stochastic costs in a cycle is a cycle cost. Thus, the total cost of the control process is the sum of all cycle costs. The stability is equivalent to the bounded sum of all cycle costs, according to Definition 1. To prove the stability condition, we first analyze a stochastic cycle cost, where only case two and case four defined in Assumption 1 exist. It depends on the number of these two cases conditioned on the open loop probabilities and the time interval distribution presented in Section III-A. Then, we analyze the sum of stochastic cycle costs to the infinity cycles by further considering case one and case three defined in Assumption 1. Finally, we derive the stability condition by making the sum of the stochastic cycle costs bounded as Definition 1. See Appendix A for detailed proof. ∎

Sufficient conditions in stability analysis are critical because they provide guarantees that the system will be stable under the specific assumption. They are thus preferred since they give engineers and researchers a clear set of criteria to design and analyze their systems safely. The stability condition of the WHMC systems depends on the wireless communication parameters, i.e., the open loop probabilities of human and machine control p¯H\bar{p}_{H} and p¯M\bar{p}_{M}, the control system parameters, i.e., αHM\alpha_{HM}, αM\alpha_{M}, αH\alpha_{H} and α\alpha, and the Markov human state transition rule 𝐌\mathbf{M}. In particular, p¯H\bar{p}_{H} and 𝐌\mathbf{M} impact the distribution of LL, which further affect the stability condition. The condition indicates that if the WHMC systems exhibit high dynamics (i.e., the plant state changes significantly even with very small control input), the human operator experiences fatigue with a high control lag, and the open-loop probability is high, then the WHMC system becomes difficult to stabilize through collaboration.

III-B2 Stability region

The stability region in WHMC systems defines the range of system parameters that ensure stable operation, as per Theorem 1. The boundary of this stability region represents the critical limits beyond which the system may become unstable. The properties of this boundary are elucidated next.

Corollary 1.

Given the WHMC stability condition in Theorem 1,

(i) the stability region boundary in terms of αHM\alpha_{HM} and αH\alpha_{H} is linear:

αHM=p¯M(1p¯M)αH+1𝔼[ΩL](1p¯M),\alpha_{HM}=-\frac{\bar{p}_{M}}{\left(1-\bar{p}_{M}\right)}\alpha_{H}+\frac{1}{\mathbb{E}\left[\Omega^{L}\right]\left(1-\bar{p}_{M}\right)}, (23)

where ΩαM(1p¯M)+αp¯M\Omega\triangleq\alpha_{M}\left(1-\bar{p}_{M}\right)+\alpha\bar{p}_{M};

(ii) the stability region boundary in terms of αM\alpha_{M} and α\alpha is linear:

αM=p¯M1p¯Mα+1(1p¯M)L¯l=1L¯(L¯Λ[L=l])l,\alpha_{M}=-\frac{\bar{p}_{M}}{1-\bar{p}_{M}}\alpha+\frac{1}{\left(1-\bar{p}_{M}\right)\bar{L}}\sum_{l=1}^{\bar{L}}\left(\bar{L}\Lambda\mathbb{P}\left[L=l\right]\right)^{-l}, (24)

where L¯𝔼[L]\bar{L}\triangleq\mathbb{E}\left[L\right] and ΛαHM(1p¯M)+αHp¯M\Lambda\triangleq\alpha_{HM}\left(1-\bar{p}_{M}\right)+\alpha_{H}\bar{p}_{M};

(iii) the stability region boundaries, in terms of the other four possible pairs of control system parameters, i.e., αHM\alpha_{HM}, αM\alpha_{M}, αH\alpha_{H}, and α\alpha, are concave.

Proof.

See Appendix B. ∎

As illustrated in Fig. 4, a linear stability region (e.g., Corollary 1 (i) and (ii)) means the boundary is governed by a linear function. It implies that any combination of the control system parameters within the region will maintain system stability, offering engineers substantial flexibility in parameter selection and system tuning without compromising stability. This implication is applicable to the convex stability region, where the boundary is governed by a convex function. In contrast, a concave stability region (e.g., Corollary 1 (iii)) has a boundary governed by a concave function. This indicates that while individual parameter sets within the region ensure stability, linear combinations of these parameters may not. For any stable parameter set, all parameter sets within the rectangular area defined by this set and the origin are also stable. In addition to control system parameters, communication system parameters also impact the stability region, which is presented in Section III-C.

Refer to caption
Figure 4: Illustration of the stability region boundaries.

III-B3 Special cases

Given the stability condition of the general WHMC system in Theorem 1, we examine the stability conditions for three specific cases.

For an error-free channel, assuming the communication channels are perfect, we have pH(t)=pM(t)=0,tp_{H}(t)=p_{M}(t)=0,\forall t. The stability condition in (1) reduces to

𝔼[αML]αHM<1,\mathbb{E}\left[\alpha_{M}^{L}\right]\alpha_{HM}<1, (25)

where 𝔼[αML]=k=1τmaxαMk+1vk\mathbb{E}\left[\alpha_{M}^{L}\right]=\sum_{k=1}^{\tau_{\text{max}}}\alpha_{M}^{k+1}v_{k} and vkv_{k} defined in (2) is determined by the human state transition matrix 𝐌\mathbf{M}. In this case, the stability depends on αM\alpha_{M}, αHM\alpha_{HM}, and 𝐌\mathbf{M}. Since the communication channels are perfect, only human control loops may be open due to the human control lag. Thus, only the Lyapunov gains in cases one and two of Assumption 1, i.e., αHM\alpha_{HM} and αM\alpha_{M}, play a role in this scenario.

Human control only, assuming that the plant is only controlled by a human operator, i.e., the machine control loop is always open (pM(t)=1,tp_{M}(t)=1,\forall t). The stability condition is

𝔼[αL]αH<1,\mathbb{E}\left[\alpha^{L}\right]\alpha_{H}<1, (26)

where 𝔼[αL]=l=0αlzl\mathbb{E}\left[\alpha^{L}\right]=\sum_{l=0}^{\infty}\alpha^{l}z_{l} and zlz_{l} is defined in (16). In this case, the stability depends on αH\alpha_{H}, α\alpha, and 𝐌\mathbf{M}. Since the machine control loop is always open, only the Lyapunov gains in cases three and four of Assumption 1 are relevant. We note that if the human control lag is a constant, LL is still a random time interval due to the random SH delay.

Machine control only, assuming that the plant is only controlled by a machine controller, i.e., the human control loop is always open (pH(t)=1,tp_{H}(t)=1,\forall t). The stability condition of this case cannot be directly obtained from Theorem 1, because the stochastic cycle-based approach in Theorem 1 is on the basis of closed human control loops. Modifications to the definition of stochastic cycles are required to analyze the stability condition. Our results are presented next:

Proposition 1.

The plant in Section II controlled solely by the machine is stochastically stable if

αMα𝔼[αL^]<1,\frac{\alpha_{M}}{\alpha}\mathbb{E}\left[\alpha^{\hat{L}}\right]<1, (27)

where control system parameters α\alpha and αM\alpha_{M} are defined in Assumption 1; L^\hat{L} is the time steps between the two consecutive closed machine control loops with the probability distribution of [L^=l]=(1p¯M)p¯Ml1\mathbb{P}\left[\hat{L}=l\right]=(1-\bar{p}_{M})\bar{p}_{M}^{l-1}.

Proof.

See Appendix C. ∎

In this case, the stability depends on α\alpha, αM\alpha_{M} and p¯M\bar{p}_{M}. Since the human control loop is always open in this case, only the Lyapunov gains in cases two and four of Assumption 1 are applicable. (27) resemble exist results [43].

III-C Numerical Examples of the Stability Region

Refer to caption
Figure 5: Numerical examples of the boundary of stability conditions: (a) Impacts of the human state transition matrix on the stability region in terms of αHM\alpha_{HM} and αH\alpha_{H}, where α=1.02\alpha=1.02, αM=1.01\alpha_{M}=1.01, N=3N=3, and TI-HARQ are adopted. (b) Impacts of HARQ schemes on the stability region in terms of αHM\alpha_{HM} and αH\alpha_{H}, where α=1.02\alpha=1.02, αM=1.01\alpha_{M}=1.01, 𝐌=𝐌l\mathbf{M}=\mathbf{M}_{l}, and N=3N=3 are adopted. (c) Impacts of the maximum number of retransmissions on the stability region in terms of αHM\alpha_{HM} and αH\alpha_{H}, where α=1.02\alpha=1.02, αM=1.01\alpha_{M}=1.01, 𝐌=𝐌l\mathbf{M}=\mathbf{M}_{l}, and IR-HARQ are adopted. (d) The stability region in terms of αM\alpha_{M} vs. αH\alpha_{H} and αM\alpha_{M} vs. αHM\alpha_{HM}, where 𝐌=𝐌l\mathbf{M}=\mathbf{M}_{l}, and IR-HARQ are adopted. (e) The stability region in terms of αHM\alpha_{HM} and αH\alpha_{H}, where α=1.02\alpha=1.02, 𝐌=𝐌l\mathbf{M}=\mathbf{M}_{l}, and IR-HARQ are adopted. (f) The stability region in terms of αM\alpha_{M} and αH\alpha_{H}, where αHM=0.3\alpha_{HM}=0.3, 𝐌=𝐌l\mathbf{M}=\mathbf{M}_{l}, and IR-HARQ are adopted. Colourized areas are stable regions.

We present numerical results to illustrate the stability region in terms of the communication, the control system, and the human model parameters, which show how these parameters affect the stability condition (22) in Theorem 1. The average channel gain is denoted as h¯\bar{h} and follows the free-space path loss model h¯=A(3×1084πfcd)de\bar{h}=A(\frac{3\times 10^{8}}{4\pi f_{c}d})^{d_{e}}, where AA denotes the antenna gain; fcf_{c} denotes the carrier frequency; dd denote the distance from the human operator or the machine to the plant; ded_{e} denote the path loss exponent [46]. The time-varying wireless channel power gains are generated from Rayleigh fading channel models, i.e., h(t)Exp(1)h(t)\sim\!Exp(1). Given the transmission power PtxP_{\text{tx}} and the receiving noise power σ2\sigma^{2}, the SNR of received packets in all channels are obtained from γ(t)=h¯h(t)Ptxσ2\gamma(t)=\frac{\bar{h}h(t)P_{\text{tx}}}{\sigma^{2}}, respectively. The communication parameters are summarized in Table I.

TABLE I: Communication Parameters in Simulation
Items Value
Communication parameters
Code rate [bps], b/lpb/l_{p} 2
Packet length [symbols], lpl_{p} 1500
Transmit power [dBm], PtxP_{\text{tx}} 23
Background noise power [dBm], σ2\sigma^{2} -70
Maximum number of re/transmissions, NN {1,3,5}\{1,3,5\}
Free-space path loss model
Antenna gain, AA 4
Carrier frequency [MHz], fcf_{c} 915
Distance from machine to plant [m], dd 40
Distance from human to plant [m], dd 45
Path loss exponent, ded_{e} 2.9

The human control lag has two states 𝒮={5,25}\mathcal{S}=\{5,25\} (i.e., fast and slow) with the stationary probability distribution (0.5,0.5)(0.5,0.5) and the state transition matrix 𝐌\mathbf{M} can be one of the three cases below:

𝐌h=[0.90.10.10.9],𝐌e=[0.50.50.50.5],𝐌l=[0.10.90.90.1].\mathbf{M}_{h}\!=\!\begin{bmatrix}0.9&0.1\\ 0.1&0.9\end{bmatrix},\mathbf{M}_{e}\!=\!\begin{bmatrix}0.5&0.5\\ 0.5&0.5\end{bmatrix},\mathbf{M}_{l}\!=\!\begin{bmatrix}0.1&0.9\\ 0.9&0.1\end{bmatrix}. (28)

𝐌h\mathbf{M}_{h} is a Prolonged Response Model, where the human operator tends to remain in a single state—either fast (low lag) or slow (high lag)—for extended periods. This reflects a tendency for the operator’s reaction time to be consistently fast or slow, with infrequent transitions between these two states. 𝐌e\mathbf{M}_{e} is a Random Response Model, where the human operator has an equal probability of staying in their current state or switching to the other, leading to unpredictable shifts between fast and slow reactions. 𝐌l\mathbf{M}_{l} is a Variable Response Model, where the human operator frequently switches between fast and slow reactions, indicating high variability in response times.

Numerical results are illustrated in Fig. 5. We select the pair of αH\alpha_{H} and αHM\alpha_{HM} to show the impacts because this pair has the simplest linear relationship for demonstration (see Corollary 1). Fig. 5(a) illustrates the impacts of human model parameters on the stability region. In particular, a human operator with a variable response model shows the largest stability region, while a human operator with a prolonged response model has the smallest stability region. A human operator with a prolonged response model has a higher chance of instantly staying in a large lag state. Thus, to guarantee closed loop stability, more reliable communications are required. As shown in Corollary 1(i), the slope of the linear stability region in Fig. 5(a) depends on the expected probability of an open machine control loop p¯M\bar{p}_{M} defined in (8).

Fig. 5(b) illustrates the impacts of three HARQ schemes on the stability region. Compared with TI-HARQ, WHMC systems with IR-HARQ and CC-HARQ schemes show a larger stability region due to the fact that the packet combining can significantly reduce the number of retransmissions by taking advantage of the accumulated SNRs. A WHMC system with the IR-HARQ scheme has the largest stability region because only incremental redundancies are retransmitted for each event of the erroneous packet. Fig. 5(c) illustrates the impacts of maximum re/transmission attempts on the stability region. We see that the system with HARQ schemes (i.e., N>1N>1) has a larger stability region than the system without retransmission (i.e., N=1N=1). As NN increases, the extension of the stability region becomes small; thus, N=3N=3 is commonly used in the numerical illustrations.

In addition to the linear boundary, Fig. 5(d) illustrates the concave boundaries in terms of αM\alpha_{M} vs. αH\alpha_{H} and αM\alpha_{M} vs. αHM\alpha_{HM}, where a small variation of αM\alpha_{M} leads to a significant change in both αH\alpha_{H} and αHM\alpha_{HM}. This is because machine control attempts are more frequent than human control attempts, and the accumulated significance of αM\alpha_{M} is significantly higher than the Lyapunov gains αH\alpha_{H} and αHM\alpha_{HM} involving human control attempts. Fig. 5(e) illustrates stability regions in terms of the pair of αHM\alpha_{HM} and αH\alpha_{H} with different αM\alpha_{M}. As αM\alpha_{M} decreases, the stability region expands dramatically, highlighting the significant reduction of human control efforts to stabilize the system. Fig. 5(f) illustrates stability regions in terms of the pair of αM\alpha_{M} and αH\alpha_{H} with different α\alpha. The stability region expands with decreasing α\alpha. This is because a larger open loop Lyapunov gain α\alpha indicates greater effort required for both automatic machine and human control inputs.

IV Proof of Concept Experiment

In this section, we present a case study of WHMC to illustrate its advantage in control performance. The experiment data of the case study are recorded to estimate the control system and the human model parameters, followed by the stability analysis of the case study to show the effectiveness of Theorem 1.

Refer to caption
Figure 6: The cart-pole system to illustrate the WHMC.

IV-A Experiment Setups

We build a WHMC system where a cart-pole system is simulated and controlled by a machine controller and a real human operator, as shown in Fig. 6. The machine controller is implemented to control the applied force FF to the cart with an unknown weight mcm_{c} for balancing the pole. The dynamic weight mcm_{c} on the cart can be observed by the human operator who monitors the state of the simulated cart-pole system and uses a key ‘S’ between ‘A’ and ‘D’ on a keyboard to intervene in the control of the cart-pole system to remove the dynamic weight on the cart. The dynamic weight can be seen as a catastrophic disturbance to the system, which cannot be handled by the machine controller designed without such knowledge. Therefore, such a control system has nonlinear dynamics and unknown disturbance to the machine controller, which is challenging without collaboration with a human operator. The IR-HARQ scheme is adopted with a maximum re/transmissions number of N=3N=3. Other communication parameters and the free-space path loss model are the same as Table I.

IV-A1 Cart-pole dynamics

In the simulated cart-pole system, the mass of the pole is assumed to be concentrated at its end mass. The states of the cart-pole system consist of the position x(t)x(t) and velocity x˙(t)\dot{x}(t) of the cart, the angle θ(t)\theta(t) and angular velocity θ˙(t)\dot{\theta}(t) of the pole, and the unknown weight mc(t)m_{c}(t) on the cart, which is denoted as 𝐱(t)(x(t),x˙(t),θ(t),θ˙(t),mc(t))\mathbf{x}(t)\triangleq(x(t),\dot{x}(t),\theta(t),\dot{\theta}(t),m_{c}(t))^{\top}. The dynamics of the cart-pole system are governed by the non-linear dynamic equations in (29),

{2MpgLpsinθ(t)=(2I+MpLp2)θ¨(t)+MpLpcosθ(t)x¨(t)+2cθ˙(t),Rotational dynamics (pendulum),MpLpsinθ(t)θ˙(t)2=(Mc+Mp+mc(t))x¨(t)+bx˙(t)+MpLpcosθ(t)θ¨(t)uM(t),Horizontal force balance (cart),\begin{cases}2M_{p}gL_{p}\sin{\!\theta(t)}\!=\!(2I+M_{p}L_{p}^{2})\ddot{\theta}(t)\!+\!M_{p}L_{p}\cos{\!\theta(t)}\ddot{x}(t)\!+\!2c\dot{\theta}(t),\!&\text{Rotational dynamics (pendulum)},\\ M_{p}L_{p}\sin{\!\theta(t)}\dot{\theta}(t)^{2}\!=\!(M_{c}\!+\!M_{p}\!+\!m_{c}(t))\ddot{x}(t)\!+\!b\dot{x}(t)\!+\!M_{p}L_{p}\cos{\!\theta(t)}\ddot{\!\theta}(t)\!-\!u_{M}(t),\!&\text{Horizontal force balance (cart)},\end{cases} (29)

where Mp=2M_{p}=2 kgkg and Mc=10M_{c}=10 kgkg are the mass of the pole and cart, respectively; g=9.8g=9.8 m/s2m/s^{2} is the gravitational acceleration; Lp=6L_{p}=6 mm is the length of the pole; I=MpLp24I=\frac{M_{p}L_{p}^{2}}{4} is the moment of inertia for a point mass in terms of the center of the pole; uM(t)u_{M}(t) is the applied force to the cart by the machine controller in (3); c=b=0.1c=b=0.1 are the damping coefficients for the pole and cart, respectively. For the dynamics of the unknown weight on the cart, mc(t)m_{c}(t), we assume that once the weight is successfully removed by the actuator remotely controlled by the human operator, it will reappear on the cart after a random time interval; otherwise, the unknown weight will remain on the cart continuously. Thus, mc(t)m_{c}(t) has the following updating rule

mc(t+1)={mc(t)+uH(t),for mc(t)0,mc(t)+w(t),otherwise,m_{c}(t+1)=\begin{cases}m_{c}(t)+u_{H}(t),&\text{for }m_{c}(t)\neq 0,\\ m_{c}(t)+w(t),&\text{otherwise,}\end{cases} (30)

where w(t){0,5}w(t)\in\{0,5\} is randomly generated and uH(t)u_{H}(t) is the human control input. The sampling period is Ts=0.05T_{s}=0.05 ss. x¨(t)\ddot{x}(t) and θ¨(t)\ddot{\theta}(t) can be derived from (29) given 𝐱(t)\mathbf{x}(t). Then, by leveraging 𝐱(t)\mathbf{x}(t), x¨(t)\ddot{x}(t), θ¨(t)\ddot{\theta}(t), TsT_{s} and (30), we can get 𝐱(t+1)\mathbf{x}(t+1), indicating the proposed cart-pole system follows (1). The initial state is 𝐱(0)=(0,0,π6,0,5)\mathbf{x}(0)=(0,0,\frac{\pi}{6},0,5)^{\top}.

IV-A2 Control policies

In this experiment, mc(t)m_{c}(t) is unknown to the machine controller, but all other states, parameters, and the system dynamics in (29) (excluding mc(t)m_{c}(t)) are known. The machine control policy seeks to achieve a decaying angle as per θ(t+1)=ηθ(t)\theta(t+1)=\eta\theta(t) by applying force uM(t)u_{M}(t), where η(0,1)\eta\in(0,1). Using the Euler approximation to update the angle, we obtain

θ(t+1)=θ˙(t+1)Ts+θ(t)=ηθ(t).\theta(t+1)=\dot{\theta}(t+1)T_{s}+\theta(t)=\eta\theta(t). (31)

The updated angular velocity θ˙(t+1)\dot{\theta}(t+1) is also obtained by using the Euler approximation, i.e., θ˙(t+1)=(θ˙(t)+Tsθ¨(t))Ts\dot{\theta}(t+1)=(\dot{\theta}(t)+T_{s}\ddot{\theta}(t))T_{s}. A smaller η\eta will lead to a control policy enabling a faster speed of θ(t)0\theta(t)\rightarrow 0, which is 0.7 in the experiment. By leveraging (31) and (29) with mc(t)=0m_{c}(t)=0, the machine control policy can be obtained as (32) to determine θ¨(t)\ddot{\theta}(t),

fM(𝐱(t))=\displaystyle f_{M}\left(\mathbf{x}(t)\right)= 2MpgLpsin(θ(t))(Mc+Mp)MpLpcos(θ(t))2cθ˙(t)(Mc+Mp)MpLpcos(θ(t))+bx˙(t)MpLpθ˙2sin(θ(t))\displaystyle\frac{2M_{p}gL_{p}\sin(\theta(t))(M_{c}+M_{p})}{M_{p}L_{p}\cos(\theta(t))}-\frac{2c\dot{\theta}(t)(M_{c}+M_{p})}{M_{p}L_{p}\cos(\theta(t))}+b\dot{x}(t)-M_{p}L_{p}\dot{\theta}^{2}\sin(\theta(t)) (32)
(η1)θ(t)Γ(t)Ts2MpLpcos(θ(t))+θ˙(t)Γ(t)TsMpLpcos(θ(t)),\displaystyle-\frac{(\eta-1)\theta(t)\Gamma(t)}{T_{s}^{2}M_{p}L_{p}\cos(\theta(t))}+\frac{\dot{\theta}(t)\Gamma(t)}{T_{s}M_{p}L_{p}\cos(\theta(t))},

where Γ(t)(Mc+Mp)(2I+MpLp2)(MpLpcos(θ(t)))2\Gamma(t)\triangleq(M_{c}+M_{p})(2I+M_{p}L_{p}^{2})-(M_{p}L_{p}\cos(\theta(t)))^{2}. Recall that the human control policy is to remove the unknown weight on the cart if the human operator observes it through visual feedback. Thus, the human control policy is fH(𝐱(t))=mc(t)f_{H}(\mathbf{x}(t))=-m_{c}(t).

IV-B WHMC Control Performance

Definition 2 (Collaborative Control Performance).

The control performance of a WHMC system at each time step is evaluated by a cost function JJ: ls0\mathbb{R}^{l_{s}}\rightarrow\mathbb{R}_{\geq 0}, which is defined as

J(t)=𝐱(t)𝐏𝐱(t),J(t)=\mathbf{x}(t)^{\top}\mathbf{P}\mathbf{x}(t), (33)

where 𝐏ls×ls\mathbf{P}\in\mathbb{R}^{l_{s}\times l_{s}} is a positive diagonal matrix to individually penalize the states of interest. A smaller control cost indicates a better control performance.

In the experiment described in Section IV-A, the objective of the WHMC system is to balance the pole (i.e., θ(t)\theta(t) is closely around the zero point). Thus, we are only interested in the angle of the pole, resulting in a cost function J(t)=(θ(t))2J(t)=(\theta(t))^{2}. The control cost of the machine control only case, the human control only case, and the WHMC case are shown in Fig. 7. The human operator’s objective is to remove the weight, not to balance the pole. Only the machine controller handles pole balancing. Without machine control inputs, the control cost increases. Both the WHMC and machine-only cases can reduce the cost over time, with their stability guaranteed, as will be further discussed in Section IV-C3. Compared to the machine-only case, the WHMC case shows a faster decrease in cost, demonstrating the importance of WHMC.

Refer to caption
Figure 7: The control cost of the cart-pole system.

IV-C WHMC System Stability

IV-C1 Estimation of control system parameters

Since the control objective is to balance the pole, the Lyapunov-like function V()V(\cdot) is defined as

V(θ(t))={|θ(t)|,for |θ(t)|0.05,0,otherwise,V\left(\theta(t)\right)=\begin{cases}|\theta(t)|,&\text{for }|\theta(t)|\geq 0.05,\\ 0,&\text{otherwise,}\end{cases} (34)

where the threshold of 0.05 is set to eliminate the impacts of uncontrolled θ˙(t)\dot{\theta}(t) on the control objective θ(t)\theta(t) when θ(t)\theta(t) reaches to the desired zero point. To estimate the four control system parameters, i.e., αHM\alpha_{HM}, αM\alpha_{M}, αH\alpha_{H}, and α\alpha, we collect data by conducting the experiment in four cases, i.e., no control inputs, machine control only, human control only, and human-machine collaborative control, respectively. The parameter in each case is estimated by maxV(θ(t+1))V(θ(t))\max{\frac{V(\theta(t+1))}{V(\theta(t))}} based on the corresponding data set, where V()V(\cdot) is defined in (34). The estimated four control system parameters are αHM=0.5271\alpha_{HM}=0.5271, αM=0.7949\alpha_{M}=0.7949, αH=1.0196\alpha_{H}=1.0196, and α=1.0134\alpha=1.0134.

IV-C2 Estimation of human model

To reduce the estimation complexity, we quantize the human control lag to a two-state set of {3,7}\{3,7\}, which corresponds to 0.15 ss and 0.35 ss. The state transition matrix is estimated based on the maximum likelihood estimation approach, which is

𝐌=[0.25760.74240.44040.5596].\mathbf{M}=\begin{bmatrix}0.2576&0.7424\\ 0.4404&0.5596\end{bmatrix}. (35)

The corresponding stationary probability distribution is (0.3723,0.6277)(0.3723,0.6277), i.e., [τH(t)=0.15]=0.3723\mathbb{P}\!\left[\tau_{H}(t^{\prime})\!=\!0.15\right]\!=\!0.3723 and [τH(t)=0.35]=0.6277\mathbb{P}\!\left[\tau_{H}(t^{\prime})\!=\!0.35\right]\!=\!0.6277.

IV-C3 Stability of the cart-pole system

Based on the above estimation and parameters in Table I, the left term of the stability condition in Theorem 1 is 0.3539<10.3539<1, demonstrating a stabilized WHMC system. In the machine-only control scenario, the left term of the stability condition (27) is 0.8594<10.8594<1, indicating a stochastically stable system. Conversely, in the human-control-only case, the left term of the stability condition (26) is 3.3088>13.3088>1, signifying an unstable system. This instability also explains the increasing control cost observed in Fig. 7.

V Conclusions

We have developed a foundational WHMC model that integrates dual wireless loops for both machine and human control, addressing the intricate challenges associated with WHMC systems. By introducing a novel stochastic cycle-cost-based approach, we have derived a stability condition that accounts for the complexities of wireless communication, human behavior, and control system dynamics. Our approach has been validated through extensive numerical analysis and the creation of a new case study, demonstrating its practical effectiveness. These contributions offer a strong basis for advancing WHMC systems in increasingly complex and dynamic environments.

Acknowledgments

The authors would like to express their sincere gratitude to Dr. Anuradha Annaswamy, Director of the Active-Adaptive Control Laboratory at MIT, for her valuable comments on this paper. Her insightful feedback and suggestions have been instrumental in improving the clarity and rigor of this work.

Appendix A Proof of Theorem 1

The time steps of the nnth closed-loop human control is defined as t=knt=k_{n}, as shown in Fig. 3. Let mCm_{C} and mOm_{O} denote the number of case two and case four defined in Assumption 1 between t=knt=k_{n} and t=kn+1t=k_{n+1}, respectively. Then we have

V(𝐱(kn+l))αMmCαmOV(𝐱(kn)),V(\mathbf{x}(k_{n}+l))\leq\alpha_{M}^{m_{C}}\alpha^{m_{O}}V(\mathbf{x}(k_{n})), (36)

and

𝔼[V(𝐱(kn+l))]𝔼[αMmCαmO]𝔼[V(𝐱(kn))].\mathbb{E}\left[V(\mathbf{x}(k_{n}+l))\right]\leq\mathbb{E}\left[\alpha_{M}^{m_{C}}\alpha^{m_{O}}\right]\mathbb{E}\left[V(\mathbf{x}(k_{n}))\right]. (37)

Since

𝔼[αMmCαmOmC+mO=l]=(αM(1p¯M)+αp¯M)l,\mathbb{E}\left[\alpha_{M}^{m_{C}}\alpha^{m_{O}}\mid m_{C}+m_{O}=l\right]=\left(\alpha_{M}\left(1-\bar{p}_{M}\right)+\alpha\bar{p}_{M}\right)^{l}, (38)

the sum of V()V(\cdot) between the two adjacent closed human control loops has the following inequality

t=knkn+11𝔼[V(𝐱(t))]l=0kn+1kn1Ωl𝔼[V(𝐱(kn))],\sum_{t=k_{n}}^{k_{n+1}-1}\mathbb{E}\left[V(\mathbf{x}(t))\right]\leq\sum_{l=0}^{k_{n+1}-k_{n}-1}\Omega^{l}\mathbb{E}\left[V(\mathbf{x}(k_{n}))\right], (39)

where

ΩαM(1p¯M)+αp¯M.\Omega\triangleq\alpha_{M}\left(1-\bar{p}_{M}\right)+\alpha\bar{p}_{M}.

By further processing the above inequality, we have

t=knkn+11𝔼[V(𝐱(t))]1ΩL11Ω𝔼[V(𝐱(kn))].\sum_{t=k_{n}}^{k_{n+1}-1}\mathbb{E}\left[V(\mathbf{x}(t))\right]\leq\frac{1-\Omega^{L-1}}{1-\Omega}\mathbb{E}\left[V(\mathbf{x}(k_{n}))\right]. (40)

Then,

t=0𝔼[V(𝐱(t))]n=01ΩL11Ω𝔼[V(𝐱(kn))].\sum_{t=0}^{\infty}\mathbb{E}\left[V(\mathbf{x}(t))\right]\leq\sum_{n=0}^{\infty}\frac{1-\Omega^{L-1}}{1-\Omega}\mathbb{E}\left[V(\mathbf{x}(k_{n}))\right]. (41)

Let hCh_{C} and hOh_{O} denote the numbers of case one and case three defined in Assumption 1 between t=0t=0 and t=knt=k_{n}, respectively. In this time interval, m^C\hat{m}_{C} and m^O\hat{m}_{O} denote the numbers of case two and case four, respectively. Then,

𝔼[V(𝐱(kn))]𝔼[αMm^Cαm^OαHMhCαHhO]𝔼[V(𝐱(0))].\mathbb{E}\left[V(\mathbf{x}(k_{n}))\right]\leq\mathbb{E}\left[\alpha_{M}^{\hat{m}_{C}}\alpha^{\hat{m}_{O}}\alpha_{HM}^{h_{C}}\alpha_{H}^{h_{O}}\right]\mathbb{E}\left[V(\mathbf{x}(0))\right]. (42)

It can be further processed as

𝔼[V(𝐱(kn))]𝔼[ΩnL]𝔼[αHMhCαHhO]𝔼[V(𝐱(0))].\mathbb{E}\left[V(\mathbf{x}(k_{n}))\right]\leq\mathbb{E}\left[\Omega^{nL}\right]\mathbb{E}\left[\alpha_{HM}^{h_{C}}\alpha_{H}^{h_{O}}\right]\mathbb{E}\left[V(\mathbf{x}(0))\right]. (43)

Since

𝔼[αHMhCαHhOhC+hO=n]=(αHM(1p¯M)+αHp¯M)n,\mathbb{E}\left[\alpha_{HM}^{h_{C}}\alpha_{H}^{h_{O}}\mid h_{C}+h_{O}=n\right]=\left(\alpha_{HM}\left(1-\bar{p}_{M}\right)+\alpha_{H}\bar{p}_{M}\right)^{n}, (44)

we have

𝔼[V(𝐱(kn))]𝔼[ΩnL]Λn𝔼[V(𝐱(0))],\mathbb{E}\left[V(\mathbf{x}(k_{n}))\right]\leq\mathbb{E}\left[\Omega^{nL}\right]\Lambda^{n}\mathbb{E}\left[V(\mathbf{x}(0))\right], (45)

where

ΛαHM(1p¯M)+αHp¯M.\Lambda\triangleq\alpha_{HM}\left(1-\bar{p}_{M}\right)+\alpha_{H}\bar{p}_{M}.

By leveraging (41) and (45), we have

t=0𝔼[V(𝐱(t))]n=01ΩL11Ω𝔼[ΩnL]Λn𝔼[V(𝐱(0))].\sum_{t=0}^{\infty}\mathbb{E}\left[V(\mathbf{x}(t))\right]\leq\sum_{n=0}^{\infty}\frac{1-\Omega^{L-1}}{1-\Omega}\mathbb{E}\left[\Omega^{nL}\right]\Lambda^{n}\mathbb{E}\left[V(\mathbf{x}(0))\right]. (46)

Since 𝔼[V(𝐱(0))]<\mathbb{E}\left[V(\mathbf{x}(0))\right]<\infty, to make t=0𝔼[V(𝐱(t))]<\sum_{t=0}^{\infty}\mathbb{E}\left[V\left(\mathbf{x}(t)\right)\right]<\infty, we need

n=01ΩL11Ω𝔼[ΩnL]Λn<.\sum_{n=0}^{\infty}\frac{1-\Omega^{L-1}}{1-\Omega}\mathbb{E}\left[\Omega^{nL}\right]\Lambda^{n}<\infty. (47)

Let

𝔼[Ξ(n)]𝔼[1ΩL11ΩΩnL]Λn.\mathbb{E}\left[\Xi(n)\right]\triangleq\mathbb{E}\left[\frac{1-\Omega^{L-1}}{1-\Omega}\Omega^{nL}\right]\Lambda^{n}. (48)

Then, we have

𝔼[Ξ(n+1)]𝔼[1ΩL11ΩΩ(n+1)L]Λn+1,\mathbb{E}\left[\Xi(n+1)\right]\triangleq\mathbb{E}\left[\frac{1-\Omega^{L-1}}{1-\Omega}\Omega^{(n+1)L}\right]\Lambda^{n+1}, (49)

and

𝔼[Ξ(n+1)]=𝔼[ΩL]Λ𝔼[Ξ(n)].\mathbb{E}\left[\Xi(n+1)\right]=\mathbb{E}\left[\Omega^{L}\right]\Lambda\mathbb{E}\left[\Xi(n)\right]. (50)

To satisfy (47), (22) is derived from (50) as the stability condition of the WHMC system.

Appendix B Proof of Corollary 1

According to (22), the bound of the stability region is

𝔼[ΩL]Λ=1.\mathbb{E}\left[\Omega^{L}\right]\Lambda=1. (51)

(i) When control system parameters αM\alpha_{M} and α\alpha are fixed, by further processing (51), we have

αHM=1𝔼[ΩL](1p¯M)p¯M(1p¯M)αH,\alpha_{HM}=\frac{1}{\mathbb{E}\left[\Omega^{L}\right]\left(1-\bar{p}_{M}\right)}-\frac{\bar{p}_{M}}{\left(1-\bar{p}_{M}\right)}\alpha_{H}, (52)

where the linearity between αHM\alpha_{HM} and αH\alpha_{H} is showcased.

(ii) When control system parameters αHM\alpha_{HM} and αH\alpha_{H} are fixed, by further processing (51), we have

l=1L¯Ωl[L=l]=l=1L¯1L¯Λ,\sum_{l=1}^{\bar{L}}\Omega^{l}\mathbb{P}\left[L=l\right]=\sum_{l=1}^{\bar{L}}\frac{1}{\bar{L}\Lambda}, (53)

which is a sum of L¯𝔼[L]\bar{L}\triangleq\mathbb{E}\left[L\right] linear equations and can be represented as

αM=p¯M1p¯Mα+1(1p¯M)L¯l=1L¯(L¯Λ[L=l])l.\alpha_{M}=-\frac{\bar{p}_{M}}{1-\bar{p}_{M}}\alpha+\frac{1}{\left(1-\bar{p}_{M}\right)\bar{L}}\sum_{l=1}^{\bar{L}}\left(\bar{L}\Lambda\mathbb{P}\left[L=l\right]\right)^{-l}. (54)

Thus, the linearity between αM\alpha_{M} and α\alpha is demonstrated.

(iii) For any other possible pairs of two control system parameters, we can take one of L¯\bar{L} equations from (53) and prove that it is convex in terms of the pairs other than those in (i) and (ii). In particular, we have

Ωl[L=l]=1L¯Λ.\Omega^{l}\mathbb{P}\left[L=l\right]=\frac{1}{\bar{L}\Lambda}. (55)

Then, it can be represented as

(αM(1p¯M)+αp¯M)l=1L¯[L=l](αHM(1p¯M)+αHp¯M).(\alpha_{M}\left(1-\bar{p}_{M}\right)+\alpha\bar{p}_{M})^{l}=\frac{\frac{1}{\bar{L}\mathbb{P}\left[L=l\right]}}{(\alpha_{HM}\left(1-\bar{p}_{M}\right)+\alpha_{H}\bar{p}_{M})}. (56)

We take the pair of α\alpha and αH\alpha_{H} for example, given the fixed αM\alpha_{M} and αHM\alpha_{HM}. Then, (56) can be represented as

α=1p¯M((1L¯[L=l](αHM(1p¯M)+αHp¯M))1lαM(1p¯M)).\alpha\!=\!\frac{1}{\bar{p}_{M}}\!\!\left(\!\!\left(\!\!\frac{\frac{1}{\bar{L}\mathbb{P}\left[L=l\right]}}{(\alpha_{HM}\!\left(1\!-\!\bar{p}_{M}\right)+\alpha_{H}\bar{p}_{M})}\!\!\right)^{\!\!\frac{1}{l}}\!\!-\alpha_{M}\!\left(1\!-\!\bar{p}_{M}\right)\!\!\right)\!\!. (57)

The first-order derivative α˙(αH)\dot{\alpha}(\alpha_{H}) is

α˙(αH)=(αHM(1p¯M)+αHp¯M)11ll(L¯[L=l])1l.\dot{\alpha}(\alpha_{H})=-\frac{(\alpha_{HM}\left(1-\bar{p}_{M}\right)+\alpha_{H}\bar{p}_{M})^{-1-\frac{1}{l}}}{l(\bar{L}\mathbb{P}\left[L=l\right])^{\frac{1}{l}}}. (58)

The second-order derivative α¨(αH)\ddot{\alpha}(\alpha_{H}) is

α¨(αH)=(1+1l)p¯M(αHM(1p¯M)+αHp¯M)21ll(L¯[L=l])1l.\ddot{\alpha}(\alpha_{H})=\frac{(1+\frac{1}{l})\bar{p}_{M}(\alpha_{HM}\left(1-\bar{p}_{M}\right)+\alpha_{H}\bar{p}_{M})^{-2-\frac{1}{l}}}{l(\bar{L}\mathbb{P}\left[L=l\right])^{\frac{1}{l}}}. (59)

We note that p¯M(0,1)\bar{p}_{M}\in(0,1), αH0\alpha_{H}\geq 0 and αHM0\alpha_{HM}\geq 0. Thus, α¨(αH)0\ddot{\alpha}(\alpha_{H})\geq 0. Then, (57) is convex and the sum of convex functions (53) is convex and has a concave stability boundary. Other pairs other than those in (i) and (ii) can also be proved following the above analysis.

Appendix C Proof of Proposition 1

We also leverage the stochastic cycle-based approach in Section III-B1. Assume the time steps of the two adjacent closed machine control loops are knk_{n} and kn+1k_{n+1}. Then, we have

V(𝐱(kn+l))αlV(𝐱(kn)),V(\mathbf{x}(k_{n}+l))\leq\alpha^{l}V(\mathbf{x}(k_{n})), (60)

and

𝔼[V(𝐱(kn+l))]αl𝔼[V(𝐱(kn))].\mathbb{E}\left[V(\mathbf{x}(k_{n}+l))\right]\leq\alpha^{l}\mathbb{E}\left[V(\mathbf{x}(k_{n}))\right]. (61)

The sum of V()V(\cdot) between the two adjacent closed machine control loops has the following inequality

t=knkn+11𝔼[V(𝐱(t))]l=0kn+1kn1αl𝔼[V(𝐱(kn))].\sum_{t=k_{n}}^{k_{n+1}-1}\mathbb{E}\left[V(\mathbf{x}(t))\right]\leq\sum_{l=0}^{k_{n+1}-k_{n}-1}\alpha^{l}\mathbb{E}\left[V(\mathbf{x}(k_{n}))\right]. (62)

By further processing the above inequality, we have

t=knkn+11𝔼[V(𝐱(t))]1αL^11α𝔼[V(𝐱(kn))].\sum_{t=k_{n}}^{k_{n+1}-1}\mathbb{E}\left[V(\mathbf{x}(t))\right]\leq\frac{1-\alpha^{\hat{L}-1}}{1-\alpha}\mathbb{E}\left[V(\mathbf{x}(k_{n}))\right]. (63)

Then,

t=0𝔼[V(𝐱(t))]n=01αL^11α𝔼[V(𝐱(kn))].\sum_{t=0}^{\infty}\mathbb{E}\left[V(\mathbf{x}(t))\right]\leq\sum_{n=0}^{\infty}\frac{1-\alpha^{\hat{L}-1}}{1-\alpha}\mathbb{E}\left[V(\mathbf{x}(k_{n}))\right]. (64)

Since

𝔼[V(𝐱(kn))]αMn𝔼[αn(L^1)]𝔼[V(𝐱(0))],\mathbb{E}\left[V(\mathbf{x}(k_{n}))\right]\leq\alpha_{M}^{n}\mathbb{E}\left[\alpha^{n\left(\hat{L}-1\right)}\right]\mathbb{E}\left[V(\mathbf{x}(0))\right], (65)

we have

t=0𝔼[V(𝐱(t))]n=01αL^11ααMn𝔼[αn(L^1)]𝔼[V(𝐱(0))].\sum_{t=0}^{\infty}\mathbb{E}\left[V(\mathbf{x}(t))\right]\leq\sum_{n=0}^{\infty}\frac{1-\alpha^{\hat{L}-1}}{1-\alpha}\alpha_{M}^{n}\mathbb{E}\left[\alpha^{n\left(\hat{L}-1\right)}\right]\mathbb{E}\left[V(\mathbf{x}(0))\right]. (66)

Since 𝔼[V(𝐱(0))]<\mathbb{E}\left[V(\mathbf{x}(0))\right]<\infty, to make t=0𝔼[V(𝐱(t))]<\sum_{t=0}^{\infty}\mathbb{E}\left[V\left(\mathbf{x}(t)\right)\right]<\infty, we need

n=01αL^11ααMn𝔼[αn(L^1)]<.\sum_{n=0}^{\infty}\frac{1-\alpha^{\hat{L}-1}}{1-\alpha}\alpha_{M}^{n}\mathbb{E}\left[\alpha^{n\left(\hat{L}-1\right)}\right]<\infty. (67)

Let

𝔼[Ξ(n)]𝔼[1αL^11ααn(L^1)]αMn.\mathbb{E}\left[\Xi(n)\right]\triangleq\mathbb{E}\left[\frac{1-\alpha^{\hat{L}-1}}{1-\alpha}\alpha^{n\left(\hat{L}-1\right)}\right]\alpha_{M}^{n}. (68)

Then we have

𝔼[Ξ(n+1)]=𝔼[1αL^11αα(n+1)(L^1)]αMn+1.\mathbb{E}\left[\Xi(n+1)\right]=\mathbb{E}\left[\frac{1-\alpha^{\hat{L}-1}}{1-\alpha}\alpha^{(n+1)\left(\hat{L}-1\right)}\right]\alpha_{M}^{n+1}. (69)

Then we have the following equation

𝔼[Ξ(n+1)]=αM𝔼[αL^1]𝔼[Ξ(n)]=αMα𝔼[αL^]𝔼[Ξ(n)].\mathbb{E}\!\left[\Xi(n\!+\!1)\right]\!=\!\alpha_{M}\mathbb{E}\left[\alpha^{\hat{L}-1}\right]\!\!\mathbb{E}\!\left[\Xi(n)\right]\!=\!\frac{\alpha_{M}}{\alpha}\mathbb{E}\!\left[\alpha^{\hat{L}}\right]\!\!\mathbb{E}\!\left[\Xi(n)\right]. (70)

The stability condition in (27) is derived from (70) to satisfy (67).

References

  • [1] H. Lasi, P. Fettke, H.-G. Kemper, T. Feld, and M. Hoffmann, “Industry 4.0,” Bus. Inf. Syst. Eng., vol. 6, pp. 239–242, 2014.
  • [2] S. Kumar, C. Savur, and F. Sahin, “Survey of human-robot collaboration in industrial settings: Awareness, intelligence, and compliance,” IEEE Trans. Syst. Man Cybern. Syst., vol. 51, no. 1, pp. 280–297, 2020.
  • [3] P. K. R. Maddikunta, Q.-V. Pham, P. B, N. Deepa, K. Dev, T. R. Gadekallu, R. Ruby, and M. Liyanage, “Industry 5.0: A survey on enabling technologies and potential applications,” J. Ind. Infor. Integr., vol. 26, 2021, Art. no. 100257.
  • [4] I. Kardush, S. Kim, and E. Wong, “A techno-economic study of Industry 5.0 enterprise deployments for human-to-machine communications,” IEEE Commun. Mag., vol. 60, no. 12, pp. 74–80, 2022.
  • [5] A. P. Dani, I. Salehi, G. Rotithor, D. Trombetta, and H. Ravichandar, “Human-in-the-loop robot control for human-robot collaboration: Human intention estimation and safe trajectory tracking control for collaborative tasks,” IEEE Control Syst. Mag., vol. 40, no. 6, pp. 29–56, 2020.
  • [6] Z. Lu, Y. Guan, and N. Wang, “An adaptive fuzzy control for human-in-the-loop operations with varying communication time delays,” IEEE Robot. Autom. Lett., vol. 7, no. 2, pp. 5599–5606, 2022.
  • [7] F. Mars and P. Chevrel, “Modelling human control of steering for the design of advanced driver assistance systems,” Annu. Rev. Control, vol. 44, pp. 292–302, 2017.
  • [8] H. Kress-Gazit, K. Eder, G. Hoffman, H. Admoni, B. Argall, R. Ehlers, C. Heckman, N. Jansen, R. Knepper, J. Křetínský, S. Levy-Tzedek, J. Li, T. Murphey, L. Riek, and D. Sadigh, “Formalizing and guaranteeing human-robot interaction,” Commun. ACM, vol. 64, no. 9, p. 78–84, 2021.
  • [9] J. v. Oosterhout, J. G. W. Wildenbeest, H. Boessenkool, C. J. M. Heemskerk, M. R. d. Baar, F. C. T. v. d. Helm, and D. A. Abbink, “Haptic shared control in tele-manipulation: Effects of inaccuracies in guidance on task execution,” IEEE Trans. Haptic, vol. 8, no. 2, pp. 164–175, 2015.
  • [10] A. Lopes, J. Rodrigues, J. Perdigao, G. Pires, and U. Nunes, “A new hybrid motion planner: Applied in a brain-actuated robotic wheelchair,” IEEE Rob. Autom. Mag., vol. 23, no. 4, pp. 82–93, 2016.
  • [11] W. Liu, D. E. Quevedo, K. H. Johansson, B. Vucetic, and Y. Li, “Stability conditions for remote state estimation of multiple systems over multiple Markov fading channels,” IEEE Trans. Autom. Control, vol. 68, no. 7, pp. 4273–4280, 2022.
  • [12] T. Yucelen, Y. Yildiz, R. Sipahi, E. Yousefi, and N. Nguyen, “Stability limit of human-in-the-loop model reference adaptive control architectures,” Int. J. Control, vol. 91, no. 10, pp. 2314–2331, 2018.
  • [13] H.-N. Wu and M. Wang, “Human-in-the-loop behavior modeling via an integral concurrent adaptive inverse reinforcement learning,” IEEE Trans. Neural Networks Learn. Sys., pp. 1–12, 2023.
  • [14] P. van Overloop, J. Maestre, A. D. Sadowska, E. F. Camacho, and B. De Schutter, “Human-in-the-loop model predictive control of an irrigation canal [applications of control],” IEEE Control Syst. Mag., vol. 35, no. 4, pp. 19–29, 2015.
  • [15] Z. Li, J. Liu, Z. Huang, Y. Peng, H. Pu, and L. Ding, “Adaptive impedance control of human–robot cooperation using reinforcement learning,” IEEE Trans. Ind. Electron., vol. 64, no. 10, pp. 8013–8022, 2017.
  • [16] E. Eraslan, Y. Yildiz, and A. M. Annaswamy, “Shared control between pilots and autopilots: An illustration of a cyberphysical human system,” IEEE Control Syst. Mag., vol. 40, no. 6, pp. 77–97, 2020.
  • [17] Q. Deng and D. Söffker, “A review of HMM-based approaches of driving behaviors recognition and prediction,” IEEE Trans. Intell. Veh., vol. 7, no. 1, pp. 21–31, 2022.
  • [18] C.-P. Lam and S. S. Sastry, “A POMDP framework for human-in-the-loop system,” in Proc. IEEE CDC, 2014, pp. 6031–6036.
  • [19] L. Feng, C. Wiltsche, L. Humphrey, and U. Topcu, “Synthesis of human-in-the-loop control protocols for autonomous systems,” IEEE Trans. Autom. Sci. Eng., vol. 13, no. 2, pp. 450–462, 2016.
  • [20] B. Hu and J. Chen, “Optimal task allocation for human–machine collaborative manufacturing systems,” IEEE Robot. Autom. Lett., vol. 2, no. 4, pp. 1933–1940, 2017.
  • [21] C. Craye, A. Rashwan, M. S. Kamel, and F. Karray, “A multi-modal driver fatigue and distraction assessment system,” Int. J. Intelligent Transp. Syst. Res., vol. 14, no. 3, pp. 173–194, 2016.
  • [22] M. Protte, R. Fahr, and D. E. Quevedo, “Behavioral economics for human-in-the-loop control systems design: Overconfidence and the hot hand fallacy,” IEEE Control Syst. Mag., vol. 40, no. 6, pp. 57–76, 2020.
  • [23] H.-N. Wu and X.-M. Zhang, “Stochastic stability analysis and synthesis of a class of human-in-the-loop control systems,” IEEE Trans. Syst. Man Cybern. Syst., vol. 52, no. 2, pp. 822–832, 2022.
  • [24] P. Park, S. Coleri Ergen, C. Fischione, C. Lu, and K. H. Johansson, “Wireless network design for control systems: A survey,” IEEE Commun. Surv. Tutor., vol. 20, no. 2, pp. 978–1013, 2018.
  • [25] W. Liu, X. Zang, Y. Li, and B. Vucetic, “Over-the-air computation systems: Optimization, analysis and scaling laws,” IEEE Trans. Wirel. Commun., vol. 19, no. 8, pp. 5488–5502, 2020.
  • [26] J. Chen, W. Liu, D. E. Quevedo, S. R. Khosravirad, Y. Li, and B. Vucetic, “Structure-enhanced DRL for optimal transmission scheduling,” IEEE Trans. Wirel. Commun., vol. 23, no. 1, pp. 379–393, 2023.
  • [27] L. Schenato, B. Sinopoli, M. Franceschetti, K. Poolla, and S. S. Sastry, “Foundations of control and estimation over lossy networks,” Proc. IEEE, vol. 95, no. 1, pp. 163–187, 2007.
  • [28] P. Minero, M. Franceschetti, S. Dey, and G. N. Nair, “Data rate theorem for stabilization over time-varying feedback channels,” IEEE Trans. Autom. Control, vol. 54, no. 2, pp. 243–255, 2009.
  • [29] K. Huang, W. Liu, Y. Li, A. Savkin, and B. Vucetic, “Wireless feedback control with variable packet length for industrial IoT,” IEEE Wirel. Commun. Lett., vol. 9, no. 9, pp. 1586–1590, 2020.
  • [30] W. Liu, P. Popovski, Y. Li, and B. Vucetic, “Wireless networked control systems with coding-free data transmission for industrial IoT,” IEEE Internet Things J., vol. 7, no. 3, pp. 1788–1801, 2020.
  • [31] G. Yang, Z. Pang, M. Jamal Deen, M. Dong, Y.-T. Zhang, N. Lovell, and A. M. Rahmani, “Homecare robotic systems for healthcare 4.0: Visions and enabling technologies,” IEEE J. Biomedical Health Informat., vol. 24, no. 9, pp. 2535–2549, 2020.
  • [32] G. Zhao, M. A. Imran, Z. Pang, Z. Chen, and L. Li, “Toward real-time control in future wireless networks: Communication-control co-design,” IEEE Commun. Mag., vol. 57, no. 2, pp. 138–144, 2019.
  • [33] Y. Polyanskiy, H. V. Poor, and S. Verdu, “Channel coding rate in the finite blocklength regime,” IEEE Trans. Inform. Theory, vol. 56, no. 5, pp. 2307–2359, 2010.
  • [34] C. Xu, H. Yu, P. Zeng, and Y. Li, “Towards critical industrial wireless control: Prototype implementation and experimental evaluation on URLLC,” IEEE Commun. Mag., vol. 61, no. 9, pp. 193–199, 2023.
  • [35] Z. Xiang, F. Gabriel, E. Urbano, G. T. Nguyen, M. Reisslein, and F. H. P. Fitzek, “Reducing latency in virtual machines: Enabling tactile internet for human-machine co-working,” IEEE J. Sel. Areas Commun., vol. 37, no. 5, pp. 1098–1116, 2019.
  • [36] S. Mondal, L. Ruan, M. Maier, D. Larrabeiti, G. Das, and E. Wong, “Enabling remote human-to-machine applications with AI-enhanced servers over access networks,” IEEE Open J. Commun. Soc., vol. 1, pp. 889–899, 2020.
  • [37] X. Kuai, X. Yuan, W. Yan, and Y.-C. Liang, “Coexistence of human-type and machine-type communications in uplink massive MIMO,” IEEE J. Sel. Areas Commun., vol. 39, no. 3, pp. 804–819, 2021.
  • [38] G. Pang, W. Liu, Y. Li, and B. Vucetic, “DRL-based resource allocation in remote state estimation,” IEEE Trans. Wirel. Commun., vol. 22, no. 7, pp. 4434–4448, 2022.
  • [39] K. Huang, W. Liu, M. Shirvanimoghaddam, Y. Li, and B. Vucetic, “Real-time remote estimation with hybrid ARQ in wireless networked control,” IEEE Trans. Wirel. Commun., vol. 19, no. 5, pp. 3490–3504, 2020.
  • [40] C. Sahin, L. Liu, E. Perrins, and L. Ma, “Delay-sensitive communications over IR-HARQ: Modulation, coding latency, and reliability,” IEEE J. Sel. Areas Commun., vol. 37, no. 4, pp. 749–764, 2019.
  • [41] F. Ghanami, G. A. Hodtani, B. Vucetic, and M. Shirvanimoghaddam, “Performance analysis and optimization of NOMA with HARQ for short packet communications in massive IoT,” IEEE Internet Things J., vol. 8, no. 6, pp. 4736–4748, 2021.
  • [42] P. Larsson, B. Smida, T. Koike-Akino, and V. Tarokh, “Analysis of network coded HARQ for multiple unicast flows,” IEEE Trans. Commun., vol. 61, no. 2, pp. 722–732, 2013.
  • [43] W. Liu, D. E. Quevedo, Y. Li, and B. Vucetic, “Anytime control under practical communication models,” IEEE Trans. Autom. Control, vol. 67, no. 10, pp. 5400–5407, 2021.
  • [44] T. V. Dang, K.-V. Ling, and D. E. Quevedo, “Stability analysis of event-triggered anytime control with multiple control laws,” IEEE Trans. Autom. Control, vol. 64, no. 1, pp. 420–426, 2019.
  • [45] D. E. Quevedo, W.-J. Ma, and V. Gupta, “Anytime control using input sequences with Markovian processor availability,” IEEE Trans. Autom. Control, vol. 60, no. 2, pp. 515–521, 2015.
  • [46] L. Huang, S. Bi, and Y.-J. A. Zhang, “Deep reinforcement learning for online computation offloading in wireless powered mobile-edge computing networks,” IEEE Trans. Mob. Comput., vol. 19, no. 11, pp. 2581–2593, 2020.