This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Predictive Wireless Based Status Update for Communication-Agnostic Sampling

Zhiyuan Jiang, Wei Zhang, Zixu Cao, Shan Cao, Shunqing Zhang,
and Shugong Xu, Fellow, IEEE
The authors are with Shanghai Institute for Advanced Communication and Data Science, Shanghai University, China. Emails: {jiangzhiyuan, 16dqzw, caozixu, cshan, shunqing, shugong}@shu.edu.cn. A preliminary version of the paper has been presented at IEEE INFOCOM 2020 [1]. This work was supported by the National Key R&\&D Program of China (No. 2017YFE0121400), the program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning, and Shanghai Institute for Advanced Communication and Data Science (SICS).
Abstract

In a wireless network that conveys status updates from sources (i.e., sensors) to destinations, one of the key issues studied by existing literature is how to design an optimal source sampling strategy on account of the communication constraints which are often modeled as queues. In this paper, an alternative perspective is presented—a novel status-aware communication scheme, namely parallel communications, is proposed which allows sensors to be communication-agnostic. Specifically, the proposed scheme can determine, based on an online prediction functionality, whether a status packet is worth transmitting considering both the network condition and status prediction, such that sensors can generate status packets without communication constraints. We evaluate the proposed scheme on a Software-Defined-Radio (SDR) test platform, which is integrated with a collaborative autonomous driving simulator, i.e., Simulation-of-Urban-Mobility (SUMO), to produce realistic vehicle control models and road conditions. The results show that with online status predictions, the channel occupancy is significantly reduced, while guaranteeing low status recovery error. Then the framework is applied to two scenarios: a multi-density platooning scenario, and a flight formation control scenario. Simulation results show that the scheme achieves better performance on the network level, in terms of keeping the minimum safe distance in both vehicle platooning and flight control.

Index Terms:
Age of information, status update, reinforcement learning, cellular-V2X, software defined radio

I Introduction

With the emergence and development of the Internet-of-Things (IoT), more and more applications require timely information be communicated through wireless networks, e.g., collaborative autonomous driving, wherein only the latest information can reflect the accurate status of the physical world. In this context, leveraging the wireless interface for real-time status updating has gained much traction. Status update is a process that the source node transfers information flow to the target node (with possibly feedback), which lets the target node remotely obtain the real-time status generated at the source node.

Age of Information (AoI) [2] serves as a performance metric for characterizing the freshness of information in a status update system. Most existing works for time-sensitive status update in wireless networks focus on AoI optimizations. Recent research advances suggest that many well-known design principles (e.g., for providing high throughput and low delay) that lead to the success of traditional packet-based networks, are inappropriate for AoI optimizations. Many researches focus on queuing theory. The pioneer work in [3, 2] analyzes the tradeoff between sampling frequency and communication delay (e.g., network scheduling delay and queuing delay). It is shown that the source node can transmit more timely data packets by increasing the sampling frequency of status information, thus reducing the error of the status information between the target node and the source node. However, the increase of the sampling packets would cause longer network delay, resulting in network congestion and also the new information of each sampling packet is decreased. Many later studies have extended this result to e.g., scenarios with packet management [4], multi-class queuing systems [5], gamma distribution for the service time [6], controlling the status packet arrival process instead of assuming it is random [7], and energy harvesting sources [8]. In addition, Ref. [9] derives the stationary distribution of AoI under various queuing disciplines, and Ref. [10] models the spatially correlated statuses as a random field. In conclusion, for the status-unaware schemes of minimizing AoI, it is necessary to control the sampling frequency to avoid network congestion.

As the communication medium, there are many researches on wireless networks, especially scheduling problems, for AoI optimizations. A centralized scheduling strategy using the Whittle’s index is used in [11, 12]. Ref. [13, 14, 15, 16, 17, 18] study decentralized scheduling strategies in wireless multi-access networks. Considering that each node may have different channel conditions, service rates and packet arrival rates, Ref. [16, 17, 18] optimize the access probability of the retreat window size based on the existing MAC-layer protocol, e.g., CSMA or ALOHA. The access probability is associated with the index in Whittle’s index method in [14, 15]. When the number of nodes is large, the round-robin scheduling strategy is proved to be optimal in [13]. And [19, 20] show that a stationary policy actually achieves order-optimal performance in general network topology.

On the other hand, there are many schemes based on status-aware update [21, 22]. An update approach based on status-error thresholds is proposed to optimize sampling from Wiener and Ornstein-Uhlenbeck processes in [23, 24]. Ref. [25] proposes a method to capture the change of status by using effective AoI instead of simple AoI. In this respect, the value of information [26, 27] is designed to reflect on the impact of AoI on a specific application. Ref. [28] proposes a mean-field approach to calculate the access probabilities with random-walk status transitions to optimize the wireless network. According to the content of some status packets, Ref. [29] proposes to assign priorities to them. Most of the mentioned works are based on a status model, e.g., Markov source model, to obtain theoretical tractability, and have not considered status predictions.

In summary, existing works either try to design the sampler based on communication constraints modeled by queues, or develop status-aware scheduler without considering status predictions. Is it possible to build a status update system wherein the sampler can sample the information source without any frequency constraints, while the communication interface can automatically adapt to both network conditions and status trends, such that the destination node observes frequent and fresh updates while the network occupancy is still low? In this paper, a novel wireless communication protocol named parallel communications based on communication-agnostic sampling is proposed. Specifically, there are two types of packets transmitted at the source node, one is conventional Over-The-Air (OTA) packets and the other is model prediction packets. Only when statuses are not predicable by the built-in online status predictor, they are transmitted OTA; otherwise, the status updates are produced by model predictions. A novel Status-aware Multi-Agent Reinforcement learning neTworking solution (SMART) based on the Whittle’s index methodology is proposed, that can be extended to any status vectors (coped with DNN) without being limited to a specific network. We use the Software-Defined-Radio (SDR) platform to test the framework proposed in this paper. We simulate a typical autonomous driving scenario on an integrated hardware-software evaluation platform. In the platform, the vehicle control and motion dynamics are simulated by Simulation of Urban MObility (SUMO), and the communications among vehicles are realized by Software-Defined-Radio (SDR). The results show that by adopting parallel communications, the channel occupancy is significantly reduced, while guaranteeing low status recovery error. Then we applied the framework to two realistic scenarios, one is a multi-density platooning scenario implemented on SUMO and the other is a UAV formation flight control scenario implemented by MATLAB. The results show that compared with current packet-based delay-optimized scheme and the AoI-optimized status-unaware scheme, the parallel communication framework has better performance in terms of wireless network control performance, e.g., safe inter-vehicle distance.

The rest of the paper is organized as follows. In Section II, a single-link example is shown to illustrate the core idea of parallel communications; the SDR verification follows immediately; the migration to the hardware-software evaluation platform follows finally. Section III describes the networking solution for parallel communications, i.e., SMART. Section IV provides a case study using SUMO, applying the proposed framework to multi dense platooning. Section V provides another case study using MATLAB to apply the framework to the UAV formation flight control. Finally, in Section VI, conclusions are drawn with future directions discussed.

Refer to caption
Figure 1: Parallel communications diagram of a single link, and implementation on the SDR testbed.

II Predictive Wireless: A Single-Link Example and SDR Proof-of-Concept

First, we would like to illustrate intuitively, plain and simple, on the central idea of predictive wireless, which is in analogous to human communications. Suppose that Alice and Bob live in different cities and Bob sends a letter to Alice everyday about whether it is rainy in his city on that day. This is a typical status update setting where a fixed update interval (one day) is adopted. However, further suppose that Bob’s city rarely rains, and then he would soon discover that he could make the status update much more efficiently (in the sense of sending less reports), by defining and communicating a model with Alice prescribing that whenever there is no letter, there is no rain; otherwise, Bob would report an model-unexpected status indicating rainy. Obviously, the model in this case is a time-series forecasting model predicting no rain every time, and the status is a binary scalar value.

Such a simple, but useful idea can be generalized to improve the communication efficiency in status update-based wireless systems. In essence, such systems can benefit much from a status prediction functionality. The proposed system architecture diagram is shown in Fig. 1. We term this architecture as parallel communications, since the status information is communicated by both model predictions and OTA packets. The parallel communication operation is transparent to the upper layers or modules, i.e., communication-agnostic sensing module. Several key design aspects are illustrated in details as follows. We assume a time-slotted system and the time index is denoted by tt (the duration of a time slot is 11 millisecond in the SDR implementation). At each time slot, a status (or a set of statuses) is sampled by the sensing module and fed into the Parallel Transmit Function Block (PTxFB).

II-A Model Estimation and Calibration

A model, denoted by t()\mathcal{M}_{t}(\cdot), is identified and estimated online by collecting the statuses over time at the source node, e.g., by well-known system identification and time-series forecasting techniques such as Auto-Regressive Integrated Moving Average (ARIMA) or Recurrent Neural Network (RNN)—in analogous to Bob discovering the no-rain model mentioned above. Specifically, the PTxFB at the source node takes input from the status sampling (sensing) module which samples a specific physical-world status with an arbitrarily high frequency (denoted by f𝗌𝖺𝗆𝗉𝗅𝖾f_{\mathsf{sample}}) to minimize the sampling error. That is to say, the sensing module is totally regardless of the communication burden, i.e., communication-agnostic sensing. Based on the status input data, PTxFB fits a parametric model, which can be matrices in ARIMA or a neural network, to the status data. This model estimation process is performed online and constantly to adapt to status model changes in real world. Afterwards, the estimated model is transmitted to the destination node, i.e., calibrated online; in our design, we adopt a fixed model calibration interval, denoted by T𝗆𝗈𝖽𝖾𝗅T_{\mathsf{model}}. It is essential that the source and destination maintain the same model and are synchronized—this issue is addressed in Section II-E by a model confirmation mechanism.

II-B Parallel Transmitter

At a source node, i.e., transmitter side, assuming that the PTxFB has a current status model which is calibrated with a corresponding destination node, a parallel transmitter works as follows.

1. At time tt, when a new status 𝒔(t)\bm{s}(t) comes from the sensing module, the PTxFB makes a transmit decision:

𝖳𝗋𝖺𝗇𝗌𝗆𝗂𝗍(t)={][c]l?l\IEEEstrut𝖳𝗋𝗎𝖾,if g(𝒔(t),𝒔¯(t))>δ;𝖥𝖺𝗅𝗌𝖾,otherwise,\IEEEstrut\displaystyle\mathsf{Transmit}(t)=\left\{\,\begin{IEEEeqnarraybox}[]{[}][c]{l?l}\IEEEstrut\mathsf{True},&\textrm{if }\operatorname{g}(\bm{s}(t),\bar{\bm{s}}(t))>\delta;\\ \mathsf{False},&\textrm{otherwise},\IEEEstrut\end{IEEEeqnarraybox}\right. (3)

where g()\operatorname{g}(\cdot) denotes an error measure function, e.g., 1\ell^{1}, 2\ell^{2} norms, the model-predicted status at time tt is denoted by 𝒔¯(t)\bar{\bm{s}}(t) which is based on the estimated model and previous statuses, and δ\delta denotes a threshold controlling how much status error the system can tolerate.

2. The model prediction at time tt and the status estimation at the source and destination node (assuming calibrated) are respectively expressed as follows:

𝒔¯(t)\displaystyle\bar{\bm{s}}(t) =\displaystyle= t(𝒔¯(tN𝗂𝗇𝗉𝗎𝗍),,𝒔¯(t1)),\displaystyle\mathcal{M}_{t}({\bar{\bm{s}}}(t-N_{\mathsf{input}}),\,\cdots,\,\bar{\bm{s}}(t-1)),
𝒔^(t)\displaystyle\hat{\bm{s}}(t) =\displaystyle= {][c]l?l\IEEEstrut𝒔(t),if 𝖠𝖢𝖪t;𝒔¯(t),otherwise,\IEEEstrut\displaystyle\left\{\,\begin{IEEEeqnarraybox}[]{[}][c]{l?l}\IEEEstrut\bm{s}(t),&\textrm{if $\mathsf{ACK}_{t}$};\\ \bar{\bm{s}}(t),&\textrm{otherwise},\IEEEstrut\end{IEEEeqnarraybox}\right. (6)

where 𝖠𝖢𝖪t\mathsf{ACK}_{t} denotes a successful transmission at time tt,111If the MAC layer protocol does not support acknowledgment feedback, each transmission is assumed to be successful. and the input data size of the model is denoted by N𝗂𝗇𝗉𝗎𝗍N_{\mathsf{input}}.

Two points of explanation are in order. In the first step, essentially, when δ\delta is larger, the status recovery error increases because only when the status variation compared with model prediction exceeds δ\delta there will be a status packet transmission; otherwise, the error is smaller but the network occupancy is higher since more unexpected packets are transmitted OTA. In a distributed wireless network employing contention-based channel access mechanisms such as Carrier-Sense Multiple-Access (CSMA), higher occupancy directly leads to higher collision rate, and thus lower transmission reliability. Therefore, the threshold δ\delta represents an important tradeoff that will be considered in more details in Section III. Based on the second step, at the parallel transmitter, the status model takes the previous N𝗂𝗇𝗉𝗎𝗍N_{\mathsf{input}} estimated statuses as input, and produces the current model prediction, denoted by 𝒔¯(t)\bar{\bm{s}}(t). Afterwards, 𝒔¯(t)\bar{\bm{s}}^{\prime}(t) is compared with the real status 𝒔(t){\bm{s}}^{\prime}(t), and if the error is beyond the threshold, a status packet is transmitted. The estimated status at the destination node, considering status prediction output and OTA packets, is obtained at the source node, denoted as 𝒔^(t)\hat{\bm{s}}(t). The issue of unaligned estimations of statuses between source and destination are considered in Section II-E.

II-C Parallel Receiver

At a destination node, i.e., receiver side, again assuming the current status model is calibrated with a corresponding source node, the PRxFB works as follows.

𝒔^(t)\displaystyle\hat{\bm{s}}^{\prime}(t) =\displaystyle= {][c]l?l\IEEEstrut𝒓(t),if 𝖠𝖢𝖪t;𝒔¯(t),otherwise,\IEEEstrut\displaystyle\left\{\,\begin{IEEEeqnarraybox}[]{[}][c]{l?l}\IEEEstrut\bm{r}(t),&\textrm{if $\mathsf{ACK}_{t}$};\\ \bar{\bm{s}}(t),&\textrm{otherwise},\IEEEstrut\end{IEEEeqnarraybox}\right. (9)

where 𝒔^(t)\hat{\bm{s}}^{\prime}(t) denotes the status estimation at the destination node, and 𝒓(t)\bm{r}(t) denotes the received status when successfully decoding a packet over the air interface, and hence 𝒓(t)=𝒔(t)\bm{r}(t)=\bm{s}(t) neglecting sensing noise (in the testbed verification, sensing noise is added). Assuming calibrated status model, the model prediction at the source and destination should be aligned, i.e., when there is no packet transmitted, the destination node would use the model prediction as the received status and output to upper layers. Here, a successful reception is denoted by 𝖠𝖢𝖪t\mathsf{ACK}_{t}, which is the same with the transmitter side.

Remark 1 (Transparency)

Based on the above design, one of the main advantages of parallel communications is transparency to the sensing layer. Specifically, from the sensing layer’s perspective, the status packets are constantly fed into the communication module without any consideration of the queuing or network load conditions. At the destination node, the status recovery module, which takes input from the parallel receiver, receives status packets at the same rate as the sampling rate at the source. Therefore, the upper layers of the status update system adopting parallel communications work completely transparent to the communication conditions; however, the true network occupancy is greatly reduced thanks to status model predictions. The term “parallel” refers to the fact that two communication paths are present: one is real OTA transmission and the other is model-prediction outputs by calibrated models between source and destination which also involve communications.

II-D SDR Implementation

Refer to caption
Figure 2: The overall structure of the physical layer and MAC layer of the SDR platform.

A two-vehicle platoon (a leader and a follower) with artificial status data is implemented on SDR. The leader vehicle collects kinematic status information from the follower vehicle, and then assigns an acceleration to the follower vehicle to keep the distance between two vehicles a constant. The kinematic status information consists of the distance from the leader vehicle, and the instantaneous velocity and acceleration.

The proposed scheme is implemented on an SDR platform consisting of two National Instruments (NI) USRP-2974 [30] devices. The USRP-2974 device is composed of a Kintex-7 XC7K410T Field Programmable Gate Array (FPGA) baseband board and a general purpose processor (GPP) as the system controller. The physical layer of Cellular Vehicle-to-Everything (C-V2X) Mode 4 is implemented in the FPGA. A simplified MAC layer and the parallel communication framework are both implemented on GPP. The overall structure of the physical layer and MAC layer is shown in Fig. 2. The details are specified below.

C-V2X Mode 4 supports vehicles to communicate with each other directly without base station coverage. In the PHY layer, one Resource Block (RB) is 180180 KHz (1212 subcarriers of 1515 KHz subcarrier spacing). One subchannel is defined as a set of RBs in the same subframe, and the number of RBs can vary depending on applications. One subframe is 11 ms. One vehicle User Equipment (UE) can transmit in one subchannel. A sensing-based Semi-Persistent Scheduling (SPS)—a decentralized resource allocation scheme—is adopted to enable direct V2X communications [31]. Some of the system parameters are shown in the Table I. The modulation of control channel is QPSK and the coding scheme is tail biting convolution code; the modulation of data channel is 64QAM and the coding scheme is turbo coding. Both USRPs are settled on the table and the distance between the two is about 5050 centimeters.

TABLE I: The system parameters of the experiment
parameters values
central frequency 5.95.9 GHz
bandwidth 2020 MHz
number of RBs 100100
number of subchannels 22
number of RBs used to transmit messages in one subchannel 2424
transmit power 20-20 dBm

In the experiment, two USRPs are utilized to mimic two vehicle UEs. The kinematic status information is generated by SUMO instead of real measurements considering the cost issue. However, the proposed scheme is tested on our SDR testbed without SUMO. As long as the real status data can be predicted, the experiment is practical. In addition, Gaussian white nose is added to the velocity and distance observations to simulate real-world imperfections, since kinematic sensors always suffer from measurement errors. Denote the observed noisy status information as x¯\bar{x} with xx being the real status. The detailed procedure of our implemented parallel communication framework goes as follows.

At the follower vehicle, the current spacing (d¯(t)\bar{d}(t)), velocity (v¯(t)\bar{v}(t)) and acceleration (a¯(t)\bar{a}(t)) is sampled every 11 ms for model estimation and calibration. Denote 𝒔¯(t)[d¯(t),v¯(t),a¯(t)]𝖳\bar{\bm{s}}(t)\triangleq[\bar{d}(t),\,\bar{v}(t),\,\bar{a}(t)]^{\mathsf{T}} and 𝒘¯(t)[d¯(t),v¯(t)]𝖳\bar{\bm{w}}(t)\triangleq[\bar{d}(t),\,\bar{v}(t)]^{\mathsf{T}}. Every 100100 ms, model parameters are calculated by the stored kinematic statuses by an Least Mean Squares (LMS) estimation method:

𝑨(t)=[𝒘¯(t),,𝒘¯(t99)][𝒔¯(t1),,𝒔¯(t100)],\bm{A}(t)=[\bar{\bm{w}}(t),\,\cdots,\,\bar{\bm{w}}(t-99)]\cdot[\bar{\bm{s}}(t-1),\,\cdots,\,\bar{\bm{s}}(t-100)]^{\dagger}, (10)

where 𝑿\bm{X}^{\dagger} denotes the pseudo-inverse of 𝑿\bm{X} (in particular right pseudo-inverse). Meanwhile, the follower vehicle transmits model parameters and piggyback the current kinematic status to the leader vehicle to calibrate the model. Every message conveying model parameters is time-stamped for model confirmation which is illustrated in the next paragraph. Every 1010 ms, the status, i.e., the distance and velocity of the model prediction are compared with the actual observed status output from sensors. If the error (measured by 1\ell^{1} norm) between the actual and predicted statuses is higher than a threshold (0.10.1 in our experiment), the actual status would be transmitted; otherwise no transmission and hence both ends would use the model prediction as communicated status at that time.

At the leader vehicle, the status information is transmitted to higher layers every 11 ms. The parallel receiver procedure as described in Section II-C is implemented. If the leader vehicle receives a model calibration message, it would substitute the previous model, and the predicted velocity of the follower vehicle and distance between the vehicles are calculated accordingly. A timestamp is received together with the message and is utilized for feedback as illustrated below. Every 1010 ms, the control algorithm at the leader vehicle takes the status information of the vehicles as input, outputs an acceleration value based on (11) and transmits to the follower vehicle together with the last timestamp received from the follower vehicle. The follower vehicle can determine if the packet is missing according to the timestamp. This acceleration assignment message is transmitted based on conventional communication approach.

To calculate the acceleration value, the leader vehicle needs to know the desired distance, the current distance, the velocities of both vehicles and the acceleration of itself. We adopt a specific Cooperative Adaptive Cruise Control (CACC) method to calculate the acceleration. The method is shown to be string-stable, i.e., a small turbulence in the platoon can be subsided by the control method [32]. Specifically,

a𝖽𝖾𝗌,n\displaystyle a_{\mathsf{des},n} =\displaystyle= ω1(d𝖽𝖾𝗌d^n)+ω2(v^nv^n1)+ω3(v^nv¯1)\displaystyle\omega_{1}(d_{\mathsf{des}}-\hat{d}_{n})+\omega_{2}(\hat{v}_{n}-\hat{v}_{n-1})+\omega_{3}(\hat{v}_{n}-\bar{v}_{1}) (11)
+ω4a𝖽𝖾𝗌,n1+ω5a𝖽𝖾𝗌,1,\displaystyle+\omega_{4}a_{\mathsf{des},n-1}+\omega_{5}a_{\mathsf{des},1},

where d𝖽𝖾𝗌d_{\mathsf{des}} is the desired inter-vehicle distance of the platoon which is limited by the safety requirements and should be kept as small as possible to reduce air drag, the status recovery of a status xx at the leader vehicle is denoted by x^\hat{x} (xx being arbitrary), the desired acceleration of the nn-th vehicle in the platoon (leader vehicle is the first) is denoted by a𝖽𝖾𝗌,na_{\mathsf{des},n}, and ωi\omega_{i}, i=1,,5i=1,...,5 are constants satisfying certain conditions for string-stability[32]. In practice, the desired acceleration is fed into the lower controller of each vehicle such that d𝖽𝖾𝗌d_{\mathsf{des}} is maintained.

Refer to caption
Figure 3: Status recovery error measured on the SDR testbed. The transmission timing of both OTA status packets and the model calibration packets is also depicted.

In the feedback process at the leader vehicle, as mentioned above, a timestamp together with the model calibration message is first transmitted to the leader vehicle from the follower. Once this timestamp is received, the leader vehicle will transmit a confirmation packet back to the follower, in order to confirm the model calibration and calibration timing. After the follower receives the timestamp the leader feeds back (a timeout mechanism is applied that after 1010 ms without feedback, the model calibration message is considered to be lost), it will compare the received timestamp with the last timestamp it transmits. If two timestamps are different, meaning that the leader vehicle fails to receive the model calibration message, the follower vehicle would not adopt the new model parameters; when the two timestamps match, the leader and follower vehicles both adopt the same and new model parameters.

Fig. 3 shows the status recovery error at the leader vehicle, tested on the SDR platform. The timing of status packets transmitted OTA and model calibration packets are also depicted for ease of exposition. The initial distance between two vehicles is 1010 meters and the initial velocity and acceleration of both vehicles are 1010 m/s and 0 m/s2, respectively. The initial status models at both vehicles are generated randomly. The acceleration of the leader vehicle follows the formula:

a1\displaystyle a_{1} =\displaystyle= {][c]l?l\IEEEstrutc(2478t)(t478),if 478t2478;0,otherwise,\IEEEstrut\displaystyle\left\{\,\begin{IEEEeqnarraybox}[]{[}][c]{l?l}\IEEEstrut c(2478-t)(t-478),&\textrm{if }478\leq t\leq 2478;\\ 0,&\textrm{otherwise},\IEEEstrut\end{IEEEeqnarraybox}\right. (14)

where c=4c=4 m/s2. Two observations from Fig. 3 are in order. First (Obv. 11 as denoted in Fig. 3), before the first model status calibration, i.e., when the source and destination do not have an agreed model for status prediction, all packets are transmitted OTA; this is equivalent with the scheme without the proposed parallel communications. The status recovery error jumps to zero when there is an update and deviates in between two updates. After the first model calibration and confirmation, the status is predictable for about 400400 ms, i.e., error within the predefined threshold during this period, and hence no packets are transmitted OTA; however the status recovery error stays low due to model prediction outputs. Secondly (Obv. 22), starting from 478478 ms, the leader vehicle begins to accelerate based on (14). During this period, the status model has changed due to acceleration and hence status recovery error starts to accumulate over time—once there is an packet transmission OTA, the error returns to zero. As we can see, the unexpected packet transmissions are dense during the beginning of this period. After about 200200 ms, the model is calibrated based on online estimations to capture the model change, and hence the status error is small afterwards with relatively less frequent unexpected packet transmissions. Overall, we can conclude that the proposed scheme is feasible in practice, and that parallel communications can significantly reduce the packet transmissions OTA and maintains a low status recovery overhead.

Only using the SDR platform cannot simulate the real vehicles’ movement. So we have proposed a hardware-software evaluation platform to solve this problem with the SDR platform and the traffic simulator SUMO. The SDR platform is programmed to run the standardized C-V2X Mode 4 protocol with realistic channel. SUMO, the road traffic simulation, is added to model the vehicles and the traffic flow. And data is exchanged between SUMO and LabVIEW through the extension interface TraCI. Our platform is the first time using SUMO integrated with SDR platform which is based on realistic wireless communication interface. We can set the communication parameters on LabVIEW and observe the driving states of vehicles in the SUMO simulator.

Then the two-vehicle platoon scenario is extended to a three-vehicle platoon scenario (a leader and two followers) with the hardware-software evaluation platform. In the experiment, each USRP represents one vehicle. The USRPs have implemented the MAC layer of distributed scheduling and C-V2X Mode4 physical layer, so they can communicate with each other by real wireless interface to verify the signal transmission in the actual environment. SUMO is utilized to simulate the traffic conditions of vehicles on the road with the control information transmitted from the HOST layer of LabVIEW, and feedback the real status of vehicles obtained through simulation to the HOST layer of SDR platform for network simulation processing. In the SDR platform, the real state information from SUMO is used for updating the real state and data processing in the HOST layer.

Refer to caption
Figure 4: The frame structure of the platoon model implemented on the platform based on the SDR platform and the traffic simulator SUMO.

The three-vehicle platoon experiment is divided into two parts: the prediction model and communication of the vehicles are completed in SDRs, and the act moving of the vehicles is simulated in SUMO. The specific process is that the leader USRP sends accelerations to the follower USRPs and the SUMO simulator, and the parallel parameters are transmitted to the leader USRP from the follower USRPs periodically. Each USRP can get its own actual status from the SUMO simulator. If the error between the actual status and the predicted status is greater than the threshold in the follower USRP, the follower USRP will transmit the actual status obtained by SUMO to the leader USRP. Fig. 4 shows the frame structure of the platoon model implemented on the platform. SUMO is installed on the leader USRP, so the follower USRPs need to get the real state information from the leader USRP. Our plan is that the leader USRP sends state information to the follower USRPs along with the control information. It is worth noting that the process of sending state information does not belong to communication process.

In the scenario, the initial distance between the leader vehicle and the first follower vehicle is 1010 meters. The initial distance between the leader vehicle and the second follower vehicle is 2020 meters. The initial velocity and acceleration of all of the vehicles are 1010 m/s and 0 m/s2. The acceleration of the leader vehicle follows the formula:

a1\displaystyle a_{1} =\displaystyle= {][c]l?l\IEEEstrut4(4000t)(t2400),if 2400t4000;0,otherwise,\IEEEstrut\displaystyle\left\{\,\begin{IEEEeqnarraybox}[]{[}][c]{l?l}\IEEEstrut 4(4000-t)(t-2400),&\textrm{if }2400\leq t\leq 4000;\\ 0,&\textrm{otherwise},\IEEEstrut\end{IEEEeqnarraybox}\right. (17)
Refer to caption
Figure 5: The distance between the leader vehicle and the second vehicle obtained from the experiment. The status error, the transmission timing of both OTA status packets and the model calibration packets are also depicted.
Refer to caption
Figure 6: The distance between the leader vehicle and the third vehicle obtained from the experiment. The status error, the transmission timing of both OTA status packets and the model calibration packets are also depicted.

Each follower vehicle keeps its own prediction model, updates and sends the parameters of the model to the leader vehicle every 100100 ms. The leader USRP updates the acceleration of each vehicle and sends those to the follower USRPs and the simulator SUMO every 1010 ms. Correspondingly, the status of the model prediction dModeld_{\textrm{Model}} and vModelv_{\textrm{Model}} are compared with the actual status dActd_{\textrm{Act}} and vActv_{\textrm{Act}} obtained by the simulator SUMO every 1010 ms in the follower USRPs. If the error is bigger than the threshold 0.150.15, the actual status of the follower USRP will be sent to the leader USRP to update the input of the parallel model. The simulation step of SUMO is 11 ms. And SUMO transmits the actual states of the vehicles to LabVIEW every 11 ms.

Fig. 5 and Fig. 6 show the information obtained from the second vehicle and the third vehicle, respectively. The first sub-figure shows the error between the real states and the states obtained by the prediction model. The second sub-figure shows whether to send real states. The real states of the follower vehicle must be sent to the leader vehicle if the error exceeds 0.150.15. When the system is not stable, the error is big and the frequency of sending state packets is high. In contrast, when the system is stable, the error gets smaller and the state packets’ sending frequency decreases correspondingly. We can see the model parameters are sent to the leader vehicle every 100100 ms in third sub-figure. The fourth sub-figure shows the distance between the follower vehicles and the leader vehicle. When the leader vehicle accelerates, the follower vehicle also accelerates to ensure the distance constant. The results show that by adopting parallel communication, the channel occupancy is significantly reduced by online model predictions, while keeping the same states estimation error. And we have proved that autonomous driving can indeed be studied on the platform.

II-E Issues and Solutions

Several practical issues found in experiments are discussed here. Status misalignment: One of the crucial requirements for the proposed design is that the status estimations (including model predictions) should be aligned at both ends, i.e., 𝒔^(t)=𝒔^(t)\hat{\bm{s}}(t)=\hat{\bm{s}}^{\prime}(t), t\forall t. Such a requirement could be jeopardized by the following causes.

1) Model calibration packet transmission error: The packet that contains the newly estimated model using recent status data is lost due to channel error, e.g., collisions or channel fading.

2) OTA status packet transmission error: The transmission of an unexpected status packet OTA is erroneous.

3) Timing misalignment: Due to the hardware signal processing latency or other causes, the timing of a status packet received from the air interface at destination could be unaligned with the source. For example, the transmitter sends a packet at time t1t_{1}, but the perceived sending time is t1+τt_{1}+\tau where τ\tau denotes the processing latency. This issue is identified in the testbed and is believed to be ubiquitous in practice.

Notice that status misalignment by even one packet loss causes persistent status recovery error over time. To see this, assuming that one model calibration packet is lost, all subsequent model predictions would be different, due to the fact that the source node would use the newly calibrated model but the destination node would still use the old model. The situation with status packet loss is similar, for that new model predictions would be unaligned due to different views on status history at source and destination. Timing misalignment also causes the same problem.

To address this issue, we adopt three methods that are proven effective to counteract the three causes in the implementation, namely model confirmation, periodical model and status retransmissions, and timestamps. The three methods are used jointly. The model confirmation packet is a feedback packet transmitted by the destination node immediately (11 timeslot to allow signals to be processed) after it receives a model calibration packet, containing a timestamp indicating the time when the destination receives the model calibration. If the source receives the confirmation packet, in which case the source and destination have agreed on the new model and the time to begin using it, then the issue is solved. In practice, we use repetition to ensure the successful reception of the model confirmation packet since it is vital. Note that repetition cannot be directly applied to the original model calibration packet since the destination would be confused about the exact timing to use the new model. Moreover, periodical packets which contain the model calibration and current status are transmitted, termed as correction packets, to avoid the following possible problem: When an unexpected status packet is lost, long time would pass without any unexpected status packet being transmitted if the model prediction was precise, in which case the model predictions at source and destination would be unaligned for a long time, resulting in severe status recovery error. In practice, the time interval between correction packets is long, e.g., 11 second, to reduce the overhead thereby. Last but not least, a useful method to counteract timing misalignment brought by hardware impairments is piggybacking the timestamp of the status packet that is transmitted OTA. By doing this, the source and destination nodes can agree on the correct status timing that is carried by the timestamp and thus generate aligned status predictions.

1 Phase 1: Election of Supervision Node
2 Every destination node is elected as a Supervision Node (SN).
3 Phase 2: Offline Single-Agent Training
4 Initialization: Source nodes: Initialize their DDPG by the normal distribution.
5 for n=1:Nn=1:N do
6       for m=[m𝗆𝗂𝗇:m𝗂𝗇𝗍:m𝗆𝖺𝗑]m=[m_{\mathsf{min}}:m_{\mathsf{int}}:m_{\mathsf{max}}] do
7             DDPG training for source node-nn to solve the MDP expressed in (18) with given mn=mm_{n}=m. Afterwards, save the DDPG parameters (𝒘m,n\bm{w}_{m,n}) to its database.
8      
9 Phase 3: Online Auxiliary Cost Adaptation
10 Initialization: Source nodes: Initialize their DDPG by 𝒘m𝗆𝖺𝗑,n\bm{w}_{m_{\mathsf{max}},n} (n=1,,Nn=1,...,N). SNs: Assign m=m𝗆𝖺𝗑m=m_{\mathsf{max}} as the initial auxiliary cost.
11 for t=1:Tt=1:T do
12       for n=1:Nn=1:N do
13             if The output of DDPG of source node-nn is 𝗍𝗋𝖺𝗇𝗌𝗆𝗂𝗍\mathsf{transmit} based on its state then
14                   Source node-nn transmits, following the underlying MAC protocol.
15            else
16                  Source node-nn stays silent.
17            
18      if t%𝖾𝗏𝖺𝖨𝗇𝗍=0t\%\mathsf{evaInt}=0 then
19             𝖼𝗈𝗌𝗍=τ=t𝖾𝗏𝖺𝖨𝗇𝗍+1t(𝒙1(τ),,𝒙N(τ))\mathsf{cost}=\sum_{\tau=t-\mathsf{evaInt}+1}^{t}\mathcal{E}(\bm{x}_{1}(\tau),...,\bm{x}_{N}(\tau))
20             if 𝖼𝗈𝗌𝗍𝖼𝗈𝗌𝗍𝖯𝗋𝖾𝗏>δ\mathsf{cost}-\mathsf{costPrev}>\delta and the number of transmission collisions increases then
21                   m=max{m+m𝗂𝗇𝗍,m𝗆𝖺𝗑}m=\max\{m+m_{\mathsf{int}},\,m_{\mathsf{max}}\} at the corresponding source nodes.
22             else if |𝖼𝗈𝗌𝗍𝖼𝗈𝗌𝗍𝖯𝗋𝖾𝗏|<δ|\mathsf{cost}-\mathsf{costPrev}|<\delta then
23                  mm is unchanged.
24            else
25                   m=min{mm𝗂𝗇𝗍,m𝗆𝗂𝗇}m=\min\{m-m_{\mathsf{int}},\,m_{\mathsf{min}}\} at the corresponding source nodes.
26            𝖼𝗈𝗌𝗍𝖯𝗋𝖾𝗏=𝖼𝗈𝗌𝗍\mathsf{costPrev}=\mathsf{cost}
27      
Algorithm 1 SMART

III Networking of Predictive Wireless Devices for Status Update

In this section, we describe in general a networking scheme for efficient communications among ad hoc wireless devices running the parallel communication protocol. One of the critical issues of this networking scheme that distinguishes from conventional ad hoc network problems is that how to properly make transmit decisions when the decisions depend on locally-observable status information, i.e., status-aware. The scheme is inspired by the Whittle’s index methodology [33] and is explained as follows.

Denote 𝒙n(t)[𝒔n(t),𝒔^n(t)]\bm{x}_{n}(t)\triangleq[\bm{s}_{n}(t),\,\hat{\bm{s}}_{n}(t)] as the Markov state for nn-th source node in the system at time tt, wherein 𝒔n\bm{s}_{n} denotes the local status information, and 𝒔^n(t)\hat{\bm{s}}_{n}(t) denotes the status recovery at the nn-th destination (estimated at source nodes based on aligned status). Note that for non-Markov status evolution, a collection of finite length history of statuses can be approximately defined as Markov states which is common in practice. The overall system space is hence {𝒙n(t)|n=1,,N}\{\bm{x}_{n}(t)|n=1,...,N\} wherein NN is the total number of source nodes. Denote 𝒖(t)={un(t)|n=1,,N}\bm{u}(t)=\{u_{n}(t)|n=1,...,N\} as the transmit decision of sources at time tt; u(t)=1u(t)=1 denotes transmit and silent otherwise. The Markov Decision Process (MDP) can be defined as

MDP-1: minimize𝒖(t)\displaystyle\mathop{\textrm{minimize}}\limits_{\bm{u}(t)}\,\, lim supT1Tt=0T1(𝒙1(t),,𝒙N(t))\displaystyle\limsup_{T\to\infty}\frac{1}{T}\sum_{t=0}^{T-1}\mathcal{E}(\bm{x}_{1}(t),...,\bm{x}_{N}(t))
s.t., source-n knows 𝒙n(t),n=1Nun(t)=1,\displaystyle\textrm{source-$n$ knows }\bm{x}_{n}(t),\,\sum_{n=1}^{N}u_{n}(t)=1,

where ()\mathcal{E}(\cdot) denotes a predefined error function. MDP-1 is essentially a Partially Observable MDP (POMDP) as seen by each source-destination pair, since each source can only observe its local current status. MDP-1 suffers from the curse of dimensionality when using e.g., value iterations. In addition, MDP-1 is a POMDP which does not have a general solution method—multi-agent reinforcement learning techniques may be leveraged but with convergence issues [34].

Towards this end, we adopt the concept of Whittle’s index [33] to decouple MDP-1. The idea of this approach is that instead of considering all nodes simultaneously, the problem is decoupled and reduced to transmit decisions for one source node. To avoid the trivial, selfish and useless solution of always transmitting, each transmission is associated with an auxiliary cost of mnm_{n}. In other words, the decoupled MDP problem for source node-nn is formulated as

f(𝒙n)+J^=min{𝒙n(0)+𝒙n𝒫𝒙n𝒙n(0)f(𝒙n),mn+𝒙n(1)+𝒙n𝒫𝒙n𝒙n(1)f(𝒙n)},f(\bm{x}_{n})+\hat{J}^{*}=\min\left\{\begin{array}[]{l}\mathcal{E}^{(0)}_{\bm{x}_{n}}+\sum_{\bm{x}_{n}^{\prime}}\mathcal{P}^{(0)}_{\bm{x}_{n}\bm{x}_{n}^{\prime}}f(\bm{x}_{n}^{\prime}),\\ m_{n}+\mathcal{E}^{(1)}_{\bm{x}_{n}}+\sum_{\bm{x}_{n}^{\prime}}\mathcal{P}^{(1)}_{\bm{x}_{n}\bm{x}_{n}^{\prime}}f(\bm{x}_{n}^{\prime})\end{array}\right\}, (18)

wherein the top and bottom terms in the minimization operator represent the cost-to-go from state 𝒙n\bm{x}_{n} onwards with the action of silent and transmit, respectively. The expected cost functions are denoted by 𝒙n(0)\mathcal{E}^{(0)}_{\bm{x}_{n}} and 𝒙n(1)\mathcal{E}^{(1)}_{\bm{x}_{n}} respectively for both actions (assuming the error function is decomposable, e.g., 1\ell^{1}, 2\ell^{2} norms); the transition matrices 𝒫𝒙n(0)\mathcal{P}^{(0)}_{\bm{x}_{n}} and 𝒫𝒙n(1)\mathcal{P}^{(1)}_{\bm{x}_{n}} are denoted likewise. The relative cost-to-go function of state 𝒙n\bm{x}_{n} and the average cost are denoted by f(𝒙n)f(\bm{x}_{n}) and J^\hat{J}^{*} respectively. It has been shown that the solution solving (18) for each source node and selecting the node with the largest index (max{mn|n=1,,N}\max\{m_{n}|n=1,...,N\}) to transmit leads to near-optimal performance in many problems [11, 12, 14]. Inspired by these results, we design the following scheme, as shown in Algorithm 1, which is termed as SMART. Due to the fact that the status evolution is unknown, as well as that no supervisor is available in real world, we adopt a deep reinforcement learning-based approach. The status in general is comprehended using a Deep Neural Network (DNN), specifically a Deep Deterministic Policy Gradient (DDPG) approach [35], while in practice if the status is straightforward, simpler approaches can be adopted.

In Algorithm 1, we use m𝗆𝖺𝗑m_{\mathsf{max}} and m𝗆𝗂𝗇m_{\mathsf{min}} to denote the maximum and minimum auxiliary costs, respectively. In practice, m𝗆𝖺𝗑m_{\mathsf{max}} and m𝗆𝗂𝗇m_{\mathsf{min}} are hyper-parameters that depend on network conditions to provide adaptation capabilities inside the interval [m𝗆𝗂𝗇,m𝗆𝖺𝗑][m_{\mathsf{min}},m_{\mathsf{max}}]. SMART works roughly as follows. In the single-agent training phase, a set of possible auxiliary cost mm is trained for each source node as if only itself is updating with the additional auxiliary cost. A mapping from each mm to the DDPG parameters is stored in the node’s database. Afterwards, in the auxiliary cost adaptation phase, the SNs222In practice, it is found that representative (not all) destination nodes elected as SN are sufficient. observe the network conditions (i.e., collisions, channel idle and status values) for a period of time 𝖾𝗏𝖺𝖨𝗇𝗍\mathsf{evaInt} in Step 1 and thereby adjust the auxiliary cost values. When the cost increases due to congestion, the auxiliary cost should be increased to discourage source nodes from transmitting; in other cases, the auxiliary cost should be decreased, or stays the same after convergence. The corresponding source nodes switch their DDPG parameters based on their respective SNs’ feedback. SMART is also compatible with different MAC protocols. For example, nodes would transmit randomly based on the backoff window size in CSMA and C-V2X protocols. Eventually, the network converges to a situation wherein appropriate auxiliary costs are attained such that only nodes with urgent transmission needs (e.g., large status prediction error) are transmitting and the network is properly loaded. Note that the auxiliary costs can be different among nodes since each source node follows its corresponding SN’s feedback. In the following section, we will observe in details how a system of platooning vehicles behave under SMART and parallel communications.

Refer to caption
Figure 7: A multi dense platooning scenario wherein each leader vehicle controls the corresponding platoon. The C-V2X mode-4 MAC layer is implemented on SUMO to simulate vehicular communications.
Refer to caption
Figure 8: Performance comparisons with status-unaware scheme with the optimal update interval. The left, middle and right columns represent optimal status-unaware, parallel communication without and with correction packets, respectively. The top, middle and bottom rows represent vehicle distances from the front vehicles, status and correction packets transmissions, and status predictions at both ends. The bottom-left figure shows the velocity of the leader vehicle.
Refer to caption
Figure 9: Minimum safe distance versus the status update interval by status-unaware update schemes.
Refer to caption
Figure 10: Effectiveness of auxiliary cost adaptation in SMART. The number in the bracket represents platoon quantity on the road (each with 88 vehicles).

IV Case Study: Multi Dense Platooning in Steady State

In this section, we demonstrate through SUMO a Wireless Networked Control (WNC) application, namely multiple dense platooning in steady state using C-V2X communication protocol and the proposed framework (see Fig. 7). The application is selected for two reasons. First, one of the most promising 5G vertical applications is C-V2X-enabled autonomous driving and platooning is perhaps the most attractive fully autonomous driving technology; secondly, platooning represents a WNC system wherein timely (milliseconds level) and precise status update and control are essential.

Scenario and Platoon Control: Several platoons, each led by a leader vehicle, travel close to each other (within the communication range of C-V2X which is typically 700700 meters [36]) on the highway in steady state, i.e., after platoons are formed. We consider only the longitudinal drive control, that is the vehicles travel on a straight line. In each platoon, the leader vehicle is driven by human and the others are controlled by the leader vehicle based on control algorithms such as (11) and wireless signals. Naturally, for the leader vehicles to make precise control, the status information it collects from its follower vehicles, including distances from the front vehicle and instantaneous velocities, should be as timely as possible. The ultimate goal of the system is to save fuel while ensuring safety, and hence the vehicles are designed to follow its corresponding front vehicles as close as possible while not crashing into them. In the simulations, we set a predefined desired distance between two vehicles and let the control algorithm to maintain this desired distance—the minimum safe distance is therefore the maximum distance reduction from the desired distance during the whole trajectory. The simulation scenario is a two-way, four-lane highway with a length of 15001500 meters. The length of vehicles is 55 meters. The leader vehicle enters with a speed of 1010 m/s, then accelerates to 22.222.2 m/s in 55 seconds; from the 1515-th second, the leader decelerates to 9.79.7 m/s in 55 seconds, and then accelerates to 22.222.2 m/s in 1515 seconds (depicted in the bottom-left figure in Fig. 8). The acceleration of vehicles is restricted to [2.94,4][-2.94,4] m/s2 to avoid very rapid and abrupt changes in speed. The simulation time step is 11 ms. The actuation delay and sensing delay are ignored to focus on information timeliness, and hence the performance evaluation can be viewed as an optimistic bound.

Network Protocol: The underlying MAC- and PHY-layer protocols that convey the status and control information are based on C-V2X Mode 4 [31]. Specifically, as shown in Fig. 7, an SPS decentralized time/frequency resource allocation scheme is adopted wherein a vehicle UE with packet (Cooperative Awareness Message, CAM) to send could choose uniformly randomly from a pool of time/frequency resources. The pool is composed of several subchannels in the frequency domain and a number of subframes with length denoted by Resource Reservation Interval (RRI). Thereby, as the selection is fully autonomous, collisions may happen especially when the system is heavily loaded. In addition to collisions, Half-Duplex (HD) error also occurs in C-V2X Mode 4, which represents the error caused by receiving a packet while transmitting on the same subframe. In the simulations, we set the RRI to be 1010 ms and the number of subchannels to be 22, in consistency with current standards [37].

Parallel Communications Implementation: The majority of implementation details are the same as described in Section II and III. In particular, the status model is estimated based on a sliding-window LMS method (10). We choose the cost function ()\mathcal{E}(\cdot) in SMART as the standard deviation from the desired distance between successive vehicles. The model calibration time interval is 500500 ms.

Simulation Results: In Fig. 8, the proposed parallel communication framework is tested in comparison with the status-unaware scheme using the optimal status update interval. Status-unaware schemes represent those without knowledge of the status content. Therefore, the best they can achieve is to optimize over the update interval; we obtain the optimum by simulating over a set of status update intervals from 1010 ms to 150150 ms, with step size of 1010 ms, and the optimum is shown to be 4040 ms (see Fig. 9). Note that optimizing the average AoI among all vehicles is equivalent to optimizing over update intervals since the statuses are generate-at-will [11]. Intuitively, the tradeoff is that a smaller update interval leads to worse congestion, but lower update waiting delay; vice versa. Comparing among the first column in Fig. 8, it is observed that the parallel communication scheme with correction packets outperforms the others significantly, showing that by using model predictions, the network load is reduced such that the status attained and control is more timely and hence the vehicle distance variance is much smaller. Observing the middle column (Obv. 1), it is shown that without correction packets (which are transmitted periodically and carry status information) the parallel communication scheme suffers from the issue we discussed in Section II-E. That is, after a status packet loss, the source and destination are unaligned as can be observed in the bottom-middle figure; hence, the control is affected and the vehicle distance is kept to be larger than desired until the next unexpected status packet transmission. The transmissions of correction packets certainly entail overhead. From Obv. 2, it is shown that the optimal status packet update interval for status-unaware schemes is 4040 ms, and the correction packet transmission interval (same with model calibration packets) is 500500 ms. Therefore, the additional overhead of correction packets is well worth it since it is relatively small and useful. A key observation in Fig. 9, which shows that the optimal status update interval of status-unaware schemes is about 4040 ms, is that in status update, sometimes it is better to be timely but unreliable, than ultra-reliable but sacrificing timeliness. This is different with the current ultra-Reliable Low-Latency Communications (uRLLC) principle. Specifically, as shown in the figure, the status becomes stale when waiting to be updated in the out-of-date regime with large status update intervals; on the other hand, the transmission reliability is high in this regime since the network load is low and hence collisions rarely happen.

The effectiveness of dynamic auxiliary cost adaptation in SMART is shown in Fig. 10. The reference design is the status-unaware scheme with the optimal update interval. SMART is tested with different initial auxiliary cost values, as shown in the xx-axis; note that for fixed auxiliary cost schemes, the initial value never changes afterwards. It is found that the reinforcement learning-based scheme can adapt to different network conditions, as represented by different numbers of vehicles, and obtain the best performance. However, a fixed auxiliary cost scheme without considering the networking aspect is insufficient to provide robust performance.

Refer to caption
Figure 11: Performance comparisons with status-unaware scheme. The first, second, third and fourth rows represent status-unaware, parallel communication x-axis, y-axis and z-axis, respectively. The left, middle and right columns represent the UAV coordinates, status and model calibration packets transmissions, and status predictions at both ends. The first-row-right figure shows the velocity of the leader UAV.

V Case Study: Multi UAV Formation Flight in Steady State

In this section, we demonstrate through MATLAB another Wireless Networked Control (WNC) application, namely UAV formation flight control using C-V2X communication protocol. The application is selected for two reasons. First, unmanned aerial vehicles (UAVs) are getting a large attention to be widely used in various areas due to their high potential and versatile usability; secondly, for some applications that need high reliability and responsibility, they may need some special requirements of collision avoidance, which means a low-latency and precise status updating wireless network is required.

Scenario and UAV Formation Flight Control: One formation of 10 UAVs, led by a leader UAV, travel close to each other in steady state. We consider only the full-direction flight control, that is the UAVs travel freely in three dimensions. Each UAV can sense its absolute three-dimension coordinates. In the formation, similar to the previous case, the leader UAV is driven by a preset algorithm and the others are controlled by the leader UAV based on control algorithms. The status information, in this case, includes coordinates and instantaneous velocities. The ultimate goal of the system is to precisely control the UAV formation flight, and hence the UAVs are designed to travel to the target coordinates as the leader UAV requires as close as possible. In the simulations, we set each UAV a desired distance away from its leader in each dimension. The simulation scenario is a space where all UAVs can travel freely. The leader UAV enters with a speed of 0 m/s on all three axes, then accelerates to 0.490.49 m/s on x-axis in 0.50.5 second, 1.7151.715 m/s on y-axis in 0.50.5 seconds and 0.490.49 m/s on z-axis in 0.50.5 seconds; from the 0.50.5-th second, the leader decelerates to 0.2450.245 m/s on x-axis in 0.50.5 second, 1.06821.0682 m/s on y-axis in 0.50.5 second, 0 m/s on z-axis in 0.50.5 second (the velocity of x-axis is depicted in the bottom-left figure in Fig. 11). The acceleration of UAVs is restricted to [4,4][-4,4] m/s2 to avoid very rapid and abrupt changes in speed. The simulation time step is 11 ms.

Network Protocol: The protocol is similar to the previous application’s. In this application, the resource re-selection period is set small so each UAV re-selects the radio resource frequently, which could cause severer radio collision risk.

Parallel Communications Implementation: The majority of implementation details are the same as described in Section IV. In the previous application, the status includes distance between vehicles. However, in this application, it is the absolute coordinates rather than the distance. As there is three axes, two more models are added to estimate the status of the other two dimensions. In particular, the model calibration time interval is 10001000 ms in this application.

Simulation Results: In Fig. 11, the proposed parallel communication framework is tested in comparison with the status-unaware scheme. The figure shows the result from 55 s to 2020s when the model has converged. Since the status includes the absolute coordinates, the figure shows the coordinates rather than distance. Comparing among the first row in Fig. 11, it is observed that the parallel communication scheme, by using status predictions, outperforms the status-unaware scheme when the radio resource is insufficient comparing to the number of nodes. The distance between each two UAV can not keep constant in the status-unaware scheme while it can in the proposed scheme. Observing the second row, the performance on x-axis is shown. In the middle figure, the model calibration packets is transmitted per 1000ms and there are a few status packets transmitted after the model has converged. The proposed scheme reduces the radio resource cost clearly comparing to the status-unaware scheme. Since the model has converged and there are less radio resource collisions, the model predicted status is close to the act status by observing the right figure. The prediction error from 60006000 ms to 60306030 ms is about 0.0050.005 m which is smaller than the status transmission trigger threshold. Observing the third row, the performance on y-axis is shown. The performance is similar to the x-axis’s. The prediction error from 60006000 ms to 60306030 ms is about 0.0030.003 m. The estimation on y-axis is working well after the model is updated. The performance on z-axis is shown in the last row. The prediction error from 60006000 ms to 60306030 ms is about 0.0020.002 m. However, similar to the x-axis’s, the proposed scheme is working well. The performance of this application can show the generalization of the proposed parallel communication scheme on status communication problems.

VI Conclusions and Outlook

In this paper, we proposed a parallel communication scheme whereby status information is communicated by both OTA packets and aligned predictions by calibrated status models. By this scheme, the wireless communication devices can adjust packet transmission frequencies based on network conditions and status predictions, such that the sampler can be communication-agnostic. The system is implemented on an SDR platform, and link-level experimental results show that the network load can be significantly reduced while maintaining low status recovery error by leveraging model predictions, namely revealing much while saying less. We also test the model on an integrated hardware-software evaluation platform for collaborative autonomous driving that we proposed. The results show that by adopting parallel communications, the channel occupancy is significantly reduced by online model predictions, while keeping the same status estimation error. In addition, SMART is proposed for networking such predictive wireless devices in an ad hoc WNC system based on a Whittle’s index-inspired reinforcement learning framework. To test the proposed approach in practice, we simulate two scenarios with promising future applications. First, we simulate on SUMO a multi dense platooning application. Then we use MATLAB to simulate a flight formation control scenario of UAV. It is shown by the vehicle and flight control performance in the two scenarios that the parallel communication scheme significantly outperforms both AoI-optimized status-unaware schemes and conventional uRLLC schemes, due to the fact that the proposed scheme can reduce the network load, thus improving transmission reliability with less packet collisions.

Future research topics include investigating more robust system identification, model estimation and time-series forecasting mechanisms for status model predictions in parallel communications. More generally, the interplay between sensing, communication, computation and control is worth studying to enable more efficient wireless applications.

References

  • [1] Z. Jiang, Z. Cao, S. Fu, F. Peng, S. Cao, S. Zhang, and S. Xu, “Revealing much while saying less: Predictive wireless for status update,” in IEEE Conf. Comput.Commun. (INFOCOM), May 2020.
  • [2] S. Kaul, R. Yates, and M. Gruteser, “Real-time status: How often should one update?,” in IEEE Conf. Comput. Commun. (INFOCOM), pp. 2731–2735, Mar 2012.
  • [3] S. Kaul, M. Gruteser, V. Rai, and J. Kenney, “Minimizing age of information in vehicular networks,” in IEEE Int. Conf. Sens., Commun., Netw. (SECON), pp. 350–358, Jun 2011.
  • [4] M. Costa, M. Codreanu, and A. Ephremides, “On the age of information in status update systems with packet management,” IEEE Trans. Inform. Theory, vol. 62, pp. 1897–1910, April 2016.
  • [5] L. Huang and E. Modiano, “Optimizing age-of-information in a multi-class queueing system,” in IEEE Int’l Symp. Info. Theory (ISIT), pp. 1681–1685, Jun 2015.
  • [6] E. Najm and R. Nasser, “The age of information: The gamma awakening,” in IEEE Int’l Symp. Info. Theory (ISIT), pp. 2574–2578, 2016.
  • [7] Y. Sun, E. Uysal-Biyikoglu, R. D. Yates, C. E. Koksal, and N. B. Shroff, “Update or wait: How to keep your data fresh,” IEEE Trans. Inform. Theory, vol. 63, pp. 7492–7508, Nov 2017.
  • [8] A. Arafa, J. Yang, S. Ulukus, and H. V. Poor, “Age-minimal online policies for energy harvesting sensors with incremental battery recharges,” in Information Theory and Applications Workshop (ITA), pp. 1–10, Feb 2018.
  • [9] Y. Inoue, H. Masuyama, T. Takine, and T. Tanaka, “The stationary distribution of the age of information in FCFS single-server queues,” in IEEE Int’l Symp. Info. Theory (ISIT), pp. 571–575, June 2017.
  • [10] Z. Jiang and S. Zhou, “Status from a random field: How densely should one update?,” in IEEE Int’l Symp. Info. Theory (ISIT), 2019.
  • [11] I. Kadota, A. Sinha, E. Uysal-Biyikoglu, R. Singh, and E. Modiano, “Scheduling policies for minimizing age of information in broadcast wireless networks,” IEEE/ACM Trans. Netw., vol. 26, pp. 2637–2650, Dec. 2018.
  • [12] Y. Hsu, “Age of information: Whittle index for scheduling stochastic arrivals,” in IEEE Int’l Symp. Info. Theory, pp. 2634–2638, Jun. 2018.
  • [13] Z. Jiang, B. Krishnamachari, X. Zheng, S. Zhou, and Z. Niu, “Timely status update in wireless uplinks: Analytical solutions with asymptotic optimality,” IEEE Internet of Things Journal, vol. 6, pp. 3885–3898, Apr 2019.
  • [14] Z. Jiang, B. Krishnamachari, S. Zhou, and Z. Niu, “Can decentralized status update achieve universally near-optimal age-of-information in wireless multiaccess channels?,” in International Teletraffic Congress (ITC 30), vol. 01, pp. 144–152, Sep. 2018.
  • [15] J. Sun, Z. Jiang, B. Krishnamachari, S. Zhou, and Z. Niu, “Closed-form Whittle’s index-enabled random access for timely status update,” IEEE Trans. Commun., to appear.
  • [16] R. D. Yates and S. K. Kaul, “Status updates over unreliable multiaccess channels,” in IEEE Int’l Symp. Info. Theory, pp. 331–335, Jun 2017.
  • [17] A. Kosta, N. Pappas, A. Ephremides, and V. Angelakis, “Age of information performance of multiaccess strategies with packet management,” arXiv preprint arXiv:1812.09201, 2018.
  • [18] A. Maatouk, M. Assaad, and A. Ephremides, “Minimizing the age of information in a CSMA environment,” arXiv preprint arXiv:1901.00481, 2019.
  • [19] R. Talak, S. Karaman, and E. Modiano, “Distributed scheduling algorithms for optimizing information freshness in wireless networks,” in IEEE Int. Workshop Signal Process. Adv. Wireless Commun. (SPAWC), pp. 1–5, Jun 2018.
  • [20] R. Talak, S. Karaman, and E. Modiano, “Optimizing information freshness in wireless networks under general interference constraints,” in ACM Int. Symp. Mobile Ad Hoc Netw. Comput. (MobiHoc), pp. 61–70, 2018.
  • [21] Z. Jiang, S. Fu, S. Zhou, Z. Niu, S. Zhang, and S. Xu, “AI-assisted low information latency wireless networking,” IEEE Wireless Commun., to appear.
  • [22] B. Yin, S. Zhang, and Y. Cheng, “Application-oriented scheduling for optimizing the age of correlated information: A deep reinforcement learning based approach,” IEEE Internet of Things Journal, pp. 1–1, 2020.
  • [23] Y. Sun, Y. Polyanskiy, and E. Uysal-Biyikoglu, “Remote estimation of the Wiener process over a channel with random delay,” in IEEE Int’l Symp. Info. Theory (ISIT), pp. 321–325, Jun 2017.
  • [24] T. Z. Ornee and Y. Sun, “Sampling for remote estimation through queues: Age of information and beyond,” arXiv preprint arXiv:1902.03552, 2019.
  • [25] C. Kam, S. Kompella, G. D. Nguyen, J. E. Wieselthier, and A. Ephremides, “Towards an effective age of information: Remote estimation of a markov source,” in IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 367–372, Apr. 2018.
  • [26] A. Kosta, N. Pappas, A. Ephremides, and V. Angelakis, “Age and value of information: Non-linear age case,” in IEEE Int’l Symp. Info. Theory, pp. 326–330, Jun 2017.
  • [27] O. Ayan, M. Vilgelm, M. Klügel, S. Hirche, and W. Kellerer, “Age-of-information vs. value-of-information scheduling for cellular networked control systems,” in ACM/IEEE International Conference on Cyber-Physical Systems, pp. 109–117, 2019.
  • [28] Z. Jiang, S. Zhou, Z. Niu, and Y. Cheng, “A unified sampling and scheduling approach for status update in wireless multiaccess networks,” in IEEE Conf. Comput.Commun. (INFOCOM), May 2019.
  • [29] E. Najm, R. Nasser, and E. Telatar, “Content based status updates,” in IEEE Int’l Symp. Info. Theory (ISIT), pp. 2266–2270, Jun 2018.
  • [30] Online; http://www.ni.com; accessed 20-Jul-2019.
  • [31] R. Molina-Masegosa and J. Gozalvez, “LTE-V for sidelink 5G V2X vehicular communications: A new 5G technology for short-range vehicle-to-everything communications,” IEEE Veh. Technol. Mag., vol. 12, pp. 30–39, Dec 2017.
  • [32] V. Vukadinovic, K. Bakowski, P. Marsch, I. D. Garcia, H. Xu, M. Sybis, P. Sroka, K. Wesolowski, D. Lister, and I. Thibault, “3GPP C-V2X and IEEE 802.11p for vehicle-to-vehicle communications in highway platooning scenarios,” Ad Hoc Networks, vol. 74, pp. 17 – 29, 2018.
  • [33] R. R. Weber and G. Weiss, “On an index policy for restless bandits,” Journal of Applied Probability, vol. 27, no. 3, p. 637–648, 1990.
  • [34] L. Peshkin, K.-E. Kim, N. Meuleau, and L. P. Kaelbling, “Learning to cooperate via policy search,” in Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 489–496, 2000.
  • [35] D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller, “Deterministic policy gradient algorithms,” in International Conference on Machine Learning (ICML), pp. 387–395, 2014.
  • [36] Qualcomm, The path to 5G: Cellular Vehicle-to-Everything (C-V2X), https://www.qualcomm.com/documents/path-5g-cellular-vehicle-everything-c-v2x.
  • [37] 3GPP TS 36.300: “Evolved universal terrestrial radio access (E-UTRA) and evolved universal terrestrial radio access network (E-UTRAN); overall description; stage 2”.