A Generative Learning Approach for Spatio-temporal Modeling in Connected Vehicular Network
Abstract
Spatio-temporal modeling of wireless access latency is of great importance for connected-vehicular systems. The quality of the molded results rely heavily on the number and quality of samples which can vary significantly due to the sensor deployment density as well as traffic volume and density. This paper proposes LaMI (Latency Model Inpainting), a novel framework to generate a comprehensive spatio-temporal of wireless access latency of a connected vehicles across a wide geographical area. LaMI adopts the idea from image inpainting and synthesizing and can reconstruct the missing latency samples by a two-step procedure. In particular, it first discovers the spatial correlation between samples collected in various regions using a patching-based approach and then feeds the original and highly correlated samples into a Variational Autoencoder (VAE), a deep generative model, to create latency samples with similar probability distribution with the original samples. Finally, LaMI establishes the empirical PDF of latency performance and maps the PDFs into the confidence levels of different vehicular service requirements. Extensive performance evaluation has been conducted using the real traces collected in a commercial LTE network in a university campus. Simulation results show that our proposed model can significantly improve the accuracy of latency modeling especially compared to existing popular solutions such as interpolation and nearest neighbor-based methods.
Index Terms:
Connected vehicle, latency modeling, Variational Autoencoder, C-V2XI Introduction
Fog computing-supported smart vehicular system is recently gaining strong momentum due to its capability of offering computationally intensive services that exceed the capacity of in-vehicle computers. According to [1], by 2020, there will be 153.6 billion in total connected vehicles on the road globally, creating a market of over 122.51 billion US dollars. Being recognized as the new way to create business opportunities and increase revenues, major telecommunication mobile network operators (MNOs) throughout the world are now investing heavily on upgrading their existing wireless network infrastructure with fog computing. One of the key challenges for such a system is to maintain ultra-low latency and ultra-high availability between connected vehicles and fog computing networks[2].
Due to the random nature of wireless networks, the latency performance of connected vehicles can vary significantly at different time and locations. In particular, recent results [3] as well as our own measurements have shown that the round-trip-time (RTT) varies substantially from second to second even at a fixed location with direct Line-of-Sight (LoS) connectivity to the base station (e.g., eNB). In other words, it is generally impossible to guarantee a deterministic latency value for a wireless connection for most of the connected vehicular systems. In addition, the fact that different vehicles may request different sets of services when driving into different locations further exacerbates the challenge. It is therefore important to establish a spatio-temporal statistic model for wireless connections between moving vehicles and a fog computing network. In this way, this model can be utilized by each connected vehicle to recover and predict the latency performances when driving to different locations. This also creates opportunities for an autonomous vehicle to warn human drivers to take over the control of vehicles when the performance of wireless connections fails to meet the requirement of the requested service at a certain location.
Spatio-temporal modeling of wireless access latency is known to be a notoriously difficult problem due to the following reasons. First, a long-term data measuring and analysis across a wide geographical area is costy and generally difficult. Second, it is known that the number and quality of samples are directly affecting the accuracy of the derived model. In particular, when the number of samples is insufficient, an occasional bias can lead to a serious error. It is generally impossible to collect the sufficient number of samples across a wide temporal and spacial dimension with the same quality. Also the number of required samples to train a model with satisfactory performance can vary significantly under different scenarios and services. The gap between the insufficient number of samples that can be collected with the limited time and cost and the large amount of data required for training an accurate model needs to be filled. Finally, sythesizing a large number of samples across a wide range of time and space is notoriously difficult. Currently, there is still lacking a simple and accurate model for modeling and predicting the latency in wireless networks.
In this paper, we introduce LaMI (Latency Model Inpainting), a novel model inpainting framework to generate a complete spatio-temporal model that can capture statistic feature of the latency performance of a connected vehicular system across a wide-geographical area. LaMI first collects some latency performance traces from vehicles driving throughout the considered areas for a certain period of time. Due to the different location popularity and population densities, the number of samples recorded in data collection may vary significantly across different locations, some of which are not sufficient to generate the empirical latency PDF with required accuracy. To address this issue, LaMI adopts the concept of inpainting from image processing in which the spatial correlation of the empirical PDF across different locations will be evaluated and samples collected in different regions with high correlation will be carefully selected and duplicated into each other. For those regions which still cannot collect sufficient numbers of samples, we propose a VAE-based approach to generate new samples according to the unknown probability distribution of the existing samples. Finally, LaMI fits the histogram of the samples collected and generated at each location and evaluate the latency and reliability performance of different services that can be provided to each connected vehicle.
Extensive simulations have been conducted based on the latency data collected in the commercial LTE network in a university campus to evaluate the performance of LaMI. Results show that LaMI can significantly improve the recovery accuracy of spatio-temporal modeling of a vehicular network especially with the latency-demanding services. In particular, compared to interpolation and nearest neighbor approaches, LaMI can significantly improve the accuracy of the empirical latency PDF. Furthermore, LaMI can be directly extended to some network models with limited or unbalanced samples across wide temporal and geographical.
II Related Work
Connected Vehicle Networks: Connected vehicular network, especially C-V2X, has attracted significant interest in both academia and industry [4, 5, 6, 7, 8]. More specifically, a multilevel information fusion approach to better process atomic messages was proposed in [5]. In [6], the authors focused on vehicular safety about collision avoidance to improve the vehicular safety. The readers can refer to [4] for a detailed survey of the typical use cases, methodologies, and service level requirements of C-V2X.
Latency modeling: Most existing works focused on how to ensure the latency experienced by a connected vehicle below a deterministic threshold when driving into different locations[7]. Therefore, modeling the probability distribution function of latency is critically important. In [3], the authors proposed Adaptivefog, a novel framework to maximize confidence levels in LTE-based fog computing for smart vehicles. A new scheduler was proposed in [9] for mitigating the latency in vehicular systems. In [10], the authors modeled some key characteristics for wireless channels in connected vehicular networks.
Generative modeling: Deep generative models have already exhibited great potential in generating different kinds of complicated datasets. In [11], the authors gave a comprehensive introduction on a popular deep generative model, named VAE. In [12], the authors presented many applications based on VAE, such as generating undistinguished house numbers [13] and high-resolution photographic images [14].
III Architecture Overview
A fog computing-supported vehicular system consists of the following elements[2]:
-
1.
Vehicles: connected to the wireless communication networks to obtain different in-vehicle functions and with different latency requirements for different services request.
-
2.
Wireless access points (APs): belonging to a part of the network infrastructure of a mobile network operator (MNO) to provide wireless connectivity of vehicles. They will provide wireless communication links between vehicles and fog nodes. In this paper, we consider the commercial LTE connection offered by an MNO to transmit workload request and processing results between vehicles and fog nodes.
-
3.
Fog nodes: correspond to low-cost edge computing servers that can support local computation-intensive services for connected vehicles.
In this paper, we focus on the spatio-temporal statistic modeling of a connected smart vehicular system in which each vehicle must always be able to evaluate and generate its latency performance to the nearest fog node while driving to different locations. The accurate evaluation and generation of the latency performance is critically important for smart vehicular systems that rely on services offered by fog computing networks to make driving decisions. In particular, a vehicle may request different sets of services when driving to different locations [4].

In this paper, we assume a connected vehicle will only rely on the fog computing network to make its driving decision if the wireless connection between it and the chosen fog node can satisfy these requirements. Otherwise, the vehicle will either switch to a conservative driving mode relying on the on-board sensors or simply return the full control of the vehicle back to the human driver. In Fig. 1, we illustrate the possible services and the corresponding performance requirements as well as the empirical latency Cumulative Distribution Function (CDF) generated by our own measured dataset.
We propose a novel framework, called LaMI, to establish a complete spatial-temporal stochastic model to capture granular details of the latency performances of different services for a connected smart vehicle when driving across a wide geographical area. In particular, LaMI does not require to collect sufficient samples at every individual location and time segment. It takes advantage of the spatial correlation of the latency performance across different locations and adopts deep generative neural network to generate new samples in other under-sampled locations. LaMI consists of four main modules as shown in Fig. 2.

Trace collection: Samples related to the latency performance of connected vehicles must be collected throughout the considered area firstly. Generally speaking, the number of samples collected at different locations during different time periods can vary significantly due to the random feature of human driving behaviors and different population distributions.
Patching: To estimate the latency performance at locations with insufficient numbers of samples or without samples, LaMI searches for regions that have similar latency performance and the samples can be shared across these regions.
Synthesizing: For the locations still cannot collect enough samples, we adopt a deep generative neural network to learn an approximate probability distribution from existing samples and generate new samples from the learned distribution.
Modeling: An empirical spatial-temporal stochastic model will be established according to the collected as well as generated samples. The established model can be used to estimate the confidence levels of different supported services when driving to each specific location as well as further improve the prediction accuracy of the latency performance as shown in Fig. 2.
IV Methodology
IV-A Trace Collection
IV-A1 Data Description
We evaluate the service latency by measuring the round-trip time (RTT) between a connected vehicle and its fog server. We have developed a trace recording APP called Delay Explorer using Android API, to periodically ping the IP addresses of a cloud server and a fog node and record their RTTs at every 500ms. We follow a commonly adapted setting and assume that the fog server is located in service gateway (S-GW). According to [15], S-GW is a more ideal location for installing computing server compared to eNodeB (eNB) or Remote Radio Unit (RRU). In addition to the RTT, Delay Explorer also records time stamps, GPS coordinates, driving speed, network connection types, etc. This is because fog server usually have high power consumptions and stable power supply, which are typically unavailable at eNBs and RRUs. We used several smart phones installed with Delay Explorer that accumulatively collected more than 760,000 samples in a university campus in Wuhan, China. The trace consists of recordings from both fixed locations and driving on roads. The detailed information about the trace is given in Table I.
Location | a University Campus |
---|---|
Number of samples | 760,000 |
Duration of measurement | Mar 16, 2019-Apr 7, 2019 |
Size of the campus | 1153 acres |
Status | Empirical PDF | Mean(ms) | STD(ms) | Median(ms) |
---|---|---|---|---|
Fixed |
![]() |
43.9350 | 24.7486 | 37.2690 |
Moving |
![]() |
43.6104 | 28.9172 | 29.1490 |
IV-A2 Preliminary Results and Observations
Typically, the latency performance of a connected smart vehicle can be affected by its driving speed, base station location, geographical environment (e.g. blockage). As a result, it varies significantly at different locations and with time. We present the histogram, mean, standard deviation (STD), and median values of the collected trace from both fixed locations and driving recordings in Table II. We can observe that the mobility of a vehicle has a very limited influence on the RTT in terms of average RTT, but can lead to a much bigger increase in terms of STD. The collected traces also imply that the temporal and spatial correlation between specific RTT values are generally very small. Even at the same location, there is no observable correlation between two consecutive samples with a 0.5s interval recorded by the same vehicle. However, the PDF of RTTs recorded at each specific location in a certain time duration can be treated as stationary as shown in Fig. 3.

In addition, the PDF of RTTs always follows a dual-modal mixed Gaussian distribution. We generate an empirical PDF at a fixed location with different number of RTT recordings at Wuhan, China by creating histograms for the samples and fitting them with Gaussian kernel functions as shown in Fig. 4. We can observe that the empirical PDF of RTTs follows a Gaussian mixture model with two major components centered at around 25ms and 105ms. This observation aligns with recent studies [9, 15] as well as our previous work [3]. It implies that the dual model may be caused by the scheduling request (SR) channel periodicity and hybrid automatic repeat request (ARQ) retransmission delay. To evaluate the impact of under-sampling on the accuracy of predicted latency performance, we use the weighted KR distance [3] as the main metric to quantify the difference between two empirical PDFs generated with different numbers of samples. The weighted KR distance between two empirical PDFs whose CDFs are and is defined as:
(1) |
where is the absolute value of the difference between two CDF values at . can adjust the importance of different latency values when comparing different CDFs, which makes KR distance more flexible in dealing with diverse service requirements.
In Fig. 4(a), we present the Gaussian kernel function fitting result of 10 samples randomly selected from the total 40,000 samples. In Fig. 4(b), we present the same fitting of the total 40,000 samples. In Fig. 4(c), we further compared the KR distance between empirical PDF generated with different numbers of samples and that generated with all 40,000 samples collected in a fixed location. We can observe that with a larger number of samples we can generate an empirical PDF with higher accuracy. In particular, with the increase of the number of samples, KR distance will first decrease rapidly, and then slow down.
We can also observe that the number of samples collected at different locations exhibits strong unbalance, with smaller number of samples collected in remote areas and much higher density of samples recorded in urban areas. In Fig. 5, we use different shades to show the number of samples collected at each location for all four traces.




IV-B Patching
As mentioned before, in some sparsely populated areas, it is difficult to record a sufficient amount of samples to establish an accurate empirical PDF for latency performance predicting. Most existing solutions focus on estimating the performance of the under-sampled region directly using samples or performances from the nearest neighbors. In this paper, we adopt the concept of inpainting from image processing and propose a global searching and patching approach. In this approach, we will first search for areas whose PDFs have the smallest weighted KR distance with the under-sampled region, and then patch the under-sampled region with the samples from the chosen areas.
Our approach is based on semantic image inpainting and synthesis methods that have been successfully used in fixing and reconstructing images with missing regions, which is also known as exemplar-based image inpainting. In this approach, if one or more patches are identified as similar to the missing or damaged part of the image, chunks of these patches will be copied and pasted to reconstruct the missing region.
Unfortunately, exemplar-based image inpainting approaches cannot be directly applied to complete the spatio-temporal model of latency performance due to the following reasons:
-
1.
Due to the inaccuracy of GPS locations, it is generally impossible to identify the nearest neighbors of the under-sampled locations. In other words, the -nearest neighboring-based approaches cannot be directly applied to reconstruct the latency traces for the under-sampled area.
-
2.
Instead of recovering individual pixel values for the missing regions in image processing, our problem requires to reconstruct the PDF for each location. As mentioned before, RTT is a stochastic value that varies sharply at different time and locations. We focus on reconstructing the empirical PDF of the RTT for a connected smart vehicle. In other words, Euclidean distance-based metrics cannot be used to evaluate the similarity of different regions (e.g. patches).
-
3.
Driving latency can be affected by many external factors such as driving speed, weather, and locations of network infrastructure, all of which should be considered when comparing the similarity between different patches.
To address the above issues, we propose a novel patching approach based on statistic distance between various regions. In particular, to address issue 1), we divide the entire area of sample collection into a number of equal-sized regions. Vehicles will experience the same PDF when driving within each region. To address issue 2), we measure the similarity of different regions using statistic distance metric between their PDFs. In particular, suppose and are the sets of samples in regions and , respectively. In this paper, we consider the weighted KR distance metric with the weight factor = 1 to evaluate the similarity between two regions and . To address issue 3), we introduce a weighting factor, denoted as , to capture the similarity of environmental elements such as surrounding buildings, types of roads (highway or any incity-road) as well as other features between two regions. Suppose region is the region without sufficient number of samples and can be patched with the samples from region . In this case, the value of will be used to calculate the size of the samples that will be duplicated from region to patch region .
-
Input: Objective region
-
if Number of samples in :
-
Search for candidate patches:
-
Find the region : =
where is the empirical CDF generated with the
given samples;
-
Duplicate a portion of to . The portion size is evaluated according to ;
-
Update =-;
else:
-
-
Do not search, return ;
-
-
Output: Updated ;
In Algorithm 1, we try to recover and improve the empirical PDFs for regions with less than number of samples. Let us define as the set of all the samples collected in regions. Suppose we try to recover the empirical PDF of region . Let be the set of samples in region . is the combination of all the samples in the nearest neighboring regions of region . Then a global searching for candidate patches, regions with similar PDF with region , will be performed by evaluating the KR distance between combination of and and all the other similar combinations of regions throughout the entire area. We use to denote the updated set of samples including the original samples and the duplicated samples from the candidate patches.
IV-C Synthesizing
There are locations that still cannot have sufficient numbers of samples to generate an accurate empirical PDF. In this case, we adopt VAE, a deep generative neural network approach to generate more new samples. Compared to other neural network-based approaches such as regression, VAEs are generative models that can output new random samples that has similar distribution with the input data, i.e., the samples collected or patched at a given location. In this paper, we focus on generating new samples that have similar distribution with the original and patched samples at location . To achieve this, VAE introduces a latent variable that can influence the model to generate new artificial samples that are similar to the existing samples in . We use to denote the vector of latent variables that are sampled according to PDF . Our main objective is to find the PDF of samples in location , defined as for . Following the law of probability, we can write the relationship between and as follows:
(2) |
To infer the distribution , we adopt the classical variational inference approach to approximate the real distribution using some typical distributions, e.g., Gaussian distribution. In particular, we introduce a Gaussian distribution function and try to infer from by minimizing the KL divergence between these two distributions. According to its definition, the KL divergency between and is given by
(3) | |||||
(4) | |||||
where the equation sign (A) is according to the definition of KL divergence and (B) is established from the fact that the expectation over will affect and we can therefore move outside of the expectation. The right-hand-side of (4) consists of another KL divergency, and we have
(5) | |||||
Note that (5) consists of two distribution functions to project samples into a latent variable space and to generate new samples with given latent variable . We can then adopt the autoencoder to train a neural network with as the encoder net and as the decoder net.
Another way to interpret equation (5) is that VAEs will try to find under error . In other words, VAEs will try to minimize the lower bound of , i.e., we have
(6) |
As shown in [11], this minimized lower-bound is a good approximation of especially considering that the exact distribution of is generally untractable.
IV-D Modeling
After creating a sufficient number of samples, an empirical PDF can be established by fitting the histogram of the RTTs using the Gaussian kernel function. The empirical PDF can be used to estimate the confidence levels for different services.
Our generated PDF is an approximation of the PDF of the real service latency which is critical to the safety-related services requiring stringent latency performance guarantees. In this case, we can introduce a weighting factor to different segment of the generated empirical CDF according to the required latency values of safety related services. Suppose the maximum tolerable latency of vulnerable road user service is given by 100ms, we can write the adjusted confidence of supporting vulnerable road user service with the required latency level as
(7) |
where ms and .
V Numerical Results


In this section, we evaluate the performance of LaMI using dataset collected in a university campus. The sample collecting locations and volumes are shown in Figure 5.
As mentioned earlier, the total number of samples directly affected the accuracy of the recovered models especially when the number of samples is small. We therefore compare the sample recovery performance of VAE and linear interpolation, one of the popular sample recovery approaches in Figure 6. In particular, we present the empirical CDF recovered by VAE with a very small number of samples (e.g., 10 and 25 samples) with the true distribution generated by all 50,000 samples collected in the same location. We can observe that the result obtained by interpolation tends to weaken the peak of PDF, which may cause a larger bias between the two peaks at around 90ms. In contrast, VAE outperforms the interpolation approach and can better capture the properties of the true latency distribution.


To evaluate the sample recovery performance of two main models of LaMI: patching and synthesizing, we consider a special scenario and assume no samples have been recorded for a considered area. In this case, LaMI needs to first perform the global search for candidate patches, and then duplicate some high correlated samples collected in other areas. Finally, LaMI applies VAEs with the patched samples to generate empirical CDFs. Experimental results are presented in Fig. 7. We can observe that LaMI achieves a much better performance compared to the nearest neighboring approach, a commonly used approach that generates empirical CDF with samples from the neighboring regions.

In Figure 8, we evaluate the impact of the searchable area size on the performance of LaMI. In particular, we compare the KR distance between the recovered empirical PDF and true distribution under different sizes of total searchable areas. We can observe that with the increasing of the searchable area for candidate patches, the KR distance between the empirical PDF recovered by LaMI and the true distribution reduces. The KR distance however will approach to a constant when the searchable area continues to expand. This means that a larger searchable area does not always result in better performance. Note that with the searchable area continuing to increase, the misjudgment rate can also raise. In other words, the searchable area should be carefully decided to balance the improvement on model accuracy and the recovery error.
VI Conclusion
In this paper, we have proposed a novel approach to model the spatio-temporal latency performance for connected vehicular networks. In particular, we have collected RTT measurements between vehicles and fog nodes through commercial LTE networks in a university campus. Based on the thorough analysis of the collected dataset, a novel spatio-temporal generative model named as LaMI has been proposed to handle the challenges in the latency modeling in smart vehicular systems. LaMI can inpaint the missing latency samples by searching for similar regions and sharing their samples. A deep generative model based on VAE is also adopted to further improve the modeling accuracy. Numerical results show that the proposed LaMI framework can significantly improve the modeling accuracy of latency performance compared to existing popular solutions such as linear interpolation and nearest neighbor-based method.
Acknowledgment
The authors would like to thank Ericsson (China) Hubei Branch and China Mobile Hubei 5G Joint-innovation Lab for help in the data collection.
References
- [1] Market Research Engine, “Global connected cars market - trends and forecast: 2015-2020.” [Online]. Available: https://www.marketresearchengine.com/reportdetails/global-connected-cars-market
- [2] Y. Xiao and M. Krunz, “Distributed optimization for energy-efficient fog computing in the tactile Internet,” IEEE J. Sel. Area Commun., vol. 36, no. 11, pp. 2390–2400, Nov. 2018.
- [3] Y. Xiao, M. Krunz, H. Volos, and T. Bando, “Driving in the fog: Latency measurement, modeling, and optimization of lte-based fog computing for smart vehicles,” in IEEE SECON, Boston, MA, Jun. 2019.
- [4] G. WhitePaper, “5gaa c-v2x use cases methodology, examples and service level requirements,” June. 2019.
- [5] L. Zhang, D. Gao, W. Zhao, and H. C. Chao, “A multilevel information fusion approach for road congestion detection in vanets,” Mathematical and Computer Modelling, vol. 58, no. 5-6, pp. 1206–1221, Sep. 2013.
- [6] T. ElBatt, S. K. Goel, G. Holland, H. Krishnan, and J. Parikh, “Cooperative collision warning using dedicated short range wireless communications,” in Proceedings of the 3rd international workshop on Vehicular ad hoc networks, Los Angeles, CA, Sep. 2006.
- [7] S. K. Bhoi and P. M. Khilar, “Vehicular communication: a survey,” IET networks, vol. 3, no. 3, pp. 204–217, Sep. 2014.
- [8] X. Ge, B. Yang, J. Ye, G. Mao, C.-X. Wang, and T. Han, “Spatial spectrum and energy efficiency of random cellular networks,” IEEE Tran. Commun., vol. 63, no. 3, pp. 1019–1030, Mar. 2015.
- [9] H. Lee, J. Flinn, and B. Tonshal, “RAVEN: Improving interactive latency for the connected car,” in ACM MobiCom, New Delhi, India, Oct. 2018.
- [10] L. Liang, H. Peng, G. Y. Li, and X. Shen, “Vehicular communications: A physical layer perspective,” IEEE Transactions on Vehicular Technology, vol. 66, no. 12, pp. 10 647–10 659, Dec. 2017.
- [11] D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv, Dec. 2013. [Online]. Available: https://arxiv.org/abs/1312.6114
- [12] C. Doersch, “Tutorial on variational autoencoders,” arXiv, Jun. 2016.
- [13] A. G. D. W. Karol Gregor, Ivo Danihelka, “DRAW: A recurrent neural network for image generation,” arXiv, Feb. 2015. [Online]. Available: https://arxiv.org/abs/1502.04623
- [14] H. Huang, R. He, Z. Sun, T. Tan et al., “Introvae: Introspective variational autoencoders for photographic image synthesis,” in NIPS 2018, Montreal, Canada, Nov. 2018, pp. 52–63.
- [15] I. Hadžić, Y. Abe, and H. C. Woithe, “Edge computing in the ePC: A reality check,” in Proceedings of the Second ACM/IEEE Symposium on Edge Computing, San Jose/Fremont, CA, Oct. 2017.