Personalized Route Recommendation Based on User Habits for Vehicle Navigation

Yinuo Huang [email protected] , Xin Jin Toyota Motor Engineering & Manufacturing (China) Co., Ltd.BeijingChina , Miao Fan , Xunwei Yang and Fangliang Jiang NavInfo Co., Ltd.BeijingChina

Abstract.

Navigation route recommendation is one of the important functions of intelligent transportation. However, users frequently deviate from recommended routes for various reasons, with personalization being a key problem in the field of research. This paper introduces a personalized route recommendation method based on user historical navigation data. First, we formulate route sorting as a pointwise problem based on a large set of pertinent features. Second, we construct route features and user profiles to establish a comprehensive feature dataset. Furthermore, we propose a Deep-Cross-Recurrent (DCR) learning model aimed at learning route sorting scores and offering customized route recommendations. This approach effectively captures recommended navigation routes and user preferences by integrating DCN-v2 and LSTM. In offline evaluations, our method compared with the minimum ETA (estimated time of arrival), LightGBM, and DCN-v2 indicated 8.72%, 2.19%, and 0.9% reduction in the mean inconsistency rate respectively, demonstrating significant improvements in recommendation accuracy.

Route recommendation, Neural networks, User habits, Vehicle navigation

^†^†ccs: Human-centered computing^†^†ccs: Human-centered computing Ubiquitous and mobile computing^†^†ccs: Human-centered computing Ubiquitous and mobile computing systems and tools

1. Introduction

Route recommendation plays a pivotal role in our daily lives as a crucial element of intelligent transportation systems. When a user requests route guidance during a trip, the navigation service generates multiple routes and selects the top three options best suited to the user’s need. However, some users’ misalignment between the trajectory and the recommended route indicates deviations from the chosen route. This misalignment can stem from various factors, including the user’s specific intentions, potential missing road data or traffic incidents along the recommended route, and the user’s personalized route preferences. As shown in Fig. 1, the user deviates from the recommended navigation route and chooses a scenic route along the riverbank. Therefore, uncovering latent information within users’ historical navigation data and tailoring personalized route recommendations can enhance user satisfaction with navigation services.

Refer to caption — Figure 1. Misalignment between the user trajectory and navigation route. The user diverged from the navigation’s recommended route, opting instead for a scenic route along the riverbank.

With the rapid advancement of intelligent navigation services and the exponential growth of big data, navigation services preferred by most people often fall short of meeting users’ diverse needs. Customized, personalized navigation has emerged as the prevailing trend. When selecting a recommended navigation route, users take into account various factors such as time, distance, traffic congestion, tolls, safety, and convenience. A route is composed of multiple links, and the attributes of each link and the road network encapsulate hidden information about the route. In addition to analyzing route information, learning user profiles is crucial for effective route recommendation.

To solve the above problems, a personalized route recommendation model based on user habits for vehicle navigation is presented in this research. In addition to the overall route information such as time, distance, toll, etc., link sequence features are introduced to learn the detailed information of the routes. Simultaneously, user profiles are extracted from historical trajectory data and navigation behaviors. Subsequently, a Deep-Cross-Recurrent (DCR) learning model is employed to facilitate personalized route recommendations. The principal contributions of this research are summarized as follows:

•

Introduction of link sequence attribute features and landscape features to enhance the expression of route information.
•

Construct user profiles derived from historical trajectory data and navigation behavior to establish the groundwork for personalized route recommendation.
•

A DCR learning model is proposed for route recommendation. The model is tested on offline datasets and compared with other methods. The results validate the superior performance of the model.

2. Related Work

In the realm of navigation route research, Wuman Luo et al. leveraged trajectory data from a substantial user pool to construct mobile networks, identifying the most popular route between origins and destinations at different times (Luo et al., 2013). This method effectively captures the preferences of the majority of users, it falls short of reflecting personalized preferences. Peisong Li et al introduce a road attribute description method by PS theory and a personalized route planning algorithm. After the user selects a travel preference plan, the priority of the route attribute is set, favoring routes that align closely with the target direction (Li et al., 2022). However, it is difficult for users to accurately express their preferences for each criterion.

In recent years, numerous scholars have employed deep learning methods to investigate personalized route recommendations. Jingyuan Wang et al. employed neural networks to learn the cost function within the A* algorithm for personalized route recommendations (Wang et al., 2019). Ran Cheng et al proposed the R4 learning framework to predict the deviation rate of candidate routes, recommending a route with the lowest deviation rate to users (Cheng et al., 2021). Shan Liu integrated the Dijkstra algorithm into deep inverse reinforcement learning to produce personalized routes when road network information is unknown (Liu et al., 2020). Building upon this approach, Shan Liu et al introduced a graph attention network information and improved the IRL method to recommend personalized routes considering real-time traffic conditions (Liu and Jiang, 2022). However, when the state space is large, accurately computing the expected state visitation frequency becomes challenging. These studies inspire us to learn personalized route recommendations through deep learning techniques.

Route recommendation involves two key processes: recall and sorting. The recall phase aims to generate a wide array of relevant candidate routes between the origin and the destination. Subsequently, in the sorting phase, these recalled routes are ranked, with the top-ranked route being presented to the user. This paper primarily concentrates on route sorting, delving deeply into route data and user profiles derived from users’ historical trajectories and navigation behaviors.

3. Methodology

In the context of navigation recommendation, the inconsistency rate indicates the degree of deviation from the route after the user selects navigation routes. It can be employed to evaluate the quality of route recommendations. A higher inconsistency rate is indicative of a user’s dissatisfaction with the recommended route. In this paper, we define the personalized route recommendation problem as follows: Assuming there is a navigation database, given origin, destination, and request time, our objective is to predict the inconsistency rate of candidate routes and recommend the route calculation with the lowest inconsistency rate to the user. The inconsistency rate is defined as follows:

(1)

IR=1-\frac{dis_{tc}}{dis_{t}}

where $IR$ indicates the inconsistency rate, $dis_{t}$ denotes the overall distance of the user’s trajectory. By mapping trajectory data into the road network, we obtain the sequence of track links, the candidate route also be represented by the link sequence, $dis_{tc}$ represents the sum length of same links between the user’s trajectory and the candidate route. If the user deviates from the route, the label IR is set to 0 and 1 otherwise.

3.1. Feature Extraction

With the widespread adoption of navigation services, a significant volume of user trajectory data is generated daily. To provide a more comprehensive representation of navigation recommendations, we first systematically build rich features derived from these datasets. We summarize the features into two aspects: route features and user profiles.

3.1.1. Route features.

Spatial information: By mapping trajectory data into the road network, we can extract overall information about the route, including distance, number of traffic lights, tolls, number of turns, and so on. Additionally, we can gather data on the sequence of links, such as link length, number of lanes, and road types. Location-type information can be acquired through the POI grid vector.

Temporal information: The time of route recommendation requests is also a pivotal factor in navigation recommendations. This time can be refined into various features, including whether it falls within peak periods, weekends, specific days of the week, or hours of the day.

Traffic Information: The road conditions along the route significantly impact the user’s driving experience. Individuals in a hurry typically seek to avoid congested routes. Traffic information can be conveyed through various features, such as different levels of road conditions at the time of departure, et al.

landscape Information: In current research, the incorporation of scenic elements along routes is often overlooked. Considering the significant cost involved in manual assessment, this paper suggests a methodological approach. As shown in Fig. 2, this paper initially processes the route data by gridding it after mapping it into the road network. Subsequently, it statistics the water system data and green spaces data of grid along the route. Such information can serve as a representation of the landscape attributes of the route.

3.1.2. User profiles.

The user’s personalized route preference refers to the relatively stable behavioral features of the user when navigating. Drawing from daily navigation experiences, along with the aforementioned route information and the user’s historical travel data, relevant behavioral characteristics are extracted. These include the user’s inconsistency ratio, the proportion of users opting for the fastest route, and so on. These characteristics help illuminate the user’s personalized preferences, contributing to the establishment of a dataset for user profiles.

Cluster analysis is performed on the extracted user historical behavior features to distinguish the preference category of each user in the dataset. The K-Means algorithm is an unsupervised clustering technique. It partitions a given sample set into K clusters based on the distances between samples, aiming to tightly connect points within clusters while maximizing the separation between clusters. T-SNE is a nonlinear dimensionality reduction and data visualization technique that transforms high-dimensional data into two or three dimensions (der Maaten and Hinton, 2008), preserving local relationships between data points to the greatest extent possible. As shown in Fig. 3, for a more intuitive analysis of user profiles based on historical behavioral data, this paper employs the K-Means algorithm to categorize users into 6 clusters and utilizes T-SNE for dimensionality reduction and visualization of the clustering outcomes.

The cluster center of each category encapsulates comprehensive information about the cluster. A comparison of the features within each category reveals distinct differences in the feature variables, leading to the definition of corresponding labels for each category. Cluster 0 users are more toll-sensitive and willing to accept some time loss to choose routes with lower tolls. The user preference of cluster 1 is characterized by frequently choosing the fastest or shortest route. Users in cluster 2 prefer higher-quality roads, reflected in their preference for wider and safer routes. Cluster 3 users prioritize highways, often opting for highways when travel times are small, and they are insensitive to tolls. Users in cluster 4 prefer scenic routes, characterized by their selection of paths often near rivers or green parks. Cluster 5 users are sensitive to congestion and often opt for routes with smoother traffic conditions.

3.2. Route Rank Model

In route sorting applications, traditional machine learning methods like LightGBM have limitations in the hidden patterns within sequences of link attributes along the road. This paper proposes an advanced approach by integrating the DCN-v2 and LSTM models to construct a more complex model to solve the problem.

DCN-v2 is proposed for recommender systems (Wang et al., 2021), an improved version of DCN. The cross-network introduces a mixture of expert network structures to enhance the cross-ability of different subspace features. The deep network comprises multiple layers of MLP designed to uncover underlying patterns effectively.

LSTM is a special type of RNN (Hochreiter and Schmidhuber, 1997), that can learn long-term dependent information, and it has demonstrated excellent performance in natural language processing, speech recognition, and other domains. Similar to all recurrent networks, LSTM comprises a chain of repeating modules within a neural network. In each module of LSTM, the input gate, forget gate, output gate, modulated input, memory cell, and hidden state.

This paper combined DCN-v2 and LSTM models to build a Deep-Cross-Recurrent network for learning to estimate route sorting scores. The model structure is described in Fig. 4. The route features and user features mined above are taken as the input of the model. Initially, sparse features of the global route are converted into dense features via an embedding layer and then concatenated with other dense features as input to DCN-v2. The cross-network consists of 2 layers, with two hidden layers of the MLP producing outputs sized at 128 and 64 respectively. To further capture the link sequence attribute information of each route, the link sequence features are converted into high-dimensional features through the embedding layer. Then the features are projected into a 128-dimensional space by a fully connected layer with ReLU as the activation function. The transformed features are inputted into LSTM with a cell size of 256. Finally, the output of DCN-v2 and the last hidden state of the LSTM are combined, and the final score is generated via a sigmoid activation applied through a linear layer.

The embedding dimension of the link IDs is set to 32. Adam is used as the optimizer and the learning rate of Adam is set to 0.0001. The loss function is the cross-entropy loss defined below:

(2)

L=-\frac{1}{N}\sum_{i=1}^{N}\left[y_{i}\log(\hat{y}_{i})+(1-y_{i})\log(1-\hat{y}_{i})\right]

where $N$ is the size of the training set, $y_{i}$ is the binary target, $\hat{y}_{i}$ is the predicted value.

4. Experiments

4.1. Datasets

The dataset utilized in this study is sourced from the dataset of private car users who employ navigation services in our company. It encompasses the historical trajectory data and navigation records of 10,000 users over one month, covering cities such as Beijing, Shanghai, Guangzhou, and Shenzhen. The dataset comprises approximately 290,000 trajectories, with approximately 3 million candidate routes provided by the navigation service. Table 1 lists the statistics of the datasets.

Table 1. Statistics of datasets.

	number of links	navigation number	Average number of candidate routes
training set	2,105,063	236,660	11
validation set	1,369,689	26,295	11
test set	1,335,577	25,604	11

4.2. Evaluation Metrics

We use two metrics in our experiments, including the mean of inconsistency rate (mean_IR) and AUC, to evaluate the performance of the method. The calculation formula of AUC is defined as follows:

(3)

AUC=\frac{\sum I(p_{\text{pos}},p_{\text{neg}})}{P\cdot N}

(4)

I(p_{\text{pos}},p_{\text{neg}})=\begin{cases}1&\text{if }p_{\text{pos}}>p_{\text{neg}}\\ 0.5&\text{if }p_{\text{pos}}=p_{\text{neg}}\\ 0&\text{if }p_{\text{pos}}<p_{\text{neg}}\end{cases}

where $P$ is the number of positive samples, $N$ is the number of negative samples, $p_{\text{pos}}$ represents the probability that the positive sample prediction is a positive example, $p_{\text{neg}}$ represents the probability that the negative sample prediction is a negative example, and $I$ is the indicator function.

4.3. Competing method

To validate the efficacy of the method proposed in this paper, we compared multiple solutions, including the minimum estimated time of arrival (ETA) solution, LightGBM (Ke et al., 2017), and DCN-v2 model (Wang et al., 2021). The minimum ETA solution indicates the recommended route with the fastest ETA. LightGBM solution indicates the prediction inconsistency rate based on the lightGBM model, and the input features do not include link sequence information. The input data of DCN-v2 is the same as that of LightGBM, only the model frame is different.

4.4. Comparison results

The DCR model effectively learns route information, sequential link information, and user preferences. The experimental results are presented in Table 2. It can be observed that compared to the minimum ETA solution, the mean_IR of the LightGBM, DCN-v2, and DCR increased by 6.53%, 7.82%, and 8.72% respectively. This improvement signifies that models leveraging large-scale historical data effectively capture routing recommendation information. The minimum ETA solution does not fully leverage the available information in the history data, thus its performance is unsatisfactory.

Table 2. The result on the test dataset.

	Test size	AUC	mean_IR
Min_ETA	25,604	-	38.97%
LightGBM	25,604	81.09%	32.44%
DCN-v2	25,604	85.18%	31.15%
DCR	25,604	86.32%	30.25%

The DCR model was compared with the LightGBM and DCN-v2 models to assess the influence of modeling link sequence information along the route, in which input of LightGBM and DCN-v2 differ from DCR by removing the link sequence information. The forecasting results indicated that compared with LightGBM and DCN-v2 models, the AUC of DCR in the test set is increased by 5.23% and 1.14%, and mean_IR is reduced by 2.19% and 0.9%, which confirms the benefit of introducing the recurrent network structure.

In addition to presenting the overall performance of the test dataset, we classify the test dataset into three distance types: short (0 to 10 km), medium (10 to 20 km), and long (longer than 20 km). Specifically, the test dataset comprises 15k short trajectories, 6k medium trajectories, and 4k long trajectories. As shown in Fig. 5, the performance is worse as the trajectory length increases, which indicates that shorter trips are more effective in predicting personalized routes for users. Additionally, across all distance types, the DCR model consistently outperforms other models significantly.

5. Conclusion

In this paper, we introduce a novel approach to learning personalized route recommendations. The goal is to learn the route ranking score and recommend the route with the best score to the user. To establish a comprehensive feature database, we construct route features encompassing temporal, spatial, traffic, and landscape aspects. We use K-Means and T-SNE methods to extract user characteristics and effectively capture users’ behavioral preferences. Moreover, we propose a new deep learning model named DCR, which integrates DCN-v2 and LSTM to tackle the challenge. This model effectively learns user preferences and route information of route recommendations. We evaluated our approach offline with user navigation data and found that across the entire test datasets, the mean_IR of the DCR was reduced by 8.72% compared to the minimum ETA recommendation. When compared with LightGBM and DCN-v2, the DCR showed a reduction in mean_IR by 2.19% and 0.9% respectively, and an increase in AUC by 5.23% and 1.14%. We categorized the test datasets into three groups based on short, medium, and long distances. As the distance increased, the mean_IR decreased, indicating greater difficulty in predicting personalized routes for longer routes compared to shorter ones. Furthermore, the DCR outperformed other models in these evaluations. The result demonstrates the effectiveness of the approach.

References

(1)
Luo et al. (2013) Wuman Luo, Haoyu Tan, Lei Chen, and Lionel M. Ni. 2013. Finding time period-based most frequent path in big trajectory data. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD ’13). Association for Computing Machinery, New York, NY, 713–724. https://doi.org/10.1145/2463676.2465287
Li et al. (2022) Peisong Li, Xinheng Wang, Honghao Gao, Xiaolong Xu, Muddesar lqbal, and Keshav Dahal. 2022. A Dynamic and Scalable User-Centric Route Planning Algorithm Based on Polychromatic Sets Theory. IEEE Transactions on Intelligent Transportation Systems 23, 3 (2022), 2762–2772. https://doi.org/10.1109/TITS.2021.3085026
Wang et al. (2019) Jingyuan Wang, Ning Wu, Wayne Xin Zhao, Fanzhang Peng, and Xin Lin. 2019. Empowering A* Search Algorithms with Neural Networks for Personalized Route Recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD ’19). Association for Computing Machinery, New York, NY, 539–547. https://doi.org/10.1145/3292500.3330824
Cheng et al. (2021) Ran Cheng, Chao Chen, Longfei Xu, Shen Li, Lei Wang, Hengbin Cui, Kaikui Liu, and Xiaolong Li. 2021. R4: A Framework for Route Representation and Route Recommendation. https://doi.org/10.48550/arXiv.2110.10474
Liu et al. (2020) Shan Liu, Hai Jiang, Shuiping Chen, Jing Ye, Renqing He, and Zhizhao Sun. 2020. Integrating Dijkstra’s algorithm into deep inverse reinforcement learning for food delivery route planning. Transportation Research Part E: Logistics and Transportation Review 142 (2020), 102070. https://doi.org/10.1016/j.tre.2020.102070
Liu and Jiang (2022) Shan Liu and Hai Jiang. 2022. Personalized route recommendation for ride-hailing with deep inverse reinforcement learning and real-time traffic conditions. Transportation Research Part E: Logistics and Transportation Review 164 (2022), 102780. https://doi.org/10.1016/j.tre.2022.102780
der Maaten and Hinton (2008) Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, 86 (2008), 2579–2605.
Wang et al. (2021) Ruoxi Wang, Rakesh Shivanna, Derek Cheng, Sagar Jain, Dong Lin, Lichan Hong, and Ed Chi. 2021. DCN V2: Improved Deep & Cross Network and Practical Lessons for Web-scale Learning to Rank Systems. In Proceedings of the Web Conference 2021 (WWW ’21). Association for Computing Machinery, New York, NY, 1785–1797. https://doi.org/10.1145/3442381.3450078
Hochreiter and Schmidhuber (1997) Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780.
Ke et al. (2017) Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. LightGBM: a highly efficient gradient boosting decision tree. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). Curran Associates Inc., Red Hook, NY, 3149–3157.