This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

REPLAY: Modeling Time-Varying Temporal Regularities of Human Mobility for Location Prediction over Sparse Trajectories

Bangchao Deng,  Bingqing Qu,  Pengyang Wang,  Dingqi Yang*,  Benjamin Fankhauser,  and Philippe Cudre-Mauroux Bangchao Deng, Pengyang Wang, and Dingqi Yang are with the State Key Laboratory of Internet of Things for Smart City and Department of Computer and Information Science, University of Macau, Macao SAR, China, E-mail: [email protected], [email protected], [email protected]. Bingqing Qu is with BNU-HKBU United International College, China, E-mail: [email protected]. Benjamin Fankhauser is with Bern University of Applied Sciences, Switzerland, E-mail: [email protected]. Philippe Cudre-Mauroux is with the University of Fribourg, Switzerland, E-mail: [email protected].
*Corresponding author: Dingqi Yang (email: [email protected])Manuscript received April 19, 2005; revised August 26, 2015.
Abstract

Next-location prediction aims to forecast which location a user is most likely to visit given the user’s historical data. As a sequence modeling problem by nature, it has been widely addressed using Recurrent Neural Networks (RNNs). To tackle the intrinsic sparsity issue of real-world user mobility traces, spatiotemporal contexts have been shown as significantly useful. Existing solutions mostly incorporate spatiotemporal distances between locations in mobility traces, either by integrating them into the RNN units as additional information, or utilizing them to search for informative historical hidden states to improve prediction. However, such distance-based methods fail to capture the time-varying temporal regularities of human mobility, where human mobility is often more regular in the morning than in other time periods, for example; this suggests the usefulness of the actual timestamps besides the temporal distances. Under this circumstance, we propose REPLAY, learning to capture the time-varying temporal regularities for location prediction based on general RNN architecture. Specifically, REPLAY is designed on top of a flashback mechanism, where the spatiotemporal distances in sparse trajectories are used to search for the informative past hidden states; to accommodate the time-varying temporal regularities, REPLAY incorporates smoothed timestamp embeddings using Gaussian weighted averaging with timestamp-specific learnable bandwidths, which can flexibly adapt to the temporal regularities of different strengths across different timestamps. We conduct a comprehensive evaluation, comparing REPLAY against a wide range of state-of-the-art methods. Experimental results show REPLAY significantly and consistently outperforms state-of-the-art methods by 7.7%-10.9% in the location prediction task, and the learnt bandwidths reveal interesting patterns of the time-varying temporal regularities.

Index Terms:
User mobility, Sparse trajectory, Location prediction, Recurrent neural networks

1 Introduction

Location prediction plays a crucial role in the modeling of human mobility, serving as a fundamental building block for various location based services. The primary aim is to forecast which location a user is most likely to visit given the user’s historical data [1]. As a sequence modeling problem by nature, location prediction problems have been widely tackled by the literature using Recurrent Neural Networks (RNNs), such as vanilla RNN [2], Gated Recurrent Unit (GRU) [3] and Long Short-Term Memory (LSTM) [4]; these techniques have shown great success in modeling language data. However, different from word sequences (i.e., text sentences), real-world user mobility traces are often sparse and incomplete, due to the nature of the data collection scheme and privacy issues. More precisely, Location Based Social Networks (LBSNs) are valuable data sources for benchmarking location prediction techniques, where users voluntarily share their spatiotemporal presence (check-ins) at specific Points of Interest (POIs) and specific times. However, due to the voluntary sharing, the resulting mobility traces tend to be sparse in nature. For example, an empirical investigation using the commonly employed Foursquare dataset shows that the median time interval between consecutive check-ins is approximately 16.72 hours [5]. This sparsity and incompleteness in data present challenges for applying RNNs to the task of location prediction.

To tackle this sparsity issue, existing works strive to extend RNN architectures by incorporating spatiotemporal contexts, which have been shown as strong predictors for location prediction [6]. A widely adopted approach is to add the spatiotemporal distances between successive check-ins as additional inputs together with sequences of POIs to RNN units [7, 8, 9, 10, 11, 12, 13]. Despite its wide adoption, recent studies have shown that this approach cannot fully capture the temporal periodicity and spatial regularity of human mobility patterns [14], as considering spatiotemporal distances between check-ins that are several steps apart (i.e., high-order distances) in the check-in sequences are often more informative than those between successive check-ins for location prediction. Subsequently, a flashback mechanism has been introduced to explicitly use such spatiotemporal distances in sparse trajectories search for the informative past hidden states (i.e., historical hidden states with similar spatiotemporal contexts to the current one) [14, 15, 16, 17, 18, 19, 20, 21, 20]. However, despite the improved performance, the temporal distances still fail to capture the time-varying temporal regularities of human mobility, as we discuss below.

Temporal regularity has been revealed as a universal law of human mobility [6, 22], which can be evidenced by the periodicity of human activities even in sparse human mobility traces [23]. Figure 1 shows the temporal returning probability of user check-ins on the Foursquare dataset. This probability represents the probability of users revisiting a particular POI within a specific time interval after their initial check-in at that POI [6, 24]. We observe an obvious daily (periodic) revisiting pattern, which implies that historical check-ins closer to these daily peaks in temporal distance have higher predictive power; this serves as the foundation for the flashback mechanism [14]. In this paper, we further reveal that the strength of the temporal regularities varies across different time periods during a day (or across different days in a week, e.g., weekdays v.s. weekends). On the Foursquare dataset, the mobility pattern in the daytime is mostly attributed to working-related activities, which is usually more regular than the nighttime mobility pattern that is mostly attributed to entertainment-related activities111Note that user check-ins on LBSNs are mostly for social sharing purposes, where check-ins at entertainment-related POIs are much more than those at home-related POIs during the nighttime [25].. Figure 1 also compares the daytime and nighttime returning probabilities, where the daytime returning probability is significantly higher than nighttime, implying that the temporal regularity is much stronger in the daytime than in the nighttime. This implies that the hidden state 24 hours back is more informative for location prediction if the current time is daytime rather than nighttime. However, such time-varying temporal regularities cannot be captured by the distance-based methods (i.e., merely considering the temporal distances), and thus require involving the actual timestamps of check-ins.

Refer to caption
Figure 1: Returning probability w.r.t. the temporal distances between check-ins. We plot the overall returning probability and its breakdown in the daytime (6:00-18:00) or nighttime (18:00-6:00). We observe that the daytime returning probability is significantly higher than in nighttime, which implies that the temporal regularity is much stronger in the daytime than in the nighttime.

Motivated by this observation, we propose REPLAY, a general RNN based architecture learning to capture the vaRying tEmPoral reguLAritY of human mobility for location prediction. Specifically, REPLAY is designed to seamlessly incorporate smoothed timestamp embeddings with learnable bandwidths into the flashback mechanism. More precisely, for each check-in in a check-in sequence, we first transform its time to an hour-in-week timestamp tnt_{n}, where 1n1681\leq n\leq 168 (one of the 168 hours in a week) and then generate a smoothed timestamp embedding using Gaussian weighted averaging with a timestamp-specific learnable bandwidth σn\sigma_{n} (controlling the width of the Gaussian weighting function centered at the timestamp nn). Afterward, we concatenated the smoothed timestamp embedding with the corresponding POI embedding, which is then fed into an RNN with the flashback mechanism, where the spatiotemporal distances in sparse trajectories are used to search for the informative past hidden states. Finally, we combine the hidden state output by the RNN, a user embedding, and a smoothed timestamp embedding generated from a query time to make predictions on future POIs. The timestamp-specific learnable bandwidths here can automatically learn to adapt to the temporal regularities of different strengths across different timestamps. Intuitively, a smaller σn\sigma_{n} leads to a peaky Gaussian, which implies a stronger regularity at timestamp nn, as the smoothed timestamp embedding relies mostly on the embedding of tnt_{n} itself and less on the information from its neighboring timestamps for prediction. This is also evidenced by our experiments later; the learnt bandwidths for the timestamps in the daytime are indeed 7.5% smaller on average than those in the nighttime, which well corresponds to our observations in Figure 1. We summarize our contributions as follows:

  • We revisit the existing sequence models for location prediction, and identify their limitation in capturing the time-varying temporal regularities of human mobility.

  • We propose REPLAY, a general RNN architecture capturing the time-varying temporal regularities via smoothed timestamp embeddings using Gaussian weighted averaging with timestamp-specific learnable bandwidths, which can flexibly adapt to the temporal regularities of different strengths across different timestamps.

  • We conduct a thorough evaluation using two real-world LBSN datasets. Results show that our REPLAY outperforms a sizable collection of cutting-edge location prediction methods, with 7.7%-10.9% improvement over the best-performing baselines. Moreover, the learnt bandwidths reveal interesting temporal regularity patterns: 1) morning mobility shows a stronger regularity compared to other time periods, which is consistent on both weekdays and weekends; 2) weekend mobility is less regular than weekdays in general, while the weekend afternoon sometimes shows the weakest regularity.

2 Related Work

2.1 Human Mobility Trajectory

Ravenstein published work on the analysis of migration patterns using human mobility trajectories derived from demographic data [26] is pioneering work in studying human mobility. Nowadays, the widespread use of smart devices equipped with sensors has significantly enhanced the accessibility of human trajectory data for comprehensive studies [27]. Human trajectory data can be divided into two different types based on the data collection method as follows.

Firstly, there are continuously sampled trajectories, which encompass regularly recorded location sequences obtained from devices equipped by individuals. For instance, Microsoft Research Asia conducted the GeoLife experiment [28] which collects trajectories from 182 volunteers for more than three years by capturing GPS traces every five seconds or ten meters. Another notable example is that IDIAP Research Institute and Nokia Research Center conducted the Lausanne Data Collection Campaign [29], where the mobility traces of 200 individuals were collected. Each individual carried a smartphone that periodically recorded various sensor readings (e.g., Bluetooth, WLAN) at intervals such as every 60 seconds. While these human trajectory datasets provide detailed information about individual mobility patterns, their scale is often limited due to designed settings and individuals’ concerns regarding privacy (e.g., unwillingness to install software that continuously monitors and collects data on personal devices).

Secondly, there are voluntarily shared trajectories, which consist of self-reported location sequences, primarily provided by users on social media network platforms. For instance, a large number of individuals willingly share their check-ins on LBSNs, contributing to the accumulation of a substantial collection of human trajectories. However, people often choose to disclose a portion of their check-ins by reporting at specific POIs. This means that users may visit various POIs without recording, influenced by factors such as their preferences for certain places, simply forgetting to do so, or even privacy concerns. Consequently, despite LBSNs being commonly acknowledged as an important data source for conducting extensive research and data collection, the resulting mobility trajectories inherently exhibit sparsity [30, 31, 32]. This sparsity needs to be taken into careful consideration when conducting data analysis tasks [33]. In light of this context, the focus of this paper is to investigate the problem of location prediction over such sparsely populated trajectories.

2.2 Location Prediction

The spatiotemporal regularity of human mobility has been revealed with ample empirical evidence in the literature [6]. Universal mobility patterns have been discovered across different cities using statistical methods [22, 34], which serve as the foundation for human mobility modeling. Location prediction plays an important role in human mobility studying, which aims to predict the next location of users that they are most likely to visit by leveraging their historical traces. Traditional methods mainly focus on various mobility features including activity preferences [35, 36], historical visit counts [1, 37], social information [23, 38], or features (embedding) learnt by graph embedding techniques [39, 40, 41, 32]. In addition, factorization and generative methods have been utilized to tackle the challenges of location recommendation and prediction tasks [42, 43, 44, 45, 46]. However, these techniques possess limited ability to capture the sequential patterns that are inherent in human mobility. , which have been proven to be crucial predictors for accurate location prediction [7].

To model the sequential patterns of human mobility, (Hidden) Markov Chain based methods have been extensively employed for sequential prediction [47, 48, 49]. The fundamental idea is to calculate the transition probability of a particular state according to the preceding one. Factorizing Personalized Markov Chains (FPMC) is a typical technique employed in this context [50], which factorizes a personalized transition matrix in user-specific latent factors. To address the location prediction problem, FPMC has been extended by integrating spatial constraints into its modeling framework [48, 49]. However, one limitation of these location prediction techniques is their inability to effectively capture long-term dependencies in human mobility trajectories.

To capture the long-term dependencies, Recurrent Neural Networks (e.g., RNN, GRU and LSTM) have been widely used in location prediction tasks. To handle real-world mobility traces that are often sparse and incomplete, spatiotemporal contexts are often considered in the RNNs. A common approach widely adopted is to augment the Recurrent Neural Network (RNN) units with the inclusion of temporal and spatial distances between consecutive check-ins as extra information. For instance, Distance2Pre [11] considers the spatial distance between consecutive check-ins as extra information; STRNN [7] extends RNN with time and spatial-specific transition matrices for location prediction; HST-LSTM [10] extends LSTMs gates to consider spatiotemporal distance; STGN [12] adds spatial and temporal gates to the conventional LSTM to capture sequential patterns of users; NeuNext [13] jointly learn the location context prediction and the next location prediction tasks. STAN [51] uses spatiotemporal attention networks to capture the dependencies between non-adjacent check-ins; GeoSAN [52] uses hierarchical gridding of GPS locations for spatial discretization and uses self-attention layers for matching. Despite the improved performance of these methods, we have shown in our previous study [14] that only considering the spatiotemporal distances between consecutive check-ins cannot fully capture the two universal human mobility laws, i.e., spatial regularity and temporal periodicity, for location prediction.

In this context, the flashback mechanism proposed by our previous work [14] explicitly utilizes high-order spatiotemporal distance information to find informative past hidden states. This flashback mechanism has been widely utilized by subsequent studies. For instance, BiGRU [16] incorporates the flashback network with a bi-directional GRU; BSDA [15] incorporates the flashback mechanism with an extra RNN which models correlations of user appealed by POIs from bi-direction speculation; Bi-STAN [19] utilizes flashback mechanism for missing value imputation task; RTPM [18] addresses real-time location recommendation task by combining flashback mechanism with real-time user preference; Graph-Flashback [17] incorporates knowledge graph embeddings technique into the flashback framework. Note that apart from RNNs, other sequential methods have also been utilized for location prediction, such as geography-aware self-attention networks [52], graph-enhanced attention networks [53, 54], spatiotemporal attention networks [51], and hierarchical multi-task graph recurrent networks [55]. However, all of these existing methods overlook the time-varying temporal regularities of user mobility (as evidenced in Figure 1), which motivates us to design REPLAY that can automatically adapt to the temporal regularities of different strengths over time.

Compared to our previous work Flashback [14], the current work REPLAY makes the following differences. First, beyond the temporal periodicity, we further reveal the time-varying temporal regularities of human mobility in this work. Second, to capture the time-varying temporal regularities for location prediction, we design REPLAY on top of Flashback by seamlessly incorporating smoothed timestamp embeddings with learnable bandwidths into RNNs with the Flashback mechanism, where Flashback is now one of the three building blocks of REPLAY (more detail below). Third, the new experiments show superior performance of REPLAY against state-of-the-art techniques; in particular, compared to our previous work Flashback, REPLAY shows a significant improvement of 21.8%-43.9% in location prediction tasks.

Refer to caption
Figure 2: Overview of our proposed REPLAY. It consists of three components: 1) Smoothed timestamp embedding, 2) Flashback network with smoothed timestamp embeddings, and 3) Prediction conditioned on query timestamp.

3 REPLAY

Motivated by the observation of the time-varying temporal regularities of human mobility, we design REPLAY to seamlessly incorporate smoothed timestamp embeddings with learnable bandwidths into the flashback mechanism. The timestamp-specific learnable bandwidths can automatically adapt to the temporal regularities of different strengths across different timestamps. Figure 2 shows the overall architecture of our proposed REPLAY. Given an individual’s trace denoted as a sequence of POIs {,pi2,pi1,pi,}\{...,p_{i-2},p_{i-1},p_{i},...\} and its associated check-in time {,ti2,ti1,ti,}\{...,t_{i-2},t_{i-1},t_{i},...\}, where ii denote the order of check-in in the mobility trace, our objective is to predict a future POI pi+1p_{i+1} that an individual will probably visit at a query time ti+1t_{i+1}. To this end, REPLAY first generates a smoothed timestamp embedding for each check-in time in the sequence, and then concatenates it with the corresponding POI embedding which is then fed into an RNN with the flashback mechanism. Finally, we combine the hidden state output by the RNN, a user embedding, and a smoothed timestamp embedding generated from the query time to make predictions on future POIs conditioned on the query time. We present each component in detail below.

3.1 Smoothed Timestamp Embeddings

To flexibly capture the time-varying temporal regularities of human mobility, REPLAY designs a smoothed timestamp embedding method, which first transforms check-in time into hour-in-week timestamps, and then generates smoothed timestamp embeddings using Gaussian weighted averaging with learnable bandwidths, where cyclical timestamps are considered in the Gaussian kernel.

3.1.1 Hour-in-Week Timestamp Transformation

For ii-th check-in in the input mobility trace, we first transform its check-in time tit_{i} into an hour-in-week timestamp tnt_{n}, where 1n1681\leq n\leq 168 (i.e., one of the 168 hours in a week). For example, ti=t_{i}= (15:00 Tue. 07/11/2023) is transformed into tnt_{n}, n=39n=39, as tit_{i} is the 39-th hour in a week. We choose to model the temporal regularity at an hour granularity on a weekly scale, as it can capture not only daily patterns but also the difference between weekdays and weekends, which is also suggested by [32]. Moreover, we experimentally justify this design choice by comparing it with other time granularities and scales (see Section 4.4 for more detail).

3.1.2 Gaussian Smoothing with Learnable Bandwidth

For each hour-in-week timestamp, we initialize a learnable embedding vector, denoted as tn\vec{t}_{n}, and then compute a smoothed timestamp embedding sn\vec{s}_{n} using Gaussian weighted averaging with a learnable bandwidth σn\sigma_{n}, flexibly combining the neighboring timestamp embeddings. This design is inspired by kernel regression techniques [56], and we further incorporate timestamp-specific learnable bandwidths here to flexibly capture the temporal regularities of different strengths across different timestamps. Specifically, for the input timestamp tnt_{n}, the Gaussian kernel function is defined as:

fn(l)=1σn2πedist(l,n)22σn2f_{n}(l)=\frac{1}{\sigma_{n}\sqrt{2\pi}}e^{-\frac{dist(l,n)^{2}}{2\sigma_{n}^{2}}} (1)

where dist(l,n)dist(l,n) is a function computing the distance between timestamps ll and nn. The bandwidth (i.e., standard deviation) σn\sigma_{n} here controls the width of the Gaussian weighting function. A smaller bandwidth σn\sigma_{n} leads to a peaky Gaussian, where the smoothed timestamp embedding of tnt_{n} requires less information from its neighboring timestamps; in other words, the location prediction relies mostly on the timestamp embedding of tnt_{n}, which implies a stronger regularity at timestamp nn. In contrast, a larger bandwidth σn\sigma_{n} leads to a wide Gaussian, where the smoothed timestamp embedding of tnt_{n} integrates a significant amount of information from its neighboring timestamps for location prediction, which thus implies a weaker regularity at timestamp nn.

3.1.3 Cyclical Timestamps

Different from traditional distance metrics in kernel regression, our distance function for hour-in-week timestamps should be carefully designed, as Sunday night 11pm (l=168l=168) and Monday morning 1am (l=2l=2) are only two hours apart, for example. Therefore, we consider cyclical timestamps for the distance computation:

dist(l,n)={|ln|,if|ln|<84168|ln|,otherwisedist(l,n)=\begin{cases}|l-n|,&\text{if}\ |l-n|<84\\ 168-|l-n|,&\text{otherwise}\end{cases} (2)

Afterward, we compute the smoothed timestamp embedding sn\vec{s}_{n} as a weighted average of all timestamp embeddings:

sn=lfn(l)tllfn(l)\vec{s}_{n}=\frac{\sum_{l}f_{n}(l)\cdot\vec{t}_{l}}{\sum_{l}f_{n}(l)} (3)

3.2 Flashback Network with Smoothed Timestamp Embeddings

After obtaining the smoothed timestamp embedding, we feed it together with the corresponding POI embedding to RNN units with the flashback mechanism. In the following, we first briefly present the Flashback mechanism, followed by our extension of its input to add smoothed timestamp embeddings.

3.2.1 Flashback Mechanism

Flashback [14] is a general RNN architecture leveraging the spatiotemporal information in sparse trajectories to search the informative past hidden states for location prediction. Specifically, it takes a sequence of check-in POIs {,pi2,pi1,pi,}\{...,p_{i-2},p_{i-1},p_{i},...\} as the input of an RNN (vanilla RNN, GRU, or LSTM) and output its recurrent hidden state hi\vec{h}_{i}. Then, the flashback mechanism leverages the spatiotemporal distance information to search the informative past hidden states. To this end, we use learnable weight 𝒲(ΔTi,j,ΔDi,j)\mathcal{W}(\Delta T_{i,j},\Delta D_{i,j}) to re-weighting the historical hidden states hj(j<i)h_{j}(j<i) for location prediction. The weight used for aggregating the hidden state is parameterized by the spatial distance ΔDi,j\Delta D_{i,j} and temporal distance ΔTi,j\Delta T_{i,j} between the check-ins (pi,ti)(p_{i},t_{i}) and (pj,tj)(p_{j},t_{j}), which is defined as follows:

𝒲(ΔTi,j,ΔDi,j)=hvc(2πΔTi,j)eαΔTi,jeβΔDi,j\mathcal{W}(\Delta T_{i,j},\Delta D_{i,j})=\mbox{hvc}(2\pi\Delta T_{i,j})e^{-\alpha\Delta T_{i,j}}e^{-\beta\Delta D_{i,j}} (4)

where the Havercosine function hvc(x)=1+cos(x)2\mbox{hvc}(x)=\frac{1+\cos(x)}{2} is used to capture the temporal periodicity (ΔTi,j\Delta T_{i,j} is measured in days here). This Havercosine function effectively models the daily periodicity of human mobility based on the temporal distances ΔTi,j\Delta T_{i,j} as shown in Figure 1. Note that the distance between transformed timestamps used in smoothed timestamp embeddings differs from the temporal distance in Flashback here; the former is used for smoothing timestamp embeddings to accommodate the time-varying temporal regularities, while the latter is used for weighing the past hidden state to capture the temporal periodicity. The two exponential terms model the temporal and spatial decay of past hidden states, respectively, following the intuition that the hidden states of older and farther check-ins have less predictive power in general; α\alpha and β\beta are tunable parameters for decay rates. Please refer to [14] for more details.

3.2.2 Adding Smoothed Timestamp as Input

Different from the original Flashback network [14] that only takes POI embeddings as inputs of RNN units, REPLAY concatenates the POI embedding together with the corresponding smoothed timestamp embedding as inputs of the RNN units, i.e., hi=([pi;sn],hi1)\vec{h}_{i}=\mathcal{F}([\vec{p}_{i};\vec{s}_{n}],\vec{h}_{i-1}), as shown in Figure 2. The objective here is to capture the correlation between a check-in’s POI and timestamp. Such a correlation can effectively help the location prediction performance. For example, working-related POIs are usually checked by users in the daytime. Moreover, our smoothed timestamp embeddings here can flexibly capture this correlation by integrating information from neighboring timestamps, where the amount of the information is controlled by the bandwidth σ\sigma. Subsequently, the learnable σ\sigma can automatically adapt to time-varying temporal regularities; for example, a timestamp with a high mobility regularity will intuitively yield a smaller value of the bandwidth σ\sigma, as it relies less on the information from its neighboring timestamps for prediction.

3.3 Prediction Conditioned on Query Timestamp

To predict a future location, we feed the aggregated hidden state with a user embedding and a smoothed timestamp embedding of a query timestamp, to a fully connected layer to output a probability distribution of all POIs.

The design choice of prediction conditioned on timestamp is motivated by the fact that over a sparse user mobility trace, the time intervals between successive check-ins significantly vary. Subsequently, to predict the next check-in in the sequence, the predicted next POI should vary across different “query” time. Therefore, REPLAY is designed to make predictions conditioned on a query time. The same or similar settings exist in the literature by using a query timestamp (same as ours) [37, 32, 15], a query time interval (to predict the location of a user after a given time period) [57, 12, 13, 58], or a fixed query time threshold (to predict the location of a user within a forthcoming period of time) [49, 59, 58].

Note that an alternative setting is to make predictions on the next POI without considering the query time [7, 14, 17]. We do not advocate for this setting in this paper as it always makes the same prediction for all future time given an input mobility trace, which is less reasonable in real-world scenarios.

3.4 Discussions

In this section, we discuss how REPLAY captures the time-varying temporal regularities by comparing the utility of time interval and (smoothed) timestamps.

Time interval captures the temporal regularity of human mobility over sparse trajectories. Real-world mobility trajectories are often sparse and incomplete, which hinders the application of recurrent neural network models for mobility modeling. Our flashback mechanism is able to effectively utilize the rich spatiotemporal context to search the historical hidden states for location prediction, where the historical hidden states temporally close to the periodicity peaks of returning probability (as shown in Figure 1) are usually more informative for location prediction. For example, Figure 3 shows that using time interval (ΔT1\Delta T\approx 1 day), we can quickly find similar temporal context in sparse trajectories and retrieve the corresponding hidden state for prediction.

Refer to caption
Figure 3: A toy example showing the working principle of REPLAY on modeling time-varying temporal regularities of human mobility.

Smoothed Timestamps further captures the temporal variation (uncertainty) of the temporal regularities. Specifically, even though the time interval in the flashback mechanism is able to capture the temporal regularity over sparse trajectories to some extent, human mobility has an intrinsic uncertainty, which implies that the regular activity of a user might be recorded at slightly different timestamps in a day in sparse trajectories, for example. We reveal in this paper that the degree of such uncertainty (or the strength of the temporal regularities) varies across time (e.g., across different time periods during a day as shown in Figure 1). Subsequently, the smoothed timestamp embeddings are used to accommodate such variations. Instead of directly using a specific timestamp embedding, we smooth it with the neighboring timestamps as shown in Figure 3, where the timestamp-specific learnable bandwidths can automatically adapt to the temporal regularities of different strengths across different timestamps.

3.5 Model Training

The training process of REPLAY follows the traditional backpropagation-through-time training process of recurrent neural networks [60]. For each prediction, we compute the cross-entropy loss between the probability distribution of POIs output by REPLAY and the ground truth POIs, which are minimized using Adam optimizer [61]. The loss function is defined as follows:

=1Uj=1U1Nj1i=1Nj1yi+1,jlog(p(y^i+1,j|i,sk,uj))\mathcal{L}=-\frac{1}{U}\sum^{U}_{j=1}\frac{1}{N_{j}-1}\sum^{N_{j}-1}_{i=1}y_{i+1,j}log(p(\hat{y}_{i+1,j}|\mathcal{H}_{i},s_{k},u_{j}))

where UU is the number of users; NjN_{j} is the sequence length of jj-th user; yi+1,jy_{i+1,j} is the one-hot vector of ground truth POI for (ii+11)-th check-in of jj-th user. p(y^i+1,j|i,sk,uj)p(\hat{y}_{i+1,j}|\mathcal{H}_{i},s_{k},u_{j}) is the predicted probability distribution given the current aggregated hidden state i\mathcal{H}_{i}, smoothed query timestamp sks_{k} (transformed from time ti+1t_{i+1}), and user embedding uju_{j}. For more training and implementation details of REPLAY, please refer to our code available online222https://github.com/UM-Data-Intelligence-Lab/REPLAY.

4 Experiments

We conduct an extensive evaluation of REPLAY on location prediction tasks. In the following, we start by presenting our experimental setup, followed by our results and discussions.

4.1 Experiment Setup

We present our experiment setup below, including datasets, baselines, and our experiment settings.

4.1.1 Dataset

We evaluate our model on two widely used real-world datasets: Gowalla [23] and Foursquare [25], respectively. Table I shows the dataset statistics. Each mobility trace is chronologically split into 80% for training and 20% for testing.

Note that the original datasets use UTC time (Coordinated Universal Time) and it is inappropriate to directly use such time for modeling the temporal regularity of users worldwide, as the user mobility regularity depends on local time. We thus transform the UTC time of each check-in to a local time according to the time zone of the check-in location (GPS coordinates).

TABLE I: Datasets Statistics
Dataset Gowalla Foursquare
#Users 52,979 46,065
#POIs 121,851 69,005
#Checkins 3,300,986 9,450,342
Collection period 02/2009-10/2010 04/2012-01/2014
Median time between
successive check-ins
11.09 hours
(0.46 days)
16.72 hours
(0.70 days)

4.1.2 Baselines

Our baseline techniques can be divided into five types:

  • User Preference-based Methods. WRMF [62] converts implicit feedback into confidence values and employs alternating least squares for the computation of user and item latent factors. BPR proposed by [63] is a recommendation approach specifically designed for implicit feedback based on ranking, which employs a pairwise ranking loss to learn user preferences; it exploits direct user-item interaction to separate negative items from positive items to alleviate the sparsity of data and improves the performance. For these two methods, We employ the default configurations of LibRec333https://github.com/guoguibing/librec.

  • Feature-based methods: Most Frequent Time (MFT) proposed by [37] based on the observation that users often interact with POIs at specific times or during particular time intervals (24 hours a day) to rank a POI. LBSN2Vec [32] employs a random walk-with-stay scheme to automatically learn embeddings from an LBSN hypergraph; for location prediction tasks, LBSN2Vec ranks POIs by maximizing the cosine similarity with user and time embeddings. In our experiments, the ratio of mobility data, the number of the window size of random walk, and the negative sample number are set to {1, 10, 10}, respectively.

  • Markov-Chain-based Methods: FPMC [50] uses matrix factorization techniques to factorize a personalized transition matrix in user-POI latent factors. The dimension of factorization and the negative sample numbers are set to {32, 10} in our experiments. PRME [49] considers spatial constraints of human mobility and extends FPMC to location prediction problems; it captures the personalized POI mobility transition patterns by learning user-POI embeddings. The component weight, the hidden state size, and the threshold are set to {0.2, 20, 360} in our experiments. TribeFlow [57] predicts user trajectories using semi-Markov transition probabilities over latent space based on a mixture model. The number of transitions, cores, batches, and iterations are set to {0.3, 20, 20, 2000}, respectively.

  • Basic RNNs: RNN [64] is a basic Recurrent Neural Network architecture; for location prediction tasks, it captures sequential patterns from check-in datasets. LSTM [4] is a variant of RNN, which contains three multiplicative gates and a memory cell to hold long-term dependencies. GRU [3] is equipped with two gates to control the information flow for the location prediction task. For all three RNN architectures and across all datasets, the hidden size is set to 10 in our experiments.

  • Spatiotemporal Sequence Models: DeepMove [58, 65] proposes an attentional recurrent network for mobility prediction from lengthy and sparse trajectories. We employ default settings as suggested by the paper, and the embedding and the hidden size are set to {50, 10}, respectively. STRNN [7] adopts time and spatial-specific transition matrices in RNN to model mobility patterns. STGN [12] adds spatial and temporal gates to the conventional LSTM to capture sequential patterns of users. STGCN [12] uses coupled input and forget gates in STGN. The hidden size is set to 10 in our experiment. STAN [51] utilizes self-attention layers along the trajectory to capture the point-to-point interaction between non-adjacent check-ins. The negative sample number, embedding size, and trajectory sequence are set to {10, 50, 100}, respectively. GetNext [66] is a graph-enhanced transformer model that exploits user global trajectory flow map for location prediction; for a fair comparison, the category of POIs are uniformly assigned to the same. The user embedding size, GCN layers’ channel numbers, POIs embedding size, and transformer encoder embedding size are set to {32, 64, 128, 1024}, respectively. Flashback [14] uses a context-aware weighting mechanism to search informative past hidden states; Graph-Flashback [17] adds knowledge graph embeddings techniques to the flashback network, where the social relationships between users are also used. We set α\alpha and β\beta as suggested by Flashback in Graph-Flashback and Flashback.

TABLE II: Overall location prediction performance, where the best-performing ones are highlighted. Methods make predictions using query time information as a query timestamp (MFT, LBSN2Vec, and REPLAY), as a query time interval (TribeFlow, DeepMove, STGN, STGCN, STAN), or as a query time threshold (PRME). Note that DeepMove predicts locations in a fixed future time slot, implicitly implying a fixed query time interval. The top two best-performing results are denoted in bold and underlined, respectively.
Method Gowalla Foursquare
Acc@1 Acc@5 Acc@10 MRR Acc@1 Acc@5 Acc@10 MRR
User Preference based Methods WRMF 0.0112 0.0260 0.0367 0.0178 0.0278 0.0619 0.0821 0.0427
BPR 0.0131 0.0363 0.0539 0.0235 0.0315 0.0828 0.1143 0.0538
Feature-based Methods MFT 0.0525 0.0948 0.1052 0.0717 0.1945 0.2692 0.2788 0.2285
LBSN2Vec 0.0864 0.1186 0.1390 0.1032 0.2190 0.3955 0.4621 0.2781
Markov-Chain based Methods FPMC 0.0479 0.1668 0.2411 0.1126 0.0753 0.2384 0.3348 0.1578
PRME 0.0740 0.2146 0.2899 0.1504 0.0982 0.3167 0.4064 0.2040
TribeFlow 0.0256 0.0723 0.1143 0.0583 0.0297 0.0832 0.1239 0.0645
Basic RNNs RNN 0.0881 0.2140 0.2717 0.1507 0.1824 0.4334 0.5237 0.2984
LSTM 0.0621 0.1637 0.2182 0.1144 0.1144 0.2949 0.3761 0.2018
GRU 0.0528 0.1416 0.1915 0.0993 0.0606 0.1797 0.2574 0.1245
Spatiotemporal Sequence Models STRNN 0.0900 0.2120 0.2730 0.1508 0.2290 0.4310 0.5050 0.3248
DeepMove 0.0625 0.1304 0.1594 0.0982 0.2400 0.4319 0.4742 0.3270
STGN 0.0624 0.1586 0.2104 0.1125 0.2094 0.4734 0.5470 0.3283
STGCN 0.0546 0.1440 0.1932 0.1017 0.1878 0.4502 0.5329 0.3062
STAN 0.0891 0.2096 0.2763 0.1523 0.2265 0.4515 0.5310 0.3420
GETNext 0.0912 0.2003 0.2487 0.1484 0.1862 0.4702 0.5763 0.3153
Flashback 0.1158 0.2754 0.3479 0.1925 0.2496 0.5399 0.6236 0.3805
Graph-Flashback 0.1512 0.3425 0.4256 0.2422 0.2805 0.5757 0.6514 0.4136
REPLAY 0.1866 0.3476 0.4124 0.2635 0.3529 0.5953 0.6648 0.4638

4.1.3 Experiment Settings

We developed our model using the PyTorch framework and conducted experiments on the following hardware platform (CPU: Intel(R) Xeon(R) Gold 5320, GPU: NVIDIA GeForce RTX 3090). We evaluate REPLAY and the baselines in the location prediction task, aiming to predict the next POI a user will visit, based on her historical check-in sequence and a query time, as illustrated in Figure 2. We use two widely used evaluation metrics: Mean Reciprocal Rank (MRR) and average Accuracy@N (Acc@N), we adopt N = {1, 5, 10} in our experiment. We empirically set the dimension of hidden states and all (POI, timestamp, and user) embedding sizes as 10, and the temporal decay factor α\alpha and spatial decay factor β\beta following the suggested settings in [14], i.e., α=0.1\alpha=0.1 on both datasets, β=1000\beta=1000 and 100100 on Gowalla and Foursquare datasets, respectively, and the distance to flashback as 20. We implement our REPLAY with different RNN architectures (including vanilla RNN, GRU and LSTM), and systematically compare their performance in an ablation study below; in other experiments, we report the results of REPLAY using the best RNN architecture setting (vanilla RNN), if not specified otherwise.

4.2 Location Prediction Performance

Table II shows overall location prediction performance. First, we find that REPLAY achieves the best performance in most cases, yielding 7.7% and 10.9% improvement over the best-performing baselines on Gowalla and Foursquare, respectively. Moreover, REPLAY significantly outperforms the best baselines in Acc@1 (by 23.4% and 25.8% on Gowalla and Foursquare, respectively). The only exception is on Acc@10 where Graph-Flashback achieves slightly better performance than REPLAY; note that beyond user mobility trajectories, Graph-Flashback further uses user social relationships as input.

We also highlight in Table II the methods that make predictions using query time information in different formats (e.g., a query timestamp, a query time interval, or a query time threshold). However, we did not observe a consistent improvement of these methods over others that make predictions without using query time information. This is due to the fact that the distinct models adopted by these baseline methods dominate the prediction performance. Moreover, the query time information is often used by existing methods on an ad-hoc basis without a systematic study of its impact. In this context, we provide a systematic ablation study below to verify the usefulness of the query time in location prediction.

TABLE III: Ablation study on REPLAY
Method Gowalla Foursquare
Acc@1 Acc@5 Acc@10 MRR Acc@1 Acc@5 Acc@10 MRR
Flashback (RNN) 0.1158 0.2754 0.3479 0.1925 0.2496 0.5399 0.6236 0.3805
Flashback (LSTM) 0.1024 0.2575 0.3317 0.1778 0.2398 0.5169 0.6014 0.3654
Flashback (GRU) 0.0979 0.2526 0.3267 0.1731 0.2375 0.5154 0.6003 0.3631
REPLAY-noSTE (RNN) 0.1362 0.3099 0.3845 0.2185 0.3055 0.5690 0.6415 0.4251
REPLAY-noSTE (LSTM) 0.1263 0.2980 0.3724 0.2079 0.3002 0.5663 0.6396 0.4209
REPLAY-noSTE (GRU) 0.1356 0.3099 0.3856 0.2183 0.2993 0.5634 0.6360 0.4192
REPLAY-noQT (RNN) 0.1397 0.3169 0.3924 0.2237 0.2817 0.5744 0.6489 0.4139
REPLAY-noQT (LSTM) 0.1343 0.3123 0.3873 0.2187 0.2739 0.5620 0.6360 0.4043
REPLAY-noQT (GRU) 0.1319 0.3105 0.3866 0.2168 0.2716 0.5623 0.6376 0.4029
REPLAY-MultiG (RNN) 0.1856 0.3357 0.3942 0.2570 0.3501 0.5894 0.6578 0.4598
REPLAY-MultiG (LSTM) 0.1808 0.3314 0.3904 0.2527 0.3440 0.5816 0.6493 0.4529
REPLAY-MultiG (GRU) 0.1798 0.3310 0.3897 0.2520 0.3426 0.5797 0.6477 0.4513
REPLAY-FixedB(RNN) 0.1861 0.3419 0.4024 0.2601 0.3513 0.5899 0.6575 0.4607
REPLAY-FixedB (LSTM) 0.1837 0.3411 0.4032 0.2587 0.3438 0.5817 0.6498 0.4528
REPLAY-FixedB (GRU) 0.1817 0.3374 0.3991 0.2558 0.3418 0.5807 0.6487 0.4512
REPLAY (RNN) 0.1866 0.3476 0.4124 0.2635 0.3529 0.5953 0.6648 0.4638
REPLAY (LSTM) 0.1848 0.3467 0.4100 0.2618 0.3448 0.5850 0.6544 0.4549
REPLAY (GRU) 0.1826 0.3449 0.4091 0.2597 0.3430 0.5831 0.6524 0.4530

4.3 Ablation Study

We consider the following variation of REPLAY and results are shown in Table III.

  • REPLAY-noSTE is a variant of REPLAY without the Smoothed Timestamp Embeddings (noSTE). It is also equivalent to Flashback by adding the query time interval for prediction.

  • REPLAY-noQT is a variant of REPLAY making predictions without using the Query Time (noQT). It is also equivalent to Flashback integrated with the smoothed timestamp embeddings.

  • REPLAY-MultiG is a variant of REPLAY combining multi-granularity (hour-in-day and day-in-week) timestamp embeddings.

  • REPLAY-FixedB is a variant of REPLAY with a universal fixed non-learnable bandwidth.

  • Flaskback can also be considered as a variant of REPLAY without the smoothed timestamp embeddings or query time.

For each method, we instantiate it with the three common RNN architectures (vanilla RNN, LSTM, or GRU). First, we notice that REPLAY outperforms REPLAY-noSTE by 20.6% and 7.5% on Gowalla and Foursquare, respectively. This demonstrates our smoothed timestamp embeddings indeed boost performance significantly by flexibly capturing the time-varying temporal regularities of human mobility. Second, we find that REPLAY outperforms REPLAY-noQT by 18.0% and 11.1% on Gowalla and Foursquare, respectively, which shows the effectiveness of our prediction conditioned on query time as the query time intuitively provides additional information for location prediction problems. Moreover, REPLAY significantly outperforms our previous work Flashback by 43.9% and 21.8% on Gowalla and Foursquare, respectively, which verifies the effectiveness of integration smoothed timestamp embeddings and making predictions conditioned on query time.

To future explore the impact of different granularities and the timestamp-specific learnable bandwidths, we conducted experiments on two variants of REPLAY, named REPLAY-MultiG (Multi-Granularity) and REPLAY-FixedB (Fixed Bandwidth). From Table III, we observe that REPLAY consistently yields better performance (3.4% and 0.6% improvement on Gowalla and Foursquare, respectively) over REPLAY-MultiG, which further validates our design choice of hour-in-week timestamps. Besides, we find that REPLAY with timestamp-specific learnable bandwidths consistently outperforms, with improvements of 1.5% on Gowalla and 0.6% on Foursquare compared to REPLAY-FixedB. This highlights the importance of utilizing timestamp-specific learnable bandwidths in location prediction.

In addition, we also investigate the performance of using different RNN architectures (vanilla RNN, GRU or LSTM). We observe that the vanilla RNN yields the best performance in general, while LSTM and GRU show comparable results. This is probably due to the fact that the flashback mechanism is able to effectively find useful historical hidden states using spatiotemporal contexts for location prediction, which indeed weakens the utility of retaining long-term memory by LSTM or GRU.

Interestingly, we find that REPLAY is more robust than Flashback against different RNNs. Specifically, we use the coefficient of variation [67] to evaluate robustness, which is the standard deviation ratio to a variable’s mean. We compute the coefficient of variation on the performance using different RNNs, and compare the results from different methods. Table IV shows that compared to basic RNNs, Flashback can effectively reduce the coefficient of variation from 21.73% and 41.84% to 5.59% and 2.55% on Gowalla and Foursquare, respectively. More importantly, REPLAY can further reduce the coefficient of variation to 0.73% and 1.26% on Gowalla and Foursquare, respectively. In other words, integrating smoothed timestamp embeddings and making predictions conditioned on query time can make our model more robust against different RNNs.

TABLE IV: Coefficient of variation in MRR over Different RNN Architectures (RNN, GRU, LSTM).
Method Gowalla Foursquare
Basic RNNs 21.73% 41.84%
Flashback 5.59% 2.55%
REPLAY-noSTE 2.82% 0.71%
REPLAY-noQT 1.64% 1.47%
REPLAY-MultiG 1.06% 0.99%
REPLAY-FixedB 0.84% 1.12%
REPLAY 0.73% 1.26%
TABLE V: Settings on Time Granularities and Scales
Scale Granularity #Timestamps
Day Minute 1,440
Hour 24
Weekday& Weekend Minute 2,880
Hour 48
Week Minute 10,080
Hour 168
TABLE VI: Impact of different time granularities and scales on location prediction performance
Scale Granularity Gowalla Foursquare
Acc@1 Acc@5 Acc@10 MRR
Time per
epoch
Acc@1 Acc@5 Acc@10 MRR
Time per
epoch
Day Minute 0.1278 0.2918 0.3610 0.2060 41.35s 0.2801 0.5723 0.6468 0.4121 122.14s
Hour 0.1777 0.3411 0.4063 0.2555 39.35s 0.3483 0.5927 0.6622 0.4601 101.85s
Weekday& Weekend Minute 0.1804 0.3423 0.4090 0.2577 44.98s 0.3530 0.5938 0.6631 0.4633 142.74s
Hour 0.1821 0.3442 0.4097 0.2594 37.69s 0.3505 0.5941 0.6632 0.4618 102.33s
Week Minute 0.1756 0.3276 0.3891 0.2485 72.41s 0.3532 0.5940 0.6638 0.4636 264.00s
Hour 0.1866 0.3476 0.4124 0.2635 44.20s 0.3529 0.5953 0.6648 0.4638 103.49s

4.4 Temporal Regularities on Different Time Granularities and Scales

We investigate the impact of different time granularities (i.e., minute or hour) and time scales (i.e., day, weekday&weekend or week) in our smoothed timestamp embeddings used by REPLAY. Table V shows the settings we considered and their corresponding number of timestamps. Note that for the time scale weekday&weekend, we define two sets of daily cyclical timestamps for weekday and weekend, respectively; the similarity between the timestamps in each set is computed independently from the other set. This setting is motivated by the fact that human mobility patterns are often split into weekday and weekend patterns by existing studies [34]. The performance comparison between different settings is shown in Table VI.

First, we observe that the week scale is consistently better than the day scale. For example, on the hour granularity (the best time granularity setting as we discuss below), the week scale outperforms the day scale by 2.89% and 0.74% on Gowalla and Foursquare, respectively. This is due to the fact that the former can capture the different temporal regularities across different days in a week, in particular the different regularities on weekdays and weekends (more detail in Section 4.5 below), while the latter cannot. Moreover, compared to the weekday&weekend scale which can also capture the different regularities on weekdays and weekends, the week scale still achieves consistently better performance (with 1.43% and 0.4% improvement on Gowalla and Foursquare, respectively, under the hour granularity). Because compared to the weekday&weekend scale, the week scale can subtly capture the regularity differences between different weekdays (also between different days in the weekend).

Second, we observe that the hour granularity is generally better than the minute granularity, as the minute granularity is often over-specific to model the temporal regularity of user mobility. One exception is on the week scale on Foursquare where we observe comparable results; however, in this case, the minute granularity shows significantly worse runtime performance (more than twice the training time as shown in Table VI) than the hour granularity, as it needs to learn much more embeddings for minute-in-week timestamps (10,080 in total) than for hour-in-week timestamps (168 in total). Moreover, we also conducted experiments on multi-granularity combining both hour-in-day and day-in-week timestamp embeddings in Section 4.3; the results show 3.4% and 0.6% improvement of our REPLAY over this multi-granularity setting, on Gowalla and Foursquare, respectively. In summary, we choose hour-in-week timestamps in REPLAY as the default setting.

Refer to caption
Refer to caption
(a) Hours in a week
Refer to caption
Refer to caption
(b) Hours on weekdays
Refer to caption
Refer to caption
(c) Hours on weekends
Figure 4: The learnt bandwidths and the corresponding location prediction performance across different timestamps.

4.5 Revealing the Time-Varying Temporal Regularities

In this experiment, we investigate the time-vary temporal regularities of user mobility using the learnt bandwidths from REPLAY. Figure 4(a) shows the values of the learnt bandwidths for the 168 hour-in-week timestamps and the corresponding location prediction performance in MRR for each timestamp on both Gowalla and Foursquare datasets.

First, we find a clear daily pattern, where the bandwidths in the morning have smaller values in general compared to other time periods; this implies a stronger temporal regularity of the morning mobility (often working-related activities), as the smoothed timestamp embeddings require less information from the neighboring timestamps for location prediction. In contrast, the learnt bandwidths in the nighttime often have larger values, implying a weaker temporal regularity in the nighttime (often entertainment-related activities in LBSNs), as the smoothed timestamp embeddings require more information from the neighboring timestamps for location prediction. Moreover, we further compare the bandwidths between weekdays and weekends, by plotting the average bandwidth for each hour on weekdays and on weekends, as shown in Figure 4(b) and 4(c), respectively. We observe that despite the similarity of having smaller bandwidths in the morning, the bandwidths on weekends have larger values than on weekdays, which implies that weekend activities are less regular than weekdays in general. Moreover, we find that the weekend afternoon sometimes shows the weakest regularity evidenced by the largest learnt bandwidth.

Second, we also discover a distinct daily pattern in the performance of location prediction. As shown in Figure 4, the location prediction performance shows an inverse trend compared to the bandwidths. This is due to the fact that the weaker regularity of user mobility at a certain timestamp (in the nighttime or on weekends, for example) makes it harder for location prediction; subsequently, the learnt bandwidth has a larger value, as the query timestamp is less confident for location prediction and thus resorts more to its neighboring timestamps for prediction, and vice versa.

TABLE VII: Location prediction performance and the corresponding average bandwidth in different prediction periods.
Prediction time period Gowalla Foursquare
Average MRR Average Bandwidth Average MRR Average Bandwidth
Daytime (6:00-18:00) 0.2767 0.85 0.4856 1.76
Nighttime (18:00-6:00) 0.2362 0.88 0.4302 2.00
Weekday 0.2922 0.82 0.4841 1.84
Weekend 0.1898 0.98 0.3758 2.00

Finally, we quantitatively verify our motivating example where the strength of the temporal regularities varies across typical time periods (daytime/nighttime and weekday/weekend). Table VII shows the average MRR and learnt bandwidths in typical time periods. We see that the daytime prediction performance is 15.0% better (with 7.5% lower bandwidth) on average than the nighttime, while the weekday prediction performance is 41.4% better (with 11.9% lower bandwidth) on average than the weekend.

5 Conclusion

In this paper, motivated by the time-varying temporal regularities of user mobility, we propose REPLAY, a general RNN architecture designed to capture such varying regularities for location prediction. REPLAY seamlessly incorporates smoothed timestamp embeddings with learnable bandwidths into a flashback mechanism. The timestamp-specific learnable bandwidths can automatically learn to adapt to the temporal regularities of different strengths across different timestamps. We conduct a thorough evaluation using two real-world LBSN datasets. Results show that REPLAY outperforms a wide range of state-of-the-art location prediction methods, with 7.7%-10.9% improvement over the best-performing baselines. Moreover, we reveal interesting temporal regularity patterns using the learnt bandwidths: 1) morning mobility consistently shows a stronger regularity compared to other time periods; 2) weekend mobility is less regular than weekdays in general, while the weekend afternoon sometimes shows the weakest regularity.

In the future, we plan to further investigate the temporal regularities varying across different user groups and POI categories.

References

  • [1] A. Noulas, S. Scellato, N. Lathia, and C. Mascolo, “Mining user mobility features for next place prediction in location-based services,” in ICDM.   IEEE, 2012, pp. 1038–1043.
  • [2] T. Mikolov, M. Karafiát, L. Burget, J. Černockỳ, and S. Khudanpur, “Recurrent neural network based language model,” in INTERSPEECH, 2010.
  • [3] K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using rnn encoder-decoder for statistical machine translation,” arXiv:1406.1078, 2014.
  • [4] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
  • [5] B. Deng, D. Yang, B. Qu, B. Fankhauser, and P. Cudre-Mauroux, “Robust location prediction over sparse spatiotemporal trajectory data: Flashback to the right moment!” ACM Transactions on Intelligent Systems and Technology, vol. 14, no. 5, pp. 1–24, 2023.
  • [6] M. C. Gonzalez, C. A. Hidalgo, and A.-L. Barabasi, “Understanding individual human mobility patterns,” Nature, vol. 453, no. 7196, p. 779, 2008.
  • [7] Q. Liu, S. Wu, L. Wang, and T. Tan, “Predicting the next location: A recurrent model with spatial and temporal contexts,” in AAAI, 2016.
  • [8] D. Neil, M. Pfeiffer, and S.-C. Liu, “Phased lstm: Accelerating recurrent network training for long or event-based sequences,” in NIPS, 2016, pp. 3882–3890.
  • [9] Y. Zhu, H. Li, Y. Liao, B. Wang, Z. Guan, H. Liu, and D. Cai, “What to do next: Modeling user behaviors by time-lstm.” in IJCAI, 2017, pp. 3602–3608.
  • [10] D. Kong and F. Wu, “Hst-lstm: A hierarchical spatial-temporal long-short term memory network for location prediction.” in IJCAI, 2018, pp. 2341–2347.
  • [11] Q. Cui, Y. Tang, S. Wu, and L. Wang, “Distance2pre: Personalized spatial preference for next point-of-interest prediction,” in PAKDD.   Springer, 2019, pp. 289–301.
  • [12] P. Zhao, H. Zhu, Y. Liu, J. Xu, Z. Li, F. Zhuang, V. S. Sheng, and X. Zhou, “Where to go next: A spatio-temporal gated network for next poi recommendation,” in AAAI, vol. 33, 2019, pp. 5877–5884.
  • [13] P. Zhao, A. Luo, Y. Liu, F. Zhuang, J. Xu, Z. Li, V. S. Sheng, and X. Zhou, “Where to go next: A spatio-temporal gated network for next poi recommendation,” IEEE Transactions on Knowledge and Data Engineering, 2020.
  • [14] D. Yang, B. Fankhauser, P. Rosso, and P. Cudré-Mauroux, “Location prediction over sparse user mobility traces using rnns: Flashback in hidden states!” in IJCAI, 2020, pp. 2184–2190.
  • [15] X. Li, R. Hu, Z. Wang, and T. Yamasaki, “Location predicts you: Location prediction via bi-direction speculation and dual-level association.” in IJCAI, 2021, pp. 529–536.
  • [16] Y. Cao, A. Li, J. Lou, M. Chen, X. Zhang, and B. Kang, “An attention-based bidirectional gated recurrent unit network for location prediction,” in 2021 13th International Conference on Wireless Communications and Signal Processing (WCSP).   IEEE, 2021, pp. 1–5.
  • [17] X. Rao, L. Chen, Y. Liu, S. Shang, B. Yao, and P. Han, “Graph-flashback network for next location recommendation,” in Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022, pp. 1463–1471.
  • [18] X. Liu, Y. Yang, Y. Xu, F. Yang, Q. Huang, and H. Wang, “Real-time poi recommendation via modeling long-and short-term user preferences,” Neurocomputing, vol. 467, pp. 454–464, 2022.
  • [19] J. Wu, R. Hu, D. Li, L. Ren, W. Hu, and Y. Xiao, “Where have you been: Dual spatiotemporal-aware user mobility modeling for missing check-in poi identification,” Information Processing & Management, vol. 59, no. 5, p. 103030, 2022.
  • [20] Z. Ye, X. Zhang, X. Chen, H. Xiong, and D. Yu, “Adaptive clustering based personalized federated learning framework for next poi recommendation with location noise,” IEEE Transactions on Knowledge and Data Engineering, 2023.
  • [21] T. Yang, Y. Gao, Z. Huang, Y. Liu et al., “Uptdnet: A user preference transfer and drift network for cross-city next poi recommendation,” International Journal of Intelligent Systems, vol. 2023, 2023.
  • [22] A. Noulas, S. Scellato, R. Lambiotte, M. Pontil, and C. Mascolo, “A tale of many cities: universal patterns in human urban mobility,” PloS one, vol. 7, no. 5, p. e37027, 2012.
  • [23] E. Cho, S. A. Myers, and J. Leskovec, “Friendship and mobility: user movement in location-based social networks,” in KDD.   ACM, 2011, pp. 1082–1090.
  • [24] D. Lian, X. Xie, V. W. Zheng, N. J. Yuan, F. Zhang, and E. Chen, “Cepr: A collaborative exploration and periodically returning model for location prediction,” TIST, vol. 6, no. 1, p. 8, 2015.
  • [25] D. Yang, B. Qu, and P. Cudre-Mauroux, “Location-centric social media analytics: Challenges and opportunities for smart cities,” IEEE Intelligent Systems, vol. 36, no. 5, pp. 3–10, 2020.
  • [26] E. G. Ravenstein, “The laws of migration,” Journal of the statistical society of London, vol. 48, no. 2, pp. 167–235, 1885.
  • [27] C. Doulkeridis, A. Vlachou, N. Pelekis, and Y. Theodoridis, “A survey on big data processing frameworks for mobility analytics,” ACM SIGMOD Record, vol. 50, no. 2, pp. 18–29, 2021.
  • [28] Y. Zheng, Q. Li, Y. Chen, X. Xie, and W.-Y. Ma, “Understanding mobility based on gps data,” in Proceedings of the 10th international conference on Ubiquitous computing, 2008, pp. 312–321.
  • [29] J. K. Laurila, D. Gatica-Perez, I. Aad, O. Bornet, T.-M.-T. Do, O. Dousse, J. Eberle, M. Miettinen et al., “The mobile data challenge: Big data for mobile computing research,” Tech. Rep., 2012.
  • [30] C. Guo, B. Yang, J. Hu, and C. Jensen, “Learning to route with sparse trajectory sets,” in 2018 IEEE 34th International Conference on Data Engineering (ICDE).   IEEE, 2018, pp. 1073–1084.
  • [31] L. Shi, C. Huang, M. Liu, J. Yan, T. Jiang, Z. Tan, Y. Hu, W. Chen, and X. Zhang, “Urbanmotion: Visual analysis of metropolitan-scale sparse trajectories,” IEEE Transactions on Visualization and Computer Graphics, vol. 27, no. 10, pp. 3881–3899, 2020.
  • [32] D. Yang, B. Qu, J. Yang, and P. Cudre-Mauroux, “Revisiting user mobility and social relationships in lbsns: a hypergraph embedding approach,” in WWW.   ACM, 2019, pp. 2147–2157.
  • [33] G. Wang, S. Y. Schoenebeck, H. Zheng, and B. Y. Zhao, “”will check-in for badges”: Understanding bias and misbehavior on location-based social networks,” in Tenth International AAAI Conference on Web and Social Media, 2016.
  • [34] E. M. R. Oliveira, A. C. Viana, C. Sarraute, J. Brea, and I. Alvarez-Hamelin, “On the regularity of human mobility,” Pervasive and Mobile Computing, vol. 33, pp. 73–90, 2016.
  • [35] J. Ye, Z. Zhu, and H. Cheng, “What’s your next move: User activity prediction in location-based social networks,” in SDM.   SIAM, 2013, pp. 171–179.
  • [36] D. Yang, D. Zhang, V. W. Zheng, and Z. Yu, “Modeling user activity preference by leveraging user spatial temporal characteristics in lbsns,” TSMC, vol. 45, no. 1, pp. 129–142, 2015.
  • [37] H. Gao, J. Tang, and H. Liu, “Exploring social-historical ties on location-based social networks,” in ICWSM, 2012.
  • [38] A. Sadilek, H. Kautz, and J. P. Bigham, “Finding your friends and following them to where you are,” in WSDM.   ACM, 2012, pp. 723–732.
  • [39] M. Xie, H. Yin, H. Wang, F. Xu, W. Chen, and S. Wang, “Learning graph-based poi embedding for location-based recommendation,” in CIKM.   ACM, 2016, pp. 15–24.
  • [40] S. Feng, G. Cong, B. An, and Y. M. Chee, “Poi2vec: Geographical latent representation for predicting future visitors,” in AAAI, 2017.
  • [41] T. Qian, B. Liu, Q. V. H. Nguyen, and H. Yin, “Spatiotemporal representation learning for translation-based poi recommendation,” TOIS, vol. 37, no. 2, p. 18, 2019.
  • [42] T. Kurashima, T. Iwata, T. Hoshide, N. Takaya, and K. Fujimura, “Geo topic model: joint modeling of user’s activity area and interests for location recommendation,” in WSDM.   ACM, 2013, pp. 375–384.
  • [43] D. Yang, D. Zhang, Z. Yu, and Z. Wang, “A sentiment-enhanced personalized location recommendation system,” in HT, 2013.
  • [44] H. Yin, Y. Sun, B. Cui, Z. Hu, and L. Chen, “Lcars: a location-content-aware recommender system,” in KDD.   ACM, 2013, pp. 221–229.
  • [45] D. Lian, C. Zhao, X. Xie, G. Sun, E. Chen, and Y. Rui, “Geomf: joint geographical modeling and matrix factorization for point-of-interest recommendation,” in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp. 831–840.
  • [46] Y. Wang, N. J. Yuan, D. Lian, L. Xu, X. Xie, E. Chen, and Y. Rui, “Regularity and conformity: Location prediction using heterogeneous mobility data,” in KDD.   ACM, 2015, pp. 1275–1284.
  • [47] W. Mathew, R. Raposo, and B. Martins, “Predicting future locations with hidden markov models,” in UbiComp.   ACM, 2012, pp. 911–918.
  • [48] C. Cheng, H. Yang, M. R. Lyu, and I. King, “Where you like to go next: Successive point-of-interest recommendation,” in IJCAI, 2013.
  • [49] S. Feng, X. Li, Y. Zeng, G. Cong, Y. M. Chee, and Q. Yuan, “Personalized ranking metric embedding for next new poi recommendation,” in IJCAI, 2015.
  • [50] S. Rendle, C. Freudenthaler, and L. Schmidt-Thieme, “Factorizing personalized markov chains for next-basket recommendation,” in WWW.   ACM, 2010, pp. 811–820.
  • [51] Y. Luo, Q. Liu, and Z. Liu, “Stan: Spatio-temporal attention network for next location recommendation,” in Proceedings of the Web Conference 2021, 2021, pp. 2177–2185.
  • [52] D. Lian, Y. Wu, Y. Ge, X. Xie, and E. Chen, “Geography-aware sequential location recommendation,” in Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, 2020, pp. 2009–2019.
  • [53] Y. Li, T. Chen, Y. Luo, H. Yin, and Z. Huang, “Discovering collaborative signals for next poi recommendation with iterative seq2graph augmentation,” arXiv preprint arXiv:2106.15814, 2021.
  • [54] Z. Wang, Y. Zhu, Q. Zhang, H. Liu, C. Wang, and T. Liu, “Graph-enhanced spatial-temporal network for next poi recommendation,” ACM Transactions on Knowledge Discovery from Data (TKDD), vol. 16, no. 6, pp. 1–21, 2022.
  • [55] N. Lim, B. Hooi, S.-K. Ng, Y. L. Goh, R. Weng, and R. Tan, “Hierarchical multi-task graph recurrent network for next poi recommendation,” 2022.
  • [56] E. A. Nadaraya, “On estimating regression,” Theory of Probability & Its Applications, vol. 9, no. 1, pp. 141–142, 1964.
  • [57] F. Figueiredo, B. Ribeiro, J. M. Almeida, and C. Faloutsos, “Tribeflow: Mining & predicting user trajectories,” in WWW, 2016, pp. 695–706.
  • [58] J. Feng, Y. Li, C. Zhang, F. Sun, F. Meng, A. Guo, and D. Jin, “Deepmove: Predicting human mobility with attentional recurrent networks,” in WWW, 2018, pp. 1459–1468.
  • [59] J.-D. Zhang, C.-Y. Chow, and Y. Li, “Lore: Exploiting sequential influence for location recommendations,” in Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, 2014, pp. 103–112.
  • [60] P. J. Werbos, “Backpropagation through time: what it does and how to do it,” Proceedings of the IEEE, vol. 78, no. 10, pp. 1550–1560, 1990.
  • [61] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
  • [62] Y. Hu, Y. Koren, and C. Volinsky, “Collaborative filtering for implicit feedback datasets,” in ICDM.   Ieee, 2008, pp. 263–272.
  • [63] S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme, “Bpr: Bayesian personalized ranking from implicit feedback,” in UAI.   AUAI Press, 2009, pp. 452–461.
  • [64] Y. Zhang, H. Dai, C. Xu, J. Feng, T. Wang, J. Bian, B. Wang, and T.-Y. Liu, “Sequential click prediction for sponsored search with recurrent neural networks,” in AAAI, 2014.
  • [65] J. Feng, Y. Li, Z. Yang, Q. Qiu, and D. Jin, “Predicting human mobility with semantic motivation via multi-task attentional recurrent networks,” IEEE Transactions on Knowledge and Data Engineering, 2020.
  • [66] S. Yang, J. Liu, and K. Zhao, “Getnext: trajectory flow map enhanced transformer for next poi recommendation,” in Proceedings of the 45th International ACM SIGIR Conference on research and development in information retrieval, 2022, pp. 1144–1153.
  • [67] B. S. Everitt and A. Skrondal, “The cambridge dictionary of statistics,” 2010.
[Uncaptioned image] Bangchao Deng received his B.Eng. degree in Computer Science and Technology from Nanjing University of Aeronautics and Astronautics, China, in 2019. He received his M.S. degree in Computer Science from University of Macau, Macao, China, in 2023. He is currently a Ph.D. student with the State Key Laboratory of Internet of Things for Smart City and Department of Computer and Information Science, University of Macau, Macao, China. His research interests lie in Spatiotemporal Data Mining and Urban Computing.
[Uncaptioned image] Bingqing Qu is an Assistant Professor at the BNU-HKBU United International College. She received her Ph.D. in Computer Science from the University of Rennes 1 in 2016. Before joining the BNU-HKBU United International College, she worked on multimedia data analytics for the Swiss Federal Institute of Technology (EPFL) and the University of Fribourg, Switzerland. Her research interests include multimedia data mining, social media data analytics, and computer vision.
[Uncaptioned image] Pengyang Wang is an Assistant Professor in the State Key Lab of Smart Cities and Internet-of-Things at the University of Macau. He obtained his Ph.D. in Computer Science from the University of Central Florida. His research interests are in data mining, machine learning and big data analytics. Pengyang has received “Global Top 100 Chinese Rising Stars in Artificial Intelligence”, one Best Student Paper Runner-up award of SIGKDD 2018, and one Best Paper Runner-up award of SIGSPATIAL 2020. His research work has been featured by Synced and UCF Today, and also highlighted by the Natural Science Foundation (NSF) of the U.S.
[Uncaptioned image] Dingqi Yang is an Associate Professor with the State Key Laboratory of Internet of Things for Smart City and Department of Computer and Information Science, University of Macau. He received his Ph.D. degree in Computer Science from Pierre and Marie Curie University and Institut Mines-TELECOM/TELECOM SudParis in France, where he won both the CNRS SAMOVAR Doctorate Award and the Press Mention in 2015. Before joining the University of Macau, he worked as a senior researcher at the University of Fribourg in Switzerland. His research interests include big data analytics, ubiquitous computing, and smart city.
[Uncaptioned image] Benjamin Fankhauser received his Bachelor of Science in Computer Science from the Bern University of Applied Sciences in 2016 and his Master Degree in Computer Science from the University of Bern in 2020. He has worked for various tech companies in Switzerland and is becoming a Ph.D. student at the Pattern Recognition Group at University of Bern. His research interests are in computer vision, data mining and deep learning.
[Uncaptioned image] Philippe Cudre-Mauroux is a Full Professor and the director of the eXascale Infolab at the University of Fribourg in Switzerland. He received his Ph.D. from the Swiss Federal Institute of Technology EPFL, where he won both the Doctorate Award and the EPFL Press Mention. Before joining the University of Fribourg he worked on information management infrastructures for IBM Watson Research, Microsoft Research Asia, and MIT. His research interests are in next-generation, Big Data management infrastructures for non-relational data. Webpage: http://exascale.info/phil