Balancing Exploration and Cybersickness: Investigating Curiosity-Driven Behavior in Virtual Environments

Tangyao Li, and Yuyang Wang Tangyao Li and Yuyang Wang are with the Information Hub, the Hong Kong University of Science and Technology (Guangzhou), China, 511453.E-mail: [email protected], [email protected] author. Email: [email protected]

Abstract

During virtual navigation, users exhibit varied interaction and navigation behaviors influenced by several factors. Existing theories and models have been developed to explain and predict these diverse patterns. While users often experience uncomfortable sensations, such as cybersickness, during virtual reality (VR) use, they do not always make optimal decisions to mitigate these effects. Although methods like reinforcement learning have been used to model decision-making processes, they typically rely on random selection to simulate actions, failing to capture the complexities of real navigation behavior. In this study, we propose curiosity as a key factor driving irrational decision-making, suggesting that users continuously balance exploration and cybersickness according to the free energy principle during virtual navigation. Our findings show that VR users generally adopt conservative strategies when navigating, with most participants displaying negative curiosity across trials. However, curiosity levels tend to rise when the virtual environment changes, illustrating the dynamic interplay between exploration and discomfort. This study provides a quantitative approach to decoding curiosity-driven behavior during virtual navigation, offering insights into how users balance exploration and the avoidance of cybersickness. Future research will further refine this model by incorporating additional psychological and environmental factors to improve the accuracy of navigation pattern predictions.

Index Terms:

Virtual navigation, Cybersickness, Curiosity-driven behavior, Exploration.

I Introduction

Virtual reality (VR) enables immersive experiences that transcend the boundaries of time and space, with virtual navigation serving as a core component of this immersion [1]. Through navigation, users explore environments and derive rewards such as completing tasks, discovering new spaces, or enjoying the immersive experience [2, 3, 4]. However, these experiences are shaped by a complex interplay between curiosity and potential risks, particularly cybersickness [5]. This interplay profoundly impacts user behavior, as curiosity drives exploration of novel environments, while the risk of discomfort often tempers such impulses [6]. Understanding this balance is crucial for designing VR systems that are both engaging and comfortable.

Curiosity is a key driver of exploratory behavior in VR, often prompting users to seek out novel and uncertain experiences [7]. Yet, individual differences, such as prior VR experience and personal risk tolerance, lead to diverse interaction patterns [8, 9]. While some users eagerly explore, others adopt conservative strategies to avoid risks, particularly in environments where the likelihood of cybersickness is high [6]. This highlights the importance of investigating how curiosity and reward interact to shape user decision-making. Insights into this relationship could boost the development of personalized VR applications that balance engagement with user comfort [10].

Despite its significance, the dynamic relationship between curiosity and cybersickness remains underexplored. Cybersickness arises from sensory conflicts between visual input and bodily perception during spatial locomotion in VR, leading to symptoms such as dizziness, nausea, and eye strain [11]. These symptoms are prevalent, with 80% to 95% of VR users experiencing some degree of discomfort, and 5% to 30% abandoning sessions due to severe symptoms [12]. While much research has focused on the prevalence and causes of cybersickness, the role of curiosity-driven exploration in exacerbating or mitigating discomfort has received limited attention [7, 13]. Addressing this gap is essential to improving user experiences in VR.

Behavior in VR is further complicated by the inherent uncertainty of virtual environments and the brain’s limited capacity to process real-time information [14]. This often leads to irrational decisions, particularly when users face conflicts between curiosity-driven exploration and reward-seeking behavior [15]. Sensory conflicts can amplify this tension, creating a complex dynamic where curiosity, reward, and discomfort intersect. Understanding these behavioral strategies is key to addressing both the cognitive and physiological challenges users encounter in VR.

Analyzing navigation speed profiles offers a novel approach to studying these dynamics. By examining how users balance curiosity and reward during navigation, we can gain insights into the factors influencing irrational decisions and the severity of cybersickness. Notably, in the context of current study, the concept of “reward” is modeled not as a beneficial outcome but as a measure of the severity of cybersickness experienced during virtual navigation. This reframing highlights the tension between curiosity-driven exploration and the discomfort associated with sensory conflicts, offering a unique perspective on user behavior in VR.

Ultimately, the severity of cybersickness is closely tied to how users negotiate the tradeoff between immersive rewards and discomfort [16]. Investigating this “reward-curiosity conflict” through user navigation patterns offers critical insights into the mechanisms driving behavior in VR [15]. By modeling exploratory behavior, this research aims to advance VR interaction design, reduce cybersickness, and improve user experience.The contributions are as the following:

•

We employed a machine learning framework, the inverse free energy principle (iFEP), to estimate the internal variables underlying decision-making processes and to model user behavior and interaction dynamics in VR environments.
•

By applying the iFEP method to time-series data from VR navigation, we identified key variables such as curiosity and the modeled “reward” (quantified as the severity of cybersickness). This approach offers novel insights into how users negotiate the trade-off between curiosity-driven exploration and the risk of discomfort.

Section II reviews studies on individual differences in behavior during virtual navigation, identifying gaps this research addresses. Section III outlines the methodological framework, including the application of the inverse free energy principle (iFEP). Section IV presents and interprets the findings, alongside an evaluation of alternative approaches. Section V discusses the implications of the results for understanding VR user behavior and interaction design.

II Related Work

II-A User behavior patterns in VR

User behavior in virtual environments varies widely due to individual differences. VR provides a controlled platform for experiments, where participants’ movements can be continuously recorded, enabling the collection of validated datasets for comprehensive analysis [17]. These datasets are instrumental in identifying factors that influence user behavior during virtual navigation.

For example, visual gaze analysis has been used to replace traditional controllers for object selection, offering valuable insights into interaction dynamics [18]. Similarly, head orientation is often used to estimate gaze direction in social scenarios, and combined changes in head and hand positions can reveal distinct behavioral patterns [17]. Proxemic information, derived from user positioning, is critical for quantifying behaviors such as social approach and avoidance [19]. For instance, Won et al. [20] leveraged head orientation data to analyze scanning behavior in a virtual classroom, which could indicate student anxiety levels, while Gillath et al. [21] used proxemic patterns to identify prosocial approach behaviors.

Comparisons between VR and real-world proxemic patterns have revealed both similarities and differences. Bailenson et al.[22] reported that proxemic behaviors in VR often mirror those in physical settings. Additionally, walking speeds in virtual environments generally align with real-world speeds [23], although Iryo-Asano et al.[24] found that maximum walking speeds tend to be lower in VR. Notably, Wei et al. [25] reported that users with greater technological experience exhibit higher immersion and confidence, leading to more effective navigation behaviors.

II-B The role of curiosity in decision-making

Curiosity, often defined as a desire for information gain, drives exploratory actions that enhance knowledge about the environment [26, 27, 28]. It plays a pivotal role in decision-making, particularly in immersive environments requiring active exploration, such as VR [7]. In VR settings, curiosity has been shown to motivate users to engage more deeply with their surroundings, fostering active exploration.

Empirical studies shed light on the neural and behavioral dynamics of curiosity. For instance, Jepma et al.[29] demonstrated that curiosity activates brain regions associated with conflict and arousal, while resolving curiosity triggers reward-related neural pathways. Over time, the impact of curiosity on decision-making evolves, as prolonged information gaps can diminish users’ anticipation and reduce their exploratory drive [30]. Despite its potential to introduce discomfort in uncertain situations, individuals are naturally drawn to the unknown, as highlighted by [31] and [30].

To model and investigate curiosity, various methodologies have been employed [15, 30]. The free energy principle (FEP), introduced by Karl Friston under the Bayesian brain hypothesis [32, 33, 34], frames decision-making as the optimization of both reward-seeking and curiosity-driven behaviors. While FEP provides a framework for understanding these dynamics, a limitation lies in the assumption that curiosity and rewards are weighted equally, which does not fully reflect real-world behavior. For example, Konaka and Naoki [35] noted that traditional FEP models assume constant curiosity, neglecting the dynamic nature of user curiosity, which fluctuates over time. Similarly, Millidge et al.[36] criticized the oversimplification of curiosity in bi-choice tasks, where the interplay between curiosity and rewards is more complex and changes based on context.

Machine learning (ML) and reinforcement learning (RL) offer an alternative approach to modeling curiosity [37, 38]. While RL focuses on maximizing future rewards through a series of actions, it also distinguishes between extrinsic rewards (externally set) and intrinsic rewards (self-generated by the agent). Schmidhuber [38] introduced the idea of curiosity as an intrinsic reward, driving exploration even in the absence of external incentives. RL-based models of curiosity-driven exploration are typically divided into novelty-based and prediction-error algorithms. In novelty-based models, curiosity motivates the brain to explore new, unfamiliar tasks, while prediction-error models suggest that curiosity is driven by the need to improve predictions about the environment [38]. Despite the usefulness of these models, they fall short of capturing the complexity of user curiosity in immersive VR settings, where exploration is often intentional and guided by the user’s curiosity, not a passive, random process as often modeled in RL [39].

II-C Cybersickness and its impact on user experience

While curiosity can enhance engagement and exploration, cybersickness poses significant challenges to user experience, often altering behaviors and diminishing immersion. Individual characteristics such as age, gender, health, and VR experience influence the susceptibility and severity of cybersickness [8, 40]. For example, Kolasinski et al. [41] reported that younger users, particularly those aged 2–12, are more prone to cybersickness, while susceptibility decreases with age. Some studies suggest gender differences, with females potentially being more susceptible than males [42], though others have contested this [43].

Experience with VR can mitigate some effects of cybersickness. For instance, gamers tend to exhibit better spatial perception and fewer sickness symptoms due to their familiarity with rapid virtual movements [44, 45]. However, cybersickness remains a critical factor in shaping user behavior and overall satisfaction. Wang et al. [46] found that cybersickness affects walking patterns during navigation, reducing users’ sense of presence. Similarly, Gabel et al. [47] demonstrated that cybersickness could impair performance in tasks such as text reading, negatively impacting retention and comprehension in educational contexts.

Quantitative models of cybersickness have made strides in predicting its onset but often fail to account for the diversity of individual user profiles, limiting their applicability [48, 49]. Understanding the interplay between curiosity and cybersickness offers valuable insights into designing VR systems that balance user engagement with comfort. By identifying patterns in user behavior, designers can mitigate the adverse effects of cybersickness while enhancing exploratory freedom and interaction quality.

III Methods

To analyze users’ irrational curiosity-driven behavior during virtual navigation, we examined brain decision-making processes in immersive environments using the ReCU model and the iFEP method [35]. The ReCU model simulates decision-making by optimizing the expected net utility, while the iFEP method quantitatively decodes this process, estimating key influencing factors. By integrating a curiosity parameter, we accounted for the seemingly irrational decisions made by users, enabling a more nuanced understanding of interaction behavior patterns through quantitative analysis. Furthermore, we investigated how the severity of cybersickness influenced users’ navigation strategies, particularly their decisions to accelerate or rest while exploring virtual environments.

III-A The ReCU Model

The ReCU model was designed to replicate the user’s decision-making process. It assumes that curiosity and reward are factors that could influence decision-making, where curiosity is one of the reasons a user makes irrational decisions, and reward refers to the credit received after choosing one action. In this way, we could discover how VR users decide their navigation speed in an immersive environment, more specifically, balance the desire to explore the VR environment and the severity of cybersickness. The replication process involves two steps. Initially, the VR user adjusts the reward probability for each action based on their observation. Following this, the VR user decides based on reward probability and curiosity. According to the ReCU model, the VR user seeks to maximize their expected net utility, which is the sum of the expected reward and information gain. The expected net utility at trial $t$ for choosing action $a_{t+1}$ is determined by

U_{t}(a_{t+1})=E[reward_{t+1}]+c_{t}\times E[info_{t+1}],

where $c_{t}$ denotes the curiosity level at trial $t$ and $a_{t}$ denotes the action selected at trial $t$ . Curiosity is modeled as a factor that could influence the participant to make irrational decisions and is assumed to fluctuate over time. In our context, reward refers to the presence of cybersickness, which can be seen as a negative reward. In addition, the action selection $a_{t}$ follows a sigmoid function

P(a_{t})=\frac{1}{1+e^{-\beta\Delta U_{t}}},

where $\beta$ controls the action selection’s randomness and $\Delta U_{t}=U_{t}(a_{t+1})-U_{t}(a^{\prime}_{t+1})$ .

III-B The iFEP Method

The free energy principle (FEP) posits that any system aims to minimize the discrepancy between predictions and actual observations through active inference [33]. This principle also suggests that systems seek to minimize free energy, which serves as an upper bound for surprise. In the model presented by [35], the primary objective is to maximize expected net utility, which corresponds to the negative of expected free energy. A detailed formulation of expected net utility can be found in Appendix A.

To decode the decision-making process of VR users, we employed the inverse Free Energy Principle (iFEP) method, which analyzes behavioral data, including user actions and observations. This decoding process quantifies internal states, such as curiosity levels, recognition of reward probabilities for various options, and confidence in those estimates. The iFEP method operates through a repetitive cycle of predicting internal states followed by corrections, necessitating prolonged trials of behavioral data for the particle filter to converge effectively [35].

III-C Dataset

User behavioral data at each trial during the navigation task is essential for modeling with the iFEP method. Such data shall include information on the speed of movement in the environment generated through the user pushing the VR handle, the actions taken, and the severity of any cybersickness experienced during the navigation task. By collecting this information, we can more accurately analyze user behavior.

A cohort of eleven local inhabitants was invited to perform a navigation task in a VR environment ¹¹1https://github.com/coreturn/CybersicknessDataset. These volunteers, averaging 24.8 years of age (SD = 9.6), included 3 females. Before the study, a preliminary health and gaming/VR familiarity survey was administered, confirming the absence of any health-related impediments to the experiment’s integrity. The experiment also received IRB approval, and informed consent was obtained from all participants before the study.

Each participant was asked to undertake the navigation task several times with different speeds across separate days to create a dataset that included more individual behavior information. Each task lasts 4 minutes, so the total duration for participants who perform four navigation tasks is 16 minutes. At the end of the experiment, subjects received gifts in recognition of their contribution. Figure 2 gives the timeline of the experiment procedure

Refer to caption — Figure 1: The virtual scenario where the participants perform the navigation task along the highlighted path.

The experiment procedure is as follows:

•

The participants were given a brief introduction to how to use the HTC Vive Pro handle controllers to navigate in the virtual environment. Also, they can choose to terminate the experiment early if discomfort occurs.
•

The participants wear an HTC Vive Pro head-mounted display on their hand (to display the virtual environment) and an Empatica E4 wristband on their arm (to collect the EDA signal at a frequency of 4Hz).
•

The participants enter the virtual environment and navigate using the HTC Vive Pro handle controller touchpad along the highlighted path illustrated in Figure 1. Head-tracking (head position and rotation), motion (speed and rotation), and biosignal (EDA, blood volume pulse, temperature, and heart rate) data were collected during the entire process.

•

The participants completed the SSQ both before and after the navigation task. The difference between post-exposure and pre-exposure contributes to the SSQ score:

SSQ=SSQ_{post}-SSQ_{pre}.

The classification of the severity of cybersickness based on the SSQ score is as follows [49]:

SSQ=\left\{\begin{array}[]{lcl}\mathrm{Negligible}&,\mathrm{if}&0\leq SSQ\leq 5\\ \mathrm{Low}&,\mathrm{if}&5<SSQ\leq 20\\ \mathrm{Moderate}&,\mathrm{if}&20<SSQ\leq 40\\ \mathrm{High}&,\mathrm{if}&SSQ>40\\ \end{array}\right.

(1)

In the navigation task, we categorize the movement of each participant into Rest and Accelerate based on the average speed every 8 seconds. In addition, we calculate the motion sickness dose value (MSDV) [50] as an estimation of the participant’s cybersickness degree using the following formula:

MSDV=\Bigl{(}\int_{0}^{T}a^{2}(t)dt\Bigr{)}^{\frac{1}{2}},

where $T$ is the total duration, and $a$ is the acceleration computed from speed and time. Then, based on the MSDV value, it was categorized into Cybersickness and No cybersickness, which represent feeling cybersickness or not, respectively. Also, the participants wore an Empatica E4 wristband, which could record their electrodermal activity data at a rate of 4 Hz. The extracted SCR signals from the EDA signals serve as the ground truth of the participant’s probability of cybersickness.

We performed data preprocessing to fit the iFEP model. First, we utilized the original dataset’s speed, EDA signal, and SSQ score. Second, we divided the data collected in one task into 30 intervals. Each interval represents one trial and contains data over 8 seconds. Therefore, each participant’s data can be divided into 120 trials.

The iFEP method takes the participant’s action and cybersickness state at each trial as input. It predicts the internal states at each trial, including cybersickness probability, confidence, and curiosity. The internal states model factors that influence one participant to decide whether to go forward or stay still. Therefore, by analyzing these factors, the iFEP method could decode how the participants make decisions by estimating the intensity of curiosity and ground truth reward probability.

IV Results

This section presents the decoding results after applying the participant’s speed and EDA data to the iFEP method. We conducted a series of tests to evaluate the suitability of using the EDA data in the ReCU model and iFEP method. In this process, we substituted simulated EDA data into the ReCU model and iFEP method to analyze the results and validate the approach. After validating the suitability, we substituted the organized user behavioral data into the iFEP method and analyzed the predicted variables.

IV-A Validation of the ReCU Model

We generated simulated EDA signals to test the ReCU model’s validity in estimating the participant’s cybersickness level. The simulation test has 1000 trials, and the ground truth of the cybersickness degree is computed as described in Section III. Given the SCR signals extracted from the simulated EDA signals, the ReCU model replicates the decision-making process and estimates the selection probability for each option (Rest and Accelerate) and the reward probability (i.e., the probability of feeling cybersickness). We assume constant curiosity changes over time and is determined by $c_{t}=4\sin(4T\pi t)$ , where $t$ represents the trial number and $T$ represents the total number of trials. The learning rate $\alpha$ is 0.05, the randomness of action selection $\beta$ is 2, $P_{0}=0.8$ , and $\sigma_{w}=0.4$ . Moreover, the ground truth of the cybersickness degree was determined by the average of SCR signals every 8 seconds, and the probability of feeling cybersickness was assumed to be 10%, 40%, 60%, and 90%.

From Figure 3b, we can see that the ReCU model can predict the reward probability reasonably well. Also, the confidence in the estimated reward probability for one option increases when that option is selected and vice versa (Figure 3c). See Appendix B for the formulation of confidence. Additionally, the new information gained may decrease if the same option is selected consecutively (Figure 3d). Thus, it may encourage the participant to choose another option to obtain more information about the environment. This result shows the validation of the ReCU model for estimating user movement in a VR environment.

IV-B Validation of the iFEP method

After that, we used the same simulated EDA signals to test the validity of the iFEP method in the application of decoding the user decision-making process. First, the extracted SCR signals were input into the ReCU model to estimate the action selection and whether to receive a reward. The estimated reward probability for each option was treated as the ground truth (dashed line in Figures 4c and 4d) and compared with the results from the iFEP method. In addition, the estimated action and reward were inputs for the iFEP method to decode the reward probability and intensity of curiosity. The learing rate $\alpha$ is set as 0.05, the inverse temperature $\beta$ is 2, $P_{0}=0.8$ and $\sigma_{w}=0.4$ . Figure 4a shows the action selection for 1000 trials, where the shorter line indicates no reward after selecting an action, and the longer line indicates the presence of a reward after selecting an action. Moreover, we can see that the estimated reward probability for each option (red and blue solid lines in Figures 4c and 4d) correctly decodes the ground truth (dashed lines in Figures 4c and 4d). As a result, Figure 4 shows that the iFEP method could decode user decision-making processes in a VR environment.

Furthermore, we investigated the ability of the iFEP method to accurately decode user decision-making processes by applying different noise values to the intensity of curiosity. As shown in Figure 5a, the predictions of curiosity intensity with noise follow a similar pattern to the ground truth. Also, the root mean square error is minor when the value of the noise intensity is smaller (Figure 5b). Moreover, there is a positive correlation between the estimated and the ground truth of curiosity (Figures 5c and 5d). By comparing the estimation and ground truth of curiosity values, we could conclude that the prediction of curiosity intensity is robust under the influence of noise.

IV-C Results with virtual navigation data

After validating the ReCU model and iFEP method, we applied actual virtual navigation data from the abovementioned dataset to decode how the user chooses between Accelerate and Rest resulting from varying curiosity levels. We analyzed how the iFEP method predicted the participants’ realization of their probability of feeling cybersickness. Also, we evaluated the participant’s navigation behavior in the navigation task. The behavioral data are set up as described in Section III-C, and the total behavioral data is 120 trials for each participant. Here, we demonstrate one participant’s result, whose features are commonly present in the results of other participants. However, the differences in features in other participants’ results will also be discussed.

Figure 6 shows the decoding result of one participant’s navigation data using the iFEP method. Figure 6a indicates the action selected and the presence of reward at every trial, and the line in between indicates the probability of choosing to Accelerate, which is identical to the blue line in Figure 6b. Although the estimated values do not perfectly match the ground truth value, the estimations for both actions can recognize the fluctuation in the actual cybersickness probabilities (Figures 6c and 6d). Another reason affecting the accuracy of the estimated value could be that the iFEP assumes a gradual change in reward probability instead of a sudden change [35]. The confidence in the prediction of the reward probability for choosing to Rest and Accelerate is demonstrated by Figure 6e. Generally, the intensity of curiosity is harmful to most trials, with some positive curiosity around trials 30 to 60, as shown in Figure 6f. This outcome shows the participant is relatively conservative in choosing an action during the navigation task. Despite the general negative curiosity level for most participants, one participant’s behavioral data estimates positive curiosity for all trials. According to the SSQ classification, this participant has an average SSQ score of 11.22, which is the category of low cybersickness.

In addition, we analyzed the user behavioral data to explore the correlation between the expected information gain and curiosity. We computed the correlation coefficient between the expected information gain and curiosity and between the expected information gain and the temporal derivative of curiosity. As shown in Figure 7a, the correlation between expected information gain and curiosity at time lag = 0 is very small (0.0051), indicating no significant relationship between the two variables when they are aligned in time. However, the maximum correlation coefficient of 0.6316 occurs at a time lag of -24 samples. This suggests that, while there is no immediate correlation, a positive correlation emerges when expected information gain lags behind curiosity by 6 seconds. Moreover, the regression line illustrated in Figure 7b shows a positive correlation between the expected information gain and the temporal derivative of curiosity at zero time lag. The maximum correlation for this relationship occurs at a time lag of -4 samples, with a correlation coefficient of 0.5071. When considering the overall dataset from all participants, we observed that the average maximum correlation (accounting for both positive and negative correlations) occurs at a time lag of -2.36 samples, which corresponds to approximately -0.5909 seconds. This indicates that, on average, curiosity tends to precede expected information gain by about 0.59 seconds. Thus, suggest that curiosity have a predictive role represented by expected information gain. Although the average reaction time for a target detection task is approximately 0.454 seconds [51], which is faster than the observed lag in our study, this discrepancy can be attributed to individual differences in cognitive processing. Thus, our findings provide a reasonable reaction time, especially in the context of decision-making processes related to curiosity. Therefore, we can conclude that the participant would upscale their curiosity when the expected information gain increases (i.e., the environment becomes unfamiliar).

IV-D Alternative Methods

The subjective reward method and the Q-learning method are two alternative approaches for decoding a user’s decision-making process. The subjective reward method states that the expected net utility $U_{t}$ is computed by

U_{t}=d_{t}\times E[reward]+E[info],

where $d_{t}$ indicates the desire for reward at trial $t$ . So, instead of putting weight on the expected information gain in the iFEP method, the subjective reward method puts weight on the expected reward. In the context of participants performing navigation tasks in a VR environment, the desire for reward would represent whether the participant would like to experience cybersickness. Therefore, we would expect the value of $d_{t}$ to be small. However, the average value of reward intensity of 41.1953, which shows that the reward intensity did not indicate that the participant wanted to avoid cybersickness (Figure 8f).

On the other hand, the Q-learning method predicts the reward probability $Q$ for action $i$ using

Q_{i,t}=Q_{i,t-1}+\alpha_{t-1}(r_{t}a_{i,t-1}-Q_{i,t-1}),

where $\alpha_{t}$ indicates the learning rate at trial $t$ and $a_{i,t}$ indicates the action selected at trial $t$ . Also, the action selection probability is determined by

P(a_{i,t}=1)=\frac{e^{\beta_{t}Q_{i,t}}}{\sum_{i}e^{\beta_{t}Q_{i,t}}},

where $\beta_{t}$ indicates the inverse temperature at trial $t$ and controls the randomness of action selection. Appendix C gives a detailed explanation of the formulation. Therefore, to understand the relationship between inverse temperature and action selection, we analyzed the correlation coefficient between the expected information gain, inverse temperature, and the derivative of inverse temperature at different time lags.

As shown in Figures 9a and 9b, there is a positive correlation between the expected information gain and inverse temperature (correlation coefficient = 0.7031 at time lag = 0) and no correlation between the expected information gain and the temporal derivative of inverse temperature (correlation coefficient = -0.1635 at time lag = 0).

V Discussion

Most prior research approaches user interaction behavior through psychological or physiological lenses [20, 21, 52]. Our findings expand on these perspectives by elucidating the dynamic influence of curiosity in shaping user interactions within immersive environments. Additionally, we provide a holistic analysis by summarizing observed navigation behaviors and conducting quantitative evaluations. Leveraging the ReCU model and the iFEP method, we estimated critical variables, including the probability of cybersickness, confidence, curiosity, and expected net utility. These estimations allowed us to examine decision-making dynamics, highlighting how factors such as curiosity, expected information gain, and anticipated rewards interplay. Our results notably reveal that participants maintained curiosity about the virtual environment, even while experiencing varying degrees of cybersickness. Moreover, our predictions regarding cybersickness probabilities indicate that participants adjusted their navigation strategies based on self-awareness of their health conditions. Continuous estimation of these internal variables enhances our understanding of decision-making processes in immersive settings.

From the user study, approximately 27% of participants reported severe cybersickness with an SSQ score above 40, while another 27% experienced moderate symptoms. The remaining participants reported minimal symptoms. These findings underscore the significant impact of cybersickness on decision-making and task performance during virtual navigation. Using behavioral data with the iFEP method, we estimated cybersickness probability, achieving reasonable predictive accuracy. While the model does not perfectly align with ground truth, it effectively captures temporal trends in increasing and decreasing probabilities. The gradual nature of these changes may impact prediction precision. Although numerous studies explore user interaction in VR [53, 54], limited research addresses how users complete tasks despite sustained discomfort. Identifying such patterns enables the customization of VR experiences to accommodate individual needs, thereby enhancing user engagement and comfort.

Explaining the interplay between curiosity and rewards in decision-making is complex, leaving multiple interpretations of navigation patterns. First, the secondary task may have distracted participants, exacerbating symptoms [5]. Participants often reduced movement speed during navigation, suggesting an implicit health management strategy beyond the primary task. Second, pre-exposure questionnaires indicated that only 36% of participants reported frequent gaming experience. This unfamiliarity with rapidly changing scenes may have contributed to cybersickness, reducing exploration willingness [55, 25]. Additionally, personality traits can shape perception [56] and responses to novel scenarios [57], as reflected in the trend toward negative curiosity (Figure 6f) and cautious navigation patterns among most participants. These findings emphasize the importance of examining cybersickness’ influence on individual navigation behaviors, particularly given its prevalence.

Previous studies associate excessively conservative or curious behaviors with conditions like autism spectrum disorder or attention deficit hyperactivity disorder, where individuals either avoid or seek novel information significantly [35]. Our findings similarly reveal a general trend of negative curiosity, with most participants exhibiting cautious navigation behaviors likely influenced by cybersickness. However, one participant deviated, displaying consistently positive curiosity throughout all trials. This individual, with a low average SSQ score indicating minimal cybersickness, demonstrated more exploratory behavior. These results suggest that cybersickness is not the sole determinant of navigation patterns, as individuals tolerate discomfort to varying degrees [8]. In addition, rational behavior reflects a balance between exploratory and reward-seeking actions, where users initially gather information in unfamiliar environments but gradually focus on reward maximization as they gain familiarity [7]. Despite a generally conservative exploratory approach, we observed that users actively sought additional environmental information to reduce uncertainty, aligning with deprivation sensitivity [30].

V-A Practicality of alternative models

The subjective reward method emphasizes the expected reward to model a reward-driven condition. In the context of virtual navigation, this desire for reward reflects participants’ inclination to experience or avoid cybersickness. We anticipated that the reward intensity would be low, as participants are likely motivated to avoid discomfort. However, the average reward intensity across all participants was 24.97, indicating a stronger-than-expected desire for reward. Notably, only three participants exhibited a reward intensity of less than 1, suggesting they were actively trying to avoid cybersickness. This finding indicates that the subjective reward method may not effectively explain user decision-making in this scenario.

In contrast, the Q-learning method employs reinforcement learning techniques to model adaptive reward-seeking behavior by adjusting time-dependent meta-parameters, focusing exclusively on maximizing expected cumulative rewards. As shown in Figure 9a, the inverse temperature correlates with expected information gain. Additionally, the majority of participants demonstrated a positive correlation between expected cumulative reward and inverse temperature. This result implies that the randomness of action selection aligns with curiosity levels, suggesting that the iFEP method is more suitable for decoding user actions in a VR environment.

V-B Limitations and future scope

By investigating the temporal balance between cybersickness and curiosity based on users’ virtual navigation behavior, we gain insights into the neural correlates of temporal variability during immersion. However, we aware that our work may have some limitations.

First, the short navigation duration may limit accuracy. The result from the simulated data revealed that longer trials yield more accurate prediction. However, longer trials might improve the estimation accuracy, but also raise the risk of severe sickness symptoms, complicating the user study [8].

Second, only one scenario was used (see Figure 1), appealing more to participants with an affinity for forest-related contexts, prompting more active navigation. Conversely, participants who were indifferent to the setting may engaged less. Users often have varying interest levels in VR scenes, highlighting scenario design’s importance in VR applications. Future studies should also incorporate additional psychological factors like attention and stress, which could enhance the precision of decision-making models and foster a more comprehensive understanding of user interaction in immersive settings.

Previous study [58] found these variables to be positively correlated with cybersickness severity. Additionally, So et al. [59] used spatial velocity to estimate cybersickness exposure. To improve our analysis, we could collect additional user behavioral data (e.g., eyeblink rate and spatial velocity) during navigation. Expanding the number of participants in future experiments will also improve the reliability of our results.

VI Conclusion

This research aimed to decode user interaction patterns during virtual navigation, with a particular focus on the balance between exploration and the experience of cybersickness. Our findings reveal that users generally adopt a cautious approach to action selection in virtual environments, influenced by the discomfort of cybersickness. However, individuals experiencing fewer symptoms of cybersickness exhibit greater curiosity and more exploratory behavior in their navigation decisions. Moreover, we identified a positive correlation between expected information gain and curiosity, suggesting that users are more inclined to explore when the virtual environment undergoes changes. Overall, our study provides valuable insights into the interplay between curiosity and cybersickness, offering a quantitative framework for understanding how these factors shape user behavior in immersive environments. These contributions underscore the importance of balancing exploration with user comfort, and pave the way for future research to explore additional factors influencing this dynamic.

Acknowledgments

This work was supported by Guangzhou Municipal Science and Technology Bureau (2025A03J3955) and Science and Technology Planning Project of Guangdong Province (2024GXJK10).

Appendix A Expected net utility

This section provides the calculation process for the expected net utility according to [35] and [60]. The expected net utility is calculated using

U_{t}(a_{t+1})=E[Reward_{t+1}]+c_{t}\times E[Info_{t+1}],

(2)

where $a_{t}$ is action and $c_{t}$ is curiosity intensity at trial $t$ . In addition, the cybersickness probability is influenced by the latent cause $w$ and the participant’s action $a$ . Moreover, the latent cause

w_{i,t}=w_{i,t-1}+\sigma_{w}\times\epsilon_{i,t},

where $\sigma_{w}$ is the noise intensity, $\epsilon_{i,t}$ is the standard Gaussian noise, and $i$ represents the option’s index ( $i=1$ represents the action ‘rest’, and $i=2$ represents the action ‘Accelerate’). Therefore, the cybersickness probability is computed using

f(w_{i,t})=\frac{1}{1+e^{-w_{i,t}}}.

The participant’s recognition of the environment is represented as the following equation [35]:

P(o_{t}|\mathbf{w}_{t},\mathbf{a}_{t})=\prod_{i}\Bigr{[}f(w_{i,t})^{o_{t}}(1-f(w_{i,t}))^{1-o_{t}}\Bigl{]}^{a_{i,t}},

(3)

where $o_{t}\in\{0,1\}$ represents whether cybersickness occurs given the latent variable $\mathbf{w}_{t}=(w_{1,t},w_{2,t})^{T}$ and action $\mathbf{a}_{t}\in\bigl{(}(1,0)^{T},(0,1)^{T}\bigr{)}$ at trial $t$ . The Taylor series expansion for a real and differentiable function $f(x)$ at the point $x=a$ is a linear approximation at $x=a$ , where

f(x)=\sum_{n=0}^{\infty}\frac{f^{(n)}(a)}{n!}(x-a)^{n}.

Therefore, we obtain the following equation by applying the 2nd-order Taylor series expansion on equation (3):

	$\displaystyle P(\mathbf{o}_{t+1}\|\mathbf{a}_{t+1})=\int P(o_{t+1}\|\mathbf{w}_{t+1},\mathbf{a}_{t+1})Q(\mathbf{w}_{t+1}\|\mathbf{a}_{t+1})d\mathbf{w}_{t+1}$
	$\displaystyle=\int\prod_{i}\Bigl{(}f(w_{i,t+1})^{o_{t+1}}(1-f(w_{i,t+1})^{1-o_{t+1}})\Bigr{)}^{a_{i,t+1}}$
	$\displaystyle Q(\mathbf{w}_{t+1}\|\mathbf{a}_{t+1})d\mathbf{w}_{t+1}$
	$\displaystyle=\prod_{i}\Bigl{(}f(w_{i,t+1})^{o_{t+1}}(1-f(w_{i,t+1})^{1-o_{t+1}})$
	$\displaystyle+1^{o_{t+1}}(-1)^{1-o_{t+1}}\frac{1}{2}f(\mu_{i,t+1})(1-f(\mu_{i,t+1}))$
	$\displaystyle(1-2f(\mu_{i,t+1}))(p_{i,t}^{-1}+p_{w}^{-1})\Bigr{)}^{a_{i,t+1}}.$		(4)

$Q(\mathbf{w}_{t}|\varphi_{t})=N(\mathbf{w}_{t}|\boldsymbol{\mu}_{t},\boldsymbol{\Lambda}_{t}^{-1})$ is a Gaussian distribution where $\varphi_{t}=(\boldsymbol{\mu}_{t},\boldsymbol{\Lambda}_{t})$ . In addition, $\boldsymbol{\mu}_{t}=(\mu_{1,t},\mu_{2,t})^{T}$ represents the mean and $\boldsymbol{\Lambda}_{t}=diag(p_{1,t},p_{2,t})$ represents the precision. Based on the participant’s desired probability of cybersickness occurrence $P_{0}$ , the reward intensity is

R=\begin{cases}0,&\text{if $o_{t}=0$, i.e. no cybersickness occurs.}\\ \ln\frac{P_{0}}{1-P_{0}},&\text{if $o_{t}=1$, i.e. cybersickness occurs.}\end{cases}

Then, the expected reward (the first term of equation (2)) can be formulated as follows:

	$\displaystyle E[Reward_{t+1}]$	$\displaystyle=E_{P(o_{t+1}\|\mathbf{a}_{t+1})}[R(o_{t+1})]$
		$\displaystyle=\sum_{i}P(o_{t+1}\|\mathbf{a}_{i,t+1})R(o_{t+1}).$		(5)

The expected information gain (the second term of equation (2)) is computed using

$\displaystyle E$	$\displaystyle[Info_{t+1}]$
	$\displaystyle=E_{P(o_{t+1}\|\mathbf{a}_{t+1})}\Bigl{[}D_{KL}[Q(\mathbf{w}_{t+1}\|o_{t+1},\mathbf{a}_{t+1})\|\|Q(\mathbf{w}_{t+1}\|\mathbf{a}_{t+1})]\Bigr{]}$
	$\displaystyle=H(o_{t+1})-H(o_{t+1}\|\mathbf{w}_{t+1}),$	(6)

where Kullback-Leibler divergence ( $D_{KL}$ ) measures the difference between probability distrbutions $Q(\mathbf{w}_{t+1}|o_{t+1},\mathbf{a}_{t+1})$ and $Q(\mathbf{w}_{t+1}|\mathbf{a}_{t+1})$ , also known as information gain [60]. The first term $H(o_{t+1})$ is the marginal entropy and the second term $H(o_{t+1}|\mathbf{w}_{t+1})$ is the conditional entropy. First, the marginal entropy is

	$\displaystyle H(o_{t+1})=E_{P(o_{t+1}\|\mathbf{a}_{t+1})}[-\ln P(o_{t+1}\|\mathbf{a}_{t+1})]$
	$\displaystyle=-\sum_{i}a_{i,t+1}\Bigl{(}P(o_{t+1}=0\|\mathbf{a}_{t+1})\ln P(o_{t+1}=0\|\mathbf{a}_{t+1})$
	$\displaystyle+P(o_{t+1}=1\|\mathbf{a}_{t+1})\ln P(o_{t+1}=1\|\mathbf{a}_{t+1})\Bigr{)}.$		(7)

We substitute equation (4) into equation (7) to solve the formula for the marginal entropy. The conditional entropy is

	$\displaystyle H(o_{t+1}\|\mathbf{w}_{t+1})=$
	$\displaystyle E_{P(o_{t+1}\|\mathbf{w}_{t+1},\mathbf{a}_{t+1})Q(\mathbf{w}_{t+1}\|\mathbf{a}_{t+1})})[\ln P(o_{t+1}\|\mathbf{w}_{t+1},\mathbf{a}_{t+1})]$
	$\displaystyle=-E_{Q(\mathbf{w}_{t+1}\|\mathbf{a}_{t+1})}\Bigl{[}\ln\prod_{i}\Bigl{(}f(w_{i,t+1})^{o_{t+1}}$
	$\displaystyle(1-f(w_{i,t+1})^{1-o_{t+1}})^{a_{i,t+1}}\Bigr{)}\Bigr{]}$
	$\displaystyle=-E_{Q(\mathbf{w}_{t+1}\|\mathbf{a}_{t+1})}\Bigl{[}\sum_{i}a_{i,t+1}\Bigl{(}f(w_{i,t+1})\ln f(w_{i,t+1})$
	$\displaystyle+(1-f(w_{i,t+1}))\ln(1-f(w_{i,t+1}))\Bigr{)}\Bigr{]}$

We obtain the above equation by substituting equation (3) to $H(0_{t+1}|\mathbf{w}_{t+1})$ . Then, to approximate the value of conditional entropy, we would use Taylor’s theorem. Let $g(w_{i,t+1})=f(w_{i,t+1})\ln f(w_{i,t+1})+(1-f(w_{i,t+1}))\ln(1-f(w_{i,t+1})$ and apply the 2nd-order Taylor series expansion to it. Therefore, we obtain

	$\displaystyle H(o_{t+1}\|\mathbf{w}_{t+1})\approx-\sum_{i}a_{i,t+1}\Bigl{[}f(w_{i,t+1})\ln f(w_{i,t+1})$
	$\displaystyle+(1-f(w_{i,t+1}))\ln(1-f(w_{i,t+1}))$
	$\displaystyle+\frac{1}{2}\Bigl{(}f(\mu_{i,t+1})(1-f(\mu_{i,t+1}))(1+(1-2f(\mu_{i,t+1}))$
	$\displaystyle\ln\frac{f(\mu_{i,t+1})}{1-f(\mu_{i,t+1})})\Bigr{)}(p_{i,t}^{-1}+p_{w}^{-1})\Bigr{]}.$		(8)

Afterward, we need to compute the value of curiosity. Assuming that the curiosity varies at each trial and is influenced by curiosity’s noise $\zeta$ , the formula of curiosity at trial $t$ is:

c_{t}=c_{t-1}+\epsilon_{c}\times\zeta_{t},

(9)

where $\epsilon_{c}$ is the noise intenisty [35]. Therefore, by substituting equations (7) and (8) into equation (A) we obtain the value of the expected information gain. Finally, we attain the value of the expected net utility by substituting the expected reward (equation (A)), expected information gain (equation (A)) and curiosity intensity (equation (9)) to equation (2).

Appendix B Confidence

The confidence of cybersickness probability recognition is

\gamma_{i,t}=\frac{p_{i,t}}{f^{\prime}(\mu_{i,t})^{2}},

where $f$ is the cybersickness probability and $p_{i,t}$ is the precision of the probability distribution Q. Moreover, the value of $p_{i,t}$ is updated according to:

p_{i,t}=K_{i,t}^{-1}+f(\mu_{i,t})(1-f(\mu_{i,t})),

and

K_{i,t}=\sigma_{w}^{2}+p_{i,t}^{-1}.

Appendix C The Q-learning Method

The Q-learning method uses the following action value function to predict the reward obtained at trial $t$ and choosing action $i$ :

Q_{i,t}=Q_{i,t-1}+\alpha_{t-1}(r_{t}a_{i,t-1}-Q_{i,t-1}),

where $\alpha_{t}$ represents the learning rate. Additionally, the softmax function

P(a_{i,t}=1)=\frac{e^{B_{t}Q_{i,t}}}{\sum_{i}e^{B_{t}Q_{i,t}}},

where $B_{t}$ represents the inverse temperature and controls the randomness of action selection, is used to determine the action selected by the participant. The learning rate and inverse temperature values are calculated using behavioral data at each time point. These parameters are assumed to change temporally and follow

\theta_{t}=\theta_{t-1}+\epsilon_{\theta}\times\zeta_{\theta,t},

where $\theta\in\{\alpha,\beta\}$ , $\epsilon_{\theta}$ represents the noise intenisty, and $\zeta_{\theta,t}$ represents the white noise. This allows for adjustments based on the participant’s actions and outcomes. Consequently, the reward prediction at a given trial depends on the learning rate, inverse temperature, and the predicted reward from the previous trial.

References

[1] J.-R. Chardonnet, M. A. Mirzaei, and F. Mérienne, “Features of the postural sway signal as indicators to estimate and predict visually induced motion sickness in virtual reality,” International Journal of Human–Computer Interaction, vol. 33, no. 10, pp. 771–785, 2017.
[2] M. E. Gabyzon, B. Engel-Yeger, S. Tresser, and S. Springer, “Using a virtual reality game to assess goal-directed hand movements in children: A pilot feasibility study,” Technology and Health Care, vol. 24, no. 1, pp. 11–19, 2016.
[3] P.-K. Hung, R.-H. Liang, S.-Y. Ma, and B.-W. Kong, “Exploring the experience of traveling to familiar places in vr: an empirical study using google earth vr,” International Journal of Human–Computer Interaction, vol. 40, no. 2, pp. 255–277, 2024.
[4] Y. Kim and H. Lee, “Falling in love with virtual reality art: A new perspective on 3d immersive virtual reality for future sustaining art consumption,” International Journal of Human–Computer Interaction, vol. 38, no. 4, pp. 371–382, 2022.
[5] R. Venkatakrishnan, R. Venkatakrishnan, R. Canales, B. Raveendranath, D. M. Sarno, A. C. Robb, W.-C. Lin, and S. V. Babu, “The effects of secondary task demands on cybersickness in active exploration virtual reality experiences,” IEEE Transactions on Visualization and Computer Graphics, 2024.
[6] T. B. Kashdan, M. C. Stiksma, D. J. Disabato, P. E. McKnight, J. Bekier, J. Kaji, and R. Lazarus, “The five-dimensional curiosity scale: Capturing the bandwidth of curiosity and identifying four unique subgroups of curious people,” Journal of Research in Personality, vol. 73, pp. 130–149, 2018.
[7] M. P. Arnone, R. V. Small, S. A. Chauncey, and H. P. McKenna, “Curiosity, interest and engagement in technology-pervasive learning environments: A new research agenda,” Educational Technology Research and Development, vol. 59, pp. 181–198, 2011.
[8] S. Davis, K. Nesbitt, and E. Nalivaiko, “A systematic review of cybersickness,” in Proceedings of the 2014 conference on interactive entertainment. New York, NY, USA: Association for Computing Machinery, 2014, pp. 1–9.
[9] Y. Wang, J.-R. Chardonnet, F. Merienne, and J. Ovtcharova, “Using fuzzy logic to involve individual differences for predicting cybersickness during vr navigation,” in 2021 IEEE Virtual Reality and 3D User Interfaces (VR). IEEE, 2021, pp. 373–381.
[10] Y. Wang, J.-R. Chardonnet, and F. Merienne, “Modeling online adaptive navigation in virtual environments based on pid control,” in International Conference on Neural Information Processing. Springer, 2023, pp. 325–346.
[11] J. T. Reason and J. J. Brand, Motion sickness. Academic press, 1975.
[12] L. L. Arns and M. M. Cerney, “The relationship between age and incidence of cybersickness among immersive environment users,” in IEEE Proceedings. VR 2005. Virtual Reality, 2005. IEEE, 2005, pp. 267–268.
[13] T. Jung, M. C. tom Dieck, H. Lee, and N. Chung, “Effects of virtual reality and augmented reality on visitor experiences in museum,” in Information and communication technologies in tourism 2016: Proceedings of the international conference in Bilbao, Spain, February 2-5, 2016. Springer, 2016, pp. 621–635.
[14] M. Augier, “Administrative behavior: A study of decision-making processes in administrative organizations,” 2002.
[15] M. A. Gómez Maureira, I. Kniestedt, M. J. Van Duijn, C. Rieffe, and A. Plaat, “Shinobi valley: Studying curiosity for virtual spatial exploration through a video game,” in Extended Abstracts of the Annual Symposium on Computer-Human Interaction in Play Companion Extended Abstracts, 2019, pp. 421–428.
[16] Y. Wang, J.-R. Chardonnet, and F. Merienne, “Development of a speed protector to optimize user experience in 3d virtual environments,” International Journal of Human-Computer Studies, vol. 147, p. 102578, 2021.
[17] H. E. Yaremych and S. Persky, “Tracing physical behavior in virtual reality: A narrative review of applications to social psychology,” Journal of experimental social psychology, vol. 85, p. 103845, 2019.
[18] M. Rubo and M. Gamer, “Virtual reality as a proxy for real-life social attention?” in proceedings of the 2018 ACM symposium on eye tracking research & applications, 2018, pp. 1–2.
[19] M. Dechant, S. Trimpl, C. Wolff, A. Mühlberger, and Y. Shiban, “Potential of virtual reality as a diagnostic tool for social anxiety: A pilot study,” Computers in Human Behavior, vol. 76, pp. 128–134, 2017.
[20] A. S. Won, B. Perone, M. Friend, and J. N. Bailenson, “Identifying anxiety through tracked head movements in a virtual classroom,” Cyberpsychology, Behavior, and Social Networking, vol. 19, no. 6, pp. 380–387, 2016.
[21] O. Gillath, C. McCall, P. R. Shaver, and J. Blascovich, “What can virtual reality teach us about prosocial tendencies in real and virtual environments?” Media Psychology, vol. 11, no. 2, pp. 259–282, 2008.
[22] J. N. Bailenson, J. Blascovich, A. C. Beall, and J. M. Loomis, “Equilibrium theory revisited: Mutual gaze and personal space in virtual environments,” Presence: Teleoperators & Virtual Environments, vol. 10, no. 6, pp. 583–598, 2001.
[23] S. Deb, D. W. Carruth, R. Sween, L. Strawderman, and T. M. Garrison, “Efficacy of virtual reality in pedestrian safety research,” Applied ergonomics, vol. 65, pp. 449–460, 2017.
[24] M. Iryo-Asano, Y. Hasegawa, and C. Dias, “Applicability of virtual reality systems for evaluating pedestrians’ perception and behavior,” Transportation research procedia, vol. 34, pp. 67–74, 2018.
[25] W. Wei, R. Qi, and L. Zhang, “Effects of virtual reality on theme park visitors’ experience and behaviors: A presence perspective,” Tourism management, vol. 71, pp. 282–293, 2019.
[26] T. G. Reio Jr, J. M. Petrosko, A. K. Wiswell, and J. Thongsukmag, “The measurement and conceptualization of curiosity,” The Journal of Genetic Psychology, vol. 167, no. 2, pp. 117–135, 2006.
[27] J. A. Litman, “Interest and deprivation factors of epistemic curiosity,” Personality and individual differences, vol. 44, no. 7, pp. 1585–1595, 2008.
[28] ——, “Relationships between measures of i-and d-type curiosity, ambiguity tolerance, and need for closure: An initial test of the wanting-liking model of information-seeking,” Personality and Individual Differences, vol. 48, no. 4, pp. 397–402, 2010.
[29] M. Jepma, R. G. Verdonschot, H. Van Steenbergen, S. A. Rombouts, and S. Nieuwenhuis, “Neural mechanisms underlying the induction and relief of perceptual curiosity,” Frontiers in behavioral neuroscience, vol. 6, p. 5, 2012.
[30] M. K. Noordewier and E. Van Dijk, “Curiosity and time: from not knowing to almost knowing,” Cognition and Emotion, vol. 31, no. 3, pp. 411–421, 2017.
[31] J. Kruger and M. Evans, “The paradox of alypius and the pursuit of unwanted information,” Journal of Experimental Social Psychology, vol. 45, no. 6, pp. 1173–1179, 2009.
[32] K. Friston, “The free-energy principle: a rough guide to the brain?” Trends in Cognitive Sciences, vol. 13, no. 7, pp. 293–301, 2009.
[33] ——, “The free-energy principle: a unified brain theory?” Nature reviews neuroscience, vol. 11, no. 2, pp. 127–138, 2010.
[34] K. J. Friston, M. Lin, C. D. Frith, G. Pezzulo, J. A. Hobson, and S. Ondobaka, “Active inference, curiosity and insight,” Neural computation, vol. 29, no. 10, pp. 2633–2683, 2017.
[35] Y. Konaka and H. Naoki, “Decoding reward–curiosity conflict in decision-making from irrational behaviors,” Nature Computational Science, vol. 3, no. 5, pp. 418–432, 2023.
[36] B. Millidge, A. Tschantz, and C. L. Buckley, “Whence the expected free energy?” Neural Computation, vol. 33, no. 2, pp. 447–482, 2021.
[37] P. Schwartenbeck, J. Passecker, T. U. Hauser, T. H. FitzGerald, M. Kronbichler, and K. J. Friston, “Computational mechanisms of curiosity and goal-directed exploration,” elife, vol. 8, p. e41703, 2019.
[38] J. Schmidhuber, “Curious model-building control systems,” in Proc. international joint conference on neural networks, 1991, pp. 1458–1463.
[39] R. Dubey and T. L. Griffiths, “Understanding exploration in humans and machines by formalizing the function of curiosity,” Current Opinion in Behavioral Sciences, vol. 35, pp. 118–124, 2020.
[40] J. Häkkinen, M. Liinasuo, J. Takatalo, and G. Nyman, “Visual comfort with mobile stereoscopic gaming,” in Stereoscopic Displays and Virtual Reality Systems XIII, A. J. Woods, N. A. Dodgson, J. O. Merritt, M. T. Bolas, and I. E. McDowall, Eds., vol. 6055, International Society for Optics and Photonics. SPIE, 2006, pp. 85 – 93.
[41] E. Kolasinski, U. A. R. I. for the Behavioral, and S. Sciences, Simulator Sickness in Virtual Environments, ser. Simulator Sickness in Virtual Environments. U.S. Army Research Institute for the Behavioral and Social Sciences, 1995, no. v. 4. [Online]. Available: https://books.google.com/books?id=fy3q_5LbkLQC
[42] G. D. Park, R. W. Allen, D. Fiorentino, T. J. Rosenthal, and M. L. Cook, “Simulator sickness scores according to symptom susceptibility, age, and gender for an older driver assessment study,” in Proceedings of the human factors and ergonomics society annual meeting, vol. 50, no. 26. SAGE Publications Sage CA: Los Angeles, CA, 2006, pp. 2702–2706. [Online]. Available: https://doi.org/10.1177/154193120605002607
[43] E. M. Kolasinski and R. D. Gilson, “Simulator sickness and related findings in a virtual environment,” in Proceedings of the human factors and ergonomics society annual meeting, vol. 42, no. 21. SAGE Publications Sage CA: Los Angeles, CA, 1998, pp. 1511–1515.
[44] S. P. Smith and S. Du’Mont, “Measuring the effect of gaming experience on virtual environment navigation tasks,” in 2009 IEEE Symposium on 3D User Interfaces. IEEE, 2009, pp. 3–10.
[45] M. S. Dennison, A. Z. Wisti, and M. D’Zmura, “Use of physiological signals to predict cybersickness,” Displays, vol. 44, pp. 42–52, 2016.
[46] Y. Wang, J. Eckkrammer, M. Kocur, and P. Wintersberger, “Investigation of simulator sickness in walking with multiple locomotion technologies in virtual reality,” in Proceedings of the 30th ACM Symposium on Virtual Reality Software and Technology, 2024, pp. 1–2.
[47] J. Gabel, M. Ludwig, and F. Steinicke, “Immersive reading: Comparison of performance and user experience for reading long texts in virtual reality,” in Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, 2023, pp. 1–8.
[48] J. Kim, W. Kim, H. Oh, S. Lee, and S. Lee, “A deep cybersickness predictor based on brain signal analysis for virtual reality contents,” IEEE, 2019.
[49] R. Li, Y. Wang, H. Yin, J.-R. Chardonnet, and P. Hui, “A deep cybersickness predictor through kinematic data with encoded physiological representation,” in 2023 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). IEEE, 2023, pp. 1132–1141.
[50] M. J. Griffin, Handbook of human vibration. Academic press, 2012.
[51] T. S. Braver, D. M. Barch, J. R. Gray, D. L. Molfese, and A. Snyder, “Anterior cingulate cortex and response conflict: effects of frequency, inhibition and errors,” Cerebral cortex, vol. 11, no. 9, pp. 825–836, 2001.
[52] S. J. Chung, S. Y. Kim, and K. H. Kim, “Comparison of visitor experiences of virtual reality exhibitions by spatial environment,” International Journal of Human-Computer Studies, vol. 181, p. 103145, 2024.
[53] P. I. Jaffe, R. A. Poldrack, R. J. Schafer, and P. G. Bissett, “Modelling human behaviour in cognitive tasks with latent dynamical systems,” Nature Human Behaviour, vol. 7, no. 6, pp. 986–1000, 2023.
[54] J. R. J. Neo, A. S. Won, and M. M. Shepley, “Designing immersive virtual environments for human behavior research,” Frontiers in Virtual Reality, vol. 2, p. 603750, 2021.
[55] N. Tian, P. Lopes, and R. Boulic, “A review of cybersickness in head-mounted displays: raising attention to individual susceptibility,” Virtual Reality, vol. 26, no. 4, pp. 1409–1441, 2022.
[56] Y. Ishikawa, A. Kobayashi, and D. Kamisaka, “Modelling and predicting an individual’s perception of advertising appeal,” User Modeling and User-Adapted Interaction, vol. 31, no. 2, pp. 323–369, 2021.
[57] L. Chen, W. Cai, D. Yan, and S. Berkovsky, “Eye-tracking-based personality prediction with recommendation interfaces,” User Modeling and User-Adapted Interaction, vol. 33, no. 1, pp. 121–157, 2023.
[58] Y. Y. Kim, H. J. Kim, E. N. Kim, H. D. Ko, and H. T. Kim, “Characteristic changes in the physiological components of cybersickness,” Psychophysiology, vol. 42, no. 5, pp. 616–625, 2005.
[59] R. H. So, A. Ho, and W. Lo, “A metric to quantify virtual scene movement for the study of cybersickness: Definition, implementation, and verification,” Presence, vol. 10, no. 2, pp. 193–215, 2001.
[60] T. Parr and K. J. Friston, “Generalised free energy and active inference,” Biological cybernetics, vol. 113, no. 5, pp. 495–513, 2019.

	$\displaystyle H(o_{t+1})=E_{P(o_{t+1}\|\mathbf{a}_{t+1})}[-\ln P(o_{t+1}\|\mathbf{a}_{t+1})]$
	$\displaystyle=-\sum_{i}a_{i,t+1}\Bigl{(}P(o_{t+1}=0\|\mathbf{a}_{t+1})\ln P(o_{t+1}=0\|\mathbf{a}_{t+1})$
	$\displaystyle+P(o_{t+1}=1\|\mathbf{a}_{t+1})\ln P(o_{t+1}=1\|\mathbf{a}_{t+1})\Bigr{)}.$		(7)

	$\displaystyle H(o_{t+1}\|\mathbf{w}_{t+1})=$
	$\displaystyle E_{P(o_{t+1}\|\mathbf{w}_{t+1},\mathbf{a}_{t+1})Q(\mathbf{w}_{t+1}\|\mathbf{a}_{t+1})})[\ln P(o_{t+1}\|\mathbf{w}_{t+1},\mathbf{a}_{t+1})]$
	$\displaystyle=-E_{Q(\mathbf{w}_{t+1}\|\mathbf{a}_{t+1})}\Bigl{[}\ln\prod_{i}\Bigl{(}f(w_{i,t+1})^{o_{t+1}}$
	$\displaystyle(1-f(w_{i,t+1})^{1-o_{t+1}})^{a_{i,t+1}}\Bigr{)}\Bigr{]}$
	$\displaystyle=-E_{Q(\mathbf{w}_{t+1}\|\mathbf{a}_{t+1})}\Bigl{[}\sum_{i}a_{i,t+1}\Bigl{(}f(w_{i,t+1})\ln f(w_{i,t+1})$
	$\displaystyle+(1-f(w_{i,t+1}))\ln(1-f(w_{i,t+1}))\Bigr{)}\Bigr{]}$