acmlicensed \isbn978-1-4503-6708-0/20/04 \acmPrice$15.00
Dynamic Difficulty Adjustment via Fast User Adaptation
Abstract
Dynamic difficulty adjustment (DDA) is a technology that adapts a game’s challenge to match the player’s skill. It is a key element in game development that provides continuous motivation and immersion to the player. However, conventional DDA methods require tuning in-game parameters to generate the levels for various players. Recent DDA approaches based on deep learning can shorten the time-consuming tuning process, but require sufficient user demo data for adaptation. In this paper, we present a fast user adaptation method that can adjust the difficulty of the game for various players using only a small amount of demo data by applying a meta-learning algorithm. In the video game environment user test (n=9), our proposed DDA method outperformed a typical deep learning-based baseline method.
doi:
https://doi.org/10.1145/3313831.XXXXXXXkeywords:
Dynamic difficulty adjustment; deep learning; meta-learning.<ccs2012> <concept> <concept_id>10003120.10003121.10003122.10003332</concept_id> <concept_desc>Human-centered computing User models</concept_desc> <concept_significance>500</concept_significance> </concept> <concept> <concept_id>10010147.10010257.10010293.10010294</concept_id> <concept_desc>Computing methodologies Neural networks</concept_desc> <concept_significance>300</concept_significance> </concept> <concept> <concept_id>10010405.10010476.10011187.10011190</concept_id> <concept_desc>Applied computing Computer games</concept_desc> <concept_significance>300</concept_significance> </concept> </ccs2012>
[500]Human-centered computing User models \ccsdesc[300]Computing methodologies Neural networks \ccsdesc[300]Applied computing Computer games
1 Introduction
Difficulty balancing is a key element in game development because players easily become bored or frustrated when the games are too easy or difficult for them. Dynamic difficulty adjustment (DDA) is a method for adapting the difficulty of a game according to the player’s ability to provide continuous motivation to the player. Various studies in the HCI field [2, 3, 5] have revealed that DDA has positive effects, such as increasing the immersion [4] and long-term motivation of players [12].
Several studies have been conducted on how to implement DDA [7, 15, 9, 13]. One of the most straightforward but powerful methods is to increase or decrease the game’s strength index, e.g., the in-game parameters or the AI level, according to the player’s in-game performance. However, the parameter adjustment method requires careful tuning, which is time-consuming. With the development of deep learning technology in various fields [16, 10, 11, 14], it is expected that the shortcomings of conventional DDA methods can be overcome by using deep neural networks. In [12], a method was proposed for adapting the game challenge to the player by generating an enemy agent based on a player model trained using the player’s actual movement and strategy. This method outperformed conventional DDA in several subjective metrics but required adequate data acquisition process because the player model had to be newly trained for each player.
In this paper, we propose a novel DDA approach referred to as fast user adaptation based on deep neural networks that can quickly adapt to a player’s capabilities with a small amount of play data. In order to use the sparse user demo data effectively, we employ the model-agnostic meta-learning (MAML) algorithm [6]. Meta-learning is a method that focuses on fast adaptation to various tasks, i.e., the generalization of network parameters, to make it easy to respond to new unseen tasks. We apply this meta-learning concept to create a DDA model that quickly adapts to new players.


2 Fast User Adaptation
The fast user adaptation method we present is to modify the MAML algorithm [6] to train a model that can quickly respond to different users (Figure 1(a)). When dividing the training data obtained from various tasks into and , the MAML method first updates the network parameter in a few gradient steps calculated using , and trains to minimize the loss of calculated with the updated . This training method can be expressed in the following equation
where denotes the loss value when data , obtained from task , is fed into the model with parameter . Our fast user adaptation method applies the MAML algorithm, where in place of using training data from various tasks (), we use data from various players () instead. Similar to [12], we hypothesize that a DDA that makes players encounter agents whose behavior and strategy are similar to themselves can boost player motivation effectively. Therefore, our DDA method is intended to make an agent quickly learn the player’s movements so that the player faces an agent who plays similarly to himself/herself.
3 Experiment Details
For the user test, we developed a virtual Air Hockey game environment where two players compete with their respective strikers and a single puck on a slippery surface. A player can freely move the striker within his/her area and score points by hitting the puck and putting it inside the opponent’s goal. We conducted a user test that confronts participants with DDA-applied agents in this Air Hockey game environment.
To validate our fast user adaptation model, we implemented two baseline DDA methods: another data-driven approach utilizing neural networks, and a conventional DDA approach. For the data-driven baseline, referred to as LSTM-FC Net, we implemented a neural network model incorporating long short-term memory (LSTM) layers that can extract the user embedding information, e.g., users’ proficiency, from user demo data, and fully connected (FC) layers that output appropriate actions based on the current game state and the embedding information (Figure 1(b)). For the conventional DDA baseline, we generated agents corresponding to progressive levels of difficulty from 1 to 9. The level of difficulty was increased or decreased depending on the player’s win or loss.
In detail, our DDA network consists of four FC layers with 80 hidden units, and the LSTM-FC Net consists of two LSTM layers with 10 hidden units and four FC layers with 80 hidden units. 60M timesteps of artificial agent data and 0.2M data acquired from one human player were used for the model training. When the same data were exploited for five epochs, our model training was about nine times faster than the LSTM-FC Net (2 hours vs 18 hours).
Nine participants between 22 to 29 years of age (mean age=25.33) were recruited for the user test. All participants were provided with sufficient practice time to avoid their skill increase during the user test. After the practice time, participants took apart in three sessions with the three different types of DDA agents in random order. The initial difficulty adjustment of each session was performed using data acquired during a pre-session of about one minute performed immediately before each session. Each session lasted about four minutes, and after half of each session (i.e., after two minutes), a short break was given and an additional difficulty adjustment implemented.
4 Results
We evaluated the DDA methods using both objective and subjective evaluation metrics. As objective metrics of how successfully the DDA model adapted to the user, we measured the participants’ win/loss rate and puck possession, i.e., the percentage of time with puck on one’s side. An even game is expected to result in 50 percent for each metric. Figure 2(a) indicates that our method shows a comparable win/loss rate to the conventional method and is superior to that of the LSTM-FC Net. In terms of the puck possession, our method also shows a comparable result to the conventional method, and a superior result to the LSTM-FC Net.
As subjective metrics, we asked the participants to complete a questionnaire which assessed the enjoyment, suitable difficulty, engrossment, and personal gratification, modified from [8]. Figure 2(b) shows the subjective evaluation results of our user test. Our DDA method shows superior results to the LSTM-FC Net in terms of the enjoyment, engrossment, and, in particular, the suitable difficulty score.
5 Conclusion
In this paper, we proposed a novel DDA method named fast user adaptation based on a meta-learning algorithm. Our method surpassed a deep neural network-based baseline in both objective and subjective evaluations, and showed a much faster learning speed. In addition, our method showed comparable performance to the conventional DDA even though it has the advantage of not requiring time-consuming parameter tuning.
6 Acknowledgments
This work was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07043580).
References
- [1]
- [2] Alexander Baldwin, Daniel Johnson, and Peta A Wyeth. 2014. The effect of multiplayer dynamic difficulty adjustment on the player experience of video games. In CHI’14 Extended Abstracts on Human Factors in Computing Systems. 1489–1494.
- [3] Thomas Constant and Guillaume Levieux. 2019. Dynamic difficulty adjustment impact on players’ confidence. In Proceedings of the 2019 CHI conference on human factors in computing systems (CHI ’19). 1–12.
- [4] Alena Denisova and Paul Cairns. 2015. Adaptation in digital games: the effect of challenge adjustment on player performance and experience. In Proceedings of the 2015 Annual Symposium on Computer-Human Interaction in Play (CHI PLAY’15). 97–101.
- [5] Alena Denisova and Paul Cairns. 2019. Player experience and deceptive expectations of difficulty adaptation in digital games. Entertainment Computing 29 (2019), 56–68.
- [6] Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning (ICML). 1126–1135.
- [7] Suoju He, Junping Wang, Xiao Liu, Wan Huang, and others. 2010. Dynamic difficulty adjustment of game AI by MCTS for the game Pac-Man. In 2010 Sixth International Conference on Natural Computation, Vol. 8. IEEE, 3918–3922.
- [8] Takahiro Kusano, Yunshi Liu, Pujana Paliyawan, Tomohiro Harada, and Ruck Thawonmas. 2019. Motion Gaming AI using Time Series Forecasting and Dynamic Difficulty Adjustment for Improving Exercise Balance and Enjoyment. In 2019 IEEE Conference on Games (CoG).
- [9] David Melhart, Ahmad Azadvar, Alessandro Canossa, Antonios Liapis, and Georgios N Yannakakis. 2019. Your gameplay says it all: modelling motivation in Tom Clancy’s the division. In 2019 IEEE Conference on Games (CoG). IEEE, 1–8.
- [10] Hee-Seung Moon and Jiwon Seo. 2019a. Observation of human response to a robotic guide using a variational autoencoder. In 2019 Third IEEE International Conference on Robotic Computing (IRC). IEEE, 258–261.
- [11] Hee-Seung Moon and Jiwon Seo. 2019b. Prediction of human trajectory following a haptic robotic guide using recurrent neural networks. In 2019 IEEE World Haptics Conference (WHC). IEEE, 157–152.
- [12] Johannes Pfau, Jan David Smeddinck, and Rainer Malaka. 2020. Enemy within: Long-term motivation effects of deep player behavior models for dynamic difficulty adjustment. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (CHI’20). 1–10.
- [13] I-Chen Wu, Ti-Rong Wu, An-Jen Liu, Hung Guei, and Tinghan Wei. 2019. On strength adjustment for MCTS-based programs. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’19), Vol. 33. 1222–1229.
- [14] Ziming Wu, Yulun Jiang, Yiding Liu, and Xiaojuan Ma. 2020. Predicting and Diagnosing User Engagement with Mobile UI Animation via a Data-Driven Approach. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (CHI’20). 1–13.
- [15] Haiyan Yin, Linbo Luo, Wentong Cai, Yew-Soon Ong, and Jinghui Zhong. 2015. A data-driven approach for online adaptation of game difficulty. In 2015 IEEE conference on computational intelligence and games (CIG). IEEE, 146–153.
- [16] Tianhe Yu, Chelsea Finn, Annie Xie, Sudeep Dasari, Tianhao Zhang, Pieter Abbeel, and Sergey Levine. 2018. One-shot imitation from observing humans via domain-adaptive meta-learning. arXiv preprint arXiv:1802.01557 (2018).