\setcopyright

ifaamas \acmConference[AAMAS ’23]Proc. of the 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2023)May 29 – June 2, 2023 London, United KingdomA. Ricci, W. Yeoh, N. Agmon, B. An (eds.) \copyrightyear2023 \acmYear2023 \acmDOI \acmPrice \acmISBN \acmSubmissionID71 \affiliation \institutionDelft University of Technology \cityDelft \countryThe Netherlands \affiliation \institutionDelft University of Technology \cityDelft \countryThe Netherlands \affiliation \institutionTNO \citySoesterberg \countryThe Netherlands \affiliation \institutionDelft University of Technology \cityDelft \countryThe Netherlands

Persuading to Prepare for Quitting Smoking with a Virtual Coach: Using States and User Characteristics to Predict Behavior

Nele Albers [email protected] , Mark A. Neerincx [email protected] and Willem-Paul Brinkman [email protected]

Abstract.

Despite their prevalence in eHealth applications for behavior change, persuasive messages tend to have small effects on behavior. Conditions or states (e.g., confidence, knowledge, motivation) and characteristics (e.g., gender, age, personality) of persuadees are two promising components for more effective algorithms for choosing persuasive messages. However, it is not yet sufficiently clear how well considering these components allows one to predict behavior after persuasive attempts, especially in the long run. Since collecting data for many algorithm components is costly and places a burden on users, a better understanding of the impact of individual components in practice is welcome. This can help to make an informed decision on which components to use. We thus conducted a longitudinal study in which a virtual coach persuaded 671 daily smokers to do preparatory activities for quitting smoking and becoming more physically active, such as envisioning one’s desired future self. Based on the collected data, we designed a Reinforcement Learning (RL)-approach that considers current and future states to maximize the effort people spend on their activities. Using this RL-approach, we found, based on leave-one-out cross-validation, that considering states helps to predict both behavior and future states. User characteristics and especially involvement in the activities, on the other hand, only help to predict behavior if used in combination with states rather than alone. We see these results as supporting the use of states and involvement in persuasion algorithms. Our dataset is available online.

Key words and phrases:

Persuasion Algorithm; Reinforcement Learning; Conversational Agent, eHealth; Smoking; Behavior Change; Physical Activity

1. Introduction

Recent years have seen a surge of eHealth applications for behavior change (e.g., Ly et al. (2017); Fadhil and Gabrielli (2017); Meijer et al. (2021)), which provide behavior change support over the Internet or connected technologies such as apps and text messaging. Such applications often ask their users to do activities such as setting a goal, planning a running route, or watching an educational video. Persuasive messages are commonly used to motivate users to do these activities. For example, users may be reminded that doing an activity is in line with their decision to change their behavior. However, the effect of single persuasive attempts on behavior tends to be small (e.g., Kaptein et al. (2015); de Vries (2018); Albers et al. (2022)).

Several studies have tried to increase the effectiveness of a persuasive attempt. One way is to consider the current state people are in (e.g., confidence, knowledge, motivation). Such a state describes a person’s condition or status at a certain time that is relatively stable with regards to its elements American Psychological Association (2023). Carfora et al. (2020) and Klein et al. (2013), for instance, account for people’s self-efficacy when selecting messages for behavior change. Doing so is in line with behavior change theories, which posit that behavior is influenced by people’s current state (e.g., Michie et al. (2011); Ajzen (1991)). Yet, behavior in turn can also influence people’s states. For example, verbally persuading people Strecher et al. (1986) or improving their mood Kavanagh and Bower (1985) may increase their self-efficacy. Intuitively, we want to persuade people in such a way that they move to a state in which they are more likely to be successfully persuaded again. One framework that allows one to consider both current and future states is Reinforcement Learning (RL). RL with consideration of states has been applied to adapting the framing of messages for inducing healthy nutritional habits Carfora et al. (2020) or the affective behavior of a social robot teacher Gordon et al. (2016). However, it is not yet sufficiently clear how persuasive attempts affect behavior and future states, especially after a sequence of these attempts.

An alternative to considering people’s current state when choosing a persuasive strategy is to consider their characteristics such as gender, personality, and involvement in an issue. While previous work has found such characteristics to play a role (e.g., Kaptein and Eckles (2012); de Vries (2018); Maheswaran and Meyers-Levy (1990)), little work has comprehensively compared the use of user characteristics to the one of states. In addition, it may be helpful to combine these two approaches: behavior after applying a persuasive strategy in a state may differ based on user characteristics.

Our goal thus is to shed light on the effects of considering algorithm components such as states, user characteristics, or both when choosing a persuasive strategy. While previous work has tested algorithms with such components (e.g., Gordon et al. (2016); Hors-Fraile et al. (2019)), we do not yet understand the effects of individual algorithm components in practice. Therefore, rather than developing a new algorithm and comparing it to existing ones, we want to first get a better understanding of the practical impact of algorithm components. This can enable informed decisions on which components to include, which is desirable due to the larger amount of human data that needs to be collected when more components are used. Collecting more human data is costly and places a burden on users of eHealth applications that is unlikely to benefit the already low adherence rates to these applications. If data is explicitly collected by means of questions, people are likely to stop using the application if many questions are asked. For example, Pommeranz et al. (2012) saw that more cognitively demanding preference elicitation methods were seen as more effortful and liked less, which is negatively associated with technology use Venkatesh et al. (2012). Moreover, while implicit data collection methods such as sensors have the potential to collect high-quality data less obtrusively, they often do not yet succeed at this. Yang et al. (2023), for instance, found in the context of smoking cessation that improvements in sensing technology are needed to obtain higher data quality, lower the burden to users, and increase adherence.

Thus, to get a better understanding of the effects of algorithm components, we conducted a study in which smokers interacted with the text-based virtual coach Sam in up to five sessions. In each session, Sam assigned people a new preparatory activity for quitting smoking together with a persuasive strategy. The goal of these activities was to prepare people for change, which is typically done at the start of a behavior change intervention to increase the likelihood of successful change. Half of the activities targeted becoming more physically active as this may facilitate quitting smoking Haasova et al. (2013); Trimbos Instituut (2016). In the next session, Sam asked about the effort people spent on their activity to measure their behavior. To determine people’s states, Sam asked questions about people’s capability, opportunity, and motivation to do an activity. Each pair of states from consecutive sessions forms a transition sample that we used to predict states after persuasive attempts. Moreover, we measured 32 characteristics covering demographics, smoking and physical activity, personality, and involvement in the activities. Based on the resulting 2366 transition samples from 671 people, we compared the effectiveness of considering states, user characteristics, or both for predicting behavior after persuasive attempts. In addition, we used simulations to assess the long-term effects of optimally persuading people based on an RL-approach that considers current and future states to maximize the effort people spend on their activities.

This paper’s contribution is evidence supporting the use of states derived from behavior change theories as well as people’s overall involvement as components in persuasion algorithms. Following the stages in the development of technological health interventions defined by Brinkman (2011), this justifies research on including these components in a full intervention as a next step.

2. Background

2.1. Persuasive strategies

Several sets of persuasive strategies have been defined. For example, Oinas-Kukkonen and Harjumaa (2008) distinguish seven persuasive strategies such as social comparison and competition, Cialdini (2006) defines six persuasive strategies such as authority, Fogg (2002) differentiates between persuasive strategies related to “technology as a tool” (e.g., self-monitoring) and those related to “technology as a social actor” (e.g., language cues), and Consolvo et al. (2009) describe nine persuasive strategies such as credibility. Such persuasive strategies are meant to directly influence people’s motivation Michie et al. (2011). In addition, there are strategies that are meant to influence motivation indirectly by, for example, restructuring a person’s environment. Examples include action and coping planning Sniehotta et al. (2005b). Notably, many of these persuasive strategies can be implemented in several ways. For instance, there are different ways of framing messages (e.g., Steward et al. (2003); Catellani et al. (2021)) and communication modalities (e.g., Vidrine et al. (2012); Wang et al. (2019)). In this work, we focus on persuasive strategies that can be implemented in a text-based virtual coach that supports a single person in their behavior change process, without requiring external elements such as sensor data or peers. We thereby interpret the term persuasion broadly to also include strategies that influence motivation indirectly.

2.2. States

Persuasive strategies are not equally effective in all circumstances: the context of a persuasive attempt matters Alslaity and Tran (2020); Oinas-Kukkonen and Harjumaa (2009). One way to describe the context is the state a persuadee is in. For example, the effectiveness of different health messages depends on a persuadee’s self-efficacy Bertolotti et al. (2019), and the processing of messages depends on a persuadee’s mood Bless et al. (1990); Fogg (2002). Several of these state features have been formalized as influencing behavior in behavior change theories. One such theory is the behavior change wheel Michie et al. (2011), at whose center lies the Capability-Opportunity-Motivation-Behavior (COM-B) model of behavior. This COM-B model is an overarching causal model of behavior, according to which a person’s capability, motivation, and opportunity determine their behavior. Capability includes having the necessary knowledge and skills, motivation considers the brain processes influencing behavior, and opportunity captures factors outside of an individual such as support from one’s social environment. The COM-B model is overarching in the sense that components of other behavior change theories can be mapped to it. For example, Fogg’s behavior model specifies that ability, motivation, and a trigger need to come together for behavior to happen Fogg (2002). Ability can be mapped to “Capability” in the COM-B model, motivation to “Motivation,” and the trigger to “Opportunity.” The COM-B model thus provides an indication of which information about a persuadee’s state needs to be considered to predict behavior after persuasive attempts. One question we pose hence is:

Q1: How well can states derived from the COM-B model predict behavior after persuasive attempts?

2.3. Future states

In the COM-B model, a person’s capability, opportunity, and motivation influence their behavior, and the behavior in turn influences their capability, opportunity, and motivation. Thus, behavior influences people’s future states. This effect of behavior on a person’s state has also been studied in the context of persuasion. For instance, Steward et al. (2003) found that the framing of messages influences their effect on self-efficacy, and Carfora et al. (2019) saw that the message type affects a person’s intention to act, anticipated regret, and attitude towards behavior. Thus, persuasive strategies differ in their effect on a persuadee’s state. Ideally, we would choose a persuasive strategy that positively influences a persuadee’s state by, for example, increasing motivation. To do so, we need to be able to predict not just the behavior, but also the state after a persuasive attempt. We thus investigate the following question:

Q2: How well can states derived from the COM-B model predict states after persuasive attempts?

Ideally, a persuasive attempt moves a person to a future state in which they are more likely to be successfully persuaded again. Since capability, opportunity, and motivation determine behavior, the goal is that each person ultimately moves to, and then stays in, a state with high values for these predictors of behavior. We, therefore, want to examine what happens to people’s states after a sequence of persuasive attempts in the ideal case. The ideal case is that we always use the optimal persuasive strategy:

Q3: What is the effect of (multiple) optimal persuasive attempts on persuadees’ states?

Being able to predict states may help to choose effective sequences of persuasive strategies, but how is behavior affected by using sequences of optimal persuasive strategies? And importantly, how much does it matter what a virtual coach says? Hence, we pose the following question:

Q4: How do optimal and sub-optimal persuasive attempts compare in their effect on behavior?

2.4. User characteristics

Considering people’s states is one way to capture their differing responses to persuasive strategies - considering user characteristics is another. With user characteristics we mean information about a user that changes, if at all, very slowly and irrespective of persuasive attempts. Kaptein and Eckles (2012), for instance, showed that age, gender, and personality may influence which of the persuasive strategies by Cialdini (2006) is most effective. Several other works have confirmed the influence of user characteristics such as the stage of behavior change de Vries (2018), personality Alkış and Temizel (2015); de Vries (2018); Halko and Kientz (2010); Oyibo and Vassileva (2017); Zalake et al. (2021), age and gender Muhammad Abdullahi et al. (2018), cultural background Oyibo et al. (2018), and how people approach pleasure and pain Cesario et al. (2008). Another potentially important user characteristic is involvement. According to the Elaboration Likelihood Model (ELM) Petty and Cacioppo (1986), messages are more likely to be processed in detail when people are highly involved in an issue Maheswaran and Meyers-Levy (1990). Such in-depth processing in turn is more likely to have a persistent effect Petty and Cacioppo (1986). Predicting the effectiveness of persuasive attempts based on user characteristics has the advantage that we need to collect data less often: in contrast to states, we do not need to gather this data before each persuasive attempt. We thus pose the following question:

Q5: How does predicting behavior based on user characteristics compare to doing so based on states?

Rather than replacing states with user characteristics, one may also use both states and characteristics. For instance, Steward et al. (2003) showed that a person’s need for cognition influences the effect of message types on self-efficacy. Thus, user characteristics may have an effect on the states after persuasive attempts. Intuitively, one would expect people who are more similar with regard to these user characteristics to respond more similarly to persuasive attempts. We, therefore, investigate the following question:

Q6: How does incorporating users’ similarity based on characteristics, besides the consideration of states, improve the prediction of behavior?

3. Methodology

To answer our research questions, we developed the virtual coach Sam that persuaded people to do preparatory activities for quitting smoking based on an RL-algorithm. This algorithm for choosing persuasive strategies aimed to maximize the effort people spend on their activities over time. Data for the algorithm was collected in a longitudinal study. The data and analysis code underlying this paper as well as the Appendix can be found online Albers et al. (2023).

3.1. Virtual coach

We implemented the text-based virtual coach Sam that helped people prepare for quitting smoking and becoming more physically active in conversational sessions. In each session, Sam randomly proposed to users a new preparatory activity for quitting smoking or becoming more physically active such as tracking one’s smoking behavior. These activities were based on components of the StopAdvisor smoking cessation intervention Michie et al. (2012) and future-self exercises Meijer et al. (2018); Penfornis et al. (2023). After proposing the activity, Sam asked questions to determine a user’s current state. This state was used as input for choosing how to persuade the user to do the activity. In the next session, Sam asked about users’ experience with their activity and the effort they spent on it. Throughout the dialog, Sam used techniques from motivational interviewing Henkemans et al. (2009) such as giving compliments for spending a lot of effort on activities and otherwise expressing empathy. Empathy can also facilitate forming and maintaining a relationship with a user Bickmore et al. (2005), which can support behavior change Zhang et al. (2020). Moreover, based on discussions with smoking cessation experts, Sam maintained a generally positive and encouraging attitude while trying to avoid responses that may be perceived as too enthusiastic Free et al. (2009). The implementation of the virtual coach, based on Rasa and Rasa Webchat, can be found online Albers (2022). The structure and an example of the conversational sessions as well as examples of the activities are available in the Appendix.

3.2. Persuasion algorithm

For each persuasive attempt, Sam chose a persuasive strategy based on its learned policy. In the next session, the user provided Sam with feedback by reporting the effort they spent on their activity. Formally, we can define our approach as a Markov Decision Process (MDP) $\langle S,A,R,T,\gamma\rangle$ . The action space $A$ consisted of different persuasive strategies, the reward function $R:S\times A\times S\rightarrow[-1,1]$ was determined by the self-reported effort, $T:S\times A\times S\rightarrow[0,1]$ described the transition function, and the discount factor $\gamma$ was set to $0.85$ to favor rewards obtained in the near future over rewards obtained in the more distant future. The intuition behind this value for $\gamma$ was that while we wanted to persuade a user over multiple time steps successfully, a failed persuasive attempt in the near future could cause a user to become less receptive to future ones or even to drop out entirely: early success might encourage people to continue Amabile and Kramer (2011). The finite state space $S$ described the state a user was in and was captured by answers to questions about a user’s capability, opportunity, and motivation to perform an activity Michie et al. (2014). The goal of an agent in an MDP is to learn an optimal policy $\pi^{*}:S\rightarrow\Pi(A)$ that maximizes the expected cumulative discounted reward $\mathbb{E}\big{[}\sum_{t}^{\infty}\gamma^{t}r_{t}\big{]}$ for acting in the given environment. The value function $V^{\pi}:S\rightarrow\mathbb{R}$ describes the expected cumulative discounted reward for executing $\pi$ in state $s$ and all subsequent states. $V^{*}$ denotes the value function if $\pi=\pi^{*}$ . Figure 3 in the Appendix illustrates the algorithm idea.

State space.

In each session, users provided answers to questions about their capability, opportunity, and motivation to do preparatory activities (e.g., “I feel that I need to do the activity”) on 5-point Likert scales. These questions were based on the COM-B self-evaluation questionnaire Michie et al. (2014) with an additional question about self-efficacy based on Sniehotta et al. (2005a) to assess motivation (see Table 2 in the Appendix). To use the time and effort of users efficiently, we only asked those questions that we envisioned to differ between people for our domain. We transformed the questions to binary features based on whether a value was greater than or equal to the feature mean (1) or less than the feature mean (0). To further reduce the size of the state space, we used our collected data to select three out of eight features in a way that was inspired by the G-algorithm Chapman and Kaelbling (1991). This involved iteratively selecting the feature for which the Q-values were most different when the feature is 0 compared to when the feature is 1. Besides the reduction in state space size, this feature selection also has the benefit that fewer questions would need to be answered by users in practice. The three chosen features were 1) whether users felt like they wanted to do an activity, 2) whether they had things that prompted or reminded them to do an activity, and 3) whether they felt like they needed to do an activity. The resulting state space had a size of $|S|=2^{3}=8$ . We denote states with binary strings such as $001$ (here the first and second features are $0$ and the third feature is $1$ ).

Action space.

Five persuasive strategies formed the action space: authority, commitment, and consensus from Cialdini (2006), action planning Hagger and Luszczynska (2014), and no persuasion. The first three persuasive strategies consisted of a persuasive message (e.g., “Experts recommend $\langle doing\ activity\rangle$ to $\langle positive\ impact\ of\ activity\rangle$ .”) and a subsequent reflective question (e.g., “Which other experts, whose opinion you value, would agree with this?”). The latter was meant to increase the in-depth central processing of the persuasive message. According to the ELM, such high-effort central processing of messages leads to attitudes that are more likely to be persistent over time, resistant to counterattack, and influential in guiding thought and behavior Petty and Cacioppo (1986). Persuasive messages were based on the validated messages from Thomas et al. (2017). For action planning, users were asked to create an if-then plan for doing their activity based on the formulation by Sniehotta et al. (2005a). Yet, rather than asking users to enter their action plans in a table, the virtual coach prompted them to create an if-then plan of the form “If $\langle$ situation $\rangle$ , then I will $\langle$ do activity $\rangle$ ” based on Chapman et al. (2009). For the first four persuasive strategies, a message that reminded people of their new activity after the session also contained a question based on the persuasive strategy. These reminder questions were based on the ones by Schwerdtfeger et al. (2012). Repeating a persuasive attempt can also increase in-depth central processing Petty and Cacioppo (1986). Examples of persuasive messages and reflective questions are given in the Appendix.

Reward

In sessions 2–5, participants were asked about the overall effort they spent on their last activity on a scale from 0 to 10, adapted from Hutchinson and Tenenbaum (2006). Based on the mean effort $\overline{e}$ , the reward $r\in[-1,1]$ for an effort $e$ was computed as follows:

$r=\begin{cases}-1+\frac{e}{\overline{e}}&if\ e<\overline{e}\\ 1-\frac{10-e}{10-\overline{e}}&if\ e>\overline{e}\\ 0&otherwise.\end{cases}$

The idea behind this reward signal was that an effort that was equal to the mean was awarded a reward of $0$ , and that rewards for efforts greater and lower than the mean were each equally spaced.

3.3. Data collection

Study.

We conducted a longitudinal study in which people interacted with Sam in up to five conversational sessions between 20 May 2021 and 30 June 2021. The Human Research Ethics Committee of Delft University of Technology granted ethical approval for the research (Letter of Approval number: 1523). Before the collection of data, the study was preregistered in the Open Science Framework (OSF) Albers and Brinkman (2021). Participants were recruited from the online crowdsourcing platform Prolific. Eligible were people who were contemplating or preparing to quit smoking DiClemente et al. (1991), smoked tobacco products daily, were fluent in English, were not part of another intervention to quit smoking, had an approval rate of at least 90% and at least one previous submission on Prolific, and provided informed consent. Participants were persuaded randomly in the first two sessions. Afterward, participants were split into four groups, each of which was persuaded based on a different policy. We provide details on these policies in Table 5 in the Appendix. 760 people started the first session, and 518 people successfully completed session 5 (see Figure 3 in the Appendix). Participant characteristics such as age and education level are shown in Table 6 in the Appendix.

Data.

We gathered 2366 $\langle s,a,r,s^{\prime}\rangle$ -samples from 671 people, where $s$ is the state, $a$ the action, $r$ the reward, and $s^{\prime}$ the next state. Besides these transition samples, we also collected data on user characteristics. This includes 31 pre-characteristics (i.e., characteristics measured before any persuasive attempt) covering demographics, smoking, physical activity, personality, and need for cognition. Moreover, we measured users’ overall involvement in their assigned activities after the five sessions. Due to dropout, we obtained involvement data for only 500 participants. The Appendix provides more information on the user characteristics we measured.

4. Results

We now investigate each of our six research questions. For each research question, we first describe our setup, followed by our findings and the resulting answer to the research question.

Q1: How well can states derived from the COM-B model predict behavior after persuasive attempts?