figurehptb
GlassMessaging: Supporting Messaging Needs During Daily Activities Using OST-HMDs
Abstract.
The act of communicating with others during routine daily tasks is both common and intuitive for individuals. However, the hands- and eyes-engaged nature of present digital messaging applications makes it difficult to message someone amidst such activities. We introduce GlassMessaging, a messaging application designed for Optical See-Through Head-Mounted Displays (OST-HMDs). It facilitates messaging through both voice and manual inputs, catering to situations where hands and eyes are preoccupied. GlassMessaging was iteratively developed through a formative study identifying current messaging behaviors and challenges in common multitasking with messaging scenarios.
1. Introduction and Related Work
The proliferation of mobile devices has transformed our means of communication, making applications (henceforth referred to as apps) like WhatsApp, Telegram, Messenger, and WeChat commonplace (Curry, 2022). However, using these apps during daily tasks, such as cooking or walking, is hindered by their design, which demands extensive visual and manual interaction. Research reveals that individuals often use messaging apps while multitasking, with 13% of messages sent on the move (Battestini et al., 2010).
Given this context, we question, “How can we refine mobile messaging for effective communication during routine multitasking?” This brings us to Optical See-Through Head-Mounted Displays (OST-HMDs or Augmented Reality Smart Glasses) (Itoh et al., 2021), designed for hands-free usage and maintaining situational awareness (Orlosky et al., 2014; Zhao et al., 2023; Janaka et al., 2022). There remains a void in crafting interfaces tailored for OST-HMDs suited to daily multitasking, with current messaging apps for OST-HMDs (like Vuzix Blade’s WeChat111https://apps.vuzix.com/app/wechat) primarily being derivatives of mobile phone apps (A notable exception is Google Glass XE (2013-2017) (Google Glass, 2013b, a) which is discussed in Appendix A).
The inclination to communicate while multitasking is evident in messaging app usage (Curry, 2022), fostering closeness and support (Cho et al., 2020; Grinter and Eldridge, 2003). Mobile phones, while supporting multitasking, can be hazardous in situations needing acute awareness, such as walking (Hashish et al., 2017; Sullman et al., 2021). OST-HMDs appear promising due to their hands-free nature and enhanced situational awareness (Orlosky et al., 2014; Lucero and Vetek, 2014). Voice input stands out as a feasible hands-free technique for OST-HMDs, as other methods like head and gaze inputs might be less accurate or result in ergonomic strain (Lee and Hui, 2018; Cohen and Oviatt, 1995; Revilla et al., 2020).
Consequently, we present GlassMessaging (Janaka et al., 2023), a messaging application for OST-HMDs, iteratively designed post examining the prevailing needs, habits, challenges, and constraints users encounter while messaging and multitasking.

The figure presents a flowchart consisting of five elements, each depicting a step to follow after receiving a notification. The steps are: (1) The user utters ’show chat’, and the chat interface along with a notification appears. (2) The user mentions a contact’s name, and the system navigates to the chat interface of that contact. (3) The user speaks a message, and the system transcribes it into text in the text entry box. (4) The user says ’send’, and the system sends the message. (5) The user utters ’hide chat’, and the full interface is hidden, restoring the complete view of the environment.
2. System
After evaluating existing mobile messaging apps (e.g., WhatsApp, Telegram) on OST-HMDs by sideloading, we found that, while their UI was generally intuitive, they were not optimized for OST-HMDs (Janaka et al., 2023). For example, content often obstructed the view, the color schemes were either too bright or too dark, and some elements were too small. These factors led to usability issues. To cater to hands-busy scenarios, we introduced voice dictation for text entry and voice commands for hands-free UI navigation. We also implemented a ring mouse interaction to allow for faster and more precise scrolling and selection (Sapkota et al., 2021), while retaining mid-air gestures due to their “intuitive” touch-like content manipulation paradigm.
2.1. Apparatus
We selected Microsoft HoloLens 2 (HL2), an OST-HMD with hand-tracking, voice commands, and world-scale positioning (2k resolution, 52° diagonal FoV), to develop GlassMessaging, our messaging app designed for OST-HMDs. A wireless ring mouse (Sanwa Supply 400-MA077) facilitated easy directional UI element selection (Figure 1). We developed GlassMessaging using Unity 3D, Mixed Reality Toolkit (MRTK 2.8), leveraging MRTK’s built-in functions for mid-air gestures, voice inputs, virtual keyboard, and content stabilization. To simulate a realistic messaging experience, we implemented a virtual chat server using Python, running on a tablet computer connected to the HL2 via Wi-Fi, enabling bi-directional communication through a socket connection between the client and the server (see https://github.com/NUS-HCILab/GlassMessaging).
2.2. Interface Design
To enhance learnability and maintain consistency (Nielsen, 1994) with familiar interfaces, we chose to modify the UIs of existing mobile messaging apps and tailor them to OST-HMDs, instead of entirely redeveloping them. The final interface is shown in Figure 1 after two iterations (see details at (Janaka et al., 2023)).
2.2.1. Visual interface (output)
The visual interface of GlassMessaging (Figure 1) consists of four main UI panels, namely, notifications, contacts, chat messages, and voice/keyboard input panels. This allows users to receive notifications, select contacts, and compose/send messages using voice and keyboard input.
2.2.2. Audio interface (Input-Output)
As depicted in Figure 1, users can interact with GlassMessaging via voice commands (Table 1) to navigate the UI (e.g., ‘SCROLL UP’, ‘SCROLL TO TOP’) and dictate text (using ‘VOICE MESSAGE’). Audio feedback (e.g., beeps) accompanies some input interactions.
When the app is not in dictation mode, voice commands can directly activate various functionalities, such as opening notifications (‘OPEN NOTIFICATION’), selecting contacts (‘<NAME>’), sending the message (‘SEND’), and hiding the interface (‘HIDE CHAT’). Voice shortcuts such as ‘TEXT <NAME>’ are also available, which combine ‘<NAME>’ and ‘VOICE MESSAGE’ for direct text entry. Similarly, the ‘REPLY’ command opens the notification and begins dictation for a reply immediately.
2.2.3. Manual-input interface (Input)
GlassMessaging supports two manual input methods: a wearable ring-mouse and mid-air hand gestures as shown in Table 1.
Ring mouse: The user can scroll through the contact list using the ring mouse’s ‘up’ and ‘down’ buttons. The ‘right’ button toggles between input modalities and selects the send button. The ‘center’ button activates the selected virtual button and serves as a long-press toggle to hide/reveal the entire interface.
Mid-air interaction: The visual interface can be interacted with through mid-air gestures. The contact list can be scrolled by swiping, and a contact’s chat can be opened by pressing their virtual icon. The input modality is chosen by selecting the corresponding virtual buttons (voice or keyboard). Pressing on a notification opens the chat with the sender.
3. Evaluation
To assess the effectiveness of GlassMessaging, we compared it to the Telegram application on mobile phones in a controlled study set in daily multitasking situations. Our findings (Janaka et al., 2023) indicate that, even with the present technological constraints of the OST-HMD platform, GlassMessaging provided enhanced voice input access and enabled smoother interactions than phones. This resulted in a 33.1% reduction in response time and a 40.3% increase in texting speed. These findings underscore the significant potential of OST-HMDs as a meaningful complement to mobile phone-based messaging in multitasking scenarios.
However, there are several challenges to overcome before fully harnessing this platform’s potential. For example, the use of GlassMessaging resulted in a 2.5% drop in texting accuracy, especially with complex texts. Moreover, current OST-HMDs have some inherent downsides (e.g., rudimentary hardware capabilities, unfamiliarity, limited interactions (technavio, 2022; Itoh et al., 2021; Lee and Hui, 2018)) when contrasted with the mature and extensively tested mobile phones currently available.
4. Conclusion and Future Work
While multitasking with messaging is a frequent real-life activity, current mobile applications and platforms fall short in providing adequate support. We pinpointed two primary situational impediments (i.e., hands-busy and eyes-busy) arising from existing mobile platforms, which drove us to iteratively develop GlassMessaging, a messaging application tailored for OST-HMDs to address these shortcomings. We envision messaging on OST-HMDs as the forthcoming communication frontier, acting as a valuable adjunct to mobile phones during multitasking and driven forward by technological progress. To realize this vision, it is essential to re-conceptualize communication interfaces that align with OST-HMD affordances and to devise strategies to overcome potential situational challenges (e.g., privacy and social concerns with voice).
Acknowledgements.
We thank the volunteers who participated in our studies. This research is supported by the National Research Foundation, Singapore, under its AI Singapore Programme (AISG Award No: AISG2-RP-2020-016). It is also supported in part by the Ministry of Education, Singapore, under its MOE Academic Research Fund Tier 2 programme (MOE-T2EP20221-0010), and by a research grant #22-5913-A0001 from the Ministry of Education of Singapore. Any opinions, findings and conclusions, or recommendations expressed in this material are those of the author(s) and do not reflect the views of the National Research Foundation or the Ministry of Education, Singapore.References
- (1)
- Battestini et al. (2010) Agathe Battestini, Vidya Setlur, and Timothy Sohn. 2010. A large scale study of text-messaging use. In Proceedings of the 12th international conference on Human computer interaction with mobile devices and services (MobileHCI ’10). Association for Computing Machinery, New York, NY, USA, 229–238. https://doi.org/10.1145/1851600.1851638
- Cho et al. (2020) Hyunsung Cho, Jinyoung Oh, Juho Kim, and Sung-Ju Lee. 2020. I Share, You Care: Private Status Sharing and Sender-Controlled Notifications in Mobile Instant Messaging. Proceedings of the ACM on Human-Computer Interaction 4, CSCW1 (May 2020), 1–25. https://doi.org/10.1145/3392839
- Cohen and Oviatt (1995) P R Cohen and S L Oviatt. 1995. The role of voice input for human-machine communication. Proceedings of the National Academy of Sciences 92, 22 (Oct. 1995), 9921–9927. https://doi.org/10.1073/pnas.92.22.9921 Publisher: Proceedings of the National Academy of Sciences.
- Curry (2022) David Curry. 2022. Most Popular Apps (2022). https://www.businessofapps.com/data/most-popular-apps/
- Google Glass (2013a) Google Glass. 2013a. Google Glass - YouTube. https://www.youtube.com/user/googleglass
- Google Glass (2013b) Google Glass. 2013b. Google Glass: How to use voice actions. https://www.youtube.com/watch?v=rv3KU0Yo5ZM
- Grinter and Eldridge (2003) Rebecca Grinter and Margery Eldridge. 2003. Wan2tlk? everyday text messaging. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’03). Association for Computing Machinery, New York, NY, USA, 441–448. https://doi.org/10.1145/642611.642688
- Hashish et al. (2017) Rami Hashish, Megan E. Toney-Bolger, Sarah S. Sharpe, Benjamin D. Lester, and Adam Mulliken. 2017. Texting during stair negotiation and implications for fall risk. Gait & Posture 58 (Oct. 2017), 409–414. https://doi.org/10.1016/j.gaitpost.2017.09.004
- Itoh et al. (2021) Yuta Itoh, Tobias Langlotz, Jonathan Sutton, and Alexander Plopski. 2021. Towards Indistinguishable Augmented Reality: A Survey on Optical See-through Head-mounted Displays. Comput. Surveys 54, 6 (July 2021), 120:1–120:36. https://doi.org/10.1145/3453157
- Janaka et al. (2023) Nuwan Janaka, Jie Gao, Lin Zhu, Shengdong Zhao, Lan Lyu, Peisen Xu, Maximilian Nabokow, Silang Wang, and Yanch Ong. 2023. GlassMessaging: Towards Ubiquitous Messaging Using OHMDs. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 7, 3 (Sept. 2023). https://doi.org/10.1145/3610931
- Janaka et al. (2022) Nuwan Janaka, Xinke Wu, Shan Zhang, Shengdong Zhao, and Petr Slovak. 2022. Visual Behaviors and Mobile Information Acquisition. https://doi.org/10.48550/arXiv.2202.02748
- Lee and Hui (2018) Lik-Hang Lee and Pan Hui. 2018. Interaction Methods for Smart Glasses: A Survey. IEEE Access 6 (2018), 28712–28732. https://doi.org/10.1109/ACCESS.2018.2831081
- Lucero and Vetek (2014) Andr\’{e}s Lucero and Akos Vetek. 2014. NotifEye: using interactive glasses to deal with notifications while walking in public. In Proceedings of the 11th Conference on Advances in Computer Entertainment Technology - ACE ’14. ACM Press, Funchal, Portugal, 1–10. https://doi.org/10.1145/2663806.2663824
- Nielsen (1994) Jakob Nielsen. 1994. Enhancing the explanatory power of usability heuristics. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’94). Association for Computing Machinery, New York, NY, USA, 152–158. https://doi.org/10.1145/191666.191729
- Niora (2023) Niora. 2023. Google Glass 3.0 - Review. https://www.niora.net/en/p/google_glass_3
- Orlosky et al. (2014) Jason Orlosky, Kiyoshi Kiyokawa, and Haruo Takemura. 2014. Managing mobile text in head mounted displays: studies on visual preference and text placement. ACM SIGMOBILE Mobile Computing and Communications Review 18, 2 (June 2014), 20–31. https://doi.org/10.1145/2636242.2636246
- Revilla et al. (2020) Melanie Revilla, Mick P. Couper, Oriol J. Bosch, and Marc Asensio. 2020. Testing the Use of Voice Input in a Smartphone Web Survey. Social Science Computer Review 38, 2 (April 2020), 207–224. https://doi.org/10.1177/0894439318810715 Publisher: SAGE Publications Inc.
- Sapkota et al. (2021) Shardul Sapkota, Ashwin Ram, and Shengdong Zhao. 2021. Ubiquitous Interactions for Heads-Up Computing: Understanding Users’ Preferences for Subtle Interaction Techniques in Everyday Settings. In Proceedings of the 23rd International Conference on Mobile Human-Computer Interaction. ACM, Toulouse & Virtual France, 1–15. https://doi.org/10.1145/3447526.3472035
- Sullman et al. (2021) Mark J. M. Sullman, Aneta M. Przepiorka, Agata P. Błachnio, and Tetiana Hill. 2021. Can’t text, I’m driving – Factors influencing intentions to text while driving in the UK. Accident Analysis & Prevention 153 (April 2021), 106027. https://doi.org/10.1016/j.aap.2021.106027
- technavio (2022) technavio. 2022. Smart Glass Market by Application and Geography - Forecast and Analysis 2022-2026. https://www.technavio.com/report/global-smart-glasses-market-industry-analysis
- Wikipedia (2023) Wikipedia. 2023. Google Glass. https://en.wikipedia.org/w/index.php?title=Google_Glass&oldid=1139958030 Page Version ID: 1139958030.
- Zhao et al. (2023) Shengdong Zhao, Felicia Tan, and Katherine Fennedy. 2023. Heads-Up Computing: Moving Beyond the Device-Centered Paradigm. arXiv:2202.02748 [cs] (2023), 11 pages. arXiv:2305.05292 https://doi.org/10.48550/arXiv.2305.05292
The table presents the supported input interactions of the final version of GlassMessaging. It outlines the different functionalities, such as revealing the interface, hiding the interface, opening chats, activating voice dictation, sending messages, opening and closing the virtual keyboard, navigating between contacts, and initiating voice dictation to specific contacts. Each functionality is listed along with the associated input methods: Mid-air gesture, Ring interaction, and Voice command. Function Mid-air gesture Ring interaction Voice command Reveal interface Click ‘center’ button for 1 second Show chat Hide interface Click ‘center’ button for 1 second Hide chat Open the chat related to notification Press on notification Open notification Open the chat with the contact, <NAME> Press on contact <NAME> Click ‘up’/‘down’ button to navigate <NAME> Activate voice dictation Press on ‘voice’ button Click ‘right’ button to navigate + click ‘center’ button to activate Voice message Send the message Press on ‘send’ button Send Open the virtual keyboard Press on ‘keyboard’ button Open keyboard Close the virtual keyboard Press anywhere on the interface Click any button Close keyboard Go to the topmost contact Scroll up using the finger Click and hold the ‘up’ button Scroll to the top Go to closest top contact Scroll up using the finger Click ‘up’ button Scroll up Go to closest bottom contact Scroll down using the finger Click ‘down’ button Scroll down Start voice dictating to message received contact Reply (Open notification + Voice message) Start voice dictation to the contact, <NAME> Text <NAME> (<NAME> + Voice message)
Appendix A GlassMessaging vs. Google Glass XE
Google Glass XE (2013-2017) (Google Glass, 2013b, a; Wikipedia, 2023), a discontinued product, supported heads-up messaging. Here, we distinguish between our application and Google Glass XE, showcasing our contributions from both practical and academic perspectives.
A.1. Google Glass XE (GG) interface
GG incorporated a default set of voice action commands for messaging (Google Glass, 2013b). Its lightweight and seamless design combined voice, head gestures, and touch gestures for inputs and an OST-HMD for output. To activate voice commands or send messages, users would utter “OK Glass” and “Send a message to”, followed by the contact’s name and message content. Users would respond to a message by saying “Reply” followed by their message content. Hence, GG provided an efficient method for sending and replying to individual messages.
A.2. Comparison
Features | GlassMessaging with HL2 | Google Glass XE (GG) |
Interactions | Voice, Ring-mouse, Mid-air | Voice, Touchpad (on the right temple), Head gestures |
Text entry | Voice, Mid-air keyboard | Voice |
Display | Binocular, Higher-Resolution (2048x1080 px per eye) | Monocular, Lower-resolution (640x360 px, right eye) |
Larger-FoV (30° horizontal) | Smaller-FoV (13° horizontal) | |
Chat history | Shows last 3 messages and 3 contacts | Shows last message and last contact |
Chat position | LoS (middle-center) | Above LoS (top-center), Manual switching between each UI |
Notification position | Above LoS (top-center) | |
Contact position | Right of LoS (middle-right) | |
UI opacity | Increased for new messages | Fixed |
UI access | On-demand (using voice or ring-mouse) | On-demand (by looking up or using voice) |
Table 2 depicts that both GlassMessaging (GM) and GG utilize voice input for text entry and navigation. Our study (Janaka et al., 2023) validates voice input as an efficient tool, aligning with GG’s design. However, speech recognition affects the accuracy of GM, a challenge possibly shared by GG users. GG, while catering to immediate messaging requirements, had difficulty managing intricate conversations. In contrast, GM upholds modern standards, emphasizing context via features like full chat history and unread indicators. The display location differs too: GG showcased content above the line-of-sight, demanding attention shifts, while GM, leveraging advancements, positions content within the line-of-sight, employing opacity adjustments for awareness.
Interaction-wise, GG relied on head-gestures, while GM introduced a gamut of methods like ring and mid-air gestures, providing flexibility for multitasking. Ultimately, while both platforms serve heads-up messaging, their design nuances cater to different generational hardware and user demands. GG, tailored for earlier-generation glasses, prioritized real-time singular messages. GM, on the other hand, leverages advanced OST-HMDs, managing both immediate and layered messaging. A fusion of their strengths, such as integrating GG’s head-gesture system into GM, could potentially amplify user experience, especially in intricate messaging scenarios.