Talkative
You need about 5-7 minutes to review this case study. Enjoy!
You need about 5-7 minutes to review this case study. Enjoy!
Challenge
Conduct an exploratory research of voice user interfaces in a multilingual environment. Identify key users and propose design recommendations. Process
Literature review | Interviews | VUI home deployments | Onsite observations | Post-deployment surveys | Data analysis | Personas | Design recommendations Outcome
Set of scenarios-based personas of key users and VUI design recommendations. My Role
UX Research Lead |
Overview
The way we live is changing faster than ever before. Increased globalisation affects our daily lives, continually setting new challenges ahead of us. One of the striking features of this growing globalisation is multiculturalism and the phenomenon of multilingualism associated with it. Hence, only a few modern societies can be considered homogeneous and monolingual today. The number of multilingual speakers is consistently increasing, creating new linguistic requirements that voice user interfaces (VUIs) must meet to be globally successful.
The way we live is changing faster than ever before. Increased globalisation affects our daily lives, continually setting new challenges ahead of us. One of the striking features of this growing globalisation is multiculturalism and the phenomenon of multilingualism associated with it. Hence, only a few modern societies can be considered homogeneous and monolingual today. The number of multilingual speakers is consistently increasing, creating new linguistic requirements that voice user interfaces (VUIs) must meet to be globally successful.
|
Literature Review
|
Voice User Interfaces (VUIs) are becoming more and more ubiquitous in our lives by finding use in an increasing number of technological solutions used by people - including smartphones, interactive systems, or cars. The global success of voice user interfaces greatly depends on how effectively the voice technology will recognise and understand different languages, dialects, and accents in the future (Furui, 2005). For many years, the most natural form of communication used by people, speech, proved to be one of the most difficult to understand by machines (Rabiner, 1995). Significant advancements are being observed periodically; however, there are many technological barriers that are yet to reach flexible solutions.
As of 2019, Amazon’s Alexa started supporting eight languages: English, German, Japanese, French, Italian, Spanish, Portuguese, and Hindi (Amazon Alexa, 2019). A recent study completed by Mocanu, Baronchelli, Perra, Gonçalves, Zhang, and Vespignani (2013), detected and identified a total number of 78 languages actively used through microblogging platforms. It is evident that voice user interfaces have a long way to go. Limited multilingual support in voice user interfaces (Deng & Huang, 2004) is impacting a significant population, allowing people to use a narrowed subset of languages while interacting using voice.
As of 2019, Amazon’s Alexa started supporting eight languages: English, German, Japanese, French, Italian, Spanish, Portuguese, and Hindi (Amazon Alexa, 2019). A recent study completed by Mocanu, Baronchelli, Perra, Gonçalves, Zhang, and Vespignani (2013), detected and identified a total number of 78 languages actively used through microblogging platforms. It is evident that voice user interfaces have a long way to go. Limited multilingual support in voice user interfaces (Deng & Huang, 2004) is impacting a significant population, allowing people to use a narrowed subset of languages while interacting using voice.
|
Interviews, VUI Home Deployments, Onsite Observations and Surveys
To investigate voice technology, in a real-life situation, a voice assistant called Amazon Alexa was deployed in three multilingual households. Each VUI deployment lasted two weeks, during which each family was able to use the voice assistant without limitations and verify how effectively the voice technology met their needs. The deployment was accompanied by interviews, observations, and post-deployment surveys. |
The multilingual families that agreed to participate in the study migrated to Ireland from India, Ethiopia and France.
|
Data Analysis
The data analysis process was divided into two phases. The first phase focused on analysing data collected during interviews and onsite observations with multilingual families. The second phase focused on analysing data collected through a survey shared with each family after successful in-home VUI deployment. The process's primary objective was to discover meaningful information about voice user interfaces in the context of multilingual speakers. |
Part of the master codebook created during the analysis phase on qualitative and quantitative data.
|
Personas
Based on the collected data during in-home deployments, scenarios-based personas of key users and a set of design recommendations were created. |
|
Design Recommendations
A set of design recommendations was developed for teams working with voice technology in a multilingual environment. |
Freedom to interact
Users desire to initiate voice commands in different languages simultaneously. Switching between languages seamlessly makes it possible to accurately reflect their natural multilingual environment in which they live.
Users desire to initiate voice commands in different languages simultaneously. Switching between languages seamlessly makes it possible to accurately reflect their natural multilingual environment in which they live.
Easy onboarding
Users expect the onboarding process to be as short and intuitive as the voice commands themselves. The existing business model of voice assistant providers requires users to subscribe to the entire ecosystem, install the application, create an account, purchase a premium account, and use limited languages from the supplier's limited pool.
Users expect the onboarding process to be as short and intuitive as the voice commands themselves. The existing business model of voice assistant providers requires users to subscribe to the entire ecosystem, install the application, create an account, purchase a premium account, and use limited languages from the supplier's limited pool.
Engaging conversations
Users expect more detailed, accurate and personalised conversations with the voice assistants, reflecting everyday family conversations and language preferences. Currently, voice assistants only support one-way queries. The voice devices cannot respond with a request for clarification of the intent or adapt the response to the user asking the question.
Users expect more detailed, accurate and personalised conversations with the voice assistants, reflecting everyday family conversations and language preferences. Currently, voice assistants only support one-way queries. The voice devices cannot respond with a request for clarification of the intent or adapt the response to the user asking the question.
Sense of control
Users desire to have a sense of control over the state of the device, knowing what is happening and whether the voice queries were effective. The lack of visual indicators confuses users and sometimes leaves them in a state of uncertainty as to whether a given command has been understood. Introducing visual indicators while the device is listening or thinking about a response would create a sense of greater control.
Users desire to have a sense of control over the state of the device, knowing what is happening and whether the voice queries were effective. The lack of visual indicators confuses users and sometimes leaves them in a state of uncertainty as to whether a given command has been understood. Introducing visual indicators while the device is listening or thinking about a response would create a sense of greater control.
Sense of trust and privacy
Users desire the voice device to be safe to use and not pose a threat to users’ privacy, especially their home life. They want to be sure that their voice queries, searches and conversations are safe and will not be exposed to the risk of unauthorised access from other platforms such as social media and search engines.
Users desire the voice device to be safe to use and not pose a threat to users’ privacy, especially their home life. They want to be sure that their voice queries, searches and conversations are safe and will not be exposed to the risk of unauthorised access from other platforms such as social media and search engines.
|
Final Conclusions
|
We live in a multilingual world. Only a few modern societies can be considered homogeneous and monolingual today. Speaking one language is a basic need, but speaking more than one opens a much wider range of possibilities. The languages that people have access to or want to use make a significant contribution to their success. This highlights the growing need for significant advances in the development of voice technology, to effectively support multilingual speakers in and outside of the home.
The research focused on what would appear to be a simple multilingual environment compared to others, such as schools, universities, or airports where linguistic diversity is much greater. Only three families took part in this study, but in total, they use as many as nine different languages daily. It is evident that voice interactions are still underdeveloped in comparison to other forms of interaction. More research is needed to analyse the needs of multilingual environments, identify recommendations to improve experiences with VUIs, reduce frustration, and positively impact long-term adaptation.
The research focused on what would appear to be a simple multilingual environment compared to others, such as schools, universities, or airports where linguistic diversity is much greater. Only three families took part in this study, but in total, they use as many as nine different languages daily. It is evident that voice interactions are still underdeveloped in comparison to other forms of interaction. More research is needed to analyse the needs of multilingual environments, identify recommendations to improve experiences with VUIs, reduce frustration, and positively impact long-term adaptation.