Designing for Samsung Bixby - Voice
Worked at Samsung HQ, Seoul South Korea and SRIB collectively and contributed to designing of Samsung's new voice powered digital assistant.
Overview
Voice assistants are seen as the next big medium in device interaction. Based on an internal research, 72% of S-Voice users reported that their usage frequency would increase if improvements were made to it. At that moment Samsung took on the mission to re-imagine a voice assistant and Bixby was born. In October 2017, Samsung Bixby had 10 million+ active users. It’s poised to touch 500 million devices (including TV, refrigerator etc.) in a couple years.
To comply with my non-disclosure agreement, I have omitted and obfuscated confidential information in this case study. All information in this case study is my own and does not necessarily reflect the views of Samsung.
My Role
I joined Bixby-Voice Team at Samsung Seoul R&D Campus to come up with the UX principles. These UX Principles acted as the guiding force to define how the voice assistant would behave, speak and interact with the users.
The major things I worked on were:
• Bixby Experience Principles
• Identifying Key Use-Cases
• Navigation Flow of Bixby Compatible Devices
• Innovation — Generate IP
Bixby Experience Principle
Bixby Experience Principles describe our values, and inform our business strategy or product decisions. These Principles are based on research and incorporate what the team has learned about users’ needs and wants. They also reflect an understanding of the business needs and brand positioning.
How did our team come up with them?
The process was threefold. We identified problems in the voice assistant space, empathised with users and understood what users wanted and then brainstormed how an ideal interaction between a user and voice assistant should be by performing experiential prototyping.
What are the principles?
Comfortable: Comfort user by guiding them, gracefully recovering from error, reducing anxiety and make them feel in control by considering privacy issues.
Clear and Concise: Maximize clarity by giving necessary information while avoiding repetition. Conversation should be effortless and efficient. This principle helps in maximizing the efficiency of non-screen devices.
Natural: Make effortless and contextual communication with the user while giving them freedom to choose the right modality.
Delightful: Present user with ‘meaningful’ suggestions while keeping an element of surprise with enticing visuals to achieve delightful experience.
Adaptive: Agent should adapt to various factors like user’s context and usage time. It should also keep its personality consistent while adapting.
Impact of Bixby Experience Principle
Our team created an internal guide of 105 policies with examples to realize our vision of these 5 principles. It was sent out globally to all the internal research and design teams which influenced the design of personality, dialogue style and GUI design of the assistant.
Overview
We asked a very basic question, "Where exactly is the value in what we’re doing?; how can we filter out and design for the most meaningful use-cases providing value?"
One insight which shaped the future direction of product was,
People are unaware of even basic features that are within the smartphone and find it difficult to use them!
For e.g. some of our users didn’t know how to attach picture to a message and send it. Using basic features like turning on the safe mode or opening subscription page of Youtube app and numerous other scenarios were totally either unknown to the users or they didn't know how to do it. This resulted in the most important domain of use-cases, which is “Phone feature usage”.
We identified more use-cases by considering screen and voice based devices, indoor and outdoor scenarios, most important domains for all devices (based on research data), single intent and multiple intent queries etc. We asked 40,000 voice assistant users in US through a survey to tell us about which domains they've used in past 1 year.
Navigation flow of Bixby compatible devices
We designed the voice assistant sequence for all the compatible devices like Galaxy Smartphone, Smart TV, Galaxy Gear, Speaker and Gear IconX etc. We had to carefully balance consistency and usability. Every device has a different usage context and hardware specification.We can’t force same flow for all of them. Here are a few sequence flows for Galaxy Smartphone.
Basic Voice Interaction Flow
• To see conversation history, user text aligned to the right, agent text aligned to the left.
• ASR will flow from the right, and Conversation can stack from the bottom.
• If conversation became longer, voice chrome can expand.
No Voice Input Flow
• To guide user, give comforting message & command hint at the same time
• On standby status, guide user to say help command
• Number of Chrome Hint, and contents of Chrome Hint can be flexible on each device (TV, Refrigerator, Smart Speaker etc.)
Cancel Flow
• User can cancel by touching indicator while processing or listening, then it goes to standby
• There can be guide message like “tap here to cancel” on the early usage below chrome
Restart Interaction Flow
• User can restart whenever press HW key or call Bixby on processing, result, standby status
• User can restart by touching mic icon on result/standby status
Innovation-Generate IP
Rollout and Impact
• Immediately after releasing Bixby S-Voice was discontinued. In October 2017, Samsung Bixby had 10 million+ active users. It’s poised to touch 500 million devoces soon.
• It is globally covered in all the major news channels, newspapers, online and printed magazines.
• Bixby has become an ubiquitous voice powered digital assistant of all Samsung flagship devices.
What I learnt by working on this project
• Power and Limitations of Voice: While voice is seen as a future of human machine interaction, it has some technological and privacy related limitations. For e.g. technologically end of speech detection needs more smartness. Social anxiety is still an issue and people still feel weird about using it socially or publicly. Users also feel shy speaking to a voice assistant.
• Designing for AI: Designing for a flow when the flow is not known. When number of variables are far too many.
• Domain Expertise: I learnt a lot about the voice interfaces and understood in what direction they’re headed and what are the missing pieces.
• Plan for Complexity and Risk: When working on a product consisting of multiple teams across the continents, there’s a need to stick to a plan and clearly account for known risk and unforeseen risk.
I hope you enjoyed reading about how we approached voice principles, use cases, and various user flows.
Hi, I'm Sourabh Pateriya.
Currently I'm building Soundverse Inc. that I founded in June'23. I'm a Product leader who's led teams at Spotify, Samsung and Tobii with experience in the areas of generative AI, music-tech, extended reality, eye-tracking, voice assistant and mobile. I hold 10+ patents.