Solving the "Botox Problem" for VR Avatars

Despite how far VR Avatars have advanced in the past few years, they still suffer from what some are calling "The Botox Problem." Avatars in VR have mostly static, unmoving faces. When you're communicating with another person's avatar in VR, you see their mouth moving, but there's not much else being facially expressed. Without facial expressions – our primary means of nonverbal communication and a key component of conversation in general – our ability to meaningfully connect with others in VR is limited.

While the technology evolves, some VR programs have developed stopgap solutions. Developers can use body language to cue facial expressions: a thumbs up produces a smile, rock and roll horns mean excitement. But this is a temporary solution to a perennial problem with all communication that’s not face-to-face, dating back to emails misread and AIM messages misinterpreted. Facial expressions allow us to clearly communicate meaning, tone, and intention in a way that text or expressionless speech cannot. In order for Virtual Reality to compete with video calls as the go-to solution for catching up, coworking, and communicating in general, the importance of facial expressions cannot be underestimated.

Communicating Clearly

People communicate in VR today using verbal communication and hand gestures. These are important functions for communicating with others: verbal communication speaks for itself (get it?) and gesticulating with hands and arms can help convey emotion alongside speech. Full-body tracking takes things a step further, as body language is another important source of information in a conversation. Posture and orientation can say a lot about a person’s mood or bearing, and full-body tracking technology has made substantial gains in recent years.

But studies show that at the end of the day, facial expressions are the most effective way to express emotion nonverbally. In one study comparing pairs of avatars talking without facial expressions to those chatting with expressions, pairs with facial expressions better understood their conversation partners and grew to like one another more deeply. Indeed, facial expressions are often considered the main source of information in a conversation, even more so than actual verbal communication. Just ask a poker player: the truth lies not in the words, but in the expression.

The Rhythm of Conversation

The benefits offered by expressive avatars go beyond clear communication. As anyone with a Zoom-based job will tell you, people talking over one another during video calls is a major problem. Overlapping speakers create misunderstandings and unnecessarily extend meetings with plenty of "Sorry, you go ahead." "No, you go first." The inverse can be just as frustrating, as anticipating potential interruptions with apprehensive pauses creates unnaturally lilted conversations.

This is a problem in VR too, as the only way to know if someone is done talking is if their mouth stops moving. When speaking face-to-face, we’re able to read facial cues that signal the end of a thought or sentence and know when to jump in naturally. Once we achieve facially expressive avatars, the pace and rhythm of our conversations in VR will feel as natural as real life.

Accessible VR Controls

We know that facially expressive avatars would be a game-changer for VR users everywhere. But there's perhaps an even more important reason to develop this technology: accessibility. Today's conventional, industry-standard style of VR controllers can present a real challenge for users with mobility impairments. The use of handheld joysticks and buttons makes sense from a design standpoint, but fails to accommodate those without a full range of mobility in their hands.

Using facial expressions as a means of conducting actions in VR would represent a major step forward in this regard. We’ve already seen hand motions substitute facial expressions – remember the rock and roll horns? – but the inverse, where facial expressions could replace hand motions, would enable more people than ever to comfortably engage in VR experiences.

What is Holding us Back?

We know that facially expressive avatars would bring a lot to the table in VR. So what's the holdup? While the specifics of tech development are closely guarded at all of the major VR hardware companies, we can extrapolate some information based on what we do know. For that, I spoke with Jake Maymar, VP of Innovation at The Glimpse Group, to hear his thoughts on the technological developments needed in order to achieve facial expressions in VR.

The core of the issue, according to Jake, is that VR headsets only have so much processing power. Juggling body and eye tracking, machine learning in real time, powering the four requisite cameras (and potentially adding inward-facing cameras as well,) not to mention running applications smoothly, all packaged into a lightweight, consumer-friendly wireless HMD is perhaps the biggest challenge these companies are facing today. “To put it in a metaphor,” Jake explained, “we may already have all the eggs we need, but we still don’t have a basket big enough to fit them all.” Although the technology for real-time facial mapping could very well exist today, the compact processors required to fully power it alongside all the other requisite components in a reasonably sized VR headset may be the biggest challenge.

Conclusion

From clear communication to naturally-flowing conversations to accessibility, facial expressions offer a variety of benefits to VR users everywhere. Full-body tracking, controllerless hand motions, and even precision eye tracking are key advancements which make interactions in VR feel more intuitive and natural. But facial expressions represent perhaps the most important advancement of them all, finally solving the “uncanny valley” sensation some experience when interacting with unconvincing avatars. While the technology behind facial expressions in avatars continues to develop, creating more realistic avatars is the next best thing. Lifelike, customizable avatars, such as those designed by Wolf3D and integrated across all The Glimpse Group’s VR platforms, immerse users more deeply into their virtual worlds, creating experiences that look and feel more like the real thing.

Cameron Wong

Cameron is the Content Writer/Editor at The Glimpse Group. As a former academic researcher in the humanities, he blends his outside perspective as a relative newcomer to tech with Glimpse's industry-leading expertise to demystify the world of VR, AR, and the metaverse.