- Ajinkya Waghulde
- March 17, 2022 | 8 min
A gaze expression experiment with MetaHuman — enabling more expressive photorealistic avatars
Human beings are social creatures and rely on communications to build and grow connections with others in the world. Of all the communication tools, eyes are unique in that they receive and send social cues. Through eye contact, we feel recognized, understood, and validated. Infants are more sensitive to the status of the adults' eyes than their head movement. People in relationships convey love in a long and deep gaze without a need for more words. Experienced speakers deliver powerful speeches by making eye contact with the audience. Eye contact holds the key to our communication.
While we start fancying another life in the metaverse, I believe most of us want to bring authenticity to our avatars because they manifest us in the augmented and virtual space. It's not hard to imagine that we will socialize with other users (through avatars) and engage in group activities and entertainment. Enabling eye contact and movement through eye tracking technology for the avatars becomes a no-brainer. It will significantly enhance the social presence of the users.
Users may choose a Disney character-like appearance or make a digital twin that likens themselves in every possible way depending on the type of activities they plan to engage in the metaverse. We've seen cross-platform anime-style avatar systems like Ready Player Me supporting hundreds of apps and games. Tobii recently announced a collaboration with VR streaming platform LIV that incorporates Ready Player Me avatars and enables Tobii eye tracking for content creators.
To see if eye tracking truly brings more realism to even the photorealistic avatars, I recently spent some time enabling the Tobii eye tracking in Epic's popular avatar-making tool MetaHuman. And today, I want to share what excites me in this fascinating "hacking" experience.
Capturing facial expressions in VR
MetaHuman characters are hyper-realistic and can be brought to life using real-time motion capture (Mocap) devices connected via Live Link. Several motion capture solutions exist, creating amazingly live characters. They capture face movement too but in different ways, from the built-in approach in high-end full-body motion capturing suites (e.g. Cubic Motion) to the innovative use of iPhone X. Nevertheless, they can't be directly used for avatars in VR applications because the user already wears a VR headset covering the face.
Cubic Motion (left) and Xsens with iPhone X (right)
The next-gen VR headsets are equipped with additional sensors for eye tracking and lip tracking to capture facial expressions. For example, Tobii eye tracking has been integrated into Pico Neo3 Pro Eye, HP Reverb G2 Omnicept Edition, Pico Neo 2 Eye, and HTC Vive Pro Eye. In this experiment, I've used the HP Reverb G2 Omnicept Edition headset and leveraged real-time gaze input for expression controls on MetaHuman characters in VR applications.
Exploring expressions and controls with gaze input
With the MetaHuman face rig, there are several control points for the movement of the eyes and the muscle around them. The face control board shows how well the eye muscles move when controlling the gaze direction. We configure this to get real-time gaze input from the VR headset to update the MetaHuman look direction expression controls and other muscle control points to get realistic gaze expressions.
MetaHuman face control board. Image courtesy of Epic Games
Using combined gaze direction as an input, we normalize and break it down to get the left-right and up-down movement needed to update the control points.
Look movement expression controls
MetaHuman expressions controls for look direction
Expressions controls of looking in each direction for each eye take a float value from 0.0 to 1.0. The normalized gaze direction vector inputs to these controls for each eye. We used combined gaze, a single vector in this case, so that the same value will go for both eyes. It ranges from 0.0 when looking straight ahead to 1.0 maxing for the user's gaze direction.
MetaHuman characters with eye movement enabled by Tobii eye tracking
Brow raise expression controls
MetaHuman expressions controls for brow raise
In addition to the look expression controls, we also tested with brow controls using just the gaze direction when the user looks up again. We did not use any additional input or signal for brows and triggered this by using just up direction. MetaHuman has broken down the controls to raise brow for In and Outer region, and they can be weighted separately.
MetaHuman character with brow raise expression enabled by Tobii eye tracking
In this experiment, we only used gaze direction as an input that updates around 12 expression control points on the MetaHuman face rig to get realistic eye movement. More control points can be triggered to get additional gaze expressions for VR applications.
For future development and explorations, we would like to evaluate blink and eye openness signals that can enable expressions controls for the wink, squint, scrunch, and even cheek raise.
Additional MetaHuman expressions controls for eyes
Convergence signal can also be used when the player looks at objects at variable distances, and the avatar is thus observed correctly by other players in the same VR applications.
To wrap up, I believe that authentic and expressive avatars will be highly desired in the metaverse, not only because users want to avoid the uncanny valley effect but also, they wish to be accurately and effectively represented no matter where they are. Therefore, I strongly recommend developers start exploring the benefit of eye tracking today. Find more about enabling avatars in social applications on our XR Developer Zone. Or contact us if you'd like to collaborate on our future development.