Eye tracking — making the metaverse authentic

  • by Johan Hellqvist
  • 9 min

Tobii XR headset virtual

A couple of months ago, I took a trip to the US, my first in over 18 months. The purpose of the visit was to meet up with customers, partners, and colleagues. For some, ours was the first face-to-face business encounter they’d had in nearly two years. The experience, and I am surely not alone in this feeling, was a mixture of relief and enthusiasm. After such a lengthy absence, meeting IRL felt like proof of the human need for connection. And that got me thinking about the technologies that have enabled some businesses to continue to operate throughout the pandemic and how important it is that the tech world continues to evolve these solutions along the lines of intuitiveness and authentic human connectivity.

For this post, my colleague Maggie and I sat down to discuss the state of XR, the metaverse, what it means for everyone, and how eye tracking will bring a level of authenticity to it.

There’s been a lot of hype recently, at least within parts of the tech industry, about the metaverse.

So, Johan, my first question for you is, what do you think the metaverse is? What’s your take on it?

Although perhaps not necessarily labeled as such, I think the metaverse has been coming for a while. It has come about because people have made innovative connections between evolving technologies. Metaverse is a mixed-reality (MR) space where people can come together to work, play, collaborate, and even make a living, either as themselves or as an avatar. It’s a level playing field that anyone can enter from any device, anywhere. Technology platforms and metaverse solutions will solve the time and space challenges so that being together in MR will feel authentic and coherent.

Why do you think the metaverse is happening now? Why not earlier, when mixed reality was more of a vision?

Much of the hype is, of course, because big tech companies like Meta (formerly Facebook), Microsoft, and NVIDIA are using the metaverse to describe the convergence of a range of technologies, and perhaps more importantly to tell the story for consumers without getting buried in the details. Cloud services, wide area networks, and mobile networking have become established to the point where they can support complex, bandwidth-heavy services with sustainable business models for operators. Devices have matured and are starting to deliver on the immersive promise with adoption being further fueled by ever decreasing form factors and lower prices — creating new demands for content and providing developers with an expanding market. And when you join all these dots together, you get the metaverse and what it promises.

So, what has the metaverse got to do with us, with Tobii? Why are we sitting here today discussing it?

As the metaverse evolves, we will see a rise in the popularity of VR and AR wearables because of the level of immersion they offer. Spatial acoustics will simulate sound authentically so that people can be in an MR space and rely on auditory cues to interact with each other. But to answer your question, Tobii’s technology will enable authentic connections that cannot otherwise be achieved in the metaverse because we deliver realistic eye contact and an understanding of human attention — the core elements of communication. From a very young age, children learn to follow visual cues, and somewhere around 80 percent of the information our brains process is visual. So the difference in interpretation of intent is massive when you look a person in the eye or watch how their lips move than, say, staring just above their head. So, proper communication in the metaverse will only work if devices come equipped with eye tracking.

And then there’s the controller thing. They aren’t necessarily that intuitive, neither are keyboards or mice for that matter, but we have grown to accept these input modalities as a means of getting information into systems. But to ensure an authentic metaverse experience, we don’t want to be waving our hands or controllers around in an awkward manner — especially in an AR environment. In real life, we use gestures, speech, and eye contact to show intent. And we will do the same in the metaverse. Eye tracking, which captures gaze, head movements, and facial landmarks, will enable rapid authentication as well as intuitive communication that leverages both head pose and visual cues. It’s likely that emerging gloveless hand tracking will also be part of the mix, as it provides yet another intuitive input modality. Eye tracking will support the use of simple, intuitive gestures replacing pointing and complex gestures with discreet swipes (potentially with gloves for cold climates).

For people entering the metaverse with augmented reality devices, eye tracking is critical because of the additional data it provides. The kind of information our technology delivers will help developers weave virtual objects seamlessly into the physical world. So, eye tracking will help to reduce the parallax between real and virtual worlds, making user experiences enjoyable because they will feel authentic and coherent. The technology is also fundamental for developing multifocal displays and auto accommodation systems.

Tobii XR - attentive UI

XR devices will be a big part of the metaverse because they deliver immersion, which is key to feeling like you are in another place and participating in it for real. So why is comfort such an issue?

Office-based workforces have grown accustomed to multiple monitors and devices, choosing whatever solution best suits what they need to do and where they need to do it. In the same way that ergonomics is essential to sustainable working conditions, comfort is crucial for XR. You know about our partners Pico, who have just launched the third generation of their untethered headset that includes our eye tracking technology, and how they have worked relentlessly on the comfort factor. They’ve made a significant investment because they know that comfort is crucial for people to use XR devices throughout a full working day.

You can enhance comfort with Tobii eye tracking because a headset can leverage interpupillary distance measurements to automatically adjust for different users and slippage without disturbing the user experience.

And then you’ve got the issue of motion sickness, which high-resolution graphics can help solve. That’s where technologies like dynamic foveated rendering (DFR) are essential to keep XR devices performing in split- and on-device processing architectures. This year, we saw NVIDIA release the second version of their Variable Rate Supersampling, VRSS 2, which leverages Tobii Spotlight Technology. The two technologies collaborate to provide gaze information at minimal latency to the NVIDIA driver, enabling developers to create applications that leverage DFR without additional coding for each device. The resulting stable and high frame rates help lower the threshold for dizziness that some people experience with VR.

What about new technology trends from the hardware perspective? And what is Tobii doing to ensure our integration platforms adapt?

Image quality and device weight also contribute to wearability, which is pushing the development and adoption of folded optics (often referred to as pancake lenses in XR) because their form factor and image quality often result in lighter, less bulky headsets than conventional Fresnel lens designs.

To support manufacturers as they shift to pancake for commercial devices, Tobii XR has spent much of its 2021 R&D efforts on developing the integration platforms to deliver eye tracking for this emerging architecture. Our focus has been on a compact design that enables integration without invasive changes to the headset and minimizes resource consumption. We have a couple of implementations in development and are looking forward to talking about them in more detail once they launch.

I’d say the other major hardware development is the adoption of eye tracking in commercial headsets. Last year (2021), we had two major releases of devices natively equipped with our eye tracking — the Pico Neo 3 Pro Eye, which I’ve already mentioned, and the HP Reverb G2 Omnicept Edition. Pico’s success points to the demand for standalone devices and HP’s to the need for additional sensors. And then there’s the partnership we announced with Pimax, who will bring the full potential of the metaverse to consumers. From a Tobii perspective, I would say that our design approach, making sure that our technology lives up to the everyone, everywhere, every device principle, is a crucial differentiator for manufacturers looking to embed eye tracking in consumer devices.

Announcements made by Meta and Sony Playstation reveal that eye tracking will be part of their next-generation consumer headsets, and I believe this will happen soon.

VR headset on a table, looking at the left lens from the viewer's persepective

What do you see as the biggest challenge when applying eye tracking technology in consumer XR devices?

That brings us deeper into what I’ve just mentioned about systems living up to the everyone, everywhere, every device principle (the three e’s). Because, as you know, it’s relatively easy to develop an eye tracking system that works in a controlled environment but making one that lives up to the three e’s isn’t. We’ve been working for over two decades now, doing the research and gaining the competence to deliver a solution that works — one that caters for different eye shapes, colors, and retinal reflectivity, sight correction, and so on. But for the consumer environment, it’s not just about reaching 95% population coverage; it’s about delivering a robust solution that is repeatable in large-scale production. If you want to read more about how we've done this, you can check out Andreas's post on the importance of building universal technologies in his post on demystifying eye tracking for XR.

And what eye tracking application areas do you think will first jump into the metaverse?

Well, I would say it’s the usual suspects, healthcare and wellbeing, collaboration, gaming, social, education, and simulation-based training. However, I’d say healthcare and wellbeing are the top runners in the enterprise sector. I say this because of the increased interest we saw in 2021 for Tobii Ocumen — our analytics toolkit for data recording, organization, and analysis of human behavior that these kinds of advanced applications require.

Remember when we met up with Scott Anderson of SyncThink to find the keys to scalability and he talked about how they were an application waiting for the perfect hardware to come along? I think we can expect to see more of these kinds of applications waiting for the right hardware and eye tracking signals. And our partners at REACT Neuro and Apeiron Life have taken a serious look at the importance of cognitive performance through decades of life. As Dr. Martin points out, wellbeing is no longer limited to keeping physically active and eating properly, it’s also about preventing cognitive decline.

On the consumer side, social and gaming applications will experience hockey stick growth given the current level of interest and how platforms are developing. I think too that these are the areas where consumers will appreciate the metaverse the most. For Tobii, it means we will continue to work closely with headset makers, developers, and content creators to bring features such as presence, realistic avatars, intuitive interfaces, and biometrics for authentication into the metaverse.

Before we wrap things up, Johan, I’d like to ask you about standards and the need for OpenXR.

One of the most effective ways to promote cross-platform agility is through open standards because they help prevent vendor lock-in and time-consuming device adaptation. But it also allows big tech and niche operators to work together. And as you know, OpenXR released an extension for eye tracking to the standard back in 2020, but what we can look forward to this year, is the implementation of the standard in commercial headsets and supporting solutions like Microsoft Windows MR platform.

To summarize then, Johan, what can we expect in 2022? How will Tobii add value to the metaverse user experience?

For me, it’s about eye tracking in every device, which will require integrated technologies like ours to live up to the everyone, everywhere, every device principle. We will see continued diversity in device types, with workhorses for simulation environments, but the major trend points to untethered, lightweight headsets. For developers, the metaverse presents a solid platform for storytelling and new applications. We can expect to see some early prosumer examples, but like most emerging solutions, the initial experience will probably be a bit fragmented. There will be a lot of collaborative applications that will help enterprises cut through organizational boundaries and remove the silos created by time and space constraints. I think we have a good example of this already with VirtualRetail.io and how they are f.

For consumers, it will be interesting to see how people react to Meta’s Horizon now that it’s passed the beta phase and moved on to a broader public release. We won’t reach the level playing field just yet, but we will see enhanced devices with eye tracking and a rise in authentic avatars, realistic eye contact, easy-to-use interfaces, and enhanced game play, for example.

Great summary, Johan. Thank you.

Thank you, Maggie. It was nice to see you again in person, but now it’s back to virtual.

Written by

  • Johan Hellqvist

    Johan Hellqvist

    Head of XR

    As the head of the XR segment at Tobii Tech, it means I work with AR and VR device manufacturers, helping them to integrate eye tracking into their products. My focus is on solving the challenges facing manufacturers as the market shifts toward mobile untethered devices, and how to shape our development to adapt to the tough demands of consumer electronics.

Related content

Subscribe to our blog

Subscribe to our stories about how people are using eye tracking and attention computing.