XR-4 Features Explained: World’s First Gaze-Driven XR Autofocus Camera System
Varjo is the first company to bring a variable-focus camera pass-through system to the virtual and mixed reality market. In this blog, Varjo’s Technical Fellow Ville Timonen details the technology behind this remarkable new feature implemented in Varjo XR-4 Focal Edition.
Read previous editions of our Varjo XR-4 blog series here:
• Varjo XR-4 Product Design Story: Next-Gen Headsets Optimized for Comfort and Usability
• Visual quality of the XR-4 Series: Transforming Professional Workflows with Highest-Immersion VR&XR
The need for variable focus optics
All other XR HMDs today use fixed-focus optics in video pass-through cameras, whose focus distance cannot be changed. The human eye can discern detail up to roughly 60 pixels per degree (PPD), but the problem with fixed-focus optics is that, in practice, they reach their resolution limit at around 30 PPD.
This is because of a balancing act: On one hand the lens aperture needs to be small enough for the depth-of-field (DoF) to cover the full operating range — e.g. from 20cm to infinity — at the target PPD. On the other hand, the aperture cannot be too small as a certain amount of light needs to reach the sensor for the image processor to produce a noise-free high-quality image. We are also so close to diffraction limit that making the aperture smaller would in fact lower the effective resolution. Exposure time cannot be made too long either because XR HMDs require high-speed cameras (at least 90Hz), and, for example, in fast-paced training scenarios, the exposure time must be lowered even below 1/90s to reduce the amount of motion blur.
For similar reasons, mobile phones have long since transitioned from fixed focus to variable focus cameras. Indeed, even the human eye uses variable focus optics. Varjo is the first company to bring the variable-focus camera pass-through system to XR market.
Achieving the industry-first gaze-driven autofocus system
Variable-focus cameras optimize PPD and the amount of light reaching the sensor at the expense of DoF, allowing only a small distance range to be in-focus at a time. Now the problem becomes: How do we focus at the right distance and how do we do it quickly enough?
You might be familiar with how e.g. mobile phones focus: You tap an object on the screen and the camera evaluates different focusing distances and chooses the one that extracts the highest frequencies of the object. This is not good enough for XR HMDs.
XR-4 Focal Edition has carefully calibrated optics, very fast focus actuators (less than a millisecond from end to end) — but most importantly we have developed a novel autofocus system that mimics the human eye.
It works by tracking your gaze at 200Hz and together with our advanced LiDAR depth sensor settles at the correct focus distance quicker than the human eye itself could. The end result is as natural as it can be: Wherever you look you see an in-focus image and focusing is so quick that you never see it change. It’s as if you weren’t looking through cameras in the first place.
Our system can disambiguate difficult cases such as looking at or between your fingers by choosing LiDAR depth samples around your gaze location that correlate to gaze convergence distance. One might worry that outside the fixation point objects may not be in focus. The human eye does not have the resolution to notice this outside the fovea and due to XR-4 Focal Edition’s aperture being roughly the size of a typical iris this aspect, too, works just like the human eye.
The key benefits for mission-critical work
But how much does it matter to go from 30PPD to 50PPD?
I must admit I did not originally realize that reaching human eye resolution in video pass-through is even more important than it is in VR. In hindsight it’s obvious, but it caught me by surprise how almost everything around me in the physical world has been designed for the resolution of the human visual system, be it the font size in my favourite magazine, the resolution on my computer displays, or the size of the lettering on my keyboard. With any other HMD than the XR-4 Focal Edition, everything is frustratingly just quite not readable.
If you fall short of the human eye resolution in XR, it forces you to bring what you are looking at closer to your eyes, effectively changing your natural behaviour. This is not acceptable in many use-cases, but especially so in advanced training. The instruments you train with have carefully been optimized for human sight and your head is at a certain location for a reason. You cannot train pilots to reach out to read instruments because that is not something you would do in an actual situation. Fortunately, Varjo XR-4 Focal Edition sees the world like your eyes do.