Meta, formerly known as Facebook, is making significant strides in its metaverse venture by introducing highly detailed facial scans of individuals that can be integrated into virtual environments for interactive experiences.
This development comes on the heels of Meta’s recent unveiling of the Quest 3 headset, and Mark Zuckerberg himself recently showcased the company’s remarkable progress in the field of “codec scans.” These scans enable the real-time generation of intricate facial animations, enhancing virtual conversations.
In a demonstration conducted through virtual reality, Zuckerberg engaged in an hour-long video podcast conversation with Lex Fridman, utilizing these scans to illustrate the technology’s capabilities.
Zuckerberg emphasized that the avatars traditionally employed in games and virtual reality often appear cartoonish and limited in their emotional expressions, typically relying on pre-rendered responses. In contrast, codec scans offer a level of detail that, while computer-generated, achieves a striking level of realism.
Codec scanning presents an intriguing alternative to video conferencing due to its low latency characteristics. Although the scanning process remains somewhat time-consuming, once completed, it demands minimal processing power. While the technology is still evolving and primarily focused on facial scans with some limitations in body representation, its potential applications in gaming, work, education, and social interactions are readily apparent.
Zuckerberg’s interview, accessible on YouTube, provides insight into Meta’s vision and aspirations concerning this technology. He explained, “Instead of our avatars being cartoony and transmitting a video, we’ve scanned ourselves with various expressions, creating a computer model of our faces and bodies. We collapse that into a codec. When you wear the headset, it captures your face and expressions, sending an encoded representation of your appearance over the network. In addition to being photorealistic, this approach is more bandwidth-efficient than transmitting full videos or immersive 3D scenes.”
The primary challenge lies in capturing the emotional depth of human beings, particularly through facial expressions and, notably, the eyes. Zuckerberg remarked, “There’s a certain realism that comes with delivering this photorealistic experience, delivering a sense of presence as if you’re there together, no matter where you actually are in the world.”
Before this technology becomes widely accessible, various obstacles must be overcome, with streamlining the scanning process being a top priority. Zuckerberg expressed his aim to simplify the process further, envisioning a scenario where “you just take your phone, wave it in front of your face for a couple of minutes, say a few sentences, make a bunch of expressions… so the whole process is just two to three minutes, producing results of the current quality.”
While Fridman commended the codec scans for avoiding the “uncanny valley” effect, Zuckerberg acknowledged the complexities of creating scans that seamlessly suit diverse individuals. “Different people emote to different extents,” he noted, emphasizing the need for personalized avatars that better express users’ emotions.
While the podcast setting was a dimly lit room highlighting the avatars, Zuckerberg underscored the importance of the broader environment and activities in real-world applications. He envisions codec scans being valuable for video calls but has grander aspirations for the metaverse, where users can engage in a variety of physical activities together. “Once you get mixed reality and augmented reality, we could have codec avatars and go into a meeting with some people physically present and others appearing in a photorealistic form superimposed on the physical environment. Stuff like that is going to be super powerful.”
Meta plans to gradually roll out this technology, with Zuckerberg stating, “We want to get more people scanned into the system and start integrating it into each of our apps. Something like this could make a big difference for remote meetings… It’s not ready to be a mainstream product yet, but we’ll keep refining it, incorporating more scans, and introducing additional features. In the next few years, we’ll likely see numerous experiences like this emerge.”