The new virtual human technology was released, and Apple still had so many metaverse moves

Mondo Anime Updated on 2024-01-31

With the arrival of the end of the year, Apple's big move in the field of headset Vision Pro half a year ago is getting closer and closer to the official launch.

Although in the past six months, everyone's attention has focused on the new immersive experience that Vision Pro may bring, but there are also many explorations about virtual humans hidden in it:

From Animoji, which pioneered 3D facial motion capture five years ago, to Hugs, which can generate a digital twin of a real person, released last week, Apple is determined to explore a different path for virtual humans.

And these technologies accumulated over the years will also usher in a highlight moment on Vision Pro - I have to say that Apple will handle the life of the metaverse and inject new possibilities into the metaverse with a half-year cycle.

Apple's latest release"hugs"Virtual human technology, the full name of Human Gaussian Splats, that is, Human Gaussian Synthesis, this technology is based on 3D Gaussian Splatting (3DGS) and SMPL body models, through the fusion of two advanced technologies to create more vivid and realistic digital characters.

One of the key advantages of HUGS over traditional avatar generation technologies is its efficient data processing capabilities, which can generate a complete digital avatar in only about 2 to 4 seconds** (50-100 frames), greatly reducing the amount of raw material required to create a avatar.

Of course, there are still limitations in capturing details at present, but through the algorithm, Hugs can automatically fill in the elements that are not caught, ensuring the overall quality of the digital clone.

In addition, the speed of HUGS generation is another advantage: Apple released a related ** claiming that HUGS can complete the generation of digital humans in about 30 minutes, which is about 100 times faster than other similar products on the current market.

Hugs also offers a significant improvement in rendering quality and speed in addition to requiring less material and faster generation: Hugs is able to achieve high-quality rendering at 60 fps while also handling complex challenges in dynamic scenes, such as avoiding artifacts and coordinating motion during animation.

While the action design is a bit ghostly, as you can see from the demo**, the current use of hugs is utilized"The digital humans generated by technology have approached the finished body in terms of movement fluency and character realism.

It also heralds a significant step forward in Apple's digital rendering space, particularly in the ability to create and render human digital twins in dynamic scenes. Therefore, with Vinsion Pro, it is possible to quickly generate digital humans through related technologies and carry out diversified applications based on them.

In addition to the potential combination possibilities with Vinsion Pro, these features of Hugs also make it a valuable tool in filmmaking, game development, virtual reality, and more, especially in scenes that require fast and high-quality rendering of dynamic human characters.

With HUGS technology, creators and developers have more freedom to compose novel poses and views, opening up new possibilities for digital creation.

Of course, the breakthrough did not happen overnight - in recent years, Apple's many explorations in virtual human-related technologies have become the cornerstone of the future picture of Vinsion Pro.

Looking back at Apple's virtual human-related actions, the 2017 animoji is undoubtedly a key step in it.

Animoji debuted at Apple's autumn conference in 2017, and Animoji was unveiled with the iPhone X, showing a new way to interact at that time

This technology accurately captures the user's facial movements, such as mouth, eyebrows, and eye movements, through the iPhone's front-facing TrueDepth camera system, mapping these expressions in real time onto various animated characters such as unicorns, robots, or owls. Users can select different animated characters to record and send animated messages with voice, which can accurately simulate and reflect the user's expressions and voices.

And the story that happened later is already familiar to us: as technology continues to advance, Animoji is also being used in a wider range of scenarios, such as social networking and production. At the same time, other similar products are starting to pop up like crazy, and the craze for generating their own avatars through facial capture repeats itself every once in a while.

At the same time, with the advancement of time, Apple has also made further progress in the exploration of virtual human-related technologies - many of the technical details are full of animoji shadows.

According to the current news, Apple will input the user's 3D face data in advance on the Vision Pro, and generate a 3D modeling and rendering of itself, that is, a virtual human that is close to one-to-one restoration, and in order to make the virtual person more realistic, Apple will use a new technology called "emotion recognition".

The technology is designed to analyze the user's facial expressions and emotions through the camera. According to the patent, facial recognition technology is required to identify users in this system in order to provide customized operations, and this technology is actually derived from Apple's early years.

The technology was originally developed for Siri applications. At that time, in order to reduce the number of voice requests that were misunderstood, Apple tried to do this by analyzing user sentiment to further improve accuracy. In an early patent application, Apple was developing a new way to help Siri interpret user requests by adding facial analysis to future versions of Siri or other systems.

With the advent of Vision Pro, the emotion recognition technology that has been paved before is undoubtedly about to usher in a greater use.

In addition to its efforts to build more realistic virtual humans, Apple has also begun to explore the potential application scenarios of virtual humans.

Apple's recent patent for the metaverse** shopping experience is an innovative technology that aims to revolutionize the way shopping is done – in a virtual environment where users can interact directly with a variety of virtual products, just like in real life.

The technology uses computer-generated imagery (CGI) and extended reality (CGR) technologies to elevate the digital retail experience to the next level, allowing users to interact with remote salespeople in real-time through virtual communication sessions.

For example, a user can take a virtual smartphone from a virtual TV and experience all of its features, interacting in a way that makes the virtual object look just as realistic as the real physical object. In addition, the patent proposes application scenarios in VR environments, including virtual retail stores, virtual tables, and product displays. These applications are not limited to showcasing products in the real world, but can also include virtual locations such as historical locations or fictional scenes.

With this patent, Apple aims to address the lack of instant feedback and interactivity in shopping, allowing users to enjoy a face-to-face shopping-like experience even at home or in any remote environment. Users can initiate a retail experience in a CGR environment with simple gestures, or experience a virtual product demo interactively, while salespeople can remotely manipulate the product to highlight its features and functionality.

In the near future, Vision Pro users will be able to experience highly realistic avatars and interact with them in more immersive environments – opening up new ways to experience them in entertainment, education, and remote communication.

At this point, Apple's virtual human path has been clearly shown: that is, with the real digital avatar as the main direction, by improving the action, details, emotions, scenes, etc., the digital human is infinitely close to reality.

Different from the current hyper-realistic virtual human or ** virtual human, the path chosen by Apple is closer to the path of building a real digital clone through real materials.

Although the current real virtual human can be very realistic, its application is more in the 2D plane, that is, in the display screen of mobile phones, computers, etc., the generated content is mainly oral broadcasting, and the scope of the camera is basically limited to the upper body, and the facial movements are the mainstay, and the movements of other parts of the body are limited.

In the era of spatial computing that Vision Pro is about to open, the requirements for real-life digital humans and their full-body movements will reach an unprecedented height.

Previously, the low childishness of the ** characters in Meta Horizon Worlds, and the weird feeling that can only show the upper body of the virtual character, have become two major obstacles in its development process:

On the one hand, the image of a low sense of childishness will naturally make the virtual space more playful, thus hindering its expansion in productivityOn the other hand, showing the weirdness of a character's upper body can easily ruin the immersive experience.

Obviously, the virtual humans and even the virtual world in Apple's cognition must aim to be infinitely close to the real world.

ios 17.2 The official debut of the Space** recording feature after the update is also a key step towards achieving this goal. Although the current feature can only record content at 1080p 60 fps, based on the feedback that has been experienced, it can already achieve a sense of immersion close to the real world.

Although the time for HUGS to generate a digital human for 30 minutes is still a little long, the specific application effect of the emotion recognition function, and the extent to which scenes such as shopping can meet people's expectations will still have to wait for Vision Pro to be revealed after the sale, and the resolution limit of space**1080P is still a certain distance from the high standard requirements of the headset for content 8K or even 16K, but there are more and more puzzles about the virtual world built by real digital humans and real scenes.

After the further improvement and integration of related technologies, it only takes a period of ** to build a virtual world with real scenes and realistic characters, which is actually closer than imagined.

Related Pages