The insane technology behind Apple Vision Pro
Apple always remained tight-lipped about the Metaverse and Extended Reality, unlike Mark Zuckerberg, who openly discussed and advocated these concepts. However, if we delve into Apple's history of patents and acquisitions, it becomes apparent that Apple has been secretly developing their mixed-reality headset called 'Vision Pro' for quite a few years now. Developing a technology of this magnitude doesn't happen overnight; it requires meticulous planning and strategic execution
The design of Vision Pro showcases Apple's exceptional expertise in chip manufacturing, sensors, and seamless software integration
During the recent 45-minute WWDC keynote, Apple showcased their exemplary storytelling skills, effectively convincing the audience that Vision Pro is equipped with groundbreaking technology that justifies its exorbitant price tag. Journalists who have had early access to the device rave about its unparalleled performance, surpassing anything currently available in its class. Now, the question arises: how far ahead is the technology of Vision Pro compared to its competition? Are we talking about a slight edge, or will it take years for others to catch up? Let's take a deep dive into this matter.
Display:
Let's dive into the most crucial aspect of the device: the screen. According to Apple, the primary display of Vision Pro boasts an impressive 23 million pixels spread across two panels, each roughly the size of a postage stamp. This pixel count surpasses that of a 4K TV for each eye. Powering this remarkable display is a Micro-OLED 'Apple silicone' backplane that manages to fit 64 pixels within the space of a single pixel on an iPhone screen. These pixels are incredibly tiny, measuring just 7.5 microns in width. Apple also revealed that the Vision Pro display supports a 90Hz refresh rate, along with a special 96Hz mode designed for content created at 24 frames per second. While these specifications sound impressive, they do not provide direct comparisons with existing market offerings.
You may have heard about OLED derivatives in the past, but Micro-OLED (μOLED), also known as OLED on Silicon (OLEDoS), is an entirely new technology that involves placing minuscule organic light-emitting diodes on a silicon wafer. Unlike conventional OLED screens, which typically achieve a density of 500-600 pixels per inch (PPI) in a best-case scenario, the first-generation μOLED screens can easily reach densities of 3000-4000 PPI. These screens deliver high luminance ranging from 3000 to 15000 nits, infinite contrast, rapid response times of 0.01 milliseconds or less, and long-lasting LED life compared to conventional OLEDs. However, their physical size is usually limited to less than an inch. Currently, μOLED screens are primarily used in mirrorless cameras and are slowly being introduced in extended reality (XR) devices.
In contrast to traditional OLED screens, μOLED backplanes are manufactured in semiconductor foundries rather than display fabs. It is rumored that the Vision Pro's μOLED system was designed by eMagin, a company that has since been acquired by Samsung Display. The backplanes of Vision Pro are manufactured by TSMC using their legacy nodes, possibly around 28nm or larger. TSMC possesses the capability to mass-produce semiconductors for chips on 5nm nodes and is expected to shift to even smaller 3nm nodes in the near future. Just imagine the insane pixel densities that could be achieved in the future! This also opens up the possibility for other companies to introduce more affordable μOLED devices with screens that offer better quality. It is reported that Sony has designed the frontplane of the Vision Pro system, including OLED deposition, encapsulation, post-process, and driving circuitry. According to Ross Young, CEO of Display Supply Chain Consultants, the μOLED display of Vision Pro measures approximately 1.41 inches, with each pixel measuring 7.5 microns, similar to the diameter of a human red blood cell. This corresponds to a pixel density of roughly 3400 PPI and an approximate resolution of 3800 x 3000 per eye. So far, it appears to be the highest resolution μOLED screen available in the market.
When evaluating Extended Reality (XR) devices, resolution alone doesn't provide the complete picture. Field of View (FOV) is another crucial factor that determines the observable area through the eyes or an optical device like a camera or AR/VR headset. An individual human eye has a horizontal FOV of approximately 135° (monocular) and a vertical FOV of slightly over 180°. The combined FOV of both eyes (binocular) is around 114° horizontally, which is essential for depth perception. Another important measure is Pixels-Per-Degree (PPD), which quantifies the number of pixels per degree of viewing angle. Retinal resolution refers to the point where a person with 20/20 vision cannot distinguish individual pixels in an image. For extended reality, the gold standard is around 60 PPD at the fovea. If an image of 3600 pixels (60x60) falls within a 1° x 1° area of the fovea, a person would be unable to discern higher-resolution details. Beyond 60 PPD, the increased resolution doesn't significantly improve the visual experience.
Although Apple hasn't disclosed the Field of View (FOV) of the Vision Pro headset, journalists who have tried the device mention that it competes with other AR/VR headsets offering a FOV ranging from 100 to 120 degrees. This would place the Vision Pro's pixels per degree between 50 and 70 PPD.
Now, let's examine the existing options in the market and see where the Vision Pro display stands. Meta Quest Pro, the top-tier headset from Meta, offers a resolution of 1800x1920 pixels per eye at 90Hz, with a FOV of 106° and 22 PPD. It utilizes conventional LCD screens. The Playstation VR2 features a resolution of 2000x2040 pixels per eye at 120Hz, with a FOV of 110° and 18 PPD. It employs OLED screens. Bigscreen Beyond delivers 2560x2560 pixels per eye at 90Hz, with a FOV of 102°, resulting in 32 PPD on μOLED screens. In terms of PPD, Varjo is the sole company in the market offering a bionic display that combines a central μOLED panel with a resolution of 1920x1920 per eye and 70 PPD, along with a peripheral LCD panel at 2880x2720 per eye and 30 PPD.
Thus, while the Vision Pro's display system may not be unbeatable, it currently stands as the best option available in the market, offering exceptional performance and visuals.
Processor:
Apple positions Vision Pro as a platform for "spatial computing", aiming to provide users with an immersive spatial experience. To achieve this, Apple has opted for a distinctive dual-chip design. The first chip, the M2, drives the visionOS, executes advanced vision algorithms, and delivers stunning graphics. Notably, this is the same powerful M2 chip that powers Apple's Mac computers. In addition to the M2 chip, there is a brand-new chip called R1, which has a specific purpose: processing inputs from the cameras, sensors, and microphones, and streaming images to the display with an impressive latency of just 12 milliseconds. To put that into perspective, it's 8 times faster than the blink of an eye!
But why did Apple choose to include this dedicated chip in the first place? The simple answer is to ensure users can experience the Vision Pro the way Apple intended. By incorporating the R1 chip, Apple aims to optimize the overall user experience and deliver seamless performance by fine-tuning the processing of input data from various sources. This meticulous attention to detail aligns with Apple's commitment to providing a cohesive and refined user experience across their products and services.
The Vision Pro boasts an impressive array of sensors, including an estimated 14 cameras, 5 sensors, and 6 microphones. Apple made two significant decisions in the design of the Vision Pro headset. Firstly, they prioritized augmented reality within the mixed reality experience. This means that users are always aware of their real environment, with the option to immerse themselves in a virtual environment. The headset overlays digital content onto the user's real-world surroundings. The two front cameras play a vital role in this by capturing the surrounding environment in real time and transmitting over a billion pixels per second to the display, ensuring a clear depiction of the world around the user. Additionally, the LiDAR scanner and true depth camera work together to create a fused 3D map of the surroundings. This enables Vision Pro to render digital content accurately in real-time within the space around the user, eliminating the need to create a room map when moving from one room to another. The R1 chip ensures that the data from these sensors are seamlessly passed on to the display or M1 processor near-instantaneously, resulting in an exceptional pass-through experience characterized by low latency and high-quality screens, as most journalists have noted.
The second important decision Apple made was to forego the use of external controllers typically found in AR/VR headsets on the market. Instead, Apple opted for a more natural interaction method utilizing the user's eyes, hands, and dictation to navigate the user interface. The eye gaze acts as a mouse pointer, allowing users to select UI elements simply by looking at them.
领英推荐
In 2017, Apple acquired SensoMotoric Instruments, a German firm known for its technology capable of real-time tracking and recording of a wearer's gaze at a rate of 120 times per second. This technology, potentially utilized in the Vision Pro, reduces time lag and helps mitigate motion sickness by ensuring that the perception of movement aligns with the shift in perspective. Users who have experienced the Vision Pro's eye tracking have expressed that it feels almost as if the device can read their thoughts. Hand tracking is used for clicking and interacting with the UI. The headset incorporates 4 downward cameras, 2 side cameras, and IR cameras that track hand and finger movements, even in dark environments, without requiring users to hold their hands in front of the device.
Apple's focus on a human-centric approach
Spatial Audio:
Apple introduced spatial audio technology at WWDC 2020. Essentially, it's Apple’s take on Dolby Atmos. This technology aims to provide users with surround sound and 3D audio through a pair of speakers. However, what sets the Vision Pro apart is its unique capability known as "audio ray tracing." In general, ray tracing involves the rendering of lights and shadows in computer-generated visuals, such as games, by simulating and tracking the path of every ray of light as it interacts with objects in the virtual scene, reflecting, refracting, transmitting, and absorbing accordingly.
In 2021, Apple was granted a patent for "Auralization," which relates to the simulation of sound propagation in virtual environments. Auralization involves the synthesis of sound stimuli that mimic realistic behaviors of sound waves within enclosed spaces using methods of Geometrical Acoustics (GA). In Geometrical Acoustics, the sound is modeled as rays that propagate in straight lines through the air, changing direction whenever they encounter a surface. This can be thought of as "audio ray tracing."
Given that users of the Vision Pro headset can choose between augmented and virtual reality experiences (or a combination of both), the inclusion of audio ray tracing allows for a perfect recreation of audio effects. For instance, if you're listening to a song in the augmented environment of your living room, it will sound different if you choose to be in a virtual environment, such as a cathedral. Apple claims that the Vision Pro can analyze the acoustic properties of the user's surroundings, including the physical materials present, further enhancing the audio experience.
While Apple did not explicitly focus on the gaming capabilities of the Vision Pro, this audio ray tracing technology has the potential to significantly enhance the gaming experience. It opens up possibilities for immersive and realistic audio effects within virtual gaming environments, offering users an unprecedented level of audio fidelity and immersion.
Miscellaneous features:
Apple being Apple always goes the extra mile in everything they do. When it comes to typical virtual reality (VR) devices, users are fully immersed in a virtual world, often losing touch with the real world around them. While this may be suitable for certain use cases like gaming or watching movies, for many other scenarios, the prolonged disconnection from reality can have a negative impact on one's well-being. Apple's primary goal with Vision Pro is to create an augmented reality (AR) device that seamlessly integrates the digital and real world, offering users the flexibility to choose the level of immersion based on the specific activity, such as gaming or movie watching. They also emphasize the importance of maintaining interaction with the real people
To further facilitate this goal, Apple has included an external screen on Vision Pro that displays the user's eyes to those around them. While this feature aims to enhance the sense of connection and enable more natural social interactions, it can be seen as counterintuitive from a psychological standpoint. Personally, when encountering someone wearing a device on their eyes, even if it displays their digital eyes in real-time, there is a natural inclination to feel a certain distance or detachment. It's akin to trying to engage in a conversation with someone wearing earphones in pass-through mode. Although the person may be able to hear us perfectly, there's a subconscious impression that they may not be fully engaged or interested in the interaction.
This raises an interesting point about the balance between technological immersion and maintaining genuine human connection. While Apple's intention is undoubtedly to enhance the social aspect of AR experiences, it remains to be seen how users will perceive and adapt to this feature in practice. The impact on social dynamics and the ability to foster meaningful connections with others while wearing Vision Pro will likely be a subject of ongoing exploration and refinement.
Conclusion:
Apple Vision Pro stands as a true marvel of technology. It is a testament to Apple's unwavering dedication and meticulous planning over the course of several years, drawing upon their vast expertise developed over decades. Undoubtedly, no competitor currently possesses the capability to come close to what Apple has achieved with this groundbreaking device, and bridging that gap in the future may prove to be a formidable challenge for others, considering Apple's commitment to constant improvement.
However, while the Vision Pro showcases impressive technological advancements, it is important to recognize that technology alone does not guarantee a successful product. The key factor lies in creating a human-centric experience that addresses real-life needs and fills existing gaps, whether they are openly acknowledged or subconsciously felt. This is where my ambivalence sets in.
Apple has undoubtedly put their best foot forward with the Vision Pro, pushing the boundaries of what is possible. Yet, I remain unconvinced that it fully addresses a significant gap in the market. It is crucial to consider how well the device caters to the needs and desires of its users, ensuring that it seamlessly integrates into their lives and enhances their experiences.
I am eager to hear your thoughts on this matter.
#Apple #VisionPro #AppleAR #AugmentedReality #SpatialComputing #Technology #HumanCentricDesign #Analytics
Disclaimer: The information presented in this article is based on research and sources available on the internet. While efforts have been made to ensure the accuracy and reliability of the information, the author does not guarantee its completeness or correctness, except for the information directly provided by the manufacturer. Readers are advised to cross-reference the information with official sources and exercise their own judgment before making any decisions or taking actions based on the content of this article.