From “Real World” to “Metaverse”

From “Real World” to “Metaverse”

A miniseries in 6 episodes about Data, Technology & Business in the Metaverse


Part 3 – Creating Artificial Worlds

Welcome back. Great to have you here. This is the 3rd part of the miniseries. In the first episode, we were looking at the definition of "Metaverse", and in the second episode, the topic was about capturing 3d worlds. If you have missed one of those parts, you can find it in the following list of episodes.?

Episodes:

Part 1 – Introduction to Metaverse

Part 2 – Capturing 3d Worlds

Part 3 – Creating Artificial Worlds

Part 4 – Gestures, Emotions and Voices

Part 5 – Data Security and Computer Power

Part 6 – Doing Business


Displaying 3d Worlds

Let us now look at what is required to display an artificial, immersive 3d world. Currently, we use monitors and mobile phones to "look into" 3d worlds.

Es wurde kein Alt-Text für dieses Bild angegeben.
Es wurde kein Alt-Text für dieses Bild angegeben.

(c) Microsoft Office Stock image

It looks like the next step after video calls on our laptops is to wear glasses, which are basically two little monitors, each one in front of one eye. Not very comfortable at the moment, for sure not for many hours. People even tell they get headaches after few hours. But those devices will be improved further and will most probably be as comfortable as normal glasses soon.

The industry calls this technology "Augmented Reality” and "Virtual Reality”, which we probably all have heard about. The difference between those two is that “Virtual Reality” only shows images or artificial objects, whereas “Augmented Reality” combines the photos and artificial objects with the real world, both seen thru the device at the same time.

You may know Augmented Reality from the IKEA app (see for example wired.com), where you can see how the new sofa may look like in your living room. Or maybe you know Augmented Reality from Pokemon go.

Virtual Reality and Augmented Reality are both shown on mobile devices, laptops and glasses. But to be as real as possible, we would like to be in the room, not only looking thru a device into the room, right?


Building The 3d World Around Us

There are of course video projectors, like the ones we used the last decades in the cinemas. We also used them to display our presentations on the whiteboard. They are perfect to bring images and animations onto walls, and can be also used to project images onto a table and on a floor. Or even a palm of a hand.

But, what about holograms, which have been described in science fiction since the 1970’s or so?

Well, there are apps, which can produce kind of holograms, if you build a little transparent pyramid as shown in this movie at youtube.com. This is actually fun, and I have tried it once by myself. However, this is not a solution for a real Metaverse, or course.

Microsoft sells devices, which they say, produce holograms, like the HoloLens2. Other companies have similar devices as well. But according to our definition a few lines above, those are in fact Augmented Reality devices, and not real holograms that we know from StarWars.com. Well, I must admit, those devices are interesting anyway. So, let us quickly look at the HoloLens2. Microsoft say it is "a self-contained Windows computer, running Windows Holographic, that runs apps and solutions in an immersive mixed reality environment”. In comparison to other Augmented Reality devices, it actually does not make use of two LED monitors, but produces the images with lasers in red, green, and blue, which are projected in front of the eyes via vibrating mirrors. It offers head tracking by means of 4 cameras, eye tracking by means of 2 infrared cameras. It also captures how far things are in front of the person, by recognizing the phase change of a laser pulse when being reflected on 3d objects. For those of you who are interested in the resolution of this depth sensor, let us look at an image on zdnet.com, where we can see that it can even pick up the curve of a ping-pong ball as it flies thru the air (More details: zdnet.com). I have not seen any data transfer rates published by Microsoft for the HoloLens2, but if we assume the data transfer from the device to a central computer does basically send the 8MP camera images and some data from the depth sensor and the head and eye tracking, I would assume the compressed data is below 50MB/s. Therefore, not a big deal nowadays.

What about other types of holograms?

Well, let us better call it 3d projection, even if they often are named “holographic”, like the example of the heads-up display in the automotive industry from WayRay (described by autoevolution.com). Btw, those heads-up displays are quite useful, as they can augment the drivers' view with a lot of important information like free parking lots or danger situation ahead.

The most promising way, however, as of today, to show a dynamically changing 3d object, is to use many laser projectors generating a 3d object by reflecting the laser light rays on small particles like droplets in a fog steam from the ceiling to the floor (see for example neatorama.com).

From a data transfer point of view, for displaying 3d objects like the seen animal in the fog steam with 5 laser projectors, we need probably 4k video streams per laser to be able to see enough details, so that would mean about 5 x 30MB/s of data transfer in total. Not a real problem for Wi-Fi.

However, do I really want to have fog steams everywhere in my house?

I guess, in near future, we will first wear glasses like Microsoft's HoloLens2 or Metas Oculus Rift, before we will see "real” holograms like the ones in StarWars.com.


Generating A 3d World as Real as Possible

In the previous chapter, we have talked about projecting images onto walls and tables. Now, how can a 3d world, an avatar or any other 3d object be projected onto walls or Augmented Reality glasses, as realistic as possible?

Let us remember the ToyStory image that we have seen from Disney.com. How was it possible to show such a realistic animation movie, with shadows and reflections?

Well, this is all possible by applying the so-called “Physically-based rendering”. It produces images, which look as real as the reality. Look for example at the artificially created image at this link: pbr-book.org. This looks very real, and it is not possible anymore to tell whether this is a real photo or just computer graphics. In fact, it is a 3d scene based on about 24'000 individual plants with about 3 billion triangles.

The method to calculate such realistic images is often also called “Ray Tracing” and is one of my favorite hobbies. The following images are two of my own images.

Es wurde kein Alt-Text für dieses Bild angegeben.

To produce such nice images, we first need to have the 3d objects available in the system, obviously. In the 2nd episode of this miniseries, we have talked about capturing 3d objects with cameras, but such photos cannot be used directly as representation of 3d objects without any processing. The reason is that those photos are built up by pixels, that means they are bitmap graphics. This is like zooming into the following Wikipedia image at google.com, which will not make the image sharper. Whereas, to know the details of 3d objects, i.e. their dimensions and how they are oriented in the 3d world, we need to get them as vector graphics, which is the same like having details of streets on this maps.google.com image, where you can zoom in and it will always remain sharp.

Having the 3d objects represented as vector graphics is possible by either capturing them with a 3d scanner, by designing them manually from scratch on the system, or by taking photos with cameras and use software, which calculates the dimensions and orientation of the 3d objects from these photos.

But, just having the 3d object in the system and showing it as a wireframe image like the left sphere below, or as a one-color object like the right sphere below…

Es wurde kein Alt-Text für dieses Bild angegeben.

…does not make it look like a realistic golden ball yet, right?

Es wurde kein Alt-Text für dieses Bild angegeben.

Therefore, we need to calculate the specific color of each little, little part of the 3d object. This color for each little area of the 3d object is built by all infinitely many light rays shining on this area, as indicated in the following image, or reflecting on it, or transmitting thru it.

Es wurde kein Alt-Text für dieses Bild angegeben.

The resulting ray for each little area may get reflected on this 3d object and on other objects, before it may or may not get into our eyes.

Taking all those infinitely many light rays into the calculation for each little area of the 3d object is of course impossible.

Es wurde kein Alt-Text für dieses Bild angegeben.
Es wurde kein Alt-Text für dieses Bild angegeben.

The trick is to only look at those rays, which are coming into our eyes and ignore all other rays. For these rays, which come into our eyes we can calculate the colors by following these rays back on their path and see which object they have hit.


Es wurde kein Alt-Text für dieses Bild angegeben.


The color we see in our eyes or as pixel on a monitor, is mainly driven by the object color, which we look at in front of us, here the blue large sphere, representing the water glass above.


Es wurde kein Alt-Text für dieses Bild angegeben.

But there are more influence factors to consider. We also need to check if the 3d object is in the shadow of another object and would therefore appear darker.

Also, depending on the object's material, it can also reflect the light ray like a mirror, and the reflected light ray may hit another object, which can further reflect the light ray, and so forth. The object can also be transparent like a window or rough like sand.

All those reflections and transmissions, the material structure and the light sources, all are influencing the color of the ray coming into our eyes or is displayed as a pixel.

That is why this method is called “Ray Tracing”.

Such calculations have to be executed for each pixel of the screen and for each 3d object, and by this finally get to a real-looking image that can be projected, like the one at pbr-book.org.

Btw: Even if those are still millions of calculations for one 3d world, the latest graphic cards allow almost real-time Ray Tracing, at least for scenes with a low number of 3d objects. However, real-time rendering of two persons requires still an immense computing power.

To be continued tomorrow... Thanks for taking the time to read it.

Do not miss the next episode, where we will discuss how gestures, emotions and voices can be handled in a Metaverse.





(Remark 1: There are thousands of aspects around Metaverse, and I am far from being an expert in all of those topics. Actually, I would be very happy to hear and learn from your knowledge and experience, so please use the “comments” feature to help us all to better understand the details around Metaverse. )

(Remark 2: Due to copyright reasons, I am providing links to sites containing images, rather than copying the images to this miniseries. Just click on the indicated links to see the pictures and the stories around them. The pictures, which I do show in this miniseries are either my own, or from “Microsoft Office Stock Images”.)

(Remark 3: The text only serves for educational purposes. It contains my own personal opinion and does not represent any official statements of the mentioned companies.)

要查看或添加评论,请登录

Dr. Ron Porath的更多文章

社区洞察

其他会员也浏览了