In this post I primarily want to convey an idea of the sheer size of the virtual and augmented reality universe that humanity is poised to gain access to, and why having access to this universe is likely to eventually change how most people view reality itself. I’ll begin with the following thought experiment.
Imagine that you live in a giant mansion with thousands of rooms. Each of these rooms represents a specific “reality” or “world” that is either purely virtual, purely real, or a mix of virtual and real. So in other words, only one room in this “mansion” represents the unaltered “real world” of our everyday experience.
How likely do you think it is that you would choose to remain solely in the “real world” room at all times, given that:
- There are thousands of other rooms in the mansion that you can go into
- The “real world” room is more restrictive and dull than the other rooms
- Most people in the mansion are in the other rooms
Obviously, this is a rhetorical question. Augmented and virtual reality wearables, and the virtual reality ecosystem that supports them, are soon going to provide individuals with access to a huge number of compelling alternate realities. As people spend more and more time in these alternate realities, the real world is likely to be relegated in their minds to the status of a single “channel” on a vast spectrum of “reality channels.”
One Remote is Easier to Use Than Two
I consider “augmented reality” and “virtual reality” to be essentially interchangeable terms, and so I agree with Ray Kurzweil’s 1999 belief that users will eventually be able to use a single access device to experience both. [1] My reasoning is as follows.
“Augmented reality” today refers to the overlay of electronic imagery and sound over some of a real world scene and soundscape respectively. “Virtual reality,” in turn, refers to the overlay of electronic imagery and sound over all of a real world scene and soundscape.
So virtual reality can be thought of as an extreme form of augmented reality, in which an entire real world scene and soundscape are overlaid by electronic imagery and sound. Similarly, augmented reality can be thought of as a mixed form of virtual reality, in which some portions of a real world scene and soundscape are electronically recreated, and the remaining portions of that real world scene and soundscape are electronically replaced with virtual reality scenes and sounds.
Since virtual reality goggles completely immerse their users audiovisually, and augmented reality glasses do not, virtual reality goggles provide a much better starting point for users to access all of the types of “reality channels” mentioned in the mansion example above. In contrast, augmented reality glasses can easily allow stray light to get into a user’s eyes and stray sounds to enter a user’s ears while they try to provide a virtual reality experience, thus reducing the immersive quality of the experience.
No one is going to want to carry around both a set of glasses to access augmented reality “channels” and a separate set of goggles to access virtual reality “channels,” if it can be helped. This would make about as much sense as owning a TV that requires 2 remote controls to access all 99 of its channels, of which remote #1 can only access channels 1-50 and remote #2 can only access channels 51-99. Clearly, being able to access all 99 channels on that TV with a single remote would be a lot more convenient. As a result I expect that in the not-too-distant future, virtual reality technology will enable a user to seamlessly access all types of “reality channels” using only one set of goggles.
To accomplish this, I expect that virtual reality goggles will be used to electronically recreate the audiovisual aspects of the real world environments that users would naturally experience without goggles.
The end result of this would be something like indirectly viewing the outside world with your smartphone camera and screen, instead of directly viewing it with your eyes. Or alternately, it would be something like indirectly viewing the outside world through night vision goggles, instead of directly viewing it with your eyes. In both cases, even though you wouldn’t see the outside world directly with your own eyes, you would still be able to navigate through the outside world as if you were seeing it directly, by relying on electronically replicated images of the real world.
Processing all incoming audiovisual stimuli from the outside world before it reaches the user will enable virtual reality goggles to exercise a lot more control than augmented reality glasses currently do in seamlessly melding real world and virtual world scenes together. I expect that using this technique will eventually enable virtual reality goggles to be worn nearly continuously by many people in the future.
Types of Reality Channels
Building on the mansion example above, I’ll define four basic types of “reality channels” here that future virtual reality goggles should be able to access, These definitions should also come in handy as I discuss more involved topics in the future.
- Standard Virtual Reality (“VR”): Audiovisual apparatus broadcasts the sights and sounds of a totally distinct world from the “real world” into the user’s eyes and ears. Audiovisual input from the real world is blocked out to enable this.
- Digitized Reality (“DR”): Audiovisual apparatus broadcasts faithful replications of the same “real world” sights and sounds that the user would normally experience from his or her first-person perspective into the user’s eyes and ears, ideally with negligible distortion. In other words, the audiovisual apparatus does its best to behave as if it isn’t there at all.
- Tweaked Digitized Reality (“TDR”): This is simply Digitized Reality with some distortion that is permitted by the user. The audiovisual apparatus uses as input the same “real world” sights and sounds that the user would normally experience from his or her first-person perspective. However, these sights and sounds are distorted before reaching the user to enhance the user’s natural capabilities in some way. For example, the user might zoom in on part of a scene, use night vision technology to see in the dark, change the color scheme of the walls in a room s/he is thinking of painting, process incoming sound to silence traffic noises, etc.
- Augmented Reality (“AR”): This represents any combination of Digitized Reality and/or Tweaked Digitized Reality with Virtual Reality, and is equivalent to what is popularly known as “augmented reality” today except for the fact that the user sees a Digitized Reality version of the real world for the real world portions of a scene, rather than seeing the real world directly with his or her own eyes. Augmented Reality possibilities include mixing real world landscapes with purely virtual ones, overlaying imaginary objects over real world landscapes or real characters over imaginary landscapes, playing back previously recorded Digitized Reality experiences, etc.
Comfort in Virtual Reality
Magic Leap’s CEO Rony Abovitz wrote the following words about ideal augmented and virtual reality displays in this Reddit forum. His words are also referenced in the Wired Magazine article at this link:
Our vision for AR and VR is a true replication of visual reality. The ONLY safe way forward is to make a digital light field that is naturally tuned into your brain and physiology. And it’s amazing how when you give the mind and body what they want, how much it gives back.
Along these lines, I also believe that one of the most important areas of virtual reality research right now should be the process of figuring out how to audiovisually replicate the external “real world” environment for a user, in the most naturally authentic way possible. This research needs to lead to a result that is so close to a user’s natural vision and hearing that the user will be comfortable wearing the goggles for hours on end.
Once users find that they can navigate the real world seamlessly with a pair of virtual reality goggles on, I believe that it will be much easier for them to accept alternate and virtual realities into the mix of what they view with the goggles. But establishing user comfort with purely “real world” uses of virtual reality goggles will be absolutely key to their widespread adaptation.
Along these lines, I expect that 3D printing technology will soon enable virtual reality goggle frames to be custom printed to fit the contours of each user’s face, in a way that minimizes external distractions while maximizing comfort. Optometrists (or optometric algorithms) may also one day help to optimize goggles further for the vision of individual users. I also plan to discuss in a future post some of the conventions that I believe will be used to safely navigate virtual reality worlds, given that real world surroundings will vary quite significantly for users.
Some Quick Thoughts on Current Technologies
Augmented Reality for Mobile Devices
Google’s Project Tango group (also see this link) has done some amazing work on augmented reality technologies for modern mobile devices that have dynamic 2D displays, such as tablets. Their technology is capable of overlaying virtual objects and details over Digitized Reality representations of the real world, and displaying the integrated images on the screens of mobile devices. This is made possible in part because properly equipped mobile devices are able to generate 3D maps of real world objects in the spaces near them, with the help of depth sensing and camera technology. In turn, this enables virtual objects to be rendered in realistic locations in the Digitized Reality version of the real world that the supporting mobile devices display.
Note that Qualcomm announced in May of this year that one of their processors “will power Google’s next generation Project Tango smartphone development platform.” As Qualcomm also has a trademarked “mobile vision platform” associated with augmented reality that has won several awards, this will be a very interesting space to watch.
I plan to discuss 3D mapping further in future posts, because very precise 3D mapping will be a critical part of a high quality virtual reality ecosystem that is seamlessly integrated with the real world. For now, suffice it to say that I expect the information gleaned from augmented reality research on dynamic 2D displays to be incorporated into future generations of virtual reality goggles. In fact, this Youtube video posted by The Verge shows several users at a Google facility using head-mounted Project Tango tablets as virtual reality devices, so progress is already being made!
VR Goggles with Externally Mounted Sensors
There is also an interesting sensing device made by Leap Motion that can be attached to the outside of virtual reality goggles, to achieve a form of Augmented Reality that appears to meld Tweaked Digitized Reality with Virtual Reality. This sensor and its associated software enable a user to see electronic representations of his or her arms in a virtual world during gameplay.
Light Field Technologies
In the same Reddit forum quoted above, Magic Leap’s CEO wrote that his company has a “unique digital light-field technology.” I look forward to see what they are going to release in this space, although I noticed that this recent MIT Technology Review article on the subject was titled “Magic Leap Needs to Engineer a Miracle.”
I am also paying attention to OTOY’s “light field” technology and MIT Media Lab’s unique approach to holographic video. Visual technologies like these apparently enable a viewer to look at a scene on a 2D display device from different angles and see that scene rendered differently from each angle, somewhat like looking through a window at a real world scene. I believe that light field related technologies will play an important role in overcoming issues that cause discomfort with certain virtual reality displays today, such as the issue of “vergence-accomodation conflict” mentioned in this Wired Magazine article.
A technology that I also find particularly interesting is Trilite Technologies’ digital signage technology, which they plan to use in holographic billboards. Trilite’s technology will apparently be capable of rendering a glasses-free holographic image on a properly equipped billboard, that renders viewing-angle dependent images for users across a wide range of angles. They plan to accomplish this by having each individual pixel element in such a billboard (known as a “Trixel”) emit light in a variety of different directions. Although this technology is orders of magnitude too large for use in virtual reality goggles at the current moment, the miniaturized equivalent of a technology like this could one day prove useful in replicating certain aspects of human vision.
In Summary…
Consumers are very likely to flock towards virtual reality goggles that enable them to comfortably access all “reality channels” using the same device. This will prove to be a lot more preferable than owning one set of glasses for “augmented reality” purposes and another set of goggles for “virtual reality” purposes.
To accomplish this, companies will first have to figure out how to make virtual reality goggles electronically replicate “real world” environments for their users very comfortably and authentically. As technology progresses and these goggles become less bulky/goofy looking, I expect that large swathes of the public are going to wear them nearly continuously, due to the tremendous convenience they will provide, and the ease with which users will be able to switch between “reality channels” using them.
Ray Kurzweil has also forecast that by the late 2020s virtual reality technology will be able to completely mimic all aspects of reality for users by engaging all five senses. [2] This would spectacularly expand the spectrum of “reality channels” discussed above. In such a world, for example, a person could eat a piece of chocolate cake with his or her sense of smell electronically tweaked so that the cake smells like fresh strawberries instead of chocolate. So, as impressive as they are, the augmented and virtual reality wearables of today represent only a glimpse of the wonders to come.
P.S. Further Brainstorming
Within a few years, I believe that we are going to see virtual reality goggles that simultaneously use:
- “Pico projectors” (i.e. tiny laser projectors), to project images directly onto users’ retinas [3]
- Technology that senses how a user’s eyes are oriented, moving and focusing, which in turn determines the images that the pico projectors display on the user’s retinas
- “Light field cameras” (Wikipedia link here) mounted on the outside of the goggles to capture a broad range of visual information about the external real world area near the user
Regarding #1, such displays are known as “virtual retinal displays” today (Wikipedia link here). Although these displays are obviously not yet in widespread use, the Avegant Glyph is a current example. To increase the field of view of such displays, it may eventually make sense to use multiple pico projectors per eye, each arranged at a slightly different angle around the user’s straight-ahead line of sight, to enable broader coverage of the user’s retinal surface. For example, 5 pico projectors per eye could be arranged in a pentagon formation around each eye’s straight-ahead line of sight. Each of these pico projectors would scan a triangular “raster” image onto the user’s retina (which would differ from the traditional rectangular shape of raster scanned images). The wavefronts of the image that ultimately reach the user’s retina would thus form something like the walls of the 5-sided pyramid formed by the 5 triangles that meet at each vertex of a convex regular icosahedron.
Regarding #2, I note that the company Fove is developing an eye tracking virtual reality headset. As eye contact is also an important part of real world communication, I expect that there will be significant industry research in this area. I also think its likely that future virtual reality goggles will use retinal scans to authenticate their user’s identities.
Regarding #3, light field cameras capture enough raw data about their surroundings to enable a user to focus on different areas of the same snapshot! They typically accomplish this by using a large number of miniature lenses to capture raw images, which are then processed with advanced techniques. Since future virtual reality goggles should track the eye movements of their users using the technologies mentioned in #2, it should be very possible to deduce which areas of a raw light field camera image to focus on when displaying a final, processed version of that image to a user.
References
[1] Ray Kurzweil forecast this occurring in 2019, in The Age of Spiritual Machines, New York: Penguin Books, 1999, pp.202-203
[2] Kurzweil, Ray. The Singularity is Near, New York: Penguin Books, 2005, pp.341-343
[3] Ray Kurzweil forecast the use of lasers for retinal projection in virtual reality in 2009, in The Age of Spiritual Machines, New York: Penguin Books, 1999, p.190