Monday, October 21, 2013

Can the GPU compensate for all Optical Aberrations?

Photo Credit: <a href="http://www.flickr.com/photos/55514420@N00/5192375946/">davidyuweb</a> via <a href="http://compfight.com">Compfight</a> <a href="http://creativecommons.org/licenses/by-nc-nd/2.0/">cc</a>
Photo Credit: davidyuweb via Compfight cc
As faster, newer Graphics Processing Units (GPUs) become available, graphics cards can perform real-time image transformations that were previously relegated to custom-designed hardware. Can these GPUs overcome all the important optical aberrations, thus allowing HMD vendors to use simple, low-cost optics?

The short answer is: GPUs can overcome some, but not all aberrations. Let's look deeper into this question.

Optical aberrations are the result of imperfect optical systems. Every optical system is imperfect, though of course some imperfections are more noticeable than others. There are several key types of aberrations in HMD optics which take an image from a screen and pass it through viewing optics:
  • Geometrical distortion, which we covered in a previous post would cause a square image to appear curved. The most common variants are pincushion distortion and barrel distortion.
  • Color aberration. Optical systems impact different colors in different ways, as can be seen in a rainbow or when light passes through a prism. This results in color breakup where a white dot in the original screen breaks up into its primary colors when passing through the optical system.
  • Spot size (also referred to as astigmatism), which shows how a tiny dot on the original screen appears through the optical system. Beyond the theoretical limits (diffraction limit), imperfect optical systems cause this tiny dot to appear as a blurred circle or ellipse. In essence, the optical system is unable to perfectly focus each point from the source screen. When the spot size becomes large enough, it blurs the distinction between adjacent pixels and can make viewing the image increasingly difficult.
The diagram below shows an example of the spot size and color separation on various points in the field of view of a certain HMD optical system. This is shown for the three primary colors, with their wavelengths specified in the upper right corner. As you can see, the spot size is much larger for some areas than others, and colors start to appear separated.


Which of these issues can be corrected by a GPU, assuming no practical limits on processing power?

Geometrical distortion can be corrected in most cases. One approach is for the GPU to remap the image generated by the software so that it compensates for known optical distortion. For instance, if the image through the optical system appears as if the corners of a square are pulled inwards, the GPU would morph that part of the image by pushing these corners outwards. Another approach is to render the image up-front with the understanding of the distortion, such as the algorithm covered in this article about an Intel researcher.

Color aberration may also be addressed, though it is more complex. Theoretically, the GPU can understand not only the generic distortion function for a given optical system, but the color-specific one as well, and remap the color components in the pixels accordingly. This requires understanding not only the optical system but also the primary colors that are being used in a particular display. Not all "greens", for instance, are identical

Where the GPU fails is in correcting astigmatism. If the optical system causes some parts of the image to be defocused, the GPU cannot generate an image that will 're-focus' the system. In simpler optics, this phenomena is particularly noticeable away from the center of the image. 

One might say that some defocus in the edge of an image is not an issue since the central vision of a person is much better then the peripheral vision, but this argument does not take into account the rotation of the eye and the desire to see details away from the center.

Another discussion is the cost-effectiveness of improving optics, or the "how good is good-enough" debate. Better optics often cost more, perhaps weigh more, and not everyone needs this improved performance or is willing to pay for it. Obviously, less distortion is better to more distortion, but at what price?

Higher-performance GPUs might cost more, or might require more power. This might prove to be important in portable systems such as smartphones or goggles with on-board processors (such as the SmartGoggles), so fixing imperfections on the GPU is not as 'free' as it might appear at first glance.

HMD design is a study in tradeoffs. Modern GPUs are able to help overcome some imperfections in low-cost optical systems, but they are not the solution to all the important issues.


For additional VR tutorials on this blog, click here
Expert interviews and tutorials can also be found on the Sensics Insight page here

Monday, October 14, 2013

Where are the VR Abstraction Layers?

"A printer waiting for a driver"
Once upon a time, application software included printer driver. If you wanted to use Wordperfect, or Lotus 1-2-3, you had to have a driver for your printer included in that program. Then, operating systems such as Windows or Mac OS came along and included printer drivers that could be used by any application. Amongst many other things, these operating systems provided abstraction layers - as an application developer, you no longer had to know exactly what printer you are printing to because the OS had a generic descriptor that told you about the printer capabilities and provided a standard interface to print.

The same is true for game controllers. The USB HID (Human Interface Device) descriptor tells you how many controls are in a game controller, and what it can do, so when you write a game, you don't have to worry about specific types of controllers. Similarly, if you make game controllers and conform to the HID specifications, existing applications are ready for you because of this abstraction layer.

Where are the abstraction layers for virtual reality? There are many types of VR goggles, but surely they can be characterized by a reasonably simple descriptor that might contain:

  • Horizontal and vertical field of view
  • Number of video inputs: one or two
  • Supported video modes (e.g. side by side, two inputs, etc.)
  • Recommended resolution
  • Audio and microphone capabilities
  • Optical distortion function
  • See through or immresive configuration
  • etc
Similarly, motion trackers can be described using:
  • Refresh rate (e.g. 200 Hz)
  • Capabilities: yaw, pitch, roll, linear acceleration
  • Ability to detect magnetic north
  • etc,
Today, when an application developer wants to make their application compatible with a head-mounted display, they have to understand the specific parameters of these devices. The process of enhancing the application involves two parts:
  1. Generic: change the application so that it supports head tracking; add two view frustums to support 3D; modify the camera point; understand the role of the eye separation; move the GUI elements to a position that can be easily seen on the screen; etc.
  2. HMD-specific: understand the specific intricacies of an HMD and make the application compatible with it

If these abstraction layers widely existed, the 2nd step would be replaced by supporting the generic HMD driver or head tracker driver. Once done, the manufacturers would need to write a good driver and viola! users can start using their gear immediately.

VR application frameworks like Vizard from WorldViz provide an abstraction layer, but they are not as powerful as modern game engines. There are some early efforts such as I'm in VR to provide middleware, but I think a standard for an abstraction layer has yet to be created and gain serious steam. What's holding the industry back?

UPDATE: Eric Hodgson of the Redirected Walking fame reminded me of VRPN as an abstraction layer for motion trackers, designed to provide a motion tracking API to applications either locally or over a network. As Eric notes, VRPN does not apply to display devices but does abstract the tracking information. I think that because of it being available on numerous operating systems, VRPN does not provide particularly good plug-and-play capabilities. Also, it's socket-based connectivity is excellent for tracking devices that, at most, provide several hundred lightweight messages a second. To be extended into HMDs, several things would need to happen, including:

  • Create a descriptor message for HMD capabilities
  • Plug and play (which would also be great for the motion tracking)
  • The information about HMDs can be transferred over a socket, but if the abstraction layer does anything that is graphics related (in the same way OpenGL or DirectX abstract the graphics card), it would need to move away from running over sockets.


Sunday, October 6, 2013

Is Wider Field of View always Better?

I have always been a proponent of wide field of view products. The xSight and piSight products were revolutionary when they were introduced, offering a combination of wide field of view and high resolution. There is widespread agreement that wide field of view goggles provide greater immersion, and allow users to perform many tasks faster and better.
Johnson's Criteria for
detection, recognition and identification -
from Axis Communications

But for a given display resolution, is wider field of view always better? The answer is 'No' and thinking about this question provides an opportunity to understand the different set if requirements between professional-market applications of virtual reality goggles (e.g. military training) and gaming goggles.

Aside from the obvious physical attributes - pro goggles often have to be rugged - the professional market cares very much about pixel density (or the equivalent pixel pitch) because it determines the size and distance of simulated objects that can be detected. For instance, if you are being trained to land a UAV, or trying to detect a vehicle in the distance, you want to detect, recognize and identify the target as early as possible and thus as far away as possible. The farther the target appears away, the fewer pixels it occupies on the screen for a given pixel density.

The question of how exactly many pixels are required was answered more than 50 years ago by John B. Johnson in what became known as the Johnson Criteria. Johnson looked at three key goals:
  • Detection - identifying that an object is present.
  • Recognition - recognizing the type of object, e.g. car vs. tank or person vs. horse.
  • Identification - such as determining the type of car or whether a person is a male or a female.
Based on extensive perceptual research, Johnson determined that to have a 50% probability that an observer would discriminate an object to the desired level, that object needs to occupy 2 horizontal pixels for detection, 8 horizontal pixels for recognition and 13 horizontal pixels for identification.

Let's walk through a numerical example to see how this works. The average man in the United States is 1.78m tall (5' 10") and has a shoulder width of about 46cm (18"). Let's assume that a simulator shows this person at a distance of 1000 meters. We want to be able to detect this person inside an HMD that has 1920 pixels across.

46 cm makes an angle of 0.026 degrees (calculated using arctan 0.46/1000). At a minimum, we need this angle to be equivalent to two pixels. Thus, the entire horizontal field of view of this high-resolution HMD can be no more than 25.3 degrees for us to achieve detection. If the horizontal field of view is more than that, target detection will not be possible at these simulated distances.

Similarly, if we wanted to be able to identify that person at 100 meters, these 46 cm would make an angle of 0.26 degrees so the horizontal field of view of our high-resolution 1920 pixel HMD can be no more than 38.9 degrees. If the horizontal field of view is more than that, target identification will not be possible at these simulated distances.

Thus, while we all love wide field of view, thought must be put into the field of view and resolution selection depending on the desired use of the goggles.

Notes:

  • Johnson's article was "John Johnson, “Analysis of image forming systems,” in Image Intensifier Symposium, AD 220160 (Warfare Electrical Engineering Department, U.S. Army Research and Development Laboratories, Ft. Belvoir, Va., 1958), pp. 244–273."
  • Johnson's work was expressed in line pairs, but most people equate a line pair to a pair of pixels.
  • Johnson also looked at other goals such as determining orientation, but detection, recognition and identification are the most commonly-used today.

Thursday, October 3, 2013

Thoughts on the Future of the Microdisplay Business

If I were a shareholder of a microdisplay company such as eMagin or Kopin, I'd be a little worried about where future growth is going to come from.

For years, the microdisplay pitch was something like this: we make microdisplays for specialized applications - such as military products - where high performance are required, sometimes coupled with the ability to withstand harsh environments. One day, there will be a consumer market for such products in the form of virtual reality goggles or high-quantity of camera viewfinders, and this will allow us to reduce the price of our products and expand our reach. While this is coming, we make money by selling our specialized markets and doing contract research work.

This pitch is starting to look problematic. The consumer market is waking up, but not necessarily to the displays made by eMagin and Kopin.

In immersive virtual reality (e.g. not a see-through system), smartphone displays are a much more economical solution. Because more than a hundred million smartphones are sold every year, the cost of a high-resolution smartphone display can easily be less than 5% the cost of a comparable microdisplay. Microdisplay pricing has always been a chicken-and-an-egg game: prices can go down if quantities increase, but quantities will increase only if prices go down AND enough capital is available for production line and tooling investments.

Other technologies are also good candidates: pico projectors might become very popular for heads-up displays in cars and once large quantities will be made, they can also replace the microdisplay as a technology of choice.

Pico projectors are physically small which might be attractive to see-through goggles similar to Google Glass. The current generation of see-through consumer products does not seek to be high resolution nor wide field of view, and thus low-cost LCOS displays (Google is reportedly using Hynix) can provide a good solution for a high-brightness display that can be used outdoors. Karl Guttag had an interesting article on why Kopin's transmissive displays are not a good fit for these kind of applications.

One more thing on the subject of microdisplay prices. Though the financial reports do not reveal that microdisplays are a terrifically-profitable business, I suspect prices are also kept at some level because of "most favored nation" clauses to key customers such as perhaps the US government. Such clauses might force a microdisplay company that reduces prices to offer these reduced price levels to these 'most favored' customers. Thus if - for example - the US government is responsible for a large portion of a company's revenue and has a most-favored nation clause, any reduction in pricing beyond what is offered to the government will immediately results in significant loss of revenue once the US government prices are also reduced.

There will always be specialized applications where a display like eMagin's can be a perfect fit. Perhaps ones that requires very small physical size (such as when installed in a simulated weapon), or ones that can withstand extreme temperature and shock, or ones where quality is paramount and cost is secondary, but these do not sound like high-volume consumer applications.

The financial reports of both eMagin and Kopin reflect this reality. Both companies are currently losing money as they seek to address this reality.

What can be done to expand the business? One option is vertical integration. An opto-electronic system using a display needs additional components such as driver boards and optics beyond the display.  Today, these come from third-party vendors but one could imagine micro-display companies offering electronics and optics - or maybe even motion trackers - for small to medium-sized production runs. Another option which is currently pursued by Kopin is offering complete platforms and systems such as the Golden-i platform. Ostensibly, the margins on systems are much higher than the margin on individual components, especially as these become commodities. Over time, perhaps there is greater intellectual property there as well.

It will be interesting to see how this market shakes out in the upcoming months.

Full disclosure: I am not a shareholder of either company but my company uses eMagin microdisplays for several of our products.