Electronic display technology for displaying graphical images and/or text has evolved dramatically to meet the pervasive user demand for more realistic and interactive displays. A wide range of display technologies with differing capabilities are now available including:                Cathode Ray Tube (CRT)        Bistable display        Electronic paper        Nixie tube displays        Vector display        Flat panel display        Vacuum fluorescent display (VF)        Light-emitting diode (LED) displays        Electroluminescent Displays (ELD)        Plasma display panels (POP)        Liquid crystal display (LCD)                    High-Performance Addressing (HPA)            Thin-film transistor displays (TFT)                        Organic light-emitting diode displays (OLED)        Surface-conduction electron-emitter display (SED) (experimental)        Laser TV (forthcoming)        Carbon nanotubes (experimental)        Nanocrystal displays (experimental), using quantum dots to make vibrant, flexible screens.        
However, most display technologies are generally only capable of displaying two-dimensional images on a single screen. The ability to form images at different depths within a display, whether real or perceived, has been the subject of significant and ongoing research and development in the quest to provide display technology capable of replicating or augmenting the depth effects conferred by normal human sight.
The manner in which human beings process visual information has been the subject of extensive and prolonged research in an attempt to understand this complex process.
This research has included the effects of depth or ‘apparent depth’ provided by volumetric, three-dimensional or multi-focal plane displays.
The term “preattentive processing” has been coined to denote the act of the subconscious mind in analysing and processing visual information which has not become the focus of the viewer's conscious awareness.
When viewing a large number of visual elements, certain variations or properties in the visual characteristics of elements can lead to rapid detection by preattentive processing. This is significantly faster than requiring a user to individually scan each element, scrutinizing for the presence of the said properties. Exactly what properties lend themselves to preattentive processing has in itself been the subject of substantial research. Colour, shape, three-dimensional visual clues, orientation, movement and depth have all been investigated to discern the germane visual features that trigger effective preattentive processing.
Researchers have conducted experiments using target and boundary detection in an attempt to classify preattentive features. Preattentive target detection was tested by determining whether a target element was present or absent within a field of background distractor elements. Boundary detection involves attempting to detect the boundary formed by a group of target elements with a unique visual feature set within distractors. It may be readily visualised for example that a red circle would be immediately discernible set amongst a number of blue circles.
Equally, a circle would be readily detectable if set amongst a number of square shaped distractors. In order to test for preattentiveness, the number of distractors as seen is varied and if the search time required to identify the targets remains constant, irrespective of the number of distractors, the search is said to be preattentive. Similar search time limitations are used to classify boundary detection searches as preattentive.
A widespread threshold time used to classify preattentiveness is 200-250 milliseconds as this only allows the user opportunity for a single ‘look’ at a scene. This timeframe is insufficient for a human to consciously decide to look at a different portion of the scene. Search tasks such as those stated above maybe accomplished in less than 200 milliseconds, thus suggesting that the information in the display is being processed in parallel unattendedly or pre-attentively.
However, if the target is composed of a conjunction of unique features, i.e. a conjoin search, then research shows that these may not be detected preattentively. Using the above examples, if a target is included for example, of a red circle set within distractors including blue circles and red squares, it is not possible to detect the red circle preattentively as all the distractors include one of the two unique features of the target.
Whilst the above example is based on a relatively simple visual scene, Enns and Rensink [1990] identified that targets given the appearance of being three dimensional objects can also be detected preattentively. Thus, for example a target represented by a perspective view of a cube shaded to indicate illumination from above would be preattentively detectable amongst a plurality of distractor cubes shaded to imply illumination from a different direction. This illustrates an important principle in that the relatively complex, high-level concept of perceived three dimensionality may be processed preattentively by the sub-conscious mind.
In comparison, if the constituent elements of the above described cubes are re-orientated to remove the apparent three dimensionality, subjects cannot preattentively detect targets which have been inverted for example. Additional experimentation by Brown et al [1992] confirm that it is the three dimensional orientation characteristic which is preattentively detected. Nakaymyama and Silverman [1986] showed that motion and depth were preattentive characteristics and that furthermore, stereoscopic depth could be used to overcome the effects of conjoin. This reinforced the work done by Enns Rensink in suggesting that high-level information is conceptually being processed by the low-level visual system of the user. To test the effects of depth, subjects were tasked with detecting targets of different binocular disparity relative to the distractors. Results showed a constant response time irrespective of the increase in distractor numbers.
These experiments were followed by conjoin tasks whereby blue distractors were placed on a front plane whilst red distractors were located on a rear plane and the target was either red on the front plane or blue on the rear plane for stereo colour (SC) conjoin tests, whilst stereo and motion (SM) trials utilised distractors on the front plane moving up or on the back plane moving down with a target on either the front plane moving down or on the back plane moving up.
Results showed the response time for SC and SM trials were constant and below the 250 milliseconds threshold regardless of the number of distractors. The trials involved conjoin as the target did not possess a feature unique to all the distractors. However, it appeared the observers were able to search each plane preattentively in turn without interference from distractors in another plane.
This research was further reinforced by Melton and Scharff [1998] in a series of experiments in which a search task consisting of locating an intermediate-sized target amongst large and small distractors tested the serial nature of the search whereby the target was embedded in the same plane as the distractors and the preattentive nature of the search whereby the target was placed in a separate depth plane to the distractors.
The relative influence of the total number of distractors present (regardless of their depth) versus the number of distractors present solely in the depth plane of the target was also investigated. The results showed a number of interesting features including the significant modification of the response time resulting from the target presence or absence. In the target absence trials, the reaction times of all the subjects displayed a direct correspondence to the number of distractors whilst the target present trials did not display any such dependency. Furthermore, it was found that the reaction times in instances where distractors were spread across multiple depths were faster than for distractors located in a single depth plane.
Consequently, the use of a plurality of depth/focal planes as a means of displaying information can enhance preattentive processing with enhanced reaction/assimilation times.
Three-dimensional or multi-focal plane displays are known to provide numerous advantages or capabilities unavailable with conventional two-dimensional displays. Examples of a three-Dimensional and multi-focal plane displays include Stereoscopic displays and Multi-Layer Displays (MLD) respectively.
Known three-dimensional displays seek to provide binocular depth cues to the viewer via a variety of techniques including separate head-mounted displays located directly in front of each eye, lenticular displays and holography. Unfortunately, each of these possesses certain limitations. Head-mounted displays add ergonomic inconvenience, reduce the viewer's peripheral awareness and are often cumbersome and can cause nausea, headaches and/or disorientation. Lenticular displays are only really effective at oblique viewing angles and holography is currently limited to displaying static images.
Stereoscopic (and auto-stereoscopic) displays provide the appearance of a 3D image by providing slightly different visual images to the left and right eyes of the viewer to utilise the binocular capabilities of the human visual system.
MLD systems are multi-focal plane displays that use multiple layered screens or ‘display layers’ aligned parallel with each other in a stacked arrangement with a physical separation between each screen. Each screen is capable of displaying images on a different focal plane and thus such MLD systems are often referred to as Multi-focal plane displays. Thus, multiple images separated by a physical separation or ‘depth’ can be displayed on one display. PCT Publication No. WO 99142889 discloses such an MLD in which depth is created by displaying images on the background screen furthest from the viewer which will appear at some depth behind images displayed on the screen(s) closer to the user. The benefits of MLDs, in particular those utilising the technology described in the published PCT Patent Publication Nos. WO 1999/042889 and WO 1999/044095 are gaining increasingly widespread recognition and acceptance due to their enhanced capabilities compared to conventional single focal plane displays (SLD).
The benefits of MLDs are especially germane to displays using liquid crystal displays (LCD), though MLDs can also be formed using other display technologies, e.g. an LCD front display layer may be layered in front of an OLED rear display layer.
There are two main types of Liquid Crystal Displays used in computer monitors, passive matrix and active matrix. Passive-matrix Liquid Crystal Displays use a simple grid to supply the charge to a particular pixel on the display. Creating the grid starts with two glass layers called substrates. One substrate is given columns and the other is given rows made from a transparent conductive material. This is usually indium tin oxide. The rows or columns are connected to integrated circuits that control when a charge is sent down a particular column or row. The liquid crystal material is sandwiched between the two glass substrates, and a polarizing film is added to the outer side of each substrate.
A pixel is defined as the smallest resolvable area of an image, either on a screen or stored in memory. Each pixel in a monochrome image has its own brightness, from 0 for black to the maximum value (e.g. 255 for an eight-bit pixel) for white. In a colour image, each pixel has its own brightness and colour, usually represented as a triple of red, green and blue intensities. To turn on a pixel, the integrated circuit sends a charge down the correct column of one substrate and a ground activated on the correct row of the other. The row and column intersect at the designated pixel and that delivers the voltage to untwist the liquid crystals at that pixel.
The passive matrix system has significant drawbacks, notably slow response time and imprecise voltage control. Response time refers to the Liquid Crystal Displays ability to refresh the image displayed. Imprecise voltage control hinders the passive matrix's ability to influence only one pixel at a time. When voltage is applied to untwist one pixel, the pixels around it also partially untwist, which makes images appear fuzzy and lacking in contrast. Active-matrix Liquid Crystal Displays depend on thin film transistors (TFT). Thin film transistors are tiny switching transistors and capacitors. They are arranged in a matrix on a glass substrate.
To address a particular pixel, the proper row is switched on, and then a charge is sent down the correct column. Since all of the other rows that the column intersects are turned off, only the capacitor at the designated pixel receives a charge. The capacitor is able to hold the charge until the next refresh cycle. And if the amount of voltage supplied to the crystal is carefully controlled, it can be made to untwist only enough to allow some light through. By doing this in very exact, very small increments, Liquid Crystal Displays can create a grey scale.
Most displays today offer 256 levels of brightness per pixel. A Liquid Crystal Display that can show colours must have three sub-pixels with red, green and blue colour filters to create each colour pixel. Through the careful control and variation of the voltage applied, the intensity of each sub-pixel can range over 256 shades. Combining the sub-pixel produces a possible palette of 16.8 million colours (256 shades of red×256 shades of green×256 shades of blue). Liquid Crystal Displays employ several variations of liquid crystal technology, including super twisted nematics, dual scan twisted nematics, ferroelectric liquid crystal and surface stabilized ferroelectric liquid crystal. They can be lit using ambient light in which case they are termed as reflective, backlit and termed Transmissive, or a combination of backlit and reflective and called transflective.
There are also emissive technologies such as Organic Light Emitting Diodes (OLED), and other similar technologies which project an image directly onto the back of the retina which are addressed in the same manner as Liquid Crystal Displays.
To aid clarity and avoid prolixity, reference herein will be made to an “MLD” with two display layers, i.e. an MLD having front and rear display layers. However, this should not be seen to be limiting as the MLD may include three or more display layers as required by the application.
In general an MLD is used to simultaneously display images on the front and rear display layers. The MLD is configured to display output image data from a computer system, video/image feed or other image generator and in most applications the images are composite images formed from multiple image components, e.g. a foreground object and a background scene or a computer mouse cursor and computer software Graphical User Interface GUI. The image components may be displayed on the same display layer or spread between both display layers.
For ease of reference, the position of the image components or ‘graphical objects’ on each display layer can be given as a range of orthogonal x and y co-ordinates representative of the spatial position of the image component in the plane of a display layer relative to a common fixed reference point, e.g. the edge of a display layer, viewer's position or a fixed external focal point.
However, existing computer operating systems, computer graphic controllers and software have to date not been optimised or configured for volumetric displays such as the aforementioned MLD system. Current operating systems have graphics engines that are capable of generating an image for display on two display screens in only three primary modes, either in ‘clone’, ‘dual’ or ‘extended’ display modes. In the clone mode, both screens display the same images and changes on one screen are reflected in the other. In the dual display mode, the screens display independent images and operate independently, with the user selecting which screen to interact with. In the ‘extended’ display mode the two screens in effect operate together as an enlarged single screen with images capable of being spread between the screens across a common border, e.g. the right hand side of one screen and the left hand side of the other screen.
MLD systems can therefore be used with existing operating systems and software by treating the rear display layer as a separate screen in clone, dual or extended display modes. As the images are displayed on the different display layers, the separation between the display layers provides physical and perceived separation between those images.
For example, in one possible application, a picture editing computer program may be used where the GUI and original picture may be displayed on a rear display layer while the GUI ‘toolbars’ are displayed on the front display layer, i.e. using a dual or extended display mode. The toolbars will thus always appear in front of the rest of the GUI and the picture. However, if the user wants to ‘reposition’ the GUI and toolbars together, i.e. maintaining the spatial relationship therebetween, they must each be separately manually ‘dragged’ into position. There is no way to move or manipulate these two windows together with a single user action. The manual repositioning requirement is clearly undesirable and hampers user operability.
The majority of video or graphical content designed for display on an MLD are configured to display in the ‘extended’ mode. However, in order for such ‘MLD content’ to be viewed correctly it has to be played back in a video player that is capable of displaying a ‘double-wide’ resolution, i.e. extending across both display layers. An example of such a video player is QuickTime® which can play video in double wide resolution (e.g.: 2560×768—where 2560 is double a 1280 pixel wide resolution). However, in order for a developer to view and assess the effectiveness of the MLD content, the content must be exported to the double wide resolution and run before the developer can view what they are creating. However, double-wide video players use the entire viewable area to show MLD content in this way. This presents a problem if the developer does not want to lose context of the rest of their work space for example, or the content development environment and GUI.
Thus, in order to arrange images for display on the different display layers of an MLD using conventional single layer display (SLD) operating systems, a user has two options, i.e. to:                1. Manually position images on the front and/or rear layer OR        2. Generate a “double-wide” window that will span across both layers of an MLD device, though this method works in full screen mode only,        
The developer must generate images for different layers that will make up the MLD images. This process may involve a time-dependent frame rendering process as each frame for each layer is generated and optimised independently. This creates a number of time consuming steps, including creation of different image layers or sets and arranging the image layers so that they are synchronised. This process must be repeated every time content is modified.
It would thus be advantageous to provide a means for maintaining the spatial relationship between images on different display layers of an MLD during repositioning of one of the images.
In general, graphical content and images designed for an MLD are created with an image pair comprising a rear and a front image corresponding to rear and front display layers. Further images may be added if the MLD includes more than two layers. Each image must therefore be created separately. One exemplary method of generating front and rear image pairs is described in PCT publication WO03/040820 where two identical images are displayed on the front and rear display layers and the luminance of each image is varied to create the perception that a composite image is ‘floating’ between, in front, or behind the display layers.
It is also possible to generate image pairs from three-dimensional (3D) data (e.g. having x, y, z coordinates for image parts) by processing the depth or “z” data of the images and then displaying on the front or rear display layer as determined by their relative depth. The collective depth data of a 3D image or images is known as a depth map which is also used to generate left and right eye image pairs (instead of front and rear image pairs) for 3D stereoscopic drivers and displays. The stereoscopic image generation process is also reliant on the existence of the depth data, though processes the depth data differently to volumetric displays such as an MLD. For example, stereoscopic displays present slightly different images to the left and right eye of a viewer emulating the naturally different perspective of the viewer's eyes if they were viewing a 3D object, i.e. stereoscopic displays emulate binocular vision of a 3D object. In contrast, images for volumetric displays are split between front and rear display layers and thus provide a physical depth between the images.
The depth map from 3D data can be processed to create the “depth fusion” effect on an MLD as described in PCT publication WO03/040820. However, creating a realistic 3D representation can be a difficult and time consuming task, even for experienced 3D artists. 2D images do not normally contain 3D data and thus cannot be readily adapted to an MLD without substantial work generating the depth data. Current image editing software is also not designed to operate in multiple display layers of an MLD. Thus, to create 3D content, the developers for MLD systems typically work with 2D image content that is displayed on a single 2D display when using common content editing tools, e.g. picture or video editors. Each image is then assigned to the front or rear layer as required. This can be an extremely time consuming and error-prone method as the assigning of images to front and rear layers is normally completed in a 2D workspace with the front and rear image pairs being positioned in adjacent windows in the workspace on a double wide window. Acceptable results may only be obtained after several attempts in aligning the images, increasing the time required and therefore the cost of MLD content generation.
One of the main problems with developing 3D content with existing 2D editing software is that there is no automated way to synchronise related images to be displayed on different display layers so that they move and alter together as they are edited. Instead, a user must move each image independently.
It would thus be advantageous to provide an improved method and/or system to generate depth data for a 2D image for subsequent display on an MLD or other volumetric display.
It is an object of the present invention to address the foregoing problems or at least to provide the public with a useful choice.
All references, including any patents or patent applications cited in this specification are hereby incorporated by reference. No admission is made that any reference constitutes prior art. The discussion of the references states what their authors assert, and the applicants reserve the right to challenge the accuracy and pertinence of the cited documents. It will be clearly understood that, although a number of prior art publications are referred to herein; this reference does not constitute an admission that any of these documents form part of the common general knowledge in the art, in New Zealand or in any other country.
It is acknowledged that the term ‘comprise’ may, under varying jurisdictions, be attributed with either an exclusive or inclusive meaning. For the purpose of this specification, and unless otherwise noted, the term ‘comprise’ shall have an inclusive meaning—i.e. that it will be taken to mean an inclusion of not only the listed components it directly references, but also other non-specified components or elements. This rationale will also be used when the term ‘comprised’ or ‘comprising’ is used in relation to one or more steps in a method or process.