1. Technical Field
This invention relates generally to computer controlled imaging systems and, more particularly, to a method of input image preprocessing which substantially reduces image processing time, database generation and system on-line data storage requirements.
2. Discussion
There are numerous applications for computer controlled imaging systems which require real-time processing of image data to produce a realistic and interactively controllable visual display. One such application is aircraft flight simulators often used in the training of pilots. These systems reduce the expense and danger involved in flight training, especially military flight training, by providing realistic physical and visual simulation of flying a plane. These systems interlink a video display imagery system with a flight simulation computer to produce a visual display of the field of view seen from the window of the cockpit. This display is updated as the plane is apparently dynamically guided around and through a predefined gaming area, in response to the operation of simulated controls.
In order for such systems to provide effective training, the visual display of each scene must be as realistic as possible. Changes in each scene or to various objects therein must be made smoothly and quickly such that they are not distracting and appear to the operator as being responsive in real time to his operation of the simulated controls. The image processing systems used in the generation Of the displays must, therefore, be highly efficient.
Three main techniques have been developed to efficiently generate and process the image data in these systems. The first is known as Computer Generated Imagery (CGI). CGI creates each scene for display using purely mathematical representations of each object and surface to appear in the scene. These mathematical representations are in the form of a series of points which define the outer limits of each object and surface. While this mathematical data is readily manipulable by a computer-based imaging system, the mathematically defined objects and surfaces often lack realism. Highly structured objects such as tanks, buildings and trees, even with extremely complex mathematical representations, often appear cartoonish and unrealistic.
Another approach, Computer Synthesized Imagery (CSI), provides a more realistic visual display by using stored real world images such as digitized photographs. By storing a digitized photographic or similar synthetically created image for each scene in a large indexed database, a computer can access each for display in real time. However, each scene created by CSI is limited in perspective and distance to the point of view of the camera at the time the scene was acquired. Therefore, it is not possible to simulate the dynamic navigation of a very large gaming area without a prohibitive number of photographs, each taken at a slightly different angle.
A third system, Computer Generated Synthesized Imagery (CGSI) improves upon both the CGI and CSI systems by combining the best of both technologies. This system is discussed in detail in U.S. Pat. No. 4,645,459 to Graf et al. entitled COMPUTER GENERATED SYNTHESIZED IMAGERY. The Graf patent is assigned to the Assignee of the present invention and its disclosure is incorporated herein by reference. With Graf's CGSI system, an object database contains digitized photographic quality images of all individual objects necessary to describe the entire gaming area. These objects may include such things as terrain patches, sky, water, trees, bushes, buildings and military targets. Each scene is constructed by placing individual detailed objects on a specified background surface which may be either CGI or CSI generated and which may be laid out in accordance with a known or determinable grid.
The construction of each CGSI scene typically begins with the placement of background surfaces such as patches of land, water or sky. The sequence continues with the addition of objects such as trees, rocks, roads or buildings and then special effects such as shadows and clouds. The scene is normally constructed by beginning with the objects most remote and ending with nearer objects in order to provide proper occlusion between objects. The employment of a "Z" buffer, however, allows the scene to be constructed in any order.
To construct each scene, Graf employs a field of view computer to determine which of the various stored objects, surfaces and special effects will be present in the scene. The field of view computer then coordinates the scene display relative to a particular point of view based upon the simulated vehicle's attitude and position in the gaming area as provided by the vehicle simulation computer. For each instance of every object to appear in the final scene, the field of view computer provides the object identity, its position and the orientation that it must have in the final scene. Object controllers fetch the object images from the object library database and present them to object processing channels which perform the manipulations required to fit the objects into the scene. The stored image of each object used in a scene is individually tailored for perspective, size, position, rotation, warp, and intensity prior to placement in the scene.
Each object image retrieved from the object image database is first temporarily stored in memory as an input image, typically in the form of a square array of pixels uniformly arranged in columns and rows. The input image is then warped to create an output image for display in the scene having the proper perspective and size from the viewpoint of the cockpit. As each object image passes through a warp processing channel, a linear warp transformation shrinks or enlarges the image to the required size and rotates the image as required for final orientation in the display. This creates an output image having an arrangement of rows and columns of pixels different from that of the object and input images.
Since many scenes contain a number of smaller objects, Graf provided for up to 16 objects to be loaded into memory and warped to the output display. However, this limits the system in its ability to deal with highly dynamic scenes which contain a very large number of objects. This is due to the maximum rate at which the object images are warped, given a single channel warp processor and the processing inefficiency associated with combining, sizing, rotating and positioning in a single warp process.
To overcome such limitations a Multiple Object Pipeline (MOPL) Display System was created as disclosed in U.S. Patent Application Ser. No. 622,128 which is assigned to the present Assignee, the disclosure of which is also incorporated herein by reference. The MOPL attempts to overcome the shortcomings of previous CGSI systems by simultaneously warping a number of objects in a multiplicity of parallel pipelined channels in order to improve computational efficiency and provide higher speed imaging. Conservation of database storage and transfer bandwidth is achieved by dealing with objects of variable spacial resolution on a "cell" basis to provide a variable number of cells per object, i.e., a single cell for small simple objects and a large number of cells to represent more complex or larger objects.
Each cell is stored as an object image and is individually translated to proper screen coordinates and then properly occluded with the other cells. For purposes of simplicity, an object such as a tree, even though not rectangular, can be stored as a rectangular array of pixels wherein any pixels within the rectangle but not part of the tree have values equated with transparency wherein those pixels are always transparent to an underlying pixel.
One of many warp transformation processes which can be used by the MOPL to put each "cell" into its proper screen coordinates in the scene is a two pass algorithm which can best be understood with the aid of FIG. 1. Each object image, which contains a single cell to be displayed, is read from memory into an input image which may be a square array of pixels having corners A, B, C and D as indicated. Corresponding screen coordinates A, B, C and D of an output image, which is to be ultimately displayed on the screen, are determined by the field of view computer. In this warp transformation process, the four corners of the input image are mapped to the four corners of the output image in two orthogonal passes.
As shown by FIG. 1, the first pass reads vertical lines or columns of the input image from left to right and writes to an intermediate image. Each column that is read is sized and migrated to its correct vertical axis orientation, the pixels in each column being interpolated to create the correct number of pixels required for that column in the output image. This pass is known as the vertical or Y transformation process. The second pass is the X or horizontal transformation in which horizontal rows of pixels are read from the intermediate image and written to the output image. The object of the second pass is to migrate and interpolate all rows into their correct horizontal axis orientation. Each output image once transformed then becomes part of a larger scene memory wherein each cell to appear in the scene is mapped into its proper position and properly occluded with the other cells prior to display.
The interpolation process used in this warp transformation process is based on a scale factor or compression/decompression ratio which is determined by the ratio of the output image size in pixels to the input image size. The size and position parameters for each interpolation are determined from the input image corner coordinates in relation to those of the output image. The process is computationally invariant for all transformations once the four output corners are established. The interpolation process consumes consecutive input pixels and generates consecutive output pixels. Continuous interpolation over the discrete field of pixels allows the continuous line sizing and subpixel positioning of output lines and columns to eliminate any aliasing of diagonal edges.
To preserve object image integrity and minimize lost information, it is beneficial to maximize the size of the intermediate image. This imposes a restriction on the amount of roll any single object may traverse since the intermediate image collapses to a line at a 45.degree. rotation. For this reason, for an image requiring a rotation of 60.degree., the object image may be either stored or read in at a 90.degree. rotation and then rotated only 30.degree.. All object image data may even be stored at a 90.degree. rotation in order to facilitate ease of reading pixels in column order. Storing all object images at the highest magnification required such that all transformations basically involve a reduction in size also contributes to the maintenance of image integrity.
To conserve system storage space as well as improve the speed of the input image delivery system, the pixel data corresponding to each cell can be stored in the object image library in a compressed form as a compressed object image. Prior to use of the object in a scene, the compressed cell pixel data is retrieved and then decompressed before becoming the input image to the warp transformation process. A compression technique commonly used by those skilled in the art is the JPEG DCT (Joint Photographic Experts Group Discrete Cosine Transform) image data compression standard. This standard is often used since it enables practical compression ratios of over 20:1. The JPEG-standard compression operates on 8 pixel by 8 line blocks of 64 pixels of an image and can be conveniently implemented using dedicated compression and decompression chips. Data compression, however, can decrease system throughput for objects which appear small on the display.
Other inefficiencies also result from storing all of the cells as uniform sized pixel arrays for both near and far geospecific data. As shown in FIG. 2, even though the cells which make up the various segments of a gaming area terrain are stored as like sized object images, for this example called Levels of Detail (LOD), only those cells making up terrain segments near to the eyepoint such as segment 1-2 use most of the pixel information contained in the object image. The terrain segments farther from the eye point, such as segment 8-9, have the pixel data of their object image spatially compressed into a very small region when mapped onto the screen. This is very inefficient because it involves the processing of many input pixels in order to generate only a few output pixels.
One way to improve efficiency in this situation is to use a multiplicity of object image databases to allow those objects that are to appear in the near terrain to be of a high resolution while providing image information for distant objects having a lesser number of input pixels to be processed. However, a considerable amount of on line storage with a large input bandwidth is required to support multiple databases while simulating movement at or near ground level in the gaming area or movement at very high speeds in airborne situations. An alternate method supporting both near and distant geospecific processing efficiencies and improving overall system throughput is therefore desired.