The present invention relates to an image processing device and a system using the same, which is capable of performing graphics processing, drawing processing and display processing at high speed.
More specifically, the present invention relates to an information terminal machine, such as a facsimile machine, printer, graphics device, portable personal terminal machine, navigation machine and amusement device, and to an image processing system which performs inputting, processing, editing, accumulating, communicating, outputting and displaying of image data using the information terminal machine.
In particular, the present invention relates to an image processing device and a system using the same which can minimize the amount of data being transferred between a memory device and the image processing device to thereby perform high speed processing.
Further, in particular, the present invention relates to a graphics drawing method, and to an image processing device and a system using the same, in which, in order to minimize the amount of data being transferred between a memory device and the image processing device, reduces drawing suppression processing (hidden surface processing) due to graphics overlapping, and which, in particular, performs complex image processing and three dimensional graphics display processing at high speed.
Further, in particular, the present invention relates to a method which suppresses an increase of components in an image processing system having high speed access to a memory, thereby reducing the construction cost thereof.
Various conventional systems for effecting data transfer between a memory, which holds image data, and an image data processor and/or an image processing device, which processes the image data, have been developed. However, these conventional systems have inherent problems with regards to their processing speed, such as for accessing and drawing, in that high speed processing and real time processing in a device or a system which is required to process a great amount of data have proven to be insufficient.
A proposal in which image data transferred between a memory device and an image processing device is drawn and processed in blocks representing a pixel aggregate is disclosed in Andy Goris et al. “A Configurable Pixel Cache for Fast Image Generation” (IEEE, Computer Graphics and Application, May 1987, pp.24–32), which is hereinafter referred to as the Goris et al. publication.
According to Goris et al., pixel data is fetched in blocks on demand into a pixel cache, and a drawing process is executed such that a prefetch effect is limited within a block unit.
Even in a case when a drawing processing is performed while crossing the boundary between blocks, since the pixel data is generated via a common rasterizing mode, a prefetch for pixel data in the subsequent block is started.
Therefore, if the capacity of the pixel cache is small, in response to the prefetch of the pixel data in the subsequent block, the pixel data in the previous block is first pushed out of the pixel cache and then the pushed out pixel data again needs to be accessed via rasterization, which causes a problem in that frequent memory access is necessitated.
In particular, when a coloring-up darkening process is performed for a triangle, although it is necessary to generate pixel data two-dimensionally and to process the same, the memory access is performed one-dimensionally; therefore, even when performing a drawing processing of nearby pixel data, the prefetched data as indicated above cannot be utilized, which causes an inefficient memory access. In other words, it causes a problem in that an additional memory band width is necessitated.
Further, in a graphics architecture in which a memory area for drawing processing is arranged in a main memory, a sufficient memory band width has to be provided which causes a problem in that the memory can not be efficiently used.
A three-dimensional graphics display device generally uses a display method in which an object to be displayed is divided into small three-dimensional graphics, such as triangles and quadrangles, and is modeled; a geometrico arithmetic operation is performed on these small graphics based on their viewed direction; and, based on the resultant arithmetic values, the respective graphics are projected on the two dimensional coordinate of a display device, such as a CRT. At this time, however, it is required to check for overlapping of the respective graphics in their depth direction and to prevent drawing of hidden graphics on a frame buffer.
As a conventional method of judging graphics, an overlapping Z buffer algorithm, in other words a depth buffer algorithm, is generally used. This conventional method is, for example, explained in W. M. Newman et al. “Dialog Type Computer Graphics (II) 2nd edition” translated by Setsuo Ohsuga (McGraw-Hill Book Company, 1984, pp. 483–441), which is hereinafter referred to as the Newman et al. publication. According to this method, depth information is provided for respective pixels every time. When drawing respective pixels, depth information of a pixel to be drawn is compared with the depth information of already drawn pixels, and when the pixel to be drawn is located closer to the screen, the pixel is permitted to be drawn, and conversely, when the pixel to be drawn is located deeper than the already drawn pixel, the drawing of the pixel is prevented.
Another method in which overlapping of graphics is checked via geometric calculation is, for example, explained on pages 442–443 of the Newman et al. publication. In this method, a circumscribed rectangle is defined for each of the polygons to be drawn and overlapping of these circumscribed rectangles is judged; however, when the respective circumscribed rectangles do not overlap each other, the judgment with regard to their depth is omitted.
On the other hand, a method which avoids the necessity of the Z buffer is explained in James D. Foley et al. “Fundamentals of Interactive Computer Graphics” translated by Atsumi Imamiya (Published by Japan Computer Association, 1982, pp.569–572), which is hereinafter referred to as the Foley et al. publication. This method is based on a depth sort algorithm in which the drawing is started from the graphic in the deepest location and graphics close to the viewer are successively drawn while overlapping graphics.
Further, U.S. Pat. No. 4,303,986 discloses a provision of a selective writing means for a memory which stores two dimensional images.
In the method disclosed in the Newman et al. publication, a Z buffer algorithm is used for judgment of graphics overlapping, so that this method necessitates depth information for every pixel and therefore requires a Z buffer (frame buffer) of large capacity. For example, assuming a standard system having 34 bits for color information and 16 bits for the Z buffer with a screen having 1024×768 pixels, a memory of about 4M bits in total is necessitated, among which a memory of about 1.4M bits is needed for the Z buffer algorithm and a memory of about 1.6M bits is needed for the Z buffer. Further, in the conventional Z buffer algorithm, the same pixel is accessed as many times as the number of graphics that overlap, and every time a drawing is performed, the Z buffer content has to be read and the read Z value has to be compared, among which comparison only one comparison result is significant. For this reason, useless memory access is necessitated, which makes an improvement of drawing performance difficult.
On the one hand, the method disclosed in the Foley et al. publication, which avoids any need for the Z buffer, is effective with regard to memory capacity reduction; however, since graphics are successively drawn in an overlapping manner from the graphic in the deepest location, the graphic closest to the viewer sometimes can not be drawn depending on the number of graphics to be drawn. Thus, this method cannot be applied to a system which requires a real time performance.
With regard to a conventional image processing system, for example, JP-A-5-258040 (1993) and JP-A-5-120114 (1993) disclose examples of a data processing system which makes use of a synchronous DRAM permitting high speed data transfer.
However, these image processing systems are not practical because they require excessive time when a plurality of image processings are performed via multi tasks. Further, when a bus control is performed via a time slot method by making use of a synchronous DRAM, a mishit control is frequently generated which causes a problem in that an increase of the throughput is limited.
Further, GAIN (Technical Report Published by Hitachi, Ltd. Semiconductor Division, No.96,1993. 1, pp. 6–11) discloses an example for reducing the size and cost of the device by making use of a built-in RISC (Reduced Instruction Set Computer). However, the paper is silent with regard to a specific memory access method and bus utilization method which realize a high speed image processing.
Still further, JP-A-4-107056 (1992) discloses a high speed processing method in which a bus which transfers image data from a decoder to a printer is made independent from a MPU bus.
Still further, in a conventional facsimile machine, an example of an image processing system is disclosed in Shuichi Fujikura et al. A Development of a LSI for Facsimile Image Processing” (Oki Denki Research and Development Report, October, 1992. No.156, vol.59, No.4, pp.65–70) having a processor and high speed memory dedicated for image processing in an image input and output unit, and in which data distortion is corrected to thereby realize images of high quality. However, in association with gathering of control units, each formed by a one chip microcomputer, the space rate occupied by the image processing unit increases, which adversely affects the cost thereof.
Still further, the image processing system in a recent business use facsimile machine tends to accelerate high image quality, high processing speed and large memory capacity such that LSIs dedicated for image processing and for coding are frequently constituted to have their own respective SRAMS. Accordingly, there arises a problem of increased cost of the devices.
Still further, a conventional image processing system used for a facsimile machine, a printer and a graphics device, as disclosed in JP-A-61-261961 (1986), has a SRAM (static memory) used for local processing by referring to nearby pixels at high speed and a DRAM (dynamic memory) used for storing data, such as symbol data and font data, operating at low speed, but having a large memory capacity. Therefore, the impossibility of integrating the above two types of memories is a significant problem from the point of view of device size reduction, integration into a single LSI, device constitution, device cost and product series development.
Still further, one of the reasons why high speed image processing could not be achieved with the above conventional art is that the image inputting and outputting processing and the communication processing function are required to have an extremely high real time property as well as a high speed bus throughput of about 4–20 MB/s such that their processings have to be performed via a dedicated processor and a local processing use dedicated memory independent from a main memory.
Due to the development of the semiconductor micro-machining technology and improvements in microprocessor architecture, a high speed processor, such as a RISC, and a device having an operating speed more than 100 MHz, such as a RAMBUS and synchronous DRAM, have appeared on the market. For example, the synchronous DRAM has already begun to draw attention as a memory having a large capacity operable at high speed. In contrast to the conventional DRAM, a synchronous DRAM can input and output data, address and control signals in synchronism with clocks to thereby realize a high speed data transfer comparable with the conventional SRAM, in addition to the fact that a synchronous DRAM having a larger memory capacity than the conventional DRAM can be realized at a low cost.
There are marked characteristics in image communication and processing, including advantageous characteristics with regard to the construction thereof which involve regularity in address renewal, such as the continuity of the address to be processed, easy previous forecasting of processing quantity and simple processing content and limited nearby influence of the processed result, and disadvantageous characteristics representing an intense real time requirement and a possible system break-down when not completing the processing within a predetermined time. However, no devices and systems have been proposed until now which optimize the processing in view of the above advantageous and disadvantageous characteristics, so that it is necessary to provide a device and a system therefor which take into consideration the above advantageous and disadvantageous characteristics.