The present invention relates to a graphic display processing apparatus for performing control of drawing in the graphic display processing apparatus.
With the recent development of semiconductor technology, the function and performance of an information processing apparatus such as personal computer and work station have been improved significantly year by year. Speedup of the central processing unit (hereinafter referred to as CPU), increased capacity of the storage unit such as memory and external memory unit and improvements in man-machine interface can be mentioned as factors of the improved performance. Under this background, a so-called window system has been practiced which is operated by displaying one or more rectangular frames called windows on the screen and assigning an application program to each of the windows. This type of window system features the full graphics display processing wherein not only images and figures but also characters are displayed through graphics. In the past, too, there was a window system, which required too much processing time to be practical because of the performance of CPU and the capacity of storage unit. As is known in the art, the performance of the whole window system is affected significantly, especially by the processing performance of the following drawing primitives:
(1) Bit Block Transfer PA1 (2) Character Drawing PA1 (3) Line Segment Drawing PA1 a: averaged one access time for VRAM PA1 b: fixed overhead of transfer processing of horizontal one raster PA1 c: fixed overhead of bit block transfer processing PA1 m: overhead of processing necessary for one-word transfer PA1 n: the number of VRAM accesses necessary for one transfer of data of one word PA1 p: plane coefficient PA1 x: the number of words transferred in the horizontal direction PA1 y: the number of words transferred in the vertical direction PA1 *: arithmetic multiplication sign PA1 +: arithmetic addition sign PA1 (1) Provided common to all the planes of the VRAM are a data structure transformer for transforming the format of external data by using mirror image inversion and swap separately or in combination, a bit mask register for controlling writing to the VRAM in unit of bit, an AND circuit for ANDing the contents of bit mask register and the data of data structure transformer bit by bit, a bit mask shifter for shifting data of the bit mask register, data structure transformer or AND circuit, a third merge register for holding the previous contents of data of the data structure transformer, a third shifter for shifting data of the data structure transformer and data of the third merge register, and a read data synthesizer for ORing bits of the contents of a read data selector of each plane to be described later so as to synthesize read data supplied to the CPU. PA1 (2) There is provided an address generator including a source address register for holding an address of a source area on the VRAM, a destination address register for holding an address of a destination area, a pattern address register for holding an address of a pattern area, a source offset register for holding a value added to the contents of the source address register to update the same when a read access cycle by the access cycle generator to be described later ends, a destination offset register for holding a value added to the contents of the destination address register to update the same when a write access cycle by the access cycle generator to be described later ends, a pattern offset register for holding a value added to the contents of the pattern address register to update the same when the read access cycle by the access cycle generator to be described later ends, a first address adder for adding the contents of source address register and that of source offset register, the contents of destination address register and that of destination offset register or the contents of pattern address register and that of pattern offset register so as to update the value of each register, and a second address adder for adding the write data of CPU and the contents of source address register, destination address register or pattern address register so as to update the value of each register. PA1 (3) There is provided an access cycle generator which, when receiving a request for reading the VRAM from the CPU or a sequential transfer counter to be described later, a read access request containing at least one of access operations to source, destination and pattern areas by using the address generator and when receiving a request for writing the VRAM from the CPU or the sequential transfer counter to be described later, generates a write access or the combination of read access and write access to a destination area by using the address generator and drives the memory controller for the VRAM. PA1 (4) There is provided a sequential transfer counter which starts the access cycle generator by the designated number of sequential operations of write cycle or combination of read cycle and write cycle. PA1 (5) There is provided a sequential transfer mask pattern generator which generates bit mask patterns respectively designated during the first and final write transfer processings by the sequential transfer counter and supplies the bit mask patterns, as external data, to a bit mask controller through the data structure transformer of data operation unit and during write transfer lying between the first and final write transfer processings, generates a bit pattern which permits writing of all the bits and supplies the bit mask pattern, as external data, to the bit mask controller through the transformer. PA1 (1) Provided common to all the plane of the VRAM are a data structure transformer for transforming the format of external data by using mirror image inversion and swap separately or in combination, a bit mask register for controlling writing to the VRAM in unit of bit, an AND circuit for ANDing the contents of bit mask register and the data of data structure transformer bit by bit, a bit mask shifter for shifting data of the bit mask register, data structure transformer or AND circuit, a third merge register for holding the previous contents of data of the data structure transformer, a third shifter for shifting data of the data structure transformer and data of the third merge register, and a read data synthesizer for ORing bits of the contents of a read data selector of each plane to be described later so as to synthesize read data supplied to the CPU. PA1 (2) There is provided an address generator including a source address register for holding an address of a source area on the VRAM, a destination address register for holding an address of a destination area, a pattern address register for holding an address of a pattern area, a source offset register for holding a value added to the contents of the source address register to update the same when a read access cycle by the access cycle generator to be described later ends, a destination offset register for holding a value added to the contents of the destination address register to update the same when a write access cycle by the access cycle generator to be described later ends, a pattern offset register for holding a value added to the contents of the pattern address register to update the same when the read access cycle by the access cycle generator to be described later ends, a first address adder for adding the contents of of source address register and that of source offset register, the contents of destination address register and that of destination offset register or the contents of pattern address register and that of pattern offset register so as to update the value of each register, and a second address adder for adding the write data of CPU and the contents of source address register, destination address register or pattern address register so as to update the value of each register. PA1 (3) There is provided an access cycle generator which, when receiving a request for reading the VRAM from the CPU or a sequential transfer counter to be described later, generates a read access request containing at least one of access operations to source, destination and pattern areas by using the address generator and when receiving a request for writing the VRAM from the CPU or the sequential transfer counter to be described later, generates a write access or the combination of read access and write access to a destination area by using the address generator and drives the memory controller for the VRAM. PA1 (4) A data position transformer is provided which, when the number of bits of data from the CPU differs from the number of bits of the VRAM data bus, puts the data of the CPU to the left or right on the VRAM data bus through an image on the screen and supplies the thus put data, as external data, to the data structure transformer of data operation unit. PA1 (1) Provided common to all the planes of the VRAM are a data structure transformer for transforming the format of external data by using mirror image inversion and swap separately or in combination, a bit mask register for controlling writing to the VRAM in unit of bit, an AND circuit for ANDing the contents of bit mask register and the data of data structure transformer bit by bit, a bit mask shifter for shifting data of the bit mask register, data structure transformer or AND circuit, a third merge register for holding the previous contents of data of the data structure transformer, a third shifter for shifting data of the data structure transformer and data of the third merge register, and a read data synthesizer for ORing bits of the contents of a read data selector of each plane to be described later so as to synthesize read data supplied to the a CPU. PA1 (2) There is provided an address register including a source address register for holding an address of a source area on the VRAM, a destination address register for holding an address of a destination area, a pattern address register for holding an address of a pattern area, a source offset register for holding a value added to the contents of the source address register to update the same when a read access cycle by the access cycle generator to be described later ends, a destination offset register for holding a value added to the contents of the destination address register to update the same when a write access cycle by the access cycle generator to be described later ends, a pattern offset register for holding a value added to the contents of the pattern address register to update the same when the read access by the access cycle generator to be described late ends, a first address adder for adding the contents of source address register and that of source offset register, the contents of destination address register and that of destination offset register or the contents of pattern address register and that of pattern offset register so as to update the value of each register, and a second address adder for adding the write data of CPU and the contents of source address register, destination address register or pattern address register so as to update the value of each register. PA1 (3) There is provided an access cycle generator which, when receiving a request for reading the VRAM from the CPU or a sequential transfer counter to be described later, generates a read access request containing at least one of access operations to source, destination and pattern areas by using the address generator and when receiving a request for writing the VRAM from the CPU or the sequential transfer counter to be described later, a write access or the combination of read access and write access to a destination area by using the address generator and drives the memory controller for the VRAM. PA1 (4) There is provided a sequential transfer counter which starts the access cycle generator by the designated number of sequential operations of write cycle or combination of read cycle and write cycle. PA1 (5) There is provided a sequential transfer mask pattern generator which generates bit mask patterns respectively designated during the first and final write transfer processings by the sequential transfer counter and supplies the bit mask patterns, as external data, to a bit mask controller through the data structure transformer of data operation unit and during write transfer lying between the first and final write transfer processings, generates a bit pattern which permits writing of all the bits and supplies the bit mask pattern, as external data, to the bit mask controller through the transformer. PA1 (6) There is provided a dot mask generator which generates a bit pattern for permitting write of only one bit on the VRAM data bus, supplies as external data the bit pattern to the bit mask controller through the data structure transformer of data operation unit, selectively renders, upon completion of write cycle to the VRAM, the bit pattern unchanged or rotated by one bit clockwise or counterclockwise, and when an overflow takes place as a result of the rotation, increments the value of destination of the address generator by +1 for clockwise rotation and decrements by -1 for counterclockwise rotation.
The bit block transfer is a generalized processing for transferring data in an area defined by a rectangle to another rectangular area and is an important drawing primitive which occupies 50% or more of the processing of the whole window system. The character drawing occupies 20% to 30% of the processing in a general window system. Importance of the processing performance of the drawing can be understood readily when taking into consideration the fact that an application program principally containing character display for use in word processors is executed during the drawing. The line segment drawing occupies, in general, 10% to 30% of the overall processing. It should be understood that the drawing is also an important drawing primitive, by taking into account the fact that the percentage of processing will be further increased when a sophisticated figure is displayed using an application program of, for example, computer-aided design (CAD).
Improvements in the performance of the CPU and increase in the capacity of the storage unit are mentioned hereinbefore but they merely provide background techniques for causing the window system to approach the practical base. In other words, it is the point of the above processings (1) to (3) how data, which has experienced the pre-processing such as calculation of coordinates to be drawn, can be stored at a high speed in the memory for display, i.e., video random access memory (hereinafter referred to as VRAM). In order to speed the drawing processing per se up, the display system has to be provided with a drawing speedup mechanism. Under the circumstances, many expedients for high-speed drawing have hitherto been contrived.
In a conventional system as disclosed in Japanese Patent Application Laid-open No. JP-A-59-119385, coordinates of a pixel are designated and an address is calculated through hardware to carry out bit block transfer. In a conventional system as disclosed in Japanese Patent Application Laid-open No. JP-A-01-107295, read/write operation from the CPU is expanded to perform bit block transfer. In accordance with a conventional system as disclosed in Japanese Patent Application Laid-open No. JP-A-01-140196, an address of the VRAM is generated by means of an address register and an address offset register.
A display system having a VRAM of plane type is considered. Another structure of VRAM called a packed pixel type is also available but the plane type is suitable for speedup of bit block transfer. In the plane type, the VRAM as viewed from the CPU is constructed of one or more planes and the CPU accesses plane by plane. The number of planes determines the number of colors which can be displayed simultaneously or the number of gradations of gray scale which are displayed simultaneously. For example, two colors or gradations can be displayed with one plane, 16 colors or gradations can be displayed with 4 planes and 256 colors or gradations can be displayed with 8 planes. When the transfer processing of horizontal one raster is effected repetitively by the number of rasters necessary to complete the whole processing, time t for bit block transfer processing can generally be expressed by the following equation: EQU t=p*y*(b+(a*n+m)*x)+c (1)
where each parameter has the meaning as below:
Firstly, the presupposition of the graphic display processing apparatus considered. In bit block transfer, when data representative of a plurality of pixels read out of a transfer originator, i.e., source area is written into a desired position of a transfer destination, i.e., destination area, data of one word of the source is so written as to cross two words of the destination at a high probability. For example, in the case of a graphic display processing apparatus having a hardware construction capable of reading 16 pixels by one VRAM access, data can be written into the destination without resort to shift processing through one write operation at a probability of 1/16 and data subject to shift processing crosses two words at a probability of 15/16. Therefore, without any support by hardware, the value of the number n of VRAM accesses in bit block transfer changes with the presence or absence of shift processing, resulting in different values of processing time t. Contrarily, with the merge function provided, the value of n can be kept to be constant regardless of shift processing of data to be transferred. The merge function is a technique disclosed in, for example, Japanese Patent Application Laid-open No. JP-A-63-231548. The outline of the function is as follows. When a shifter is used during bit block transfer, data is overflowed by shifting to leave a remainder of data which is not drawn in a transfer destination word during a first round. In order to permit the remainder to be drawn during the next round of word transfer, a register is provided which holds the data during one preceding round, whereby the next transfer data is merged with the held data to provide data of 2-word length from which a portion necessary for the next transfer is cut out and drawn. With the aim of further increasing the speed of the graphic display processing apparatus capable of keeping the value of n constant regardless of the presence or absence of shift processing through the use of the merge function, it is possible to presupposes realization of a graphic display processing apparatus having the merge function.
The parameter p called plane coefficient will now be described. The parameter p is determined by how many planes a VRAM of a graphic display processing apparatus in question has and how many planes of the VRAM can be handled for drawing processing simultaneously. When all the planes of the VRAM can be processed simultaneously, the value of parameter p is 1 regardless of the number of planes of the VRAM. When the processing is permitted to be carried out only plane by plane, the value of parameter p is 4 in the case of, for example, 16-color display (4 planes) and is 8 in the case of 256-color display (8 planes). When the CPU reads the contents of the VRAM to store it in the main memory, the VRAM must be read plane by plane and the value of parameter p cannot be made to be 1 regardless of the number of planes. But in the drawing processing within the VRAM according to the present invention, each plane is provided with a control circuit for support of drawing to permit drawing processings of all the planes to be effected in parallel. Aiming at further speedup of the graphic display processing apparatus capable of keeping the value of parameter p one regardless of the number of planes, the present to presuppose realization of a drawing processing apparatus capable of processing all the planes simultaneously.
In order to reduce the processing time t, it is necessary to reduce the transfer word number x in the horizontal direction by increasing the number of bits of data to be processed at a time or making the parameters a, b, and m small. The transfer word number x may be reduced by increasing the number of memory chips constituting the VRAM, widening the data bus width of the VRAM and increasing the amount of hardware of control circuits needed therefor. But at present, because of physical and economical restrictions, the data bus width per VRAM plane is often 16 to 32 bits. In addition, bit block transfer for small area is affected by other parameters.
The parameter a is a basic parameter for determining not only the bit block transfer but also the drawing speed of display unit. Since the parameter a signifies averaged access time for the VRAM, the value of this parameter may be decreased by using a memory element for VRAM such as a multi-port memory, accessing the VRAM with a plurality of words in high-speed page mode, or reducing the access time per se by adopting a device of higher access time; or by eliminating a synchronous overhead due to the difference between operation period of the CPU and that of the VRAM. The parameter c is an overhead related to the pre-processing of drawing such as application program, operating system and device driver and cannot be made to be sufficiently small by means of hardware of the apparatus. Generally, the ratio of c to t is often small and therefore, the absolute value of parameter c is a factor which is automatically reduced through improvements in the performance of the CPU.
The parameter b is multiplied by the integer number of rasters necessary for bit block transfer processing. Therefore, when the ratio of parameter b becomes large to the term a *n *x as in the case of bit block transfer of a vertically elongated area, the influence of this parameter becomes eminent and so the parameter b must be minimized as possible. Factors dominating the parameter b will be described later. The parameter m represents time required for raster operation to be effected between transfer originator data read out of the VRAM and transfer destination data. When processed through software, the parameter m becomes about 15 times as large as the term a *n, causing significant speed reduction. It is to be noted that in equation (1), the coefficient concerning transfer word number is liable to have the greatest influence upon the whole processing time t.
Problems encountered in the conventional system will now be clarified based on equation (1). In the technique shown in Japanese Patent Application Laid-open No. JP-A-59-119385, only a simple transfer processing of a rectangular area such as source copy can be carried out and this technique cannot be utilized for transfer requiring overlap of graphic forms which needs an operation between an original graphic at the transfer originator and a graphic to be drawn. "Source copy" is a kind of bit block transfer processing for copying a graphic at the transfer originator (source) onto an area of the transfer destination (destination). The operation processing effected between the original graphic and the graphic to be drawn is called raster operation. In both the processing of overlapping two graphic forms or adding patterns and the processing of displaying graphic cursors by means of a mouse or a pointing device, drawing is done using bit block transfer accompanied by the raster operation. In the window system, the bit block transfer can be considered to be accompanied by the raster operation excepting particular cases. The prior art in question is effective to only a particular instance of bit block transfer which does not include any raster operation. Also, the prior art of interest needs a control unit such as a microprocessor for the sake of updating a read address register and a write address register, though not clearly described in Japanese Patent Application Laid-open No. JP-A-59-119385. Even if the control unit is dedicated to the apparatus of Japanese Patent Application Laid-open No. JP-A-59-119385, it takes obviously a long time to update the read address register and write address register. This sets up a factor of increasing the parameters m and b. Additionally, in the technique of the literature, x component and y component of coordinates of a given dot are used as upper and lower terms, respectively, to combine y and x so as to determine a VRAM address. Accordingly, if a lateral bit map of the VRAM has a structure of other than the power of 2, the VRAM address cannot be calculated from the coordinates. This sets up a factor of increasing the parameter c.
A technique disclosed in Japanese Patent Application Laid-open No. JP-A-01-107295 is capable of effecting a raster operation between an original graphic and a graphic to be drawn and causing read/write operation by the CPU to perform transfer of data. But, in this prior art, the read cycle of the CPU is expanded to perform bit block transfer, so that the synchronous overhead occurs between cycle time of the CPU and cycle time of the VRAM, failing to give full play to the performance. This sets up a factor of increasing the parameter a.
An address generator shown in Japanese Patent Application Laid-open No. JP-A-01-140196 is comprised of an address register and an address offset register. This address generator can perform address calculation even when the VRAM has a lateral bit map of other than the power of 2 but disadvantageously, when the capacity of the VRAM is increased and the number of bits of an address necessary for accessing is increased, it must sometimes perform register setting twice in a general information apparatus having a data bus of 16 bits. For example, in order to designate, in unit of word, an address of a VRAM having a bit map of 2048.times.1024 pixels, there needs an address of 17 bits. Then, to set an address register of 17 bits, a total of two write operations into registers must be done, including one write operation into a register of 16 bits and the other write operation into a register of one bit. Generally, setting of control register and the like is effected with the I/O cycle by the CPU and consumes time. In order to reduce the number of setting operations of the control register, the control program is also required to hold an address of an area to be transferred onto the VRAM and only a lower value of the address which is required to be changed is set. This method, however, uses registers of the CPU for the purpose of holding address and disadvantageously, the number of registers of the CPU which would otherwise be utilized for other types of control is decreased. In performing bit block transfer, these address control registers are rewritten frequently, with the result that the number of setting operations is increased and the program is sophisticated to give rise to factors of performance degradation. This leads to an increase of the parameter b.
When data of a rectangular area of a desired size is transferred, it is frequent that opposite ends of the rectangle respectively begin with a midway and end in a midway of a word which is the unit of accessing the VRAM. Accordingly, in performing transfer in unit of raster, a separate processing is needed for drawing the beginning and end of transfer processing in portions within the word, giving rise to a decrease in processing speed. According to specific measurement results, the processing time required for the processing of the opposite ends amounted, in average, up to about 40% of the processing time for the whole bit block transfer. This sets up a factor which increases the parameter b remarkably.
In addition, for each application program, a data structure of a general memory constituting the main memory sometimes differs from that of a memory constituting the VRAM and the control program is required to include conversion, leading to a decrease in processing speed. This sets up a factor of increasing the parameter m.
Since equation (1) is a general expression, equations which indicate processing time t in typical different processings in bit block transfer will now be described.
In the case of drawing consisting of only write to the VRAM such as paint-out drawing, only an operation of writing color information into the VRAM generally proceeds and hence the value of n is one. Therefore, the processing time t is given by equation (2): EQU t=y*(b+(a+m)*x)+c (2).
In the case of source copy which is simple transfer from the transfer originator to the transfer destination, VRAM read at the transfer originator and VRAM write at the transfer destination are needed and so the value of n is 2. Therefore, the processing time t is given by equation (3 EQU t=y*(b+(2a+m)*x)+c (3).
In case where data at the transfer originator and data originally present at the transfer destination undergo operation processing and are written into the transfer destination, VRAM read at the transfer originator, VRAM read at the transfer destination and VRAM write at the transfer destination are needed and so the value of n is 3. Therefore, the processing time t is given by equation (4): EQU t=y*(b+(3a+m)*x)+c (4).
In case where data at the transfer originator, data originally present at the transfer destination and a pattern undergo operation processing and are written into the transfer destination, VRAM read at the transfer originator, VRAM read at the transfer destination, VRAM read of the pattern and VRAM write at the transfer destination are needed and so the value of n is 4. Therefore, the processing time t is given by equation (5): EQU t=y*(b+(4a+m)*x)+c (5).
There are these kinds of basic operations in bit block transfer. In the prior art, the value of parameter m in equations (2) to (5) cannot be made to be constant. This accounts for the fact that even when means is provided which makes the value of parameter m zero or makes it nearly zero in comparison with the value of parameter a to achieve speedup in the simple processing such as source copy pursuant to equation (3), load on the CPU is increased as soon as bit block transfer with raster operation pursuant to equation (4) or (5) begins and the value of parameter m is increased to about 5 to 20 times the value of parameter a. Disadvantageously, the processing speed therefore differs significantly depending on the kind of bit block transfer.
Problems encountered in character drawing will now be described. The character drawing signifies the processing of writing a character font (the form of a character) to a desired position on the screen of full bit map. Generally, the address of VRAM increases in the horizontal direction of display. On the assumption that the data bus of VRAM is of, for example, 16 bits, data of the VRAM having the plane structure is arranged horizontally in unit of 16 pixels and therefore drawing must be carried out after the character font is suitably shifted for positioning. The thus shifted character font sometimes exceeds the word boundary of VRAM and in some cases write operation must be effected twice.
While the VRAM address increases in the horizontal direction of display as described previously,character font data results from slicing a character in unit of dot and slices are sequentially stored in the vertical direction. The conventional system disclosed in Japanese Patent Application Laid-open No. JP-A-1-140196 is capable of addressing the VRAM vertically and therefore, when combined with the other conventional systems, it may be considered to be suitable for expanding the character font data vertically. But this is possible only when the number of bits of the character font data equals the number of bits of the VRAM data bus. Take the case where the font data is of 8 bits and the VRAM data bus is of 16 bits, for instance. When a character font is to be transferred to the VRAM by using a byte (8 bits) data transfer instruction by the CPU, the VRAM address increases vertically but the character font data from the CPU develops on upper 8 bits and lower 8 bits of the VRAM data bus alternately. This is because the CPU address is assigned in unit of byte and in the 16-bit bus, the lower 8 bits and upper 8 bits are defined as even and odd addresses, respectively. Accordingly, without any expedient applied, the conventional systems in combination fail to draw the character font at an expected position.
Incidentally, due to the fact that in the display system of personal computers, the VRAM data bus of 8 bits is leading internationally, the character font is also designed as to have an 8-bit width in many applications. On the other hand, in a display system of the class having 1000 dots in the horizontal direction, the VRAM data bus is often of 16 bits or more in order to increase the drawing speed. Therefore, in conveniences set forth so far will in general occur in the future. The character drawing faces the problems described as above.
To describe line segment drawing, a straight line is considered as a line segment. Many kinds of algorithm for generating coordinates of dots constituting a straight line have been contrived but a unit for drawing a line segment at a high speed has in general a different construction as that of the aforementioned unit for increasing the speed of bit block transfer and must be provided separately.