1. Field of the Invention
The present invention relates to methods and apparatus for generating images on a cathode ray tube ("CRT") or other display device. More particularly, the present invention relates to a pseudo-superscalar technique for video processing that allows the simultaneous execution of instructions in a single-scalar computer system.
2. Art Background
When processors were first introduced, instructions were launched one at a time. Conversely, the superscalar computer systems of today execute multiple instructions simultaneously by employing multiple instruction paths that allow for launching multiple instructions in parallel. A superscalar computer system determines which instructions may be executed independently, i.e. without a dependency upon another instruction, and then launches those instructions simultaneously. This multiple instruction launch results in significant increase in instruction throughput.
The processing of video data is an area in which multiple instruction launch is most beneficial. Video data may be represented by a 24-bit wide data signal which contains the information associated with a single picture element, or pixel, on a CRT or other display device. There are three 8-bit data components each corresponding to one of three sub-pixels within the pixel. There is a red sub-pixel, a green sub-pixel, and a blue sub-pixel. Typical video data processing operations such as dithering and interpolation require that all three data components in the data signal be operated upon in order to produce a new pixel value for display on the CRT. Operations on the data components often require simple arithmetic operations such as addition and subtraction, which may result in the propagation of a carryover value caused by overflow.
Thus, if the data signal is operated on as a single 24-bit wide value, there can be propagation of a carryover value between the data components. Any carryover value will invalidate the contents of the contiguous data component. Prior art solutions typically require the extraction of individual data components from the data signal and subsequent operation on each data component. In a single-scalar architecture, i.e. an architecture where only one instruction is launched at a time, this results in high processing overhead due to more than trebling the number of instructions required. A superscalar system overcomes this by launching the three sets of instructions simultaneously, completing the three sets of instructions in the time it takes a single-scalar system to complete one set of instructions.
However, despite the increase in performance that superscalar systems offer, the cost of replacing or redesigning a single-scalar system is often prohibitive. Therefore, as will be described, the present invention provides a method and apparatus for simulating multiple instruction launch in a single-scalar system.