1. Field of the Invention
The present invention relates to an apparatus and method useable for the real-time processing of video signals and other signals requiring digital processing.
2. The Background Art
With the rise in demand for increasingly complex electronic devices, such as computers performing video signal processing functions, it is often required that large quantities of data be manipulated by common operations, to lighten or darken an image, merge two images together, etc.
Signals requiring manipulation often contain a great deal of information, such as when an NTSC signal is being operated upon to produce various effects. A digital NTSC signal produces approximately 10.4 million pixels per second, and each pixel contains information for three colors, red, green, and blue. Thus, a digital NTSC signal results in more than 31 million pieces of digital data per second.
A typical computer CPU speed of 200 Mhz having to process more than 30 million pieces of digital video data per second, results in less than seven CPU clock cycles available each second for processing each color component for each pixel. In order to provide a smooth transition during real-time video processing, it is necessary that processing of each frame be completed prior to its display. Thus, speed of processing is extremely important in real-time signal processing applications.
In modern day computers, memory operations are the slowest operations performed, with the typical read or store operation requiring a full clock cycle in which to be executed. With the number of operations being performed on a real time video signal as seen above, it is critical that even the most simple operations be performed in the most efficient manner. However, signal processing systems, though useful for their intended purposes, often do not optimize processing functions so that memory use is minimized. It would therefore be beneficial to provide a signal processing system and methods for its use which perform specialized processing functions while minimizing the number of transfers into and out of memory.
In addition to limiting the number of memory operations performed, it is often desirable to reduce the number of discrete operations performed in a given period of time. Typically, repetitive operations are performed sequentially rather than in parallel. Performing signal processing operations on many bytes of simultaneously often reduces the amount of time required to perform those signal processing operations on the entire set of data as compared to the amount of time which might be required to perform those same operations at same operation when performed sequentially. It would therefore be beneficial to provide an apparatus and method which processes several bytes of data simultaneously.
In video processing operations, such as when decoding a video signal using the MPEG-II compression standard, or when scaling or rotating images, it is necessary to combine the incoming signal being processed with other internally generated signal data in order to achieve a given result. The present invention is only concerned with the combination of two signals when the overflow that may result is ignored, such as when adding the angular data corresponding to two pixel vectors.
In the prior art, 8-bit signal information from the incoming signal, such as pixel information, was added to other 8-bit information, one 8-bit byte at a time. This method, while use for its intended purposes, utilizes the processor inefficiently, and fails to optimize memory operations.
In the discussion that follows, memory refers to any memory used by a Central Processing Unit to perform operations. The data being manipulated is typically real-time graphical image data, but may be any type of signal data requiring similar processing.
To illustrate the method of the prior art, a first set of four 8-bit bytes of signal data such as pixel information will be combined with a different set of four 8-bit bytes to produce a composite output.
FIG. 1 is a flow chart of the prior art method of combining video signal data represented by two groups of four 8-bit bytes each.
Referring to FIG. 1, at step 10 a data byte from the first group of video image data is loaded into memory. At step 12, a data byte from the second group of data is loaded into memory. Step 14 then adds the data byte from the first group to the data byte from the second group, ignoring any overflow condition. The method proceeds at step 16 when the processor causes the result of step 16 to be written into memory. The method continues at step 18 where it is determined if all four data bytes in each group have been combined. If not, the method proceeds again with step 10. If yes, the method ends.
In order to combine the first two signals, one memory operation was required for each data byte in both original signals, for a total of eight load operations. A total of four addition operations were required to combine the signals, and a total of 4 store operations were required to store the results in memory. Since each memory operation takes approximately one clock cycle to complete, the entire method of combining two signals using the prior art method requires a minimum of 16 operations being performed in no less than twelve clock cycles.