The present invention relates generally to computer systems and more specifically to circuits for performing operations in a computer system.
As it is known in the art, computer systems and special hardware included therein are typically used to perform highly compute-intensive activities for a variety of applications. Video compression techniques are one type of compute-intensive activity generally requiring high bandwidth and storage requirements. Video compression is often used to translate video images, such as those from camera, VCR or laser disks, into digitally encoded frames which can be easily transferred over a network or stored in a memory. When desired, the compressed images are decompressed for viewing on a computer monitor or other such device.
General techniques for performing compression are set forth in common video compression standards such as MPEG (Moving Frames Expert Group), and motion JPEG (Joint Frames Expert Group) and H.261. Implementations of the techniques for video compression set forth in the foregoing standards generally have a high bandwidth requirement for performing mathematical operations, such as division.
The bandwidth and storage requirements associated with video data typically make it infeasible to economically deal with digital video data in its original form. Thus, in recent years a number of standards have been developed to compress video data for a variety of bandwidth and storage sensitive applications. These applications include video teleconferencing, network or digital satellite transmission of video. The methods included in these standards often use a compression technique which requires performing a large number of mathematical calculations upon a large amount of video data within a very short amount of time. For example, one step of video compression requires performing matrix and vector division upon the video data.
One of the challenges of video compression and decompression is to provide a solution for encoding and decoding video data in a manner which produces a high quality image at minimal cost. In particular for video compression, the solution should meet the computational requirements for performing division at a very fast rate to maintain a high quality of video compression, as well as have a throughput which meets timing requirements of other dependent components comprising this solution providing a complex dataflow.
In meeting performance demands of a compute-intensive activity, such as video data compression, solutions have been developed using a variety of approaches. In balancing the high bandwidth requirement needed in video compression, a variety of approaches have been taken to provide video compression arrangements that include both hardware and software components. One approach includes reducing the bandwidth requirements of operations performed in video compression and decompression, such as division and multiplication operations. That is, rather than provide an arrangement that performs video compression meeting a high bandwidth requirement, a video compression technique is employed which requires a lower bandwidth. For example, rather than compress a large amount of video data, a choice is made to ignore or lose certain pieces of video data. The result is that the required arrangement providing video compression has a lower bandwidth requirement for operations such as division. However, the drawback is a lesser quality video compression.
Another approach which does not sacrifice quality for speed by reducing the bandwidth requirements employs multiple circuits, such as multiple dividers. For example, if nine division operations each producing a corresponding quotient are required to be produced per cycle to meet bandwidth requirements, nine parallel independent dividers, each having an associated register containing the corresponding quotient, are used. One drawback with this technique is the amount of hardware needed. Multiple dividers and associated registers are required to hold each of the nine quotients. Also, an additional control mechanism is required to select each of the associated registers.
Yet another technique for performing the required bandwidth, such as division, uses an arrangement with a pipelined design. Specifically regarding one such arrangement for performing division, a full pipelined divider is typically used and includes a replication of hardware for each stage in the pipeline. Using a full pipelined divider is typically more hardware than is needed to address the very specific task of performing division for video compression and decompression. In other words, this approach does not focus upon minimizing the total area consumed by the hardware required to meet the division bandwidth requirement. The technique typically results in using more space on a computer chip and more hardware than needed to meet division bandwidth requirements.
Computer video compression and decompression techniques are one type of application having high bandwidth requirements when performing a highly compute-intensive activity, such as division. Solutions implemented for other applications, such as a system which performs fluid flow analysis, also face a similar problem of providing an arrangement which is able to meet high bandwidth requirements for mathematical calculations, such as division.