Methods for encoding an audio-visual signal are known in the art. According to these methods, a video signal is digitized, analyzed and encoded in a compressed manner. These methods are implemented in computer systems, either in software, hardware or a combined software-hardware form.
Most hardware encoding systems consist of a set of semiconductor circuits, which are arranged on a large circuit board. State of the art encoding systems include a single semiconductor circuit, which is based on a high power processor.
Reference is now made to FIG. 1, which is a schematic illustration of a video encoding circuit, referenced 10, which is known in the art.
Circuit 10 includes a motion estimation processor 12, a motion estimation memory 14 connected to the motion estimation processor 12, a RISC processor 16 connected to the motion estimation processor 12 and an image buffer 18, connected to RISC processor 16.
RISC processor 16 transfers a portion of video signal from image buffer 18 to memory unit 14. Motion estimation processor 12 analyzes the motion of the video signal. Motion estimation processor 12 utilizes memory unit 14 as a storage area for the video signal portion which is currently processed by it. When the motion estimation processor 12 completed analyzing the motion of a video signal portion, it transfers the results of the motion estimation analysis to the RISC processor 16.
The RISC processor 16 performs all other processing and encoding tasks which the video signal has to undergo, such as discrete COSINE transform (DCT), quantization, entropy encoding, bit-stream production and the like. The RISC processor 16 utilizes the image buffer 18 as a storage area for the video signal portion which is currently processed by it, and as a temporary storage for its computational purposes.
It will be appreciated by those skilled in the art that such encoding systems have several disadvantages. For example, one disadvantage of circuit 10 is that each of the processing units 12 and 16 have a separate storage area. Accordingly, each of the processed portions of video signal, such as and ISO/IEC 13818 (MPEG-2) macro-blocks, have to be transferred to both memory unit 14 and image buffer 18. RISC processor 16 has to access image buffer 18 for the same data, each time this data is required. Such Retrieval of large data blocks, many times, greatly increases data traffic volume over the encoding system data transmission lines.
Another disadvantage is that circuit 10 is able to execute all processing and encoding tasks in a serial manner, thereby capable of processing only a single macro-block at a time, requiring high operational processor frequencies. Circuit 10 receives a macro-block, processes it and produces an encoded bit-stream. Internally, the RISC processor 16 operates in the same manner.
Hence, as long as the RISC processor 10 hasn't completed transmitting the encoded bit-stream of a selected macro-block, it cannot receive the next macro-block.
It will be appreciated by those skilled in the art that the operational frequency of circuit 10 has a direct affect over the heat produced by it, thereby requiring large cooling elements as well as massive cooling devices such as fans and the like.
It will be appreciated by those skilled in the art that such circuit structure requires that input-output (I\O) operations have to be performed extremely fast, thereby greatly increasing the storage memory bandwidth requirements.
Another disadvantage of such systems is that all processing and encoding procedures (excluding motion estimation) are executed by the same RISC processor. In this case, the same circuit performs various types of computations, which makes the utilization of the processor's hardware resources very inefficient.
Methods for estimating motion in a video signal are known in the art. According to these methods a frame is compared with previous frames. The difference between the frames is used to estimate a level of motion. These methods analyze a frame and map it, thereby indicating areas in frame which have no motion over previous frames and areas in the frame which are assigned with a motion level.
According to one such like method each pixel in the search area is analyzed. This method requires a vast number of estimation operations and is thereby extremely resource consuming. This method is also called a full exhaustive search.
According to another method, known in the art, the search area is scanned in a center weighted manner, which can be logarithmic, and the like, whereby the center of the search area is scanned thoroughly at full resolution and the rest of the search area is scanned at lower resolution. Areas which detected as having some motion, in the low resolution search, are scanned again in full resolution. This reduces the overall number of estimation operations.
Reference is now made to FIG. 2, which is a schematic illustration of a DSP processor, referenced 50, which is known in the art.
DSP processor 50 is of a single instruction multiple data (SIMD) type machine. It includes a plurality of identical processing units (P.U.) 52, 56, 60, 64, 68 and 72, and a random access memory (RAM) 61. RAM 61 is divided into segments 54, 58, 62, 66, 70 and 74.
Each memory segment is exclusively assigned and connected to a processing unit, whereas RAM segment units 54, 58, 62, 66, 70 and 74 are assigned to and connected to processing units (P.U.) 52, 56, 60, 64, 68 and 72, respectively.
This structure has several disadvantages. One disadvantage of such machine is that the same operation is performed by all of the processing units at same time.
Another disadvantage of the SIMD machine is that the data is not shared among the processing units. For example, processing unit 56 can access data contained in RAM segment 66 via processing unit 64 only. It cannot do so directly. It will be appreciated by those skilled in the art that such a configuration is inefficient.
A further disadvantage is that individual operations that vary for different data items can not be efficiently performed by an SIMD machine. The programming of such operations into the processing units, is very difficult. Such individual operations can be only performed in serial manner, while masking all irrelevant data, resulting in shutting off most of the processing units. The utilization of the hardware resources in an SIMD machine during such programming operations is very low, and performance of the machine are dramatically decreased.
Another disadvantage relates to the interconnection structure between the processing units. It will be appreciated that, a processing unit within an SIMD machine is connected to a limited number of neighboring processing units. Hence communication between such a processing unit and a processing unit not connected thereto, is often a complex operation.
Bit-stream processing and generation, in a conventional encoding circuit, is performed by a general purpose processor. Bit-stream generation requires some specific operations, which can not be performed efficiently by a general purpose processor. In order to perform such special operation, a general purpose processor uses a small portion of its processing resources, while shutting off rest of them. Therefore, the disadvantage is that the resources of such processor are not utilized efficiently.