Methods for encoding an audio-visual signal are known in the art. According to these methods, a video signal is digitized, analyzed and encoded in a compressed manner. These methods are implemented in computer systems, either in software, hardware or combined software-hardware forms.
Most hardware encoding systems consist of a set of semiconductor circuits arranged on a large circuit board. State of the art encoding systems include a single semiconductor circuit. Such a circuit is typically based on a high-power processor.
Reference is now made to FIG. 1, which is a block diagram illustration of a prior art video encoding circuit 10.
Encoding circuit 10 includes a video input processor 12, a motion estimation processor 14, a digital signal processor 16 and a bitstream processor 18. Processors 12-18, respectively, are generally connected in series.
Video input processor 12 captures and processes a video signal, and transfers it to motion estimation processor 14. Motion estimation processor 14 analyzes the motion of the video signal, and transfers the video signal and its associated motion analysis to digital signal processor 16. According to the data contained within the associated motion analysis, digital signal processor 16 processes and compresses the video signal, and transfers the compressed data to bitstream processor 18. Bitstream processor 18 formats the compressed data and creates therefrom an encoded video bitstream, which is transferred out of encoding circuit 10.
It will be appreciated by those skilled in the art that such an encoding circuit has several disadvantages. For example, one disadvantage of encoding circuit 10 is that bitstream processor 18 transfers the encoded video bitstream, data word by data word, directly to an element external to encoding circuit 10. Accordingly, each time such data word is ready, the encoded video data word is individually transferred to the external element. Transfer of the encoded video in such a fashion greatly increases the data traffic volume and creates communication bottlenecks in communication lines such as computer buses. Additionally, circuit 10 requires a dedicated storage/bus which is allocated on a full time basis, hence, magnifying these disturbances.
Another disadvantage is that encoding circuit 10 is able to perform the encoding of video signals, only. Usually, moving picture compression applications include multiframe videos and their associated audio paths. While the encoding circuit 10 performs video compression and encoding, the multiplexing of compressed video, audio and user data streams are performed separately. Such an approach increases the data traffic in the compression system and requires increased storage and processing bandwidth requirements, thereby greatly increasing the overall compression system complexity and cost.
Reference is now made to FIG. 2, which is a block diagram of a prior art video input processor 30, as may be typically included in encoding circuit 10. Video input processor 30 includes a video capture unit 32, a video preprocessor 34 and a video storage 36. The elements are generally connected in series.
Video capture unit 32 captures an input video signal and transfers it to video preprocessor 34. Video preprocessor 34 processes the video signal, including noise reduction, image enhancement, etc., and transfers the processed signal to the video storage 36. Video storage 36 buffers the video signal and transfers it to a memory unit (not shown) external to video input processor 30.
It will be appreciated by those skilled in the art that such video input processor has several disadvantages. For example, one disadvantage of processor 30 is that it does not perform image resolution scaling. Accordingly, only original resolution pictures can be processed and encoded.
Another disadvantage is that processor 30 does not perform statistical analysis of the video signal, since in order to perform comprehensive statistical analysis a video feedback from the storage is necessary, thus allowing interframe (picture to picture) analysis, and processor 30 is operable in “feed forward” manner, only. Accordingly, video input processor 30 can not detect developments in the video contents, such as scene change, flash, sudden motion, fade in/fade out etc.
Reference is now made to FIG. 3 which is a block diagram illustration of a prior art video encoding circuit 50, similar to encoding circuit 10, however, connected to a plurality of external memory units. As an example, FIG. 3 depicts circuit 50 connected to a pre-encoding memory unit 60, a reference memory unit 62 and a post-encoding memory unit 64, respectively. Reference is made in parallel to FIG. 4, a chart depicting the flow of data within circuit 50.
Encoding circuit 50 includes a video input processor 52, a motion estimation processor 54, a digital signal processor 56 and a bitstream processor 58. Processors 54 to 58, respectively, are generally connected in series.
In the present example, video encoding circuit 50 operates under MPEG video/audio compression standards. Hence, for purposes of clarity, reference to a current frame refers to a frame to be encoded. Reference to a reference frame refers to a frame that has already been encoded and reconstructed, preferably by digital signal processor 56, and transferred to and stored in reference memory unit 62. Reference frames are compared to current frames during the motion estimation task, which is generally performed by motion estimation processor 54.
Video input processor 52 captures a video signal, which contains a current frame, or a plurality of current frames, and processes and transfers them to external pre-encoding memory unit 60. External pre-encoding memory unit 60 implements an input frame buffer (not shown) which accumulates and re-orders the frames according to the standard required for the MPEG compression scheme.
External pre-encoding memory unit 60 transfers the current frames to motion estimation processor 54. External reference memory unit 62 transfers the reference frames also to motion estimation processor 54. Motion estimation processor 54, reads and compares both sets of frames, analyzes the motion of the video signal, and transfers the motion analysis to digital signal processor 56.
Digital signal processor 56 receives the current frames from the external pre-encoding memory 60, and according to the motion analysis received from motion estimation processor 54, processes and compresses the video signal. Digital signal processor 56 then transfers the compressed data to the bitstream processor 58. Digital signal processor 56 further reconstructs the reference frame and stores it in reference memory 62. Bitstream processor 58 encodes the compressed data and transfers an encoded video bitstream to external post-encoding memory unit 64.
It will be appreciated by those skilled in the art that such an encoding circuit has several disadvantages. For example, one disadvantage of encoding circuit 50 is that a plurality of separate memory units are needed to support its operations, thereby greatly increasing the cost and the complexity of any encoding system based on device 50.
Another disadvantage is that encoding circuit 50 has a plurality of separate memory interfaces. This increases the data traffic volume and the number of external connections of encoding circuit 50, thereby greatly increasing the cost and the complexity of encoding circuit 50. Another disadvantage is that encoder circuit 50 does not implement video and audio multiplexing, which is typically required in compression schemes.
Reference is now made to FIG. 5, a block diagram illustration of a typical interlaced formatted video in a normal encoding latency mode. The top line depicts the video fields before encoding, while bottom line depicts compressed frames after encoding.
Video is generally received in a progressive or interlaced form. Typical interlaced rates are 60 fields/sec for NTSC standard and 50 fields/sec for PAL standard.
In order to minimize encoding latency, encoding circuits should begin processing of an image immediately after receipt of the minimal amount of image data. Video is comprised of a plurality of fields, wherein each frame has a top and bottom field, referenced herein as top m and bot m. The video fields illustrated in FIG. 5 are referenced top 0 and bot 0, top 1 and bot 1, etc. such that each pair of associated top and bot refers to a single frame.
Encoding circuits begin the encoding process after capturing M pictures, where M is defined as M=I/P ratio. I is defined as an I picture, which is the Intra frame or the first frame (frame 0) of the series of frames to be encoded, and P is a P picture, which is the predictive frame (frame 1), and is referenced from frame 0. The I/P ratio refers to a distance between successive I/P frames in video sequence. Typically, prior art encoding circuits, such as encoding circuit 10 or encoding circuit 50 begin processing the image after receipt of 2 or more pictures. Note that in FIG. 5, the I picture appears after the progression of 3 pictures, and as such, M=3.
It will be appreciated by those skilled in the art that such an encoding latency is a lengthy time period, and hence, has several disadvantages. One such disadvantage is that a large amount of storage is required to accumulate frames. Another disadvantage is that large latency does not enable use of encoding circuit 50 in time-sensitive interactive applications such as video conferencing and the like.