At present, more and more video stream processing apparatuses supporting the IP multimedia conference, e.g. the Multipoint Control Unit (MCU) and the Multimedia Gateway (MG) in the IP multimedia conference and the like, use an FPGA chip to implement video stream processing by way of hardware logic, thus to improve video stream processing performance.
FIG. 1 illustrates a common FPGA-based video stream processing infrastructure of the prior art. An AD chip, which is connected with a terminal camera apparatus or a terminal display apparatus, is used for performing an A/D or D/A conversion for the inputted/outputted video stream. The FPGA chip is mainly used for pre-processing the input video stream and post-processing the output video stream, such as zooming, cutting, multi-image stitching, de-interlacing and caption of the video image. The Digital Signal Processing (DSP) chip is used for decoding/coding the input/output video stream, and sending it over a specific communication protocol to the network side or to another processing apparatus for transmission or further processing.
Because the frame rates of the video streams received/sent by the AD chip may be different from each other, the FPGA chip needs to receive the video stream from an input interface at a certain frame rate and then forwards it out through an output interface at another frame rate, and even needs to perform multi-image stitching for input video streams of various frame rates. Therefore, the FPGA chip needs to perform an adaptation operation for various frame rates during video stream processing.
In the prior art, the FPGA chip implements adaptation for various frame rates by way of Frame Extract/Insert unit. Because for the FPGA chip, the required frame rates of the input video stream and the output video stream that relate to different IP multimedia conferences may be different from those of others, and there may be multiple input video streams that relate to the same IP multimedia conference (i.e. it is required to stitch multiple images for every output video stream), the FPGA chip typically takes the minimum frame rate of the input data stream and the output data stream of all the IP multimedia conferences as the system frame rate, so as to implement adaptation for various input/output frame rates. FIG. 2 illustrates a logic implementation that an FPGA chip performs frame rate adaptation. The inputted A/D conversion signal is received via the A/D conversion interface, image processing is carried out and then frame extraction is performed, so that the frame rate of every input data stream is adjusted as the same system frame rate. Frame insertion is performed for the outputted video stream, so that every output data stream is changed from the same system frame rate to the corresponding output frame rate. Specifically, in order to buffer the video stream, and to stitch the multiple images in the multi-image circumstance, a Buffer is configured to buffer the data stream. The interface capacity of the Buffer is determined from multiplying the system frame rate by the maximum image size and the number of input/output interfaces.
It can be seen that there are problems as follows in the conventional FPGA chip frame rate adaptation solution.
1. It is required to provide a dedicated Frame Extract/Insert unit, thus increasing the difficulty of chip logic design and the resource overhead of chip processing.
2. It is required to perform an accurate frame extraction/insertion calculation to implement the frame rate adaptation, and the corresponding calculation is particularly much more complicated if the input/output frame rate and the system frame rate are not integer multiple related to each other.
3. The minimum frame rate of all the interfaces is taken as the system frame rate, thus the actual input/output frame rates of most data streams are higher than the system frame rate, thus impacting the image quality of most data streams.