1. Field of the Invention
The invention relates to video encoding methods with variable video frames.
2. Description of the Related Technology
A video information stream comprises a time sequence of video frames. The time sequence of video frames can be recorded for instance by a video camera/recorder. Each of the video frames can be considered as a still image. The video frames are represented in a digital system as an array of pixels. The pixels comprises luminance or light intensity and chrominance or color information. The information is stored in a memory of the digital system. For each pixel some bits are reserved. From a programming point of view each video frame can be considered as a two-dimensional data type, although the video frames are not necessary rectangular. Note that fields from an interlaced video time sequence can also be considered as video frames.
A particular aspect of the considered video frames is that they are variable in size and even location with respect to a fixed reference such as, e.g., the display. Moreover, the considered video frames support the object concept by indicating whether a pixel belongs to an object or not.
In principle when the video information stream must be transmitted between two digital systems, this can be realized by sending the video frames sequentially in time, for instance by sending the pixels of the video frames and thus the bits representing the pixels sequentially in time.
There exist, however, more elaborated transmission schemes enabling faster and more reliable communication between two digital systems the transmission schemes are based on encoding the video information stream in the transmitting digital system and decoding the encoded video information stream in the receiving digital system. Note that the same principles can be exploited for storage purposes.
During encoding the original video information stream is transformed into another digital representation the digital representation is then transmitted. While decoding the original video information stream is reconstructed from the digital representation.
The MPEG-4 standard defines such a transmission (and storage) efficient encoded digital representation of a video information stream.
Encoding requires operations on the video information stream. The operations are performed on a digital system (for instance in the transmitting digital system). Each operation performed by a digital system consumes power. The way in which the operations for encoding are performed is called a method. The methods have some characteristics such as encoding speed and the overall power consumption needed for encoding.
The digital system can either be application-specific hardware or a programmable processor architecture. It is well-known that most power consumption in the digital systems, while performing real-time multi-dimensional signal processing such as video stream encoding on the digital systems, is due to the memory units in the digital systems and the communication path between the memory units. More precisely individual read and write operations from and to memory units by processors and/or datapaths and between memories become more power expensive when the memory units are larger, and so does the access time or latency from the busses. Naturally also the amount of read and write operations are determining the overall power consumption and the bus loading. The larger the communication path the larger is also the power consumption for a data transfer operation. With communication is meant here the communication between memory units and the processors and data paths found in the digital system and between memories themselves. There is also a difference between on- and off-chip memories. Note that the same considerations are valid when considering speed as a performance criterion.
As the power consumption of the digital system is dominated by read and write operations, thus manipulations on data types, such as video frames, the methods are considered to be data-dominated.
As the algorithm specification, the algorithm choice and its implementation determine the amount of operations and the required memory sizes it is clear that these have a big impact on the overall power consumption and other performance criteria such as speed and bus loading.
A method for encoding a video information stream, resulting in a minimal power consumption of the digital system on which the method is implemented, and exhibiting excellent performance, e.g., being fast, must be based on optimized data storage, related to memory sizes, and data transfer, related to the amount of read and write operations. Such a method can be developed by transforming an initial less power optimal method by using various code manipulations. Such a transformation approach must be supported by an adequate exploration methodology.
In general a method can be described as an ordered set of operations which are repetitively executed. The repetition is organized in a loop. During execution data is consumed and produced. The code manipulations can be loop- and/or data-flow transformations. The transformations change the ordering of the operations in the loop and result in another data consumption-production ordering. Also data reuse concepts can be used in order to obtain a more power consumption and speed optimal method. Data reuse deals with specifying from and to which memory data is read and written. More in particular applying the data reuse concept means making copies of data to smaller memories and to let the data be accessed by the processors and/or datapaths from the smaller memories.
Naturally when such a power consumption and speed optimal encoding method exist it can be implemented on a digital system, adapted for the method. This adaptation can be done by an efficient programming of programmable (application specific) processor architectures or by actually designing an application-specific or domain-specific processor with the appropriate memory units.
The fact that the power consumption is heavily dominated by data storage and data transfer of multi-dimensional data types is demonstrated in the publication [F. Catthoor, F. Franssen, S. Wuytack, L. Nachtergaele, H. De Man, “Global communication and memory optimizing transformations for low power signal processing systems”, IEEE workshop on VLSI signal processing, La Jolla Calif., October 1994] and [R. Gonzales, M. Horowitz, “Energy dissipation in general-purpose microprocessors”, IEEE J. Solid-state Circ., Vol. SC-31, No. 9, pp. 1277-1283, September 1996] for custom hardware and programmable processors respectively.
Power consumption in deep submicron CMOS digital devices is dominated by the charging of wires on-chip and off-chip. The technological evolution aims at minimizing the power consumption by lowering the supply voltages, using short thin wires and small devices, using reduced logic swing. These non-application specific approaches do not exploit the characteristics of the application in the design of the digital system and/or implementation on a given digital system.
Some following general principles for power consumption reduction are known: match architecture and computation, preserve locality and regularity inherent in the application, exploit signal statistics and data correlations and deliver energy and performance on demand. These guidelines must however be translated and extended for a more memory related context as found in multi-media applications.
The data storage and transfer exploration methodology, applied for constructing the encoding methods presented in the invention, is discussed in the detailed description of the invention.
The different aspects of the invention will be illustrated for encoding following the MPEG-4 standard, discussed in the detailed description of the invention. The current realizations of MPEG based video coding multi-media applications can be distinguished in two main classes: the customized architectures and the programmable architectures. The disadvantages of the customized approach [P. Pirsch, N. Demassieux, W. Gehrke, “VLSI architectures for video compression—a survey”, Proc. of the IEEE, invited paper, Vol. 83, No. 2, pp. 220-246, February 1995] is that the design is difficult as only limited design exploration support is available, application-specific, still has large power consumption, due to rigid memory hierarchy and central bus architecture. Many programmable processor solutions, for video and image processing, have been proposed, also in the context of MPEG [K. Roenner, J. Kneip, “Architecture and applications of the HiPar video signal processor”, IEEE Trans. on Circuit and Systems for Video Technology, special issue on “VLSI for video signal processors”.]. Power consumption management and reduction for such processors is however hardly tackled. The disadvantages of the implementation on a programmable processor are indeed (1) the large power consumption, due to expensive data transfers of which many are not really necessary, (2) most area of chip/board is taken up by memories and busses, (3) addressing and control complexity are high and (4) the speed is too low such that parallel processing is necessary, which are difficult to program efficiently due to data communication.
Much work has been published in the past on cache coherence protocols, for parallel processors. These approaches are mostly based on load balancing and parallelisation issues for arithmetic operations. Although some work on data localization issues in order to obtain better cache usage exist, it is clear that a more data transfer and storage oriented solution is required for data-dominated applications such as multi-media applications. Data reuse is the basis for traditional caching policies. These policies are however not sufficiently application oriented, and thus not exploiting enough the particular algorithm which must be implemented, and not based on global optimization considerations.
The use of global and aggressive system-level data-flow and loop transformations is illustrated for a customized video compression architecture for the H.263 video conferencing decoder standard in [L. Nachtergaele, F. Catthoor, B. Kapoor, D. Moolenaar, S. Janssens, “Low power storage exploration for H.263 video decoder”, IEEE workshop on VLSI signal processing, Monterey Calif., October 1996] and other realistic multi-media kernels in [F. Catthoor, S. Wuytack, E. De Greef, F. Franssen, L. Nachtergaele. H. De Man,“System-level transformations for low power data transfer and storage”, in paper collection on “Low power CMOS design” (eds. A. Chandrakasan, R. Brodersen), IEEE Press, pp. 609-618, 1998] [S. Wuytack, F. Catthoor, L. Nachtergaele, H.De Man, “Power Exploration for Data Dominated Video Applications”, Proc. IEEE Intnl. Symp. on Low Power Design, Monterey, pp. 359-364, August 1996].