In embodiments, the invention relates to dynamic bit-rate reduction of pre-compressed video streams. Reducing the bit-rate of a pre-compressed data stream is called transrating, and devices that perform this task are called transrators. Transrating and transrators are sub-sets of more general fields of transcoding and transcoders, respectively. Video transcoding is a process in which pre-compressed video data is converted to another compressed valid video data. Embodiments of the present invention may be used in a variety of applications where bit-rate reduction is desired or required such as in video recorders, servers, network video servers and clients.
Digital video compression has made it possible to store, stream and transport large amounts of video content which was once impractical due to the excessive size of the data files required to convey the necessary information. Digital video compression, especially the MPEG formats and particularly the MPEG-2 format is widely used in devices including DVD players, satellite and terrestrial set top boxes, network video servers and receivers and many more.
A digital video is made up of the individual still images or “frames” that, when played in sequence, are able to give the impression of movement. Although each digital video compression format has its own particular characteristics a number of common features are also present. One such common feature is the use of intra frames which are coded independently from other frames. In MPEG terminology, such frames are referred to as I frames. An I frame may be thought of as a key frame or reference video frame which acts as a point of comparison to other frames during encoding, decoding and play-back.
Another common feature is the use of inter frames which are coded with reference to other frames. They can only be decoded after their reference frames are decoded. Inter frames are of two types, commonly referred to in MPEG terminology as P frames and B frames. P frames may also be referred to as reference inter frames whereas B frames may be referred to as non-reference inter frames.
As mentioned above, transrating is a subset of a broader type of video stream processing referred to as transcoding. When used herein, transcoding is used to mean the changing of any characteristics of a digitally compressed video stream to produce a new valid digitally compressed video stream. Transrating is a transcoding process which aims only at bit-rate change, usually reduction, and it is an essential component in band-limited network environments.
Transrators with dynamic bit-rate adaptation mechanisms are particularly important when variable bit-rate (VBR) encoded video is to be streamed over a constant bit-rate (CBR) channel. If the bit-rate of the VBR video fluctuates continuously, fast adaptation is required to produce a suitable CBR output signal. Therefore, the ratio of instantaneous output bit-rate to instantaneous input bit-rate must dynamically change to produce a nearly CBR video output whose bit-rate is always below that of the transmission channel.
When transrating, there are four main issues to consider. These are complexity of the operation, quality required of the output signal, bit-rate required of the output signal and adaptation speed. A method and apparatus are sought which provides the highest quality output video using the lowest complexity of system and the lowest bit-rate with the fastest adaptation speed possible. Different transrating techniques have been developed and implemented. The architecture and performance of known systems differ and they trade speed for quality and are useful for applications with no time constraint. In other examples, the transrator architectures trade quality for speed and for simplicity such as to be useful in real-time applications.
Depending on their purposes and operational platforms, there are a variety of transrator architectures currently used. Examples of architectures and methods of transrating that are known are shown in each of FIGS. 1 to 7. In FIG. 1 a simple transrator is shown comprising a cascade of a decoder and an encoder. Using this architecture, the digital video stream can be decoded into frames and encoded again using different encoding parameters. The decoder and encoder parts are decoupled and this is therefore an extremely flexible transrator architecture. However the cost of this flexibility is high computational complexity, relatively low speed and high latency.
Furthermore, in spite of these high costs, the architecture does not guarantee the best output due to the fact that two inherently lossy processes (decoding and encoding) are cascaded. Thus, this architecture is impractical for most purposes.
FIG. 2 shows an example in which only B frames are transcoded. I and P frames are routed directly through the system. The transcoding system used in the apparatus of FIG. 2 is shown in more detail in FIG. 3. As can be seen, the system is complex as it requires the sequence of variable length decoding, de-quantisation, inverse transformation, quantisation, forward transformation and variable length encoding. Thus, the complexity is significant. Although such an arrangement may be effective, transrating only the B frames and leaving I and P frames untouched will not produce a satisfactory reduction in bit-rate, where this required, for most digital videos. Furthermore, the complex sequence of steps performed on the B frames means that the process is slow and the lossy inverse quantisation can lead to significant degradation in quality of the output signal.
FIGS. 4 and 5 show an apparatus and method flow diagram as described in U.S. Pat. No. 6,763,070. The system described herein relates to a transrating scheme in which a cut-off index is determined and transform coefficients beyond this cut-off index are eliminated. The cut-off index is determined by the rate control information derived from the input bit-rate, required output bit-rate and previously processed macro blocks.
Last, with reference to the prior art, U.S. Pat. No. 6,937,770 discloses a system and apparatus for adapted bit-rate control for rate reduction of MPEG coded video. The system utilises a scale factor between average frame size (number of bits) of input stream and desired frame size of output. This scale factor is used to compute the number of bits used for each macro block of desired rate output stream. The scale factor may be dynamically changed to produce a desired rate output. Despite its low complexity and fast adaptivity, the scheme leads to problems such as distortion, heavy blocking and drift artefacts in high motion areas of a frame.
According to a first aspect of the present invention, there is provided a method of transrating a video signal made up of an input bit stream representative of frames of a video, each frame being made up of blocks of pixels, there being a corresponding block of data within the input bit stream for each block of pixels, the method comprising: for the bit stream of a frame of the video signal, identifying the type of frame; and for certain types of frame, disregarding a configurable proportion of the data in respect of plural blocks within the frame, thereby taking into account local motion activity within the frame.
Preferably, the proportion of data is disregarded in respect of all blocks within the frame. Preferably the proportion is the same in respect of all blocks.
Preferably, the input bit stream is representative of transform coefficients of frames of the video signal, e.g. a pre-compressed video signal in accordance with some format such as one of the MPEG formats, and wherein the disregarded data is a proportion of the non-zero transform coefficients for all blocks within a frame.
The local motion activity within a block of an image is dependent on the number of non-zero transform coefficients coded for that block. Therefore, by taking into account the number of coefficients on a per block basis, local motion activity is considered and accounted for in the transrating operation. This contrasts with known transrating operations in which the only factor taken into account when determining the size or number of bits that can be allocated to each macro-block in a frame of a transrated video signal is input bit-rate and desired output bit-rate. The use of some proportion (preferably substantially the same for all blocks within any frame) of the coefficients as opposed to fixed number ensures that motion activity within a block is accounted for and blocks in which there is motion do not suffer significant amounts of visual degradation.
Furthermore, the method allows for the transrating of reference inter or “P” frames. Without the use of the present method, transrating of P frames would increase the drift effect and blockiness especially in high motion scenes. In this present method this may be avoided by leaving the I frames within the input bit stream intact and directly copying them to the output bit stream and so motion vectors and other such parameters are copied directly to the output bit stream. Any loss of data in an I frame will propagate directly to related P and B inter frames. Using the original I frames within the output bit stream will reduce the drift effect significantly.
In the present case, since local motion activity is taken into account in the process of transrating, by disregarding a configurable proportion of the data in respect of all blocks within the frame, an even distribution of visual degradation is made so that distortion is less visible in the transrated video stream and heavy blocking and drift artefacts are avoided in high motion areas of the picture frame. A block is a subregion of pixels within a frame. In the example of MPEG-2 compression, a block is typically an 8×8 group of pixels.
Accordingly in a particular embodiment, the invention provides a method of transrating a video signal made up of a bit stream corresponding to a series of non-zero transform coefficients representative of frames of a video each frame being made up of blocks of pixels, there being a corresponding block of transform coefficients for each block of pixels, the method comprising for the bit stream of a frame of the video signal, identifying the type of frame; and in dependence on the type of frame, performing a transrating operation on the frame, wherein for certain types of frame, a configurable proportion of the transform coefficients are disregarded in respect of all blocks within the frame, thereby generating a transrated output bit stream.
Preferably, the method comprises identifying whether the frame is an I frame, a P frame or a B frame and, if it is identified as an I frame, performing no transrating operation on the frame.
Preferably, certain transform coefficients are removed from the frame by the insertion of an End of Block (EOB) code at a defined point within each block of the frame within the output bit stream.
Preferably, an output bit stream is generated comprising all the non-zero coefficients from blocks within I frames and only the maintained coefficients of blocks from the P and B frames.
Preferably, the input bit stream is an encoded MPEG-2 video signal.
Preferably, the input bit stream is in the form of a discrete cosine transform of an original image file.
According to a second aspect of the present invention, there is provided apparatus for transrating a video signal made up of a bit stream corresponding to a series of transform coefficients for frames of the video, each frame being made up of blocks of pixels, the apparatus comprising a receiver for receiving the encoded video signal in the form of a digital bit stream; a reader arranged upon receipt of a frame to identify the type of frame; and a controller for varying the operation performed on the frame in dependence on the type of frame, wherein for certain types of frame, a configurable proportion of the transform coefficients are disregarded in respect of all blocks within the frame, to thereby generate an output bit stream.
According to another aspect of the present invention, there is provided a method of transcoding a video signal made up of an input bit stream representative of frames of a video, each frame being made up of blocks of pixels, there being a corresponding block of data within the input bit stream for each block of pixels, the method comprising for the bit stream of a frame of the video signal, identifying the type of frame; and for certain types of frame, disregarding a substantially equal proportion of the data irrespective of the actual amount of data required to represent the block in respect of plural blocks or each block within the frame, thereby taking into account local motion activity within the frame.