Digital video offers a great many advantages over traditional analogue systems, supporting services such as video telephony and multi-media applications. However, a key problem of digital video when compared to analogue systems is the demand it places on communications and storage resources. For example, a bandwidth of approximately 160 Mbps is required in order to transmit a broadcast quality video, which compares with a bandwidth of approximately 5 MHz for comparable quality analogue video. Thus, to be able to use digital video the digital signal requires reduction of the quantity of data.
Data reduction is achieved by using compression techniques to remove redundant data while still retaining sufficient information to allow the original image to be reproduced with an acceptable quality. There are two types of redundancy in video signals: spatial and temporal. For the coding of images, techniques which exploit spatial redundancy only are termed intra-frame (i.e. they treat each frame separately), while those used to exploit temporal redundancy are termed inter-frame (i.e. they exploit similarities between frames), the latter invariably also exploit spatial redundancy.
Several coding techniques have been developed for redundancy removal, these include run length coding, conditional replenishment, transform coding, Huffman coding and differential phase code modulation (OPOM). Many of these are utilised in key standards such as JPEG, MPEG-i and MPEG-2, and H.261/H.263. JPEG defines the form of compressed data streams for still images; MPEG/MPEG2 are for compression of moving pictures; H.261/H.263 have primarily been defined for video telephony applications employing low bit rate communications links (of the order of tens of kbitls).
Video compression and expansion systems are often referred to as ‘video codecs˜inferring the ability to both encode and/or decode images. Current video telephony systems have primarily been designed for use in PSTN or packet networks, and are governed by ITU-T recommendations 1-1.324, which covers low bit rate multimedia communication, and H.323, which covers video conferencing over traditional shared media local area networks. The video coding parameters of the algorithm controlling encoding in the video codec are normally selected on the basis of the relatively error free transmission channels these systems can provide. However, the video coding algorithms of video codecs are flexible in that they can allow selection of the coding parameters. This is particularly beneficial for transmission on channels which are prone to error. In such conditions the coding parameters can be modified so as to attempt to minimise the affect of transmission errors on the picture quality. Where errors have occurred in transmission, it has been found that the decoded video normally produces additional blockiness, annoying green and pink Squares, temporal jerkiness and sometimes chequered patterns.
In existing Systems, two parameters which are typically adjusted in encoding are the amount of intra-refresh information and frequency of start codes. in PSTN networks, the video codec starts the coding with a full intra-frame. Intraframe pictures are coded without reference to other pictures which means that they contain all the information necessary for their reconstruction by the decoder and for this reason they are an essential entry point for access to a video sequence. Because the resolution of intraframes is high, the compression rate is relatively low and therefore a full intra-frame places huge demands on the number of date bits required to define the picture. As a result, the transmission of a full intra on small bandwidth lines, and even using small buffers to minimise delays, takes large periods of time, to the extent that the decoder must freeze the previous picture on the screen for a while, in effect to allow the following picture to catch up. Thus, as an alternative approach, in succeeding frames, intra-frame information is updated (or refreshed) on sequential portions of the picture frames, rather than the whole picture frame typically on a block-by-block basis of 16×16 pixels, hence the picture is said to be intra-refreshed. If the rates at which the blocks are refreshed is slow (which it usually is in PSTN) transmission error artifacts on the image can live very long, and will vanish only when the erroneous block is intra-refreshed. In error prone networks, it is therefore necessary to increase the number of intra-refresh macro blocks in each frame, or the rate at which full intra frames are sent.
Another technique used to minimise the impact of transmission errors is to reduce the size of effected areas. Since the coded bit stream contains variable length coding (VLC) code words, an error in the bit stream in most cases causes the decoder to lose synchronisation with VLC code words. The decoder can only continue decoding after receiving a fixed length distinct code word called a start code. Typically, start codes are found at the beginning of coded picture frames, however most video coding standards also allow start codes to be inserted elsewhere in a picture, for instance at the beginning of each row of macro blocks or even more often. Thus in order to reduce the size of the areas affected by transmission errors, start codes can be introduced in the picture at more frequent locations. The density of these start codes is a compromise between reduced picture quality owing to an increased number of header bits, and the size of the area which is affected by transmission errors. In error prone environments it is advantageous to sacrifice some visual image quality in order to reduce the image area affected by transmission errors.
The overall current approach is to pre-program intra-refresh information and start code parameters into the algorithm controlling the video codec depending on the anticipated level of transmission errors. Since these parameters can be varied in an encoder, if for example there is a high probability of losing a significant amount of information in a transmission then the intra-refresh information and start code parameters are sent more often. However with high C/I (carrier to interference) or C/N (carrier to noise ratio) levels relatively less intra-refresh or start code information is required, thus allowing for better image quality.
Insertion of additional intra-refresh data and start codes is reasonably effective for mitigating the effects of predictable transmissions errors, but these approaches are not without certain shortcomings. Principally, these shortcomings stem from the fact that actual transmission errors are not always predictable, and in situations where there is a wide margin between the predicted transmission error and the actual transmission error, the intrarefresh and start code parameters will not be consistent with the required level for these encoding parameters. For example, on one hand if the transmission errors are less than anticipated then the level of intra-refresh or start code information will be in excess of that required, and the excess will thus be redundant. On the other hand, if the transmission errors are much worse than those predicted, then the intra-refresh and start code information will be insufficient, and spread so widely both temporally and spatially in the decoded pictures that the result will be poor image quality. Coding parameters are thus set at an intermediate rate, but of course in this case image quality is compromised and thus not at an optimum.
Against this background, the present invention aims to address the problems arising from transmission errors on video coded signals.