One of the recent targets in telecommunications has been to provide systems, where good quality, real-time transmission of video, audio and data is available. Transmission of video is formed by a continuous stream of data carrying moving pictures. As is generally known, the amount of data needed to transfer pictures is high compared to many other types of media, and so far usage of video in low bit rate terminals has been negligible. Transmission of data in digital form, anyhow, has provided for increased signal to noise ratios and increased information capacity along the transmission channel. In the near future advanced digital mobile telecommunication systems will also be introducing services enhancing the transmission bit rates, which means that transmission of video even over low bit rate mobile channels will soon become more feasible.
In circuit switched multimedia transmission, bit streams from sender's different media sources (e.g. video, audio, data and control) are multiplexed into a single bit stream, and at the receiving end the bit stream is again demultiplexed into various multimedia streams to be decoded appropriately. Since the bit streams from and to different sources are not equal in size, the multiplexing usually also comprises logical framing. This means that the multiplex signal to be transmitted is structured according to a chosen control protocol, and framing data blocks (e.g. bits, flags, etc.) are inserted to identify different data blocks.
The basic principle of the multiplexing scheme is illustrated with the block diagram of FIG. 1. It is to be noted that the figure merely illustrates the basic concepts and comprises no implications on actual sizes, numbers or order of the transmitted packet. In this example data packets from two different media sources are first multiplexed for transmission and after transmission demultiplexed for forwarding to different decoders. Data packets A1, A2, A3, . . . from the audio encoder and data packets V1, V2, . . . . from the video encoder are combined in the multiplexer MUXX into consecutive packet data units PDU1 (step 1) and PDU2 (step 2). Since video packets V1 and V2 are large, they are broken into segments e.g. V->>V1.1/V1.2 for transmission. The demultiplexer adds a framing data block F to each of the PDUs, to indicate the boundaries and the structure of the contents of the PDUs. In the demultiplexer DMUX, data packets A1, A2, A3, . . . and V1, V2, . . . . are separated from the PDUs according to the information given in the framing data block F, and forwarded as data signals dA1, dV1, . . . to relevant decoders. Segmented video data units V1.1/V1.2 will first be combined to single video data packets (e.g. V1), and then forwarded to the video decoder.
For optimization of channel capacity usage, signals are generally compressed before transmission. This is especially important with video transmission, where the amount of data to be transmitted is large. Compressed video, anyhow, is easily afflicted by transmission errors, mainly for two reasons. Firstly, compressed video coding is based on predictive differential coding, in which a sampling system is used and the value of the signal at each sample time is predicted to be a particular linear function of the past values of the quantized signal. This causes propagation of errors, both spatially and temporally, which means that once an error occurs, it is easily visible for the human eye for a relatively long time. Especially susceptible are transmissions at low bit rates, where there are only a few intra-coded frames, which would stop the temporal propagation. Secondly, information symbols in compressed video are coded mainly using variable length codes, which also increases the susceptibility to errors. When a bit error alters the codeword to another one of different length, the decoder will lose synchronization and also decode consecutive error free blocks incorrectly until the next synchronization code.
To limit the degradations on the images introduced by transmission errors, error detection and/or error correction methods can be applied, retransmissions can be used, and/or effects from the received corrupted data can be concealed. Normally retransmission provides a reasonable way to protect data streams from errors, but big round-trip delays associated with low bit rate transmission and moderate or high error rates make it practically impossible to use retransmission, especially with real-time videophone applications. Error detection and correction methods usually require a large overhead since they add some redundancy to the data. Consequently, for low bit rate applications, error concealing can be considered as a preferred way to protect and recover images from transmission errors.
To be able to conceal transmission errors, they have to be detected and localized. The more is known of the type and the location of the error, the better the concealment method can be focused to the problem, and accordingly the better image quality will be achieved. The video reception process provides different methods of error detection, associated with different protocol layers of video transmission, as illustrated in FIG. 2. The channel coding layer 20 provides means for detecting, as well as correcting errors in received bit streams. The transmission protocol layer 22 usually comprises a CRC (Cyclic Redundancy Check) which is run for received video signals, on the basis of which incorrect signals can be rejected. In the video decoding layer 24 errors are usually detected as illegal variable-length codes or incorrectly positioned synchronization codes. Some errors can be detected and corrected even from the decoded images in the picture layer 26. The error concealment method can utilize error data from any or each of these layers. In this application, anyhow, error detection in the demultiplexing phase is examined with more precision.
For receiving video data, the received synchronous bit stream is forwarded to a demultiplex protocol unit for demultiplexing, logical framing, sequence numbering, error detection and error correction by means of retransmission, as appropriate to each media type. The demultiplexed bit streams are forwarded to appropriate decoders, which carry out redundancy reduction coding and decoding for said demultiplexed bit streams.
The multiplexing protocol for low bit rate multimedia communication over highly error-prone channels is described in ITU-T recommendation H.223. The multiplex consists of a multiplex layer and an adaptation layer. The multiplex layer mixes the various logical channels into a single bit stream. It transfers logical channel information in packets, delimited by a flag. A flag can be a HDLC (High-Level Data Link Control) flag, with which HDLC zero-bit insertion for transparency is also used. It is also possible to use PN framing where the flag is a 16-bit pattern as described in annexes A, B, and C of H.223. Each data packet contains a one-octet header followed by a variable number of information field octets. The header octet includes a multiplex code, which specifies, by reference to a multiplex table, the mapping of the information field octets to various logical channels. Each data packet may contain a different multiplex code, and therefore a different mix of logical channels. The multiplex layer does not perform error control, except for a CRC (Cyclic Redundancy Check) on the header octet.
The adaptation layer handles error control and sequence numbering, as appropriate to each information stream. Specification H.223 defines three adaptation layers AL1, AL2, and AL3, where AL3 is intended primarily for digital video. AL3 includes a 16 bit CRC for error detection, by which the transmission errors can be localized to a single AL3 layer packet. In the specification of the adaptation layers, it is also mentioned that such error indications could be passed from a video demultiplexer to a video decoder, but actual procedures for implementing such demultiplexer indications are not presented.
An indication of the possible error in the received packet is useful in many cases, especially if retransmission is possible. Anyhow, in low bit rate video transmission the amount of information contained in one video packet has to be large in order to limit the amount of bits used for framing and redundancy. This means, that the information about possible errors in the packet per se is not very useful, since in many cases by rejecting the whole video packet too much information will be lost which may lead to inadequate picture quality.