The present invention relates to the communication of digital high definition television (HDTV) data, and more particularly to a data packet format for communicating HDTV data and a receiver for receiving such data.
Television signals are conventionally transmitted in analog form according to various standards adopted by particular countries. For example, the United States has adopted the standards of the National Television System Committee ("NTSC"). Most European countries have adopted either PAL (Phase Alternating Line) or SECAM (sequential color and memory) standards.
Digital transmission of television signals can deliver video and audio services of much higher quality than analog techniques. Digital transmission schemes are particularly advantageous for signals that are broadcast by satellite to cable television affiliates and/or directly to home satellite television receivers. It is expected that digital television transmitter and receiver systems will replace existing analog systems just as digital compact discs have largely replaced analog phonograph records in the audio industry.
A substantial amount of digital data must be transmitted in any digital television system. This is particularly true where high definition television is provided. In a digital television system, a subscriber receives the digital data stream via a receiver/descrambler that provides video, audio, and data to the subscriber. In order to most efficiently use the available radio frequency spectrum, it is advantageous to compress the digital television signals to minimize the amount of data that must be transmitted.
The video portion of a television signal comprises a sequence of video "frames" that together provide a moving picture. In digital television systems, each line of a video frame is defined by a sequence of digital data referred to as "pixels." A large amount of data is required to define each video frame of a television signal. For example, 7.4 megabits of data is required to provide one video frame at NTSC resolution. This assumes a 640 pixel by 480 line display is used with 8 bits of intensity value for each of the primary colors red, green and blue. High definition television requires substantially more data to provide each video frame. In order to manage this amount of data, particularly for HDTV applications, the data must be compressed.
Video compression techniques enable the efficient transmission of digital video signals over conventional communication channels. Such techniques use compression algorithms that take advantage of the correlation among adjacent pixels in order to derive a more efficient representation of the important information in a video signal. The most powerful compression systems not only take advantage of spatial correlation, but can also utilize similarities among adjacent frames to further compact the data. In such systems, differential encoding (DPCM) is used to transmit only the difference between an actual frame and a prediction of the actual frame. The prediction is based on information derived from a previous frame of the same video sequence. Examples of such systems can be found in U.S. Pat. Nos. 5,068,724 entitled "Adaptive Motion Compensation for Digital Television" and 5,057,916 entitled "Method and Apparatus for Refreshing Motion Compensated Sequential Video Images," both incorporated herein by reference.
In the system disclosed in the '724 patent, video signals are processed with motion compensation (DPCM) and without motion compensation (PCM), and a comparison is made to determine which type of processing results in the least amount of data for transmission. U.S. Pat. No. 5,091,782 entitled "Apparatus and Method for Adaptively Compressing Successive Blocks of Digital Video" and incorporated herein by reference, discloses a system wherein video signals are provided in both a frame format and a field format for processing with motion compensation. The resultant signals are compared on a block-by-block basis to determine which format yields the fewest errors.
Motion estimation of a video signal is provided by comparing the current luminance block with the luminance blocks in the previous frame within a specified tracking range. The previous frame luminance block with the minimum total absolute change compared to the current block is chosen. The position of the chosen block is called the motion vector, which is used to obtain the predicted values of the current block. For additional coding efficiency, the motion vectors can be differentially encoded and processed by a variable length encoder for transmission as side information to a decoder. A low pass filter may be provided in the DPCM loop for the purpose of smoothing out the predicted values as necessary. In order to protect the coded bitstream from various kinds of random noise, a forward error correction (FEC) scheme can be used.
One category of coding schemes for compressing the data rate by removing redundant information is known as "entropy coding." Another category, which relies on a human visual model (i.e., perception) yields results that can be lossy. Thus, picture quality can be degraded when the latter is used. In implementing such techniques, either intraframe or interframe coding can be used. Intraframe coding is used for the first picture and for later pictures after a change of scene. Interframe coding is used for sequences of pictures containing moving objects. Entropy coding achieves compression by using the statistical properties of the signals and is, in theory, lossless.
A coding algorithm that uses such coding techniques has been proposed by the CCITT Specialist Group. See, e.g., "Description of Reference Model 8 (RM8)," Doc. No. 525, CCITT SG XV Working Party XV-4, Specialist Group on Coding for Visual Telephony, June, 1989. In the CCITT scheme, a hybrid transform/differential pulse coded modulation (DPCM) with motion estimation is used. The DPCM is not operative for intraframe coding. For entropy coding, both one- and two-dimensional variable length codings are used.
The discrete cosine transform (DCT) described by N. Ahmed, T. Natarajan, and K. R. Rao, "Discrete Cosine Transform," IEEE Trans. Computer, Vol. C-23, pp. 90-93, Jan. 1974, is used in the CCITT system to convert the input data, which is divided into macroblocks and sub-blocks, into transform coefficients. The DCT transform is performed on the difference between blocks of current frame data and corresponding blocks of a predicted frame (which is obtained from the previous frame information). If a video block contains no motion or the predicted value is exact, the input to the DCT will be a null matrix. For slowly moving pictures, the input matrix to the DCT will contain many zeros. The output of the DCT is a matrix of coefficients which represent energy in the two-dimensional frequency domain. In general, most of the energy is concentrated at the upper left corner of the matrix, which is the low frequency region. If the coefficients are scanned in a zigzag manner, the resultant sequence will contain long strings of zeros, especially toward the end of the sequence. One of the major objectives of this compression algorithm is to create zeros and to bunch them together for efficient coding.
To maintain efficiency, a variable threshold is also applied to the coefficient sequence before quantization. This is accomplished by increasing the DCT threshold when a string of zeros is detected. A DCT coefficient is set to zero if it is less than or equal to the threshold.
A uniform quantizer is used after the transform. The step size of the quantizer can be adjusted by the transmission rate as indicated by the occupancy of a buffer. When the transmission rate reaches its limit, the step size will be increased so that less information needs to be coded. When this occurs, a degraded picture will result. On the other hand, picture quality will be improved by decreasing the step size when the transmission rate is below its limit.
To further increase coding efficiency, a two-dimensional variable length coding scheme is used for the sequences of quantized DCT coefficients. In a given sequence, the value of a non-zero coefficient (amplitude) is defined as one dimension and the number of zeros preceding the non-zero coefficient (runlength) is defined as another dimension. The combination of amplitude and runlength is defined as an "event."
A shorter length code is assigned to an event which occurs more frequently. Conversely, infrequent events receive longer length codes. An EOB (end of block) marker is provided to indicate that there are no more non-zero coefficients in the sequence.
The coded coefficient values are multiplexed together with various side information such as block classification, quantization information, and differential motion vectors. Some of the side information may also be variable length coded. The resultant bitstream is sent to a buffer for transmission.
At a receiver, a variable length decoder is necessary to perform the inverse operation of the encoder and recover the transform coefficients. In order to increase the efficiency of the receiver, it is desirable to transmit the variable length coded data in a format that provides useful information to the receiver. For example, it would be advantageous to provide a data format that includes various data fields that enable the receiver to avoid unnecessary processing. It would also be advantageous to indicate to the receiver whether a data packet currently being received has been encoded as DPCM data or PCM data. If the data has been encoded as PCM data (i.e., without motion compensation), then the receiver could be instructed to skip over any processing relating to motion compensation. Similarly, where a received packet does not contain video data, the receiver could be instructed to bypass the decoding routines that relate to the processing of video data. It would be still further advantageous for the transmitted data format to provide a field that indicates the length of a received data packet. Such information would be useful to prevent a loss of synchronization in the event of a transmission error. Other information useful to the receiver can also be advantageously provided in transmitted data packets.
The present invention provides a data packet format for use in a digital HDTV system, as well as a receiver for such packets, all having the aforementioned advantages.