Notice: More than one reissue application has been filed for the reissue of U.S. Pat. No. 5,937,095. The reissues applications are application Nos. 10/662,949 which is a continuation of 09/925,423 (the present application). 
(1) Field of the Invention
The present invention relates to a method for encoding and decoding digital moving picture signals for use in TV phones, TV conferences and the like.
(2) Description of the Prior Art
In a general method for encoding digital moving picture signals, a frame of inputted moving picture is divided into plural blocks each composed of N×M pixels, and processes of motion detection, prediction, orthogonal transform, quantization, variable length coding, etc. are conducted on each block.
In a general method for decoding digital motion picture signals, blocks each composed of N×M pixels are regenerated in a reverse procdyre procedure, that is, processes of variable length decoding, reverse quantization, reverse orthogonal transform, motion compensation, etc.
The above general encoding method and decoding method for encoding and decoding digital moving picture signals enable removal of redundancy contained in moving picture signals, and efficient communication and storage of a moving picture with less information.
In the general encoding method and decoding method for encoding and decoding digital moving picture signals, the processes are conducted on each pixel block, as stated above. It is general that a set of pixel blocks forms a subframe and a set of subframes forms a frame, which are units processed in the general encoding and decoding method.
Hereinafter, encoding and decoding of each block, subframe and frame will be described by way of an example of a general encoding and decoding method for encoding and decoding digital moving picture signals with reference to ITU-T Recommendation H.261 (hereinafter, referred simply H.261) made on March, 1993.
H.261 defines an encoding method and a decoding method for encoding and decoding luminance signals and color difference signals, separately, of digital moving picture signals. However, description will be made of only the luminance signals, for the sake of convenience. Basically, the encoding method and decoding method for encoding and decoding the luminance signals are not different from those for the color difference signals.
As shown in FIG. 1, one frame 101 of digital moving picture signals is composed of 352×288 pixels according to H.261. The frame 101 is divided into twelve subframes 102 called GOBs (Group of Blocks) each composed of 176×48 pixels (hereinafter, the subframe in the description of the prior art will be referred a GOB). Further, the GOB 102 (subframe) is divided into thirty three blocks 103 called macro blocks each composed of 16×16 pixels.
The encoding method according to H.261 defines that encoded information for one frame is corresponded to a spatial hierarchical structure such as the frame 101, GOBs 102 and macro blocks 103 described above, as shown in FIG. 2.
In FIG. 2, a part enclosed in a rectangle shows encoded information, and the number of coding bits is shown under each of the rectangles. In FIG. 2, arrows show linkages of the encoded information. A series of encoded moving picture signal sequences as this is called a bit stream 104.
In the bit stream 104 according to H.261 shown in FIG. 2, a part including all encoded information for one macro block 103 is called a macro block layer 103S, a part including all encoded information for once GOB 102 is called a GOB layer 102S, and a part including all encoded information for one frame 101 is called a frame layer 101S.
Meanings of the encoded information in each of the layers shown in FIG. 2 are given below:
Frame Layer 101S
PSC (20 bits): a frame identifier 105; a unique code by which an encoding method can be always identified, expressed as “0000 0000 0000 0000 0001”;
TR (5 bits): a frame number 106; indicating a time position in which this frame 101 should be displayed;
PTYPE (6 bits): frame type information 107; various information about the frame 101;
PEI (1 bit): extension data insertion information 108; a flag representing presence of following PSPARE 109;
PSPARE (8 bits): extension data; GOB layer 102S (subframe)
GBSC (16 bits): a GOB identifier 110; a unique code by which a decoding method can be always identified, expressed as “0000 0000 0000 0000”;
GN (4 bits): a GOB number 111; indicating a spatial position of this GOB 102 within the frame 101;
GQUNAT (5 bits): quantization characteristics information 112; indicating a quantization characteristic when a macro block 103 in the GOB 102 is encoded;
GEI (1 bit): extension data insertion information 113; a flag representing presence of following GSPARE 114;
GSPARE (8 bits): extension data 114.
Incidentally, the encoded information 115 of the macro block layer which is the lowest hierarchy in FIG. 2 is generated in the encoding method of motion detection, prediction, orthogonal transform, quantization, variable length coding, etc., as described before, whose coding bit number is not fixed. The number of coding bits of the macro block layer 103S, in general, increases if a spatial level of pixels included in the macro block 103 changes largely or a time level of pixels included in the macro block 103 having the same spatial positions changes largely. Such macro block 103, is hereinafter, referred a macro block 103 which is difficult to be encoded.
To the contrary, if a level of pixels included in the macro block 103 is steady in relation to space and time, the number of coding bits of the macro block layer 103S remarkably decreases, or sometimes becomes zero. Such macro block 103 is hereinafter referred a macro block 103 which is easy to be encoded.
In the decoding method according to H.261, the PSC 105 which is an identifier of the frame layer 101S is first found out from the bit stream 104. Incidentally, in a state where a decodable code has been successfully found out it is said that synchronization is established. When the PSC 105 is found out from the bit stream and synchronization of the frame layer 101S is established, it can be identified that the bit stream 104 until the next PCS 105 appears is encoded information for one frame. Further, a time position in which the frame 101 composed of 352×288 pixels obtained by decoding the bit stream 104 for that one frame can be obtained by examining the frame number 106 following the PSC 105.
After the establishment of the frame layer, a GBSC 110 that is an identifier of the GOB layer 102S is found out from the following bit stream 104 in the encoding method according to H.261. When synchronization of the GBSC layer is established, it can be identified that the bit stream 104 until the next GBSC 110 appears is encoded information for one GOB 102. Further, a spatial position of the GOB 102 composed of 176×48 pixels obtained by decoding the bit stream 104 for that one GOB 102 in a frame 101, in which the GOB 102 should be placed, can be obtained by examining a GN 111 which is a GOB number following the GBSC 110.
In the decoding method according to H.261, a bit stream 104 of a following macro block layer 103S is decoded after the establishment of the GOB layer 102s. The decoding method of the macro block layer 103S is a procedure to regenerate a macro block 103 composed of 16×16 pixels in processes of variable length decoding, reverse quantization, reverse orthogonal transform, motion compensation, etc., as described before. It should be here noted that the macro block layer 103S has no unique code by which a decoding method can be always identified dissimilarly to the PSC 105 or BGSC 110, and encoded information of each macro block is composed of undefined length bits of a variable length code.
As shown in FIG. 3, in the GOB (subframe) layer 102S, the encoded information from the first macro block 115, to the thirty third macro block 11533 is expressed as a series of variable length codes without a unique code. If decoding of the macro block encoded information is initiated from a point indicated by A in FIG. 3, and successively conducted in the order of the first, the second, . . . the nth, . . . the thirty third macro blocks, it is possible to regenerate all the macro blocks 103 in the GOB layer 102S. However, if the decoding of the macro block encoded information is initiated from a point indicated by B or C in FIG. 3, it is impossible to identify a point from which encoded information 115 of one macro block starts, which leads to a failure of establishing synchronization. In which case, the decoding and regenerating all macro blocks 103 become unfeasible until the next GBSC 110 appears. In other words, the GBSC 110 also represents a starting point of decoding the macro block layer 103S.
Finally, in the decoding method according to H.261, the GOB 102 which is a set of regenerated macro blocks 103 is placed in a spatial position within a frame 101 directed by GN 111, and the frame 100 which is a set of the regenerated GOBs 102 is placed in a time position directed by TR 106.
As above, it is possible to decode one frame 101 of digital moving picture correctly in relation to space and time according to H.261.
However, the above general method for encoding and decoding digital moving picture signals has a drawback that if a part of a bit stream 104 lacks is lacking or an error occurs therein, it might be impossible to accurately decode all subframes (GOBs) 102 in relation to time until synchronization of the next frame layer 101S is established.
The reason of the above is that codes which can be identified at all times in the bit stream 104 are only the PSC 105 which is a frame identifier and the GBSC 110 which is a subframe identifier in the general decoding method. If a part of the bit stream 104 lacks or an error occurs therein, it is impossible to recover synchronization of the decoding until the next GBSC 110 appears so that the decoding becomes unfeasible. Even if the next GBSC 110 appears, the bit stream 104 of that subframe layer 102S cannot be correctly decoded in relation to time. This will be understood from FIG. 4.
FIG. 4 shows an example where the fifth GOB 1025 in the nth frame 110n through the sixth GOB 1026 in the (n+1)th frame 101n−1 101n+1 cannot be decoded in relation to time due to lacks lacking portions or errors of the bit stream 104 occurring in burst. In this example, not only the PSC 105 corresponding to the (n+1)th frame in relation to time but also the following TR 106 are missed or in error. It is therefore possible to correctly decode the GOB 1027 in relation to space by establishing synchronization from the GBSC 110 corresponding to the seventh GOB 1027 in the (n+1)th frame 101n+1 in relation to time and decoding the following GN 111, but impossible to specify whether this GOB 1027 positions in the nth frame or in the (n+1)th frame in relation to time.
In terms of decoding of the eighth GOB 1028 through the twelfth GOB 10212 in the (n+1)th frame in relation to time, it is impossible to specify whether these GOBs 102 position in the nth frame or in the (n+1)th frame in relation to time.
In consequence, if a part of the bit stream 104 is missed missing or an error occurs therein, it becomes impossible to correctly decode all GOBs 102 in relation to time until synchronization of the next frame layer 1015 is established.
Further, the general method for encoding and decoding digital moving picture signals has another drawback that if the GOB 102 including a picture in motion in relation to time cannot be decoded, a picture quality of the reproduced picture is largely degraded.
This problem will be described in more detail with reference to FIG. 5. FIG. 5 shows one frame including decoded signals of a moving picture, where a figure is moving in the center of the frame. In FIG. 5, a part moving in relation to time is indicated by slanting lines, and the remaining part is a background which is still in relation to time. A scene like this is general in TV conferences, TV telephones or the like.
Referring to FIG. 5, considering that any one of the first GOB 1021 through the fourth GOB 1024 cannot be decoded. The first through fourth GOBs 1021 through 1024 include a picture still in relation to time. If the second GOB 1022 cannot be decoded, for example, a skillful operation is conducted to substitute the second GOB 1022 of the present frame 101 with the second GOB 1022 of the preceding frame 101−1 in the decoding. With this operation, degradation of a picture quality in the second GOB 1022 of the present frame 101 may be hardly detected.
However, it is a problem if decoding of the fifth through twelfth GOBs 1025 through 10212 shown in FIG. 5 cannot be decoded. The fifth through twelfth GOSs GOBs 1025 through 10212 include a picture moving in relation to time. This means, for example, that a picture in the ninth GOB 1029 of the preceding frame 101−1 is largely different from the ninth GOB 1029 of the present frame 101 in relation to time. If the decoding of the ninth GOB 1029 is unfeasible, degradation of the picture quality of the ninth GOB 1029 of the present frame 101 is obviously detected even if the skillful operation mentioned above is conducted in the decoding.
Accordingly, if decoding of GOB 102 including a picture moving in relation to time becomes unfeasible, a quality of a reproduced picture is largely degraded.