1. Field of the Invention
The present invention relates to an image processing apparatus and method, and specifically, to interframe coding on video image data and decoding on coded video image data.
2. Description of the Related Art
Intraframe coding methods, such as Motion JPEG (Joint Photographic Experts Group) and DV (Digital Video), and interframe predictive coding methods, such as H.261, H.263, MPEG-1, and MPEG-2, are known image coding methods of the related art.
These coding methods have been internationally standardized by the International Organization for Standardization (ISO) and the International Telecommunication Union (ITU).
Intraframe coding methods are adequate for apparatuses requiring video image editing and special playback since the methods perform separate coding in units of frames and therefore, frames can be easily managed. Also, interframe coding methods have a feature of high coding efficiency since the methods use interframe prediction.
In addition, the coding standards include a new coding standard called “MPEG-4” which is a next-generation general-purpose multimedia coding standard that can be used in many fields of computers, broadcasting, and communication, etc.
In the H.261, H.263, MPEG-1, and MPEG-2 standards, the frame-unit coding in use is divided into three types, that is, an intra-picture (I picture) that is only intraframe-coded, a predictive-coded picture (P picture) that also performs interframe prediction from the closest past frame, and a bidirectionally-coded picture (B picture) that additionally performs interframe prediction from the closest future frame.
In the interframe coding, all frames must be sequentially transmitted for referring to the closest past frame. When one party performs data transmission after establishing a communication link with another party in the case of, for example, a telephone line or an ISDN (Integrated Services Digital Network) line, no problem occurs since data sequentially reaches the other party without becoming lost on the path between both parties. However, since, in the case of, for example, a local area network (LAN) and an asynchronous transfer mode (ATM), no communication link is established, and coded data is transmitted in a form divided into smaller data units (packets or cells), some packets may be lost in the communication channel, and the order of packets may be switched due to the use of different communication channels.
In order that the packet receiving party may know the original order of packets, even if the order of packets is switched, the reliability of the network is enhanced such that the packet transmitting party transmits serially numbered packets, or a protocol (e.g., TCP (Transmission Control Protocol)) which confirms the arrival of packets or which sends back an undelivered-packet re-send request from the receiving party to the transmitting party.
When the operation of the network is unstable causing packets to be frequently lost, the user of the above protocol to perform re-sending accumulates transmission delays. Thus, the above protocol is not suitable for real-time transmission of video images.
In addition, in broadcast and multicast which have become popular as use of the Internet has become widespread, and which is used for multipoint data transmission, a mechanism for transmitting data to a plurality of points by performing packet transmission one time is used. In this mechanism, when a packet is lost during packet transmission to one point, re-sending such as using the above protocol greatly increases the network load because, despite normal arrival of the first packet, the other points must re-send identical packets through the network. Accordingly, in broadcast and multicast, it is common to use a protocol (e.g., UDP (User Data Protocol)) that does not perform re-sending. Nevertheless, the use of this protocol increases a probability that lost packets occur.
When a wireless network is used, not only the case of transmitting packets obtained by dividing data, but also even the case of transmitting data after establishing a link tends to increase the data error rate or the lost packet/data rate. In particular, if the receiving party detects an error, when a received signal includes errors beyond its error correcting ability, a method that abandons data in a certain section in order to perform normal processing on the other parts of data is employed. Accordingly, the amount of lost data is larger than that in the case of a wired network.
By using video image data as an example of lost data in packet transmission, a specific example in an MPEG-4 case is described below.
FIG. 1 is an illustration of an example of lost frames in packet transmission of video image data.
FIG. 1 shows MPEG-4 video-image-data frames a to e. Frame a is an I frame that is intraframe-coded. Frame b to frame e are P frames that are interframe-predictive-coded.
As shown in FIG. 1, when frame c is lost during transmission, and when frame c cannot be decoded due to a delay in decoding on frame c, frame c cannot be decoded until the next I frame i (not shown) arrives, so that it is impossible to decode frames d, e, . . . , which are P frames existing until the next I frame i arrives.
Accordingly, in order to ensure transmission of all the frames on a network on which such lost frames and time release frequently occur, a method that transmits all the frames not by using interframe coding but by using only intraframe coding, such as JPEG, has been employed. For example, in the case of JPEG coding, even if frame c is lost, decoding of the next frame can be performed. In this case, redundancy of temporal changes is not eliminated since no interframe coding is performed, thus causing a problem in that the amount of transmitted data is increased by a bad coding efficiency.
In addition, there is a known technology in which, when video image data is transmitted by using interframe coding (interceding) and intraframe coding (intracoding), a transmitting party estimates a portion of an image which may be affected by error, performs forcible intraframe coding on the estimated portion, and transmits the coded portion to a receiving party. In this case, it is difficult to estimate the portion of the image which may be affected by the error. The estimation may be wrong, so that deterioration in image quality cannot completely be eliminated.
From the above description, image coding and decoding are in demand in which, by using interframe coding to suppress a decrease in coding efficiency obtained when a small number of errors occur, even if an error occurs, a P frame can be decoded without awaiting an I frame which is later sent and image quality has less deterioration.