1. Field of the Invention
The present invention relates to a scalable transmission method of visual objects segmented by content-base in order to provide excellent video services, which are highly demanded in the Internet of which the transmission speed is variable.
2. Description of the Related Art
In general, a content-based video encoding scheme is shown that a content of the input picture (for example, person, desk, flower and the like) is segmented into video objects, and then segmented video objects are coded. The principle of the coding and decoding is described in FIG. 1.
FIG. 1 illustrates an earlier schematic diagram of a video transmission system, wherein the reference numeral 11 represents the content-based object segmenting part, 12 represents the encoding part, 13 represents the multiplexing part, 14 denotes the transmitting part, 15 denotes the receiving part, 16 denotes the demultiplexing part, 17 denotes the decoding part, and 18 represents the picture reconstructing and displaying part, respectively.
As shown in FIG. 1, the content-based object segmenting part 11 segments an inputted video frame sequence for example, content-object sequence A, content-object sequence B, and background object sequence C. Then, the encoding part 12 encodes the above segmented video object sequences, respectively. Meanwhile, those encoded video object sequences are multiplexed by the multiplexing part 13, resulting in one bitstream. Then, the bitstream is transmitted by the transmitting part 14 via a network.
Once the transmitted bitstream arrives at the receiving part 15, the bitstream is demultiplexes into several encoded video object sequences by the demultiplexing part 16. Then, with the help of the decoding part 17, the encoded video object sequences are decoded to the original video object sequence. Finally, the picture reconstructing and displaying part 18 reconstructs the decoded video sequence for display by using the spatial and temporal information.
In the meantime, the encoding scheme for the video objects adopts the predictive encoding scheme which involves an Intra type (I type) encoding, a Predictive type (P type) encoding, and a bidirectionally predictive type (B type) encoding. In the I type encoding, the prediction is achieved within the object. In the P type encoding, the prediction having a forward characteristic with regard to time is achieved. As to the B type encoding, the prediction having a bidirectional characteristic with regard to time is accomplished.
Meanwhile, the I type encoding uses information only on the current frame for encoding. The P type encoding uses either the previous I or P type video object for encoding. Also, the B type uses both the previous I or P type object and the next I or P type object.
In addition, it is assumed that the encoding of the video objects recurs with a specific pattern, for example, IPBBP . . . IPBBP . . . IPBBP. The set of video objects having this specific pattern is called Group Of Pictures (hereinafter, referred to as GOP).
However, there has been a problem that the motion picture services, such as Video On Demand (hereinafter, referred to as VOD), can not guarantee a high quality on the Internet whose channel transmission speed is varying every point of time.