Recently, television telephone services and moving image delivery services using mobile terminals have been increasingly popular. Further, it is contemplated that additional services will become popular in the future, where a server accumulates moving image data received from mobile terminals and delivers the moving image data.
FIG. 1 is a diagram illustrating an example of a typical configuration of a conventional moving image conversion apparatus. Referring to FIG. 1, received data 100 received by the moving image conversion apparatus is demultiplexed into control data 102, image data 103, and audio data 104 by data receiver/demultiplexer 101.
Control data 102, which is information on image coding, is applied to image data decoder 105. In this connection, control data 102 may be applied to image data decoder 105 and to image data encoder 107 in some cases.
Image data 103 is applied to image data decoder 105.
Image data decoder 105 decodes image data 103 based on information derived from control data 102 to generate decoded image data, and supplies the decoded image data to switch 106.
Audio data 104 is applied to switch 106.
Switch 106 applies the decoded image data to image data encoder 107, and applies audio data 104 to data output unit 108, from the time when it is notified of the start of a conversion through conversion indication signal 109, to the time when it is notified of the termination of the conversion through conversion indication signal 109.
Image data encoder 107 encodes the decoded image data applied thereto in an intra-mode and an inter-mode to generate re-encoded image data, and applies the re-encoded image data to data output unit 108.
Data output unit 108 receives the audio data that is transferred from switch 106 and the re-encoded image data that is generated from image data encoder 107 to transfer the audio data and re-encoded image data.
Third-generation (3GPP) mobile terminals widely employ MPEG4 (ISO/IEC 14496-2:2003 “Information Technology Coding of Audio and Video Visual Objects—Part2: Visual.”) as a moving image coding scheme.
MPEG4 has an intra-mode for encoding image data of a current frame using only the image data of the current frame, and an inter-mode for encoding image data of a current frame with reference to image data of a past frame as well.
In the intra-mode, input pixels which make up image data are DCT (Discrete Cosine Transform) processed in units called macro blocks, and subsequently, DCT coefficients are variable-length-encoded.
In the inter-mode, a motion compensation prediction is performed using input pixels and decoded pixels of a past frame to calculate differential pixels, and the differential pixels are DCT processed, and subsequently, motion vectors, DCT coefficients, and the like are variable-length-encoded.
When a moving image conversion apparatus is used for the accumulation and delivery of moving images, the moving image conversion apparatus cannot refer to an image of a frame prior to the first frame when it reproduces the first frame of accumulated images. Accordingly, the first frame of the accumulated images must be necessarily data encoded in the intra-mode.
In this connection, Patent Document 1 discloses a video stream editing method and apparatus for processing MPEG video streams.
This apparatus first extracts a first partial stream from MPEG video stream 1 such that an I- or P-picture takes a position of the finally displayed image, and then extracts a second partial stream from MPEG video stream 2 such that an I- or P-picture takes a position of the first displayed image.
Subsequently, this apparatus determines whether or not the first displayed picture of the second partial stream is an I-picture.
When the first displayed picture of the second partial stream is an I-picture, this apparatus leaves the first displayed picture of the second partial stream unchanged from the I-picture.
On the other hand, when the first displayed picture of the second partial stream is a P-picture, this apparatus sequentially decodes from an I-picture immediately before the P-picture to that P-picture to generate a decoded image of the P-picture. Subsequently, this apparatus again encodes the decoded image of the P-picture to generate image data of an I-picture, and substitutes the image data of the I-picture for the P-picture which is the first displayed picture of the second partial stream.
Subsequently, this apparatus combines the first partial stream with the second partial stream to generate a third stream.
Patent Document 2 in turn discloses an image accumulation/reproduction apparatus which encodes images on a frame-by-frame basis. This image accumulation/reproduction apparatus comprises encoding means, accumulating means, receiving means, control means, decoding means, intra-encoding means, and transmitting means.
The encoding means encodes macro-blocks at all positions within frames of a predefined number of frames in a predefined intra-mode at least once, and encodes macro-blocks other than the macro-blocks encoded in the intra-mode in a predefined inter-mode. The accumulating means accumulates images encoded by the encoding means.
The receiving means receives a reproduction start position from the outside for images accumulated in the accumulating means. The control means traces back from the reproduction start position, received by the receiving means, by the predefined number of frames to read images from the accumulating means. The decoding means decodes the read images to create an image frame for the reproduction start position. The intra-encoding means encodes the image frame created by the decoding means in an intra-mode. The transmitting means transmits the image frame encoded by the intra-encoding means.
Patent Document 1: JP-A-2002-300528
Patent Document 2: JP-A-2002-314940