1. Field of the Invention
The present invention generally relates to a method which encode and decode moving pictures, and particularly relates to a method which efficiently encode and decode foreground images and background images separately provided for moving pictures. Further, the present invention relates to a decoder and encoder based on this method.
2. Description of the Related Art
Technologies for coding images into digital data for the purpose of data transmission and data storage have been employed in digital broadcasting and digital videotape recording. MPEG-4 is a next standard following MPEG-2 that is widely used today, and employs object-based coding that encodes foreground pictures and background pictures separately after they are separated from the original images. The object-based coding has advantages such as the improvement of coding efficiency based on separation of foreground images from the background images. In particular, MPEG-4 includes a scheme that is called a xe2x80x9csprite codingxe2x80x9d.
xe2x80x9cSpritexe2x80x9d is an extended, panoramic image used for background pictures. This extended image is coded and transmitted in advance. On the receiver side, an image patch is extracted from the extended image at proper locations so as to be used as a background picture of the decoded image. In this sprite scheme, all that needs to be coded is the extended background image and position parameters used for image extraction at the receiver end. This eliminates a need to encode every frame, thereby making it possible to improve image coding efficiency.
When the brightness level is uniformly changed for the entirety of the background, parameters representing such a change are coded, and the background image is modified on the receiver side according to these parameters. If the background image shows a change other than a uniform brightness-level change, a picture that corresponds to the point of change is coded again so as to update the background image.
In production of broadcast programs, there has been used a method that synthesizes foreground pictures with background pictures to produce composed images. Nowadays, virtual studio techniques are widely used that utilizes computer graphics for background images. One of the methods for producing synthesized images having natural appearance is defocusing. When a foreground picture and a background picture are both in focus, a composed image lacks natural appearance. With defocusing of the background picture, however, a sense of distance and depth are increased, thereby making images appear more natural. Such a method is employed in various types of image synthesizing apparatuses.
In the related-art scheme that encodes foregrounds and backgrounds separately, if the background image of moving pictures blurs, the extended background image is encoded and transmitted again to cope with the blurring. This method, however, is undesirably inefficient since the extended background image needs to be coded each time there is a change.
Accordingly, there is a need for a scheme that encodes only a minimum amount of data when coping with blurring of background pictures.
Moreover, there is another drawback in the related art as will be described in the following.
A sprite (i.e., the extended background image) is generated by putting together a plurality of background pictures through application of image processing called panorama image processing or image mosaic processing. In such processing, camera parameters regarding camera panning and zooming or the like are estimated from video signals or directly obtained from the camera position sensors with an aim of determining relative positions of images for the purpose of integrating them together.
When images are to be integrated together, geometric distortions of the camera lens needs to be compensated for and removed from the images before the integrating thereof. Such lens distortions can be represented by formula based on a model that employs several parameters.
When a portion of the extended background image is extracted at the decoder end to produce moving pictures, no consideration is given to the fact that the extended background image is lacking in lens distortion. If lens distortion is not added back to extracted background pictures, original moving pictures cannot be reconstructed at the decoder end precisely as they were at the coder end, since lens distortion has been removed out of consideration for precision of image integration.
Accordingly, when the background picture having the lens distortion thereof removed is composed with a foreground picture having the lens distortion, a reconstructed image may not produce natural appearance because of disparity between the presence and absence of the lens distortion.
Accordingly, there is a need for a scheme that suppresses unrealistic appearance of reconstructed images caused by disparity between the presence and absence of lens distortion.
It is a general object of the present invention to provide a coding and decoding scheme that substantially obviates one or more of the problems caused by the limitations and disadvantages of the related art.
It is another and more specific object of the present invention to provide a coding and decoding scheme that encodes only a minimum amount of data when coping with blurring of background pictures.
It is yet another object of the present invention to provide a coding and decoding scheme that suppresses unrealistic appearance of reconstructed images caused by disparity between the presence and absence of lens distortion.
Features and advantages of the present invention will be set forth in the description which follows, and in part will become apparent from the description and the accompanying drawings, or may be learned by practice of the invention according to the teachings provided in the description. Objects as well as other features and advantages of the present invention will be realized and attained by a coding and decoding scheme particularly pointed out in the specification in such full, clear, concise, and exact terms as to enable a person having ordinary skill in the art to practice the invention.
To achieve these and other advantages and in accordance with the purpose of the invention, as embodied and broadly described herein, a method of coding and decoding moving pictures according to the present invention includes the steps of coding an extended background image and a foreground picture separately from each other, coding parameters indicative of an image area within the extended background image, coding a defocus value, decoding the extended background image and the foreground picture, decoding the parameters, decoding the defocus value, extracting a background picture from the image area indicated by the decoded parameters within the decoded extended background image, blurring the background picture to an extent indicated by the decoded defocus value, and composing the blurred background picture with the decoded foreground picture.
According to the method as described above, the defocus value is coded and transmitted from the coder end, and is decoded and used to defocus the background picture at the decoder end. This makes it possible to produce a composed image of the background picture and the foreground picture having natural appearance, and all that is necessary to achieve this is to encode and transmit the defocus value, which is a minimum amount of data necessary for the focus control purpose.
According to another aspect of the present invention, a method of coding and decoding moving pictures includes the steps of coding an extended background image and a foreground picture separately from each other, coding parameters indicative of an image area within the extended background image, coding a lens distortion value, decoding the extended background image and the foreground picture, decoding the parameters, decoding the lens distortion value, extracting a background picture from the image area indicated by the decoded parameters within the decoded extended background image, distorting the background picture to an extent indicated by the decoded distortion value, and composing the distorted background picture with the decoded foreground picture.
In the method as described above, the distortion value that represents the amount of lens distortion is coded and transmitted from the coder end, and is decoded and used to distort the background picture at the decoder end. This makes it possible to produce a composed image of the background picture and the foreground picture having substantially the same amount of lens distortion, thereby suppressing unrealistic appearance in the reconstructed image.