1. Field of the Invention
The present invention relates generally to the processing of a media file, and more particularly, to a method and an apparatus for providing and processing a media file for an Augmented Reality (AR) service.
2. Description of the Related Art
Augmented Reality refers to a technology of showing a virtual object (i.e., an AR object) that overlaps with real world images a user can see. The augmented reality is a concept of synthesizing real time image/voice information and the virtual object or related information to provide an augmented information service, and is called a Mixed Reality (MR) in terms of expanding human senses and recognition. Particularly, since a mobile terminal and a smart phone having various sensors such as a camera or a Global Positioning System (GPS) built therein are widely distributed and various consolidated services using high speed mobile Internet have been introduced, the use of an augmented reality service using a mobile device is rapidly increasing.
An International Organization for Standardization (IOS) media file format (FF) defines a general structure for a time based multimedia file such as a video or an audio file and is used based on a format of other files such as an MEPG (Moving Picture Experts Group)-4 (MP4) or 3GPP (3rd Generation Partnership Project) file format.
FIG. 1 illustrates a logical configuration of an ISO based media file in the prior art, and as shown in FIG. 1, a media file 100 is configured in a file header area 102, a metadata area 104, and a media data area 106.
The file header area 102 includes basic information of content included in the media file. For example, information such as a content identifier, a content manufacturer, and a manufacture time may be included in the file header. When the media file is divided into a plurality of tracks or streams, map configuration information of each track may be further included.
The metadata area 104 includes individual information of a plurality of media objects of the content included in the media file. Various pieces of profile information for decoding the media object and information about a location of the media object is included. Here, the media object is the minimum unit of a content. In a case of a video, an image frame displayed on a screen in every unit period may be the media object, and in a case of a voice, an audio frame reproduced in every unit period may be the media object. A plurality of media objects may exist in each track and information needed to reproduce the media objects is included in the metadata area 104.
The media data area 106 is an area in which the media object is actually stored.
A physical structure of the ISO based media file comprises boxes. An individual box may be configured in related data and lower ranking boxes, or exists as a container box comprised of only the lower ranking boxes. For example, a track schematically shown in FIG. 1 is physically stored in a track box, and a track box is a container box comprised of various lower ranking boxes which store track header information, media information, media decoding information, etc.
A conventional ISO based media file does not define any meta information needed for providing the augmented reality service, and also, does not provide a method of synthesizing a multimedia content included in different layers, i.e., an instruction method for reproducing an image and the virtual object in overlapping relation. Therefore, the conventional ISO based media file is limited in utilizing the augmented reality service.
In the conventional technology, an apparatus for reproducing the media file analyzes an image currently reproduced in real time, extracts an area for displaying the virtual object, displays the virtual object on a corresponding area and then synthesizes the virtual object with an image being reproduced to provide a final image to a user. Extracting an area in which the virtual object is to be displayed in the image is a major technology of the augmented reality and is divided into a marker based technology and a non-marker based technology. In marker based augmented reality, an image including a particular image of a marker such as a black and white pattern or a barcode is recognized by the apparatus, a relative coordinate of an area in which the virtual object is to be displayed is determined, and the virtual object is displayed based thereon, whereas, in non-marker based augmented reality, an object within the image is directly identified and related information is obtained.
The former case is advantageous in that a location in which the virtual object is to be located can be relatively accurately provided compared to non-marker based technology; however, it is impossible to provide a natural image because the marker always needs to be included within the image. In the latter case, the marker does not need to be inserted within the image, and thus, the image is more natural compared to a marker based image. Also, the augmented reality service may be provided with respect to a media file that is written without considering a conventional augmented reality. However, since an object within the image is accurately recognized in real time to extract a feature point at which the virtual object is displayed, accuracy is relatively lower as compared to the marker based technology. In addition, in most of the non-marker based feature point extraction methods currently suggested, a significant computation amount needs to be processed in a receiving terminal, which eventually means that the quality of the augmented reality service may vary depending on an applied feature point extraction algorithm and a computation capacity of the receiving apparatus (or reproducing apparatus).
On the other hand, in a conventional augmented reality service, a virtual object displayed to a user and additional information are determined by an application program which provides a corresponding service. For example, in a case of an augmented reality service which provides a logo of a related company intermittently for purpose of promoting a product, a logo displayed to the user is determined entirely by the application program. Therefore, in the conventional augmented reality technology, an advertisement image is displayed regardless of the content of an image actually reproduced, and thus the effectiveness of the advertisement may be reduced.