1. Field of the Invention
This invention generally relates to methods and apparatuses for image identification, and more specifically to methods and apparatuses for identifying images containing an Embedded Media Marker (EMM).
2. Description of the Related Art
The techniques of linking dynamic media with a static paper document through devices such as camera phones can be applied to many interesting applications, such as multimedia enhanced books and multimedia advertisement on paper. For example, two dimensional barcodes can be utilized on such static paper documents and can therefore be easily recognized by modern camera phones. However, barcodes tend to be visually obtrusive and interfere with the document layout when being associated with specific document content.
Other systems rely on the document content for identification. For example, visual features within the document can be utilized to identify the document. Linking media to text on the static paper document by utilizing features based on the word bounding boxes of the document (boxes that surround one or more words of a static paper document) is also possible. However, these methods fail to achieve good accuracy and scalability without providing guidance as to which of the content within the static paper document can potentially link to media information. Specifically, if such guidance is not provided adequately to users, an aimlessly captured query image that is submitted for identification may contain various distortions that lead to low identification accuracy. Similarly, without such proper indications, previous systems have needed to characterize and index entire document pages for proper identification; thereby incurring high time and memory costs for large datasets.
To address these problems, index indicators such as Embedded Media Markers (EMM) have been utilized for identification purposes. EMMs are nearly transparent markers printed on paper documents at certain locations which are linked with additional media information. Analogous to hyperlinks, EMMs indicate the existence of links. An EMM-signified patch overlaid on the document can be captured by the user with a camera phone in order to view associated digital media. Once the EMM signified patch is captured by the camera phone, the captured image can be compared to a database of EMM or index indicators for identification, which can be utilized to retrieve the appropriate digital media.
FIG. 1 displays a sequence of a conventional process using an EMM, with an example document 100 with an EMM overlaid at the top right corner 101. The user takes a close-up of an EMM-signified patch 102 on the example document. By using the EMMs, only the EMM-signified patches need to be characterized and indexed. This can greatly reduce feature extraction time and memory usage and further enhance accuracy by excluding noisy features of contents outside the EMM.
Subsequently, at the identifying stage, the EMMs can guide users to capture an EMM-signified region, yielding a query image with much fewer distortions 103. After a sufficient query image is obtained, the next task of EMM identification is then to recognize the camera-phone-captured query image as an original EMM-signified patch indexed in the dataset so that to retrieve and play relevant media on cell phones 104.
EMMs can be represented as meaningful-awareness markers overlaid on the original paper document to guide image capture and limit processing cost. However, current EMM identification systems still rely strictly on general local-feature-based matching approaches, such as strict comparison of geographical features, without considering any particular matching constraints. Such strict comparison of geographical features suffers from low accuracy and high memory/time complexity in practice.
Therefore, there is a need for an identification scheme which provides for high accuracy with low memory and time complexity.