During the past years, the popularity of digital photography is increasing as digital cameras become more affordable and more powerful, at least in terms of storing images and resolution. Extended databases of digital pictures are created both for personal and commercial purposes. Because of the low cost in generating an image as an electronic file and the convenience of storing this large number of pictures in electronic form, most people and businesses are taking advantage of this digital phenomenon.
However, the large amount of data becomes difficult to manage, organize or index. For example, most of the users would like to be able to tag an image with various pieces of information for remembering different landmarks present in the image or for remembering the names of the persons present in the image. While methods for tagging pictures are available, see for example Lloyd-Jones et al., U.S. Patent Application Publication No. 2002/0055955, Simske, U.S. Patent Application Publication No. 2004/0049734, Anderson, U.S. Patent Application Publication No. 2005/0174430, Bhalotia et al., U.S. Patent Application Publication No. 2007/0043748, and Shneiderman, U.S. Pat. No. 7,010,751, the entire disclosures of which are incorporated here by reference, the existing techniques only allow the user to add predefined boxes at various locations of an image and also allow the user to type a desired text inside the predefined boxes. All of these techniques superimpose the inserted predefined boxes over the existing pictures and save the added data as metadata. Thus, the tagging is achieved by associating an X and Y position on the image with the added data, where the X and Y position on the image is selected by the user with the help of a mouse. In this respect, FIG. 1 shows an image 10 displayed on a screen (not shown). The image includes two persons 12 and 14 and a landmark 16. The user, by using the mouse, moves a cursor 18 from a predefined menu 20 over the image 10, selecting the X and Y position 22 for the placement of a predefined box. The predefined box 24 is selected from the menu 20 and added to the X and Y position 22 as shown in FIG. 2. Then, the user may type desired text inside the box 24.
According to this approach, the user needs to perform the tagging on a computer system that offers the above discussed functionalities. Therefore, the tagging operation is limited to computer systems and is not supported by a TV set. In addition, the tagging operation maps multiple predefined boxes to a single image, which limits the analysis of selected parts of the image.
A user may also want to use the large collection of digital images, for example, to find a same person in different images. In other words, the user may, for example, decide that he wants to determine all the images that include his mother. One alternative is to look through all the images and mark those including his mother. This approach is time consuming and thus, undesired. Another alternative is to add tags, as discussed above, to each picture and to describe within those tags the persons and landmarks present in the pictures. However, this metadata has to be entered prior to searching, which is a challenging task.
In another context, the user may also want to be able to buy products that are displayed on images presented on a TV set or a computing device. These images may incorporate movies, videos, ads, interviews, any kind of information that is presented on a TV set or a computer screen. In this respect, U.S. Patent Application Publication US 2007/0078774, to Brown, the entire content of which is incorporated here by reference, discloses a method and apparatus for the identification of products in a media program and making such products available for consumer purchase.
Brown discloses that dedicated devices BeamBack are interposed between the consumer and the device presenting the media program and also that the content of the media program is a priori linked to product metadata. The scenes of the media program include identifiers that are linked to other identifiers corresponding to still images, including objects presented in the media program. When a consumer is watching the media program, the consumer may press a button on a device to show his interest in objects presented in that particular scene. Based on the correspondence between the identifiers of the media program and the still images, the appropriate still images are shown to the consumer together with the corresponding metadata.
FIG. 3 shows how an image 10 displayed on a screen has selectable predefined regions 26 and 28 that correspond to advertised objects, a purse 26 and a shoe 28 in this example. The consumer may use the mouse to move the cursor 18 over the purse 26 to request more info about the purse. This process is possible because a priori links have been established between the purse 26 and the requested information. In this regard, it is noted that the consumer receives no information about a hat 30 displayed on the image 10 if no prior links have been established between the hat 30 and the corresponding information.
However, according to this approach, the consumer needs a special device and, additionally, both a priori links and product metadata need to be in place before the image is displayed, which might be expensive for the providers of the content and inconvenient for the consumer. Additionally, it may be desirable to provide methods and systems for providing information about displayed items without using menus and/or text boxes, e.g., which enable a user to select an item from a movie without distracting overlays being present.
Accordingly, it would be desirable to provide systems and methods that avoid the above noted limitations of the existing systems.