Recent years have seen rapid technological development in the arena of digital visual media searching. Indeed, as a result of the proliferation of personal computing devices and digital cameras, individuals and businesses now routinely manage large repositories of digital images and digital videos. Accordingly, digital visual media searching has become a ubiquitous need for individuals and businesses in a variety of scenarios ranging from casual users seeking to locate specific moments from a personal photo collection to professional graphics designers sorting through stock images to enhance creative projects.
In response, developers have created a variety of digital searching systems that can search digital visual media. In large part, these conventional digital searching systems fall within two major search paradigms: search by text-based searches (i.e., systems that utilize a keyword to search a repository of digital images) and search by similar image (i.e., systems that utilize an existing digital image to search for similar digital images). Although these conventional digital search systems are capable of identifying digital visual media portraying certain content, they also have a number of shortcomings. For example, although conventional digital search systems are able to identify content in digital images, such conventional digital search systems are unable to efficiently identify digital visual content reflecting a particular spatial configuration.
To illustrate, users often seek to find digital images with a specific visual arrangement of objects. For example, a professional designer may need a digital image portraying a specific object in a particular location for a creative project. Existing digital systems allow users to search for digital images portraying specific content, but cannot accurately identify digital images based on spatial arrangement.
To illustrate this point, FIGS. 1A and 1B illustrate the results of conventional search systems for an image of a person holding a tennis racket on their left. FIG. 1A illustrates the results of a conventional text-based search, while FIG. 1B illustrates the results of a conventional similar image based search. As shown, FIG. 1A illustrates, a word query 102 is limited in its capability to reflect spatial features in a search. Specifically, the word query 102 can describe desired content (i.e., “Tennis Racket”), but fails to provide an avenue for imposing accurate spatial constraints. Indeed, although the word query 102 includes text describing a particular configuration (i.e., “Left”), such a term fails to translate into a meaningful search result. Thus, as shown, the word search results 102a portray digital images that include tennis rackets; however, the spatial configuration of the tennis rackets portrayed within the digital images is haphazard. Accordingly, a user seeking a picture of a person holding a tennis racket to their left will have to sort through the word search results 102a in an attempt to find a digital image that matches the desired spatial arrangement.
Similarly, as shown, the image query 104 is limited in its ability to reflect spatial information in a search. As an initial matter, to search for a digital image of a person holding a tennis racket on their left, the image query 104 requires an image of a person holding a tennis racket on their left. Of course, this imposes a significant inconvenience on the user, inasmuch as the lack of an example digital image is the very reason for conducting a search in the first place. Even assuming, however, that a user already has an image of a person holding a tennis racket on their left to generate the image query 104, the image query 104 fails to adequately incorporate spatial concepts into the search. Indeed, although the image search results 104a generally include tennis rackets and tennis players, the image search results 104 portray tennis rackets in a variety of different spatial configurations. Thus, a user seeking a picture of a person holding a tennis racket to their left will have to sort through the image search results 104a in an attempt to find a digital image that matches the desired spatial arrangement.
As shown, conventional digital search systems generally lack the ability to return accurate search results for images with a particular spatial arrangement of objects.