During the past 30 years, computer systems have evolved from relatively simple processing engines with limited memories and mass-storage capacities that primarily operated on alpha-numeric input, text files, and numeric data files to high-powered, multi-processor processing engines that access vast local memories and high-capacity local mass storage devices via internal buses as well as vast remote memories and extremely high-capacity mass storage devices via various types of external communications media. Modern computers are capable of storing, managing, and accessing terabytes and even petabytes of a wide variety of different types of digitally encoded data, including video and audio data, photographic images, text-based and numeric data, and many types of complex data objects generated, stored, managed, and retrieved by a variety of different data management applications and systems. Many modern data management systems provide various types of indexing and data-object-locating facilities. For example, attribute values for attributes associated with a data object can be assigned to the data object during or following storage of the data object, and query-based data-management and data-retrieval facilities provided by modern data management systems can locate data objects having attributes with attribute values that satisfy criteria expressed in attribute-value-based queries.
Unfortunately, the capacities of modern computer-based data-object storage, management, and retrieval systems often exceed the data-object location facilities provided by these systems. Attribute values may be constrained to relatively short text strings, integer values, and other primitives which lack the expressive power, flexibility; and natural-language capabilities needed by human users to classify data objects for storage, retrieval, and location.
As one example, it may be exceedingly difficult for a human user to formulate queries using relational-database query languages or other such simple, algebraic query languages in order to find one or a few photographic images within a large database containing hundreds of thousands of photographic images. The user would need to understand and remember the various types of attributes and attribute values that have been associated with photographic images within the database in order to formulate queries to find photographic images. Moreover, many of the queries that a user might want to make may require attributes and attribute values previously assigned to data objects with extremely high levels of foresight, and may involve very complex queries as well as procedural techniques for directly querying the content of photographic images.
As one example, a user may desire to find all photographic images within a library that include sub-images of a child between the ages of two and four playing with a beach ball. Although it is possible that a Boolean-valued attribute child_playing_with_a_beach_ball_included may have been associated with each photographic image, it is highly unlikely that attributes of such particularity would have been specified during photographic-image storage and characterization operations. In the case that titles have been stored for each photographic image, it might be possible to locate candidate photographic images by retrieving photographic images that include the phrase “beach ball” within the titles, but the list of photographic images satisfying that criterion would almost certainly be vastly over-inclusive as well as vastly under-inclusive. Many might, for example, include sub-images of beach balls without children, or with children outside the specified age range of 2-4. On the other hand, many images that do include the desired sub-image might have titles that do not include the phrase “beach ball,” such as “Aunt Alice's Big Day at the Beach.”
Alternatively, a procedure could be developed to electronically access a photographic image and search the image for sub-images of small children playing with beach balls. However, the cost to develop such procedures would be extremely high, development would require copious amounts of time and significant financial expenditure, and application of the procedure to all of the images in a large image database, or image library, would use prodigious amounts of processing cycles and processing time, resulting in impractical searches or searches that could simply not be performed, even with unlimited financial resources. The data-storage requirements for storing a sufficiently large number of such specialized procedures would generally be prohibitive, as well, and could easily exceed the data-storage used to store the photographic images.
Thus, current techniques by which human users can locate photographic images within photographic-image libraries, and other types of complex data objects within other types of complex-data-object libraries, are often inadequate. As ever increasingly complex software applications generate greater and greater amounts of data of ever increasing complexity, the need for better methods to allow users to locate particular data objects within large data-object libraries is rapidly increasing, and has been identified as a critical problem in a variety of fields, from database management systems and electronic-data archiving systems to management and processing of scientific data and development of internet search engines.