1. Field of the Invention
The present invention generally concerns information visualization, and interactive techniques and interactive environments for the manipulation of images and image data in computers.
The present invention more specifically concerns (i) particular orderings of image databases in which images are searched by their perceptual characteristics--called perceptual databases--and (ii) particular metrics and methods for a man-computer interface--called a "display space"--in which a user may navigate among the (ordered) images of the (perpetual) database, and among the results of computerized searches of the images of the (perceptual) database.
2. Description of the Prior Art
2.1 Background References to the Explanation of the Invention PA0 2.1 General Background, Including Background in Biologic Systems, to the Concepts of the Present Invention PA0 2.2 Specific Background to the Present Invention Re: Conceptual Databases PA0 1. Theory of the Invention PA0 2. Primary Systems and Methods of the Invention PA0 2.1 An Image Data Presentation System PA0 2.2 An Image Database System PA0 2.3 A Man-Machine Interfacing Method PA0 3. Interactively Developing the Best and Most Appropriate Criteria to Extract Records from Databases PA0 4. Development of Rational and Powerful Criteria to Extract Records from Perceptual Databases, With Positive Indication When Selected Search Criteria and any Search Results Derived Therefrom Are Incomplete, Poorly Directed and/or Worthless Garbage--Including as May All be Performed by Semi-skilled Labor PA0 5. Further Methods of the Invention PA0 5.1 A Man-Machine Interface Method PA0 5.2 A Method of Displaying Object Records
The present specification makes abundant reference to certain papers and textbooks that also form the most pertinent prior art to the present invention. In to make convenient the frequent citation of these references hereinafter this SUMMARY OF THE INVENTION section, and also within the DESCRIPTION OF THE PREFERRED EMBODIMENT, of this specification, the references are first listed in this section 2.1.
(Note that, of the following references, references "Santini 96a", "Santini 96b", "Santini 96c" and "Santini 97a"--some of which have not yet appeared in printed form, but are at the time of the filing of this specification slated for publication--are the inventor's own work, and are thus not prior art. These references are included in the present and following sections so that they may be optimally current and comprehensive.)
Reference "Ashby 88" is F. Gregory Ashby and Nancy A. Perrin. Toward a unified theory of similarity and recognition. Psychological Review, 95(1): 124-150, 1988.
Reference "Beck 82" is J. Beck. Textural segmentation. In J. Beck, editor, Organization and representation in perception. Erlbaum, 1982.
Reference "Boothby 75", is William M. Boothby. An Introduction to Differential Manifolds and Riemannian Geometry. Pure and Applied Mathematics. Academic Press, 1975.
Reference "Bruce 85" is Vicki Bruce and Patrick Green. Visual Perception: Physiology, Psychology, and Ecology. Lawrence Erlbaum Associates, 1985.
Reference "Chang 95" is Shih Fu Chang and John R. Smith. Extracting multi-dimensional signal features for content-based visual query. In SPIE Symposium on Communications and Signal Processing. 1995.
Reference "Chen 96" is H. Chen, B. Schatz, T. Ng, J. Martinez, A. Kirchoff, and C. Lin. A parallel computing approach to creating engineering concept spaces for semantic retrieval: The Illinois digital library initiative project. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(8), August 1996.
Reference "Flicker 95" is Myron Flickner, Harpreet Sawhney, Wayne Niblack, Jonathan Ashley, Qian Huan, Byron Dom, Monika Gorkani, Jim Hafner, Denis Lee, Dragutin Petkovic, David Steele, and Peter Yanker. Query by image and video content: The QBIC system. IEEE Computer, 1995.
Reference "Hailman 77" is J. P. Hailman. Optical Signals: Animal communication and light. Indiana University Press, 1977.
Reference "Alston 45" is Alston S. Householder and Herbert D. Landahl. Mathematical biophysics of the central nervous system. Principia Press, Bloomington, Inc., 1945.
Reference "Hsu 96" is Chih-Cheng Hsu, Wesley W. Chu, and Ricky K. Taira. A knowledge-based approach for retrieving images by content. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(4): 522-532, August 1996.
Reference "Hubel 67" is D. H. Hubel and T. N. Wiesel. Receptive fields and functional architecture of monkey striate cortex. Journal of Physiology (London), 195, 1967.
Reference "Idris 95" is F. Idris and S. Panchanathan. Image indexing using wavelet vector quantization. In Proceedings of the SPIE Vol. 2606-Digital Image Storage and Archiving Systems, Philadelphia, Pa., USA, 25-26 October, pages 269-275. 1995.
Reference "Jacobs 95" is Charles E. Jacobs, Adam Finkelstein, and Savid H. Salesin. Fast multiresolution image querying. In Proceedings of SIGGRAPH 95, Los Angeles, Calif. ACM SIGGRAPH, New York, 1995.
Reference "Krumhansl 78" is Carol L. Krumhansl. Concerning the applicability of geometric models to similarity data: The interrelationship between similarity and spatial density. Psychological Review, 85: 445-463., 1978.
Reference "Lovelock 89" is D. Lovelock and H. Rund. Tensors, Differential Forms, and Variational Principles. Dover Books on Advanced Mathematics, 63. Dover Publications, Inc., New York, 1975, 1989.
Reference "Malik 90" is Jitendra Malik and Pietro Perona. Preattentive texture discrimination with early vision mechanisms. Journal of the Optical Society of America A, 7(5), 1990.
Reference "Mason 91" is Carol Mason and Eric R. Kandel. Central visual pathways. In Eric R. Kandel, James H. Schwartz, and Thomas M. Jessell, editors, Principles of Neural Science, chapter 30, pages 420-439. Appleton & Lange, 1991.
Reference "Nabil 96" is Mohammad Nabil, Anne H. H. Ngu, and John Sheperd. Picture similarity retrieval using the 2D projection interval representation. IEEE Transaction on Knowledge and Data Engineering, 8(4): 533-539, August 1996.
Reference "Okubo 87" is T. Okubo. Differential Geometry. Monographs and Textbooks in pure and applied mathematics. Marcel Dekker, Inc., 270 Madison Ave., New York 10016, 1987.
Reference "Olson 70" is R. R. Olson and F. Attneave. What variables produce similarity grouping? American Journal of Psychology, 83: 121, 1970.
Reference "Pentland 94" is A. Pentland, R. W. Picard, and S. Sclaroff. Photobook: Tools for content-based manipulation of image databases. In SPIE Conference on Storage and Retrieval of Images and Video Databases II, Volume 2185. San Jose, Calif., February 1994.
Reference "Santini 95" is Simone Santini and Ramesh Jain. Similarity matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1995. (Submitted).
Reference "Santini 96a" is Simone Santini and Ramesh Jain. Gabor space and the development of preattentive similarity. In International Conference on Pattern Recognition, Vienna, 1996. Available at http://www-cse.uscd.edu/users/ssantini.
Reference "Santini 96b" is Simone Santini and Ramesh Jain. Similarity queries in image databases. In Proceedings of CVPR '96, International IEEE Computer Vision and Pattern Recognition Conference, 1996.
Reference "Santini 96c" is Simone Santini and Ramesh Jain. The graphical specification of similarity queries. Journal of Visual Languages and Computing, 1997 (to appear). Available at http://www-cse.ucsd.edu/users/ssantini.
Reference "Santini 97a" is Simone Santini and Ramesh Jain. Similarity is a geometer. Multimedia Tools and Applications, 1997 (to appear). Available at http://www-cse.ucsd.edu/users/ssantini.
Reference "Sawhney 96" is H. Sawhney and S. Ayer. Compact representation of videos through dominant and multiple motion estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(8), August 1996.
Reference "Shapley 90" is Robert Shapley, Terrence Caelli, Stephen Grossberg, Michael Morgan, and Ingo Rentschler. Computational theories of visual perception. In Lothar Spillman and John S. Werner, editors, Visual Perception: The Neurophysiological Foundation, pages 417-448. Academic Press, 1990.
Reference "Shepard 87" is Roger N. Shepard. Toward a universal law of generalization for physical science. Science, 237: 1317-1323, 1987.
Reference "Stark 95a" is Hans-Georg Stark and Gernod P. Laufkotter. Image indexing and content based access to databases of medical images and wavelets. In Proceedings of the SPIE Vol. 2569-Wavelet Applications in Signal and Image Processing III, San Diego, Calif., USA, 12-14 July, pages 790-800, 1995.
Reference "Treisman 86" is A. Treisman. Features and objects in visual processing. Scientific American, 255: 114B-125, 1986.
Reference "Treisman 87" is A. Treisman. Properties, parts, and objects. In. K. R. Boff, L. Kaufman, and J. P. Thomas, editors, Handbook of Perception and Human Performance. Wiley, 1987.
Reference "Tversky 77" is Amos Tversky. Features of similarity. Psychological Review, 84(4): 327-352, July 1977.
Reference "Tversky 70" is Amos Tversky and David H. Krantz. The dimensional representation and the metric structure of similarity data. Journal of Mathematical Psychology, 7: 572-597, 1970.
Reference "Van Essen 84" is David C. Van Essen and H. R. Newsome, William T. and Maunsell. The visual field representation in striate cortex of the macaque monkey: Asymmetries, anisotropies, and individual variability. Vision Research, 24(5): 429-448, 1984.
Reference "Warner 94" is Frank W. Warner. Foundations of Differential Manifolds and Lie Groups. Graduate Texts in Mathematics, 94. Springer-Verlag, 1983.
Reference "Wilson 96" is Hugh R. Wilson, Dennis Levi, Lamberto Maffei, Jyrki Rovamo, and Russel DeValois. The perception of form, retina to striate cortex. In Lothar Spillman and John S. Werner, editors, Visual Perception: The Neurophysiological Foundation. Academic Press, 1990.
Reference "Yao 79" is Christopher H. Yeo. The anatomy of the vertebrate nervous system: an evolutionary and developmental perspective. In David A. Oakley and H. C. Plotkin, editors, Brain, Behavior, and Evolution, pages 28-51. Methuen, London, 1979.
The present invention will shortly be seen to concern certain new constructs for similarity-based image databases, and a new type of man-machine (man-computer) interface for navigation in a database of images.
"Image databases" are databases in which images are searched through their perceptual characteristics. These databases operate directly on the image data, and require no encoding or labeling by an operator. "Perceptual databases" based on this idea, or on similar ideas, which we call perceptual databases, have received a lot of attention circa 1997. Reference, for example, Pentland 94, Stark 95a, Idris 95, Jacobs 95, Flicker 95, Chang 95a, Nabil 96, Hsu 96, Chen 96, and Sawhney 96.
Working with perceptual data requires rethinking and re-analyzing many aspects of database organization that are usually taken for granted. Reference, for example, Santini 95, Santini 96a, Santini 96b, Santini 97a.
In particular, the historically fundamental operation in databases, the matching of a database item against a query, loses meaning at the perceptual level, and therefore image databases should not use it. Reference Santini 96a, Santini 96b, Santini 96c, and Santini 97a.
How are searches to be made in a database if things cannot be matched? It is one insight of the present invention that modern computer scientists may not be the first organisms in the history of planet earth looking for this answer. Many primitive animals that lacked a nervous system extended enough to sustain complex perception and categorization faced the same problem a few hundred millions years ago. The way biological evolution solved the problem is simple: instead of extracting and matching meanings, animals came to look for generic perceptual similarities between images. Reference Hailman 77 and Bruce 85.
Accordingly, the present invention finds it useful (i) to organize, and (ii) to interrogate, image databases to the same end. Rather than trying to give all possible meanings to an image--which is impossible even for humans--databases should be (i) organized and (ii) presented (which are separate, but related, things) so as to rely on simple perceptual cues, and on a reasonable similarity measures.
As an aside, and in order to illustrate by example the complexity of ascribing meaning to an image, consider the (real) example of a picture of Stalin in the 1920's on a podium. This picture happened to be a picture of Trotsky standing with Stalin at a rally, but Trotsky had later been "removed" for propaganda reasons. Sometimes the information is not in what's in the picture, but in what isn't in the picture.
The real moment of this example--which may seem excessively subtle and arcane in illustration of the simple truth that it is difficult to describe images in words ("one picture is worth a thousand words")--will only later be brought home in this specification: it will later be explained that the incongruity of this very picture might (when co-located among others) prove to be detectable and perceivable (to an otherwise unknowing human) by the action of the methods of the present invention. To locate and to distinguish "what is not there" is, with all due modesty, "quite a trick". The promise of even this (minor, narrow) aspect of the present invention alone should pique the interest of the reader.
Continuing with biological "clues" to the organization of visual databases, it is the contention of the present invention that such databases should not try to recognize objects in the image, nor even attempt object segmentation, but should instead rely on a naive and acritical observation of the patterns of color and intensity in the image.
This kind of visual data assimilation and processing is common in animals, especially animals without a sophisticated central nervous system. Rudimentary as it may be, it has proven surprisingly effective in support of the necessities of these animals' life on earth. It corresponds roughly to what in humans is known as preattentive similarity. Reference Malik 90, Treisman 86, Treisman 87, Beck 82.
The displays of the present invention--mainstay of most of its aspects--are not known by the inventors to exist, nor to be analogous, to anything in the prior art not already discussed. However, the sophisticated manner in which the present invention orders, as well as selects, the objects of a computerized search may be compared to certain patented inventions within the prior art. Although the present invention does not concern artificial intelligence, this prior art is often in the area of artificial intelligence.
U.S. Pat. No. 4,899,290 to Hartzband for a SYSTEM FOR SPECIFYING AND EXECUTING PROTOCOLS FOR USING ITERATIVE ANALOGY AND COMPARATIVE INDUCTION IN A MODEL-BASED COMPUTATION SYSTEM. This patent, assigned to Digital Equipment Corporation (Maynard, Mass.) concerns a system for performing iterative specialization and iterative generalization among objects in a set. Initially, an analogy or symmetric comparison operation is performed between a predetermined pair of objects to determine the similarities (and differences if symmetric comparison is performed) between the objects, and to generate respective similarity and difference reference structures. The system then iteratively performs analog or symmetric comparison operations using the previously generated reference structures and the reference structure of the object being processed during the iteration to determine the similarity of difference between the object and the previously determined reference structure.
U.S. Pat. No. 5,267,329 to Ulich, et. al. for a PROCESS FOR AUTOMATICALLY DETECTING AND LOCATING A TARGET FROM A PLURALITY OF TWO DIMENSIONAL IMAGES is concerned with the quality of image classification, as is the present invention. However, the classification is performed entirely by machine (computer). This patent, assigned to Kaman Aerospace Corporation (Colorado Springs, Colo.) concerns a novel data processing technique for detecting and locating a target from a plurality of two-dimensional images generated by an imaging sensor such as an imaging lidar system. This series of two dimensional images (made with one or more imaging detectors) is processed in an optimal statistical fashion to reliably detect and locate targets. The process by which the images are mathematically modified reduces the deleterious effects of noise and thereby provides the highest possible probability of detection while simultaneously maintaining a very low probability of false alarm. An data processing technique described also provides an estimate of the reliability of the detection, the target location and an output image to be displayed for visual confirmation and perhaps classification by the operator. The method includes some or all of the following steps: noise reduction, spatial filtering, noise parameter extraction, asymmetric threshold detection, contrast stretching, localization, recognition, range or depth determination and subimage mosaic generation. The method is reportedly particularly well suited for processing two dimensional images of underwater targets generated by an imaging sensor located on an airborne platform whereby the underwater target is precisely and accurately detected, located and identified.
Similarly, U.S. Pat. No. 5,325,466 to Kornacker for a SYSTEM FOR EXTRACTING KNOWLEDGE OF TYPICALITY AND EXCEPTIONALITY FROM A DATABASE OF CASE RECORDS concerns sophisticated location of data. This patent, assigned to Perceptive Decision Systems, Inc. (Columbus, Ohio), concerns a knowledge tree building system. The system iteratively partitions a database of case records into a tree of conceptually meaningful clusters. Each cluster is automatically assigned a unique conceptual meaning in accordance with its unique pattern of typicality and exceptionality within the knowledge tree; no prior domain-dependent knowledge is required. The system fully utilizes all available quantitative and qualitative case record data. Knowledge trees built by the system are particularly well suited for artificial intelligence applications such as pattern classification and nonmonotonic reasoning.
Finally, U.S. Pat. No. 5,416,892 to Loken-Kim and Kyung-ho for BEST FIRST SEARCH CONSIDERING DIFFERENCE BETWEEN SCORES again concerns sophisticated searching. This patent, assigned to Fujitsu Limited (Kawasaki, Japan), concerns a best first search for problem-solving in an artificial intelligence system employing a novel search priority index. The search priority index is calculated based on a difference between scores of a node and the next node in breadth. Searching steps required to attain a solution can be reduced by employing the search priority index.
All these patents generally show that searching in large and/or complex databases needs often be sophisticated. This much is acknowledged by the present invention. However, the present invention will show that the effective sophistication of the search need not inevitably translate into any of (i) difficulty in the search, (ii) opacity of the criterion (criteria) of search, and/or (iii) uncertainty as to the quality of search results.