(1) Technical Field
The present invention relates to the fields of adaptive filtering, computer vision, and object recognition, and more particularly to the field of wearable computer systems that identify and provide information regarding objects in an environment.
(2) Discussion
Computing technology has an ever-increasing impact on our daily lives. One recent computing trend is mobile, wearable, computing for the design of intelligent assistants to provide location-aware information access which can help users more efficiently accomplish their tasks. Thus, in the future, a user driving by a hotel or a restaurant may be able to access information such as recommendations by other visitors, the restaurant menu, and hours of operation simply by pointing their finger at the establishment. However, several technical issues must be solved before this is possible. Computer systems currently available suffer from an inability to deal with environmental uncertainties because the computational requirement for dealing with uncertainty is generally very high.
Currently, the dominant approach to problems associated with systems such as those mentioned above is through the use of computer vision algorithms. In the past, the applicability of computer vision algorithms aimed at real-time pattern recognition and object tracking has been hindered by excessive memory requirements and slow computational speeds. Recent computer vision approaches for tracking applications have reduced the necessary computation time by reducing the image search area to a smaller window. The constrained area is then centered around the last known position of the moving object being tracked. The main drawback of these methods is that when the object being tracked moves faster than the frame capture rate of the system, the object moves out of the window range. This possibility leads to a loss in tracking ability and forces the system to reset the image search area to the full view of the camera in order to recover the position of the object. The repeated reduction and expansion of the image search area slows the system's performance considerably.
Some tracking solutions have attempted to overcome these problems by gradually varying the search window's size according to the moving object's speed. The faster the object moves, the larger the search window becomes, while still centering on the last known position of the object. Therefore, if the object is moving rapidly, the search window becomes large and the computation time for the system increases, thus slowing down the system's response time.
More advanced systems such as that discussed in “Robust Finger Tracking with Multiple Cameras”, Proc. Conference on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems, Corfu, Greece, 1999, pp. 152–160 by C. Jennings, use state-space estimation techniques to center a smaller search window around a predicted position of an object, rather than around its current position. In this way, as the moving object's speed increases, the predicted window position accompanies the object, thereby keeping it inside the window's view. The window size thus remains small and centered around the object of interest regardless of its speed. This, in turn, keeps the memory allocations to a minimum, freeing memory space that can be used by other simultaneous applications. However, abrupt changes in the object's movement patterns introduces modeling uncertainties, and such systems break down, resulting in loss of the tracked object.
The state-space solutions presented to-date are generally prone to failure when uncertainties are introduced by the surrounding environment or through the ego-motion of the user. It is well known that a central premise in Kalman filtering is its assumption that the underlying model parameters {F, G, H, R, Q} are accurate. Once this assumption is violated, the performance of the filter deteriorates appreciably.
Therefore, a robust estimation technique is needed that models uncertainties created by the environment and the user's random ego-motion in the state-space model, and which is effective in keeping the object inside a small search window, thereby reducing the number of times the image search area has to be expanded to full view; thus improving the system's response time. Furthermore, in order to provide information regarding objects or places in a scene, it is desirable that an estimation technique, a robust technique, be combined into a vision-based pointer tracking system that incorporates object recognition so that information associated with recognized objects can be presented to a user.
The following references are presented for further background information:    [1] T. Kailath, A. H. Sayed, and B. Hassibi, Linear Estimation, Prentice Hall, NJ, 2000.    [2] A. H. Sayed, “A framework for state-space estimation with uncertain models”, IEEE Transactions on Automatic Control, vol. 46, no. 9, September 2001.    [3] T. Brown and R. C. Thomas, “Finger tracking for the Digital Desk”, Australasian User Interface Conference (AUIC), vol. 1, pp. 11–16, Australia, 2000.    [4] A. Wu, M. Shah, and N. Da Vitoria Lobo, “A virtual 3D blackboard: 3D finger tracking using a single camera”, Proceedings. Fourth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 536–543, 2000.    [5] D. Comaniciu and P. Meer, “Robust analysis of feature space: color image segmentation”, Proc. Conference on Computer Vision and Pattern Recognition, San Juan, Puerto Rico, 1997, pp. 750–755.    [6] V.Colin de Verdiere, and J. L. Crowley, “Visual recognition using local appearance”, Proc. European Conference on Computer Vision, Frieburg, Germany, 1998.    [7] S. M. Dominguez, T. A. Keaton, A. H. Sayed, “Robust finger tracking for wearable computer interfacing”, Proc. Perceptive User Interfaces, Orlando, Fla., November 2001.    [8] T. Keaton, and R. Goodman, “A compression framework for content analysis”, Proc. Workshop on Content-based Access of Image and Video Libraries, Fort Collins, Colo., June 1999, pp. 68–73.    [9] H. Schneiderman, and T. Kanade, “Probabilistic modeling of local appearance and spatial relationships for object recognition”, Proc. Conference on Computer Vision and Pattern Recognition, Santa Barbara, Calif., 1998, pp. 45–51.    [10] C. Tomasi and T. Kanade, “Detection and tracking of point features”, Technical Report CMU-CS-91-132, Carnegie Mellon University, Pittsburg, Pa., April 1991.    [11] J. Yang, W. Yang, M. Denecke, A. Waibel, “Smart sight: a tourist assistant system”, Proc. Intl. Symposium on Wearable Computers, vol. 1, October 1999, pp.73–78.    [12] R. K. Mehra, “On the identification of variances and adaptive Kalman filtering”, IEEE Transactions on Automatic Control, AC-15, pp. 175–183, 1970.