1. Field of the Invention
The present invention relates to image processing, and in one exemplary aspect to reducing noise in the optical image-based tracking of objects through random media.
2. Description of Related Technology
Image data processing is useful in a broad variety of different disciplines and applications. One such application relates to the tracking of objects or targets in a random or substantially randomized media. Object tracking through random media is used, for example, in astronomical and space imaging, free space laser communication systems, automated lasik eye surgery, and laser-based weapon systems. Each of these applications requires a high degree of precision.
Inherent in object tracking is the need to accurately locate the object as a function of time. A typical tracking system might, e.g., gather a number of sequential image frames via a sensor. It is important to be able to accurately resolve these frames into regions corresponding to the object or target being tracked, and other regions not corresponding to the object (e.g., background). Making this more difficult are the various sources of noise which may arise in such systems, including: (i) noise generated by the sensing system itself; (ii) noise generated by variations or changes in the medium interposed between the object being tracked and the sensor (e.g., scintillation); and (iii) object reflection or interference noise (e.g., speckle).
One very common prior art approach to image location relies on direct spatial averaging of such image data, processing one frame of data at a time, in order to extract the target location or other relevant information. Such spatial averaging, however, fails to remove image contamination from the aforementioned noise sources. As result, the extracted target locations have a lower degree of accuracy than is desired.
A number of other approaches to image acquisition and processing are disclosed in the prior art as well. For example, U.S. Pat. No. 5,387,930 to Toh issued Feb. 7, 1995 entitled “Electronic image acquisition system with image optimization by intensity entropy analysis and feedback control” discloses an image acquisition system wherein parameters associated with the system, such as any of the lens aperture, the lens focus and image intensity, are adjusted. Incoming image data is processed to determine the entropy of the image and with this information the aperture can be optimized. By determining the dynamic range of the scene, the black and white levels thereof can be identified and the gain and offset applied to the image adjusted to minimize truncation distortion. Specular highlights can be detected by calculating the ratio of changes in maximum and minimum intensities between different but related images.
U.S. Pat. No. 5,489,782 to Wernikoff issued Feb. 6, 1996 entitled “Method and apparatus for quantum-limited data acquisition” discloses a methodology of forming an image from a random particle flux. Particles of the flux are detected by a discrete-cell detector having a cell size finer than conventionally used. The count data are filtered through a band-limiting filter whose bandwidth lies between a bandwidth corresponding to the detector cell size and the flux bandwidth of interest. Outliers may be flattened before filtering. Neighborhoods around each cell are evaluated to differentiate stationary regions (where neighboring data are relatively similar) from edge regions (where neighboring data are relatively dissimilar). In stationary regions, a revised estimate for a cell is computed as an average over a relatively large neighborhood around the cell. In edge regions, a revised estimate is computed as an average over a relatively small neighborhood. For cells lying in an edge region but near a stationary/edge boundary, a revised estimate is computed by extrapolating from data in the nearby stationary region.
U.S. Pat. No. 5,640,468 to Hsu issued Jun. 17, 1997 entitled “Method for identifying objects and features in an image” discloses scene segmentation and object/feature extraction in the context of self-determining and self-calibration modes. The technique uses only a single image, instead of multiple images as the input to generate segmented images. First, an image is retrieved. The image is then transformed into at least two distinct bands. Each transformed image is then projected into a color domain or a multi-level resolution setting. A segmented image is then created from all of the transformed images. The segmented image is analyzed to identify objects. Object identification is achieved by matching a segmented region against an image library. A featureless library contains full shape, partial shape and real-world images in a dual library system. Also provided is a mathematical model called a Parzen window-based statistical/neural network classifier. All images are considered three-dimensional. Laser radar based 3-D images represent a special case.
U.S. Pat. No. 5,850,470 to Kung, et al. issued Dec. 15, 1998 entitled “Neural network for locating and recognizing a deformable object” discloses a system for detecting and recognizing the identity of a deformable object such as a human face, within an arbitrary image scene. The system comprises an object detector implemented as a probabilistic DBNN, for determining whether the object is within the arbitrary image scene and a feature localizer also implemented as a probabilistic DBNN, for determining the position of an identifying feature on the object. A feature extractor is coupled to the feature localizer and receives coordinates sent from the feature localizer which are indicative of the position of the identifying feature and also extracts from the coordinates information relating to other features of the object, which are used to create a low resolution image of the object. A probabilistic DBNN based object recognizer for determining the identity of the object receives the low resolution image of the object inputted from the feature extractor to identify the object.
U.S. Pat. No. 6,226,409 to Cham, et al. issued May 1, 2001 entitled “Multiple mode probability density estimation with application to sequential markovian decision processes” discloses a probability density function for fitting a model to a complex set of data that has multiple modes, each mode representing a reasonably probable state of the model when compared with the data. Particularly, an image may require a complex sequence of analyses in order for a pattern embedded in the image to be ascertained. Computation of the probability density function of the model state involves two main stages: (1) state prediction, in which the prior probability distribution is generated from information known prior to the availability of the data, and (2) state update, in which the posterior probability distribution is formed by updating the prior distribution with information obtained from observing the data. In particular this information obtained from data observations can also be expressed as a probability density function, known as the likelihood function. The likelihood function is a multimodal (multiple peaks) function when a single data frame leads to multiple distinct measurements from which the correct measurement associated with the model cannot be distinguished. The invention analyzes a multimodal likelihood function by numerically searching the likelihood function for peaks. The numerical search proceeds by randomly sampling from the prior distribution to select a number of seed points in state-space, and then numerically finding the maxima of the likelihood function starting from each seed point. Furthermore, kernel functions are fitted to these peaks to represent the likelihood function as an analytic function. The resulting posterior distribution is also multimodal and represented using a set of kernel functions. It is computed by combining the prior distribution and the likelihood function using Bayes Rule.
U.S. Pat. No. 6,553,131 to Neubauer, et al. issued Apr. 22, 2003 entitled “License plate recognition with an intelligent camera” discloses a camera system and method for recognizing license plates. The system includes a camera adapted to independently capture a license plate image and recognize the license plate image. The camera includes a processor for managing image data and executing a license plate recognition program device. The license plate recognition program device includes a program for detecting orientation, position, illumination conditions and blurring of the image and accounting for the orientations, position, illumination conditions and blurring of the image to obtain a baseline image of the license plate. A segmenting program for segmenting characters depicted in the baseline image by employing a projection along a horizontal axis of the baseline image to identify positions of the characters. A statistical classifier is adapted for classifying the characters. The classifier recognizes the characters and returns a confidence score based on the probability of properly identifying each character. A memory is included for storing the license plate recognition program and the license plate images taken by an image capture device of the camera.
U.S. Pat. No. 6,795,794 to Anastasio, et al. issued Sep. 21, 2004 entitled “Method for determination of spatial target probability using a model of multisensory processing by the brain” discloses a method of determining spatial target probability using a model of multisensory processing by the brain includes acquiring at least two inputs from a location in a desired environment where a first target is detected, and applying the inputs to a plurality of model units in a map corresponding to a plurality of locations in the environment. A posterior probability of the first target at each of the model units is approximated, and a model unit with a highest posterior probability is found. A location in the environment corresponding to the model unit with a highest posterior probability is chosen as the location of the next target.
U.S. Pat. No. 6,829,384 to Schneiderman, et al. issued Dec. 7, 2004 entitled “Object finder for photographic images” discloses an object finder program for detecting presence of a 3D object in a 2D image containing a 2D representation of the 3D object. The object finder uses the wavelet transform of the input 2D image for object detection. A pre-selected number of view-based detectors are trained on sample images prior to performing the detection on an unknown image. These detectors then operate on the given input image and compute a quantized wavelet transform for the entire input image. The object detection then proceeds with sampling of the quantized wavelet coefficients at different image window locations on the input image and efficient look-up of pre-computed log-likelihood tables to determine object presence.
U.S. Pat. No. 6,826,316 to Luo, et al. issued Nov. 30, 2004 entitled “System and method for determining image similarity” discloses a system and method for determining image similarity. The method includes the steps of automatically providing perceptually significant features of main subject or background of a first image; automatically providing perceptually significant features of main subject or background of a second image; automatically comparing the perceptually significant features of the main subject or the background of the first image to the main subject or the background of the second image; and providing an output in response thereto. In the illustrative implementation, the features are provided by a number of belief levels, where the number of belief levels are preferably greater than two. The perceptually significant features include color, texture and/or shape. In the preferred embodiment, the main subject is indicated by a continuously valued belief map. The belief values of the main subject are determined by segmenting the image into regions of homogenous color and texture, computing at least one structure feature and at least one semantic feature for each region, and computing a belief value for all the pixels in the region using a Bayes net to combine the features.
U.S. Pat. No. 6,847,895 to Nivlet, et al. issued Jan. 25, 2005 entitled “Method for facilitating recognition of objects, notably geologic objects, by means of a discriminant analysis technique” discloses a method for facilitating recognition of objects, using a discriminant analysis technique to classify the objects into predetermined categories. A learning base comprising objects that have already been recognized and classified into predetermined categories is formed with each category being defined by variables of known statistical characteristics. A classification function using a discriminant analysis technique, which allows distribution among the categories the various objects to be classified from measurements available on a number of parameters, is constructed by reference to the learning base. This function is formed by determining the probabilities of the objects belonging to the various categories by taking account of uncertainties about the parameters as intervals of variable width. Each object is then assigned, if possible, to one or more predetermined categories according to the relative value of the probability intervals. The present invention does not require a library of known shapes (even if only known in a statistical sense). The present invention instead classifies each pixel, and shapes are inferred nonparametrically from the resulting posterior image.
United States Patent Publication No. 20030072482 to Brand published Apr. 17, 2003 entitled “Modeling shape, motion, and flexion of non-rigid 3D objects in a sequence of images” discloses a method of modeling a non-rigid three-dimensional object directly from a sequence of images. A shape of the object is represented as a matrix of 3D points, and a basis of possible deformations of the object is represented as a matrix of displacements of the 3D points. The matrices of 3D points and displacements forming a model of the object. Evidence for an optical flow is determined from image intensities in a local region near each 3D point. The evidence is factored into 3D rotation, translation, and deformation coefficients of the model to track the object in the video.
United States Patent Publication No. 20030132366 Gao, et al. published Jul. 17, 2003 “Cluster-weighted modeling for media classification” discloses a probabilistic input-output system is used to classify media in printer applications. The probabilistic input-output system uses at least two input parameters to generate an output that has a joint dependency on the input parameters. The input parameters are associated with image-related measurements acquired from imaging textural features that are characteristic of the different classes (types and/or groups) of possible media. The output is a best match in a correlation between stored reference information and information that is specific to an unknown medium of interest. Cluster-weighted modeling techniques are used for generating highly accurate classification results.
United States Patent Publication No. 20040022438 to Hibbard published Feb. 5, 2004 entitled “Method and apparatus for image segmentation using Jensen-Shannon divergence and Jensen-Renyi divergence” discloses a method of approximating the boundary of an object in an image, the image being represented by a data set, the data set comprising a plurality of data elements, each data element having a data value corresponding to a feature of the image. The method comprises determining which one of a plurality of contours most closely matches the object boundary at least partially according to a divergence value for each contour, the divergence value being selected from the group consisting of Jensen-Shannon divergence and Jensen-Renyi divergence.
Despite the foregoing plethora of different approaches to object location and image processing, there is still an unsatisfied need for practical and effective methods that account for sensor-based, medium-induced, and/or reflection related noise sources. Ideally, such improved methods would be readily implemented using extant hardware and software, and would utilize information on an inter-frame (frame-to-frame) basis in order to isolate and remove unwanted noise artifact, thereby increasing the accuracy of the image (and location).