1. Field of the Invention
The present invention relates to image processing and object tracking, and in one exemplary aspect to reducing sensor noise and scintillation noise in the optical image-based tracking of targets through media.
2. Description of Related Technology
Image data processing is useful in a broad variety of different disciplines and applications. One such application relates to the tracking of objects or targets in a random or substantially randomized media. Object tracking through random media is used, for example, in astronomical and space imaging, free space laser communication systems, automated lasik eye surgery, and laser-based weapon systems. Each of these applications requires a high degree of precision.
Inherent in object tracking is the need to accurately locate the target as a function of time. A typical tracking system might, e.g., gather a number of sequential image frames via a sensor. It is important to be able to accurately resolve these frames into regions corresponding to the target being tracked, and other regions not corresponding to the target (e.g., background). Making this more difficult are the various sources of noise which may arise in such systems, including: (i) noise generated by the sensing system itself; (ii) noise generated by variations or changes in the medium interposed between the target being tracked and the sensor (e.g., scintillation); and (iii) target reflection or interference noise (e.g., speckle).
One very common prior art approach to image location relies on direct spatial averaging of such image data, processing one frame of data at a time, in order to extract the target location or other relevant information. Such spatial averaging, however, fails to remove image contamination from the aforementioned noise sources. As result, the extracted target locations have a lower degree of accuracy than is desired.
Two fundamental concepts are utilized under such approaches: (i) the centroid method, which uses an intensity-weighted average of the image frame to find the target location; and (ii) the correlation method, which registers the image frame against a reference frame to find the target location.
Predominantly, the “edge N-point” method is used, which is a species of the centroid method. In this method, a centroid approach is applied to the front-most N pixels of the image to find the target location.
However, despite their common use, none of the foregoing methods (including the N-point method) is well suited to use in applications with high levels of scintillation that occur in actively illuminated targets, for targets with significant speckle or scintillation of intensity, or in the presence of sensor noise.
A number of other approaches to image acquisition/processing and target tracking are disclosed in the prior art as well. For example, U.S. Pat. No. 4,227,077 to Hopson, et al. issued Oct. 7, 1980 entitled “Optical tracking system utilizing spaced-apart detector elements” discloses an optical tracking system in which an image of an object to be tracked is nutated about the image plane. Individual detector elements are arranged in a hexagonal array within the image plane such that each of the individual detector elements are located at respectively the centers of contiguous hexagonal cells. The nutation is provided, in one embodiment, by means of a tiltable mirror which rotates about an axis through the center of the mirror. The nutation of image points relative to the positions of the detector elements permit the various portions of an image to be scanned by individual detector elements. The inward spiraling of image points is utilized to provide for acquisition of the image points, and the position of an image point relative to a detector element at a given instant of time is utilized to provide elevation and azimuthal tracking data for tracking a desired object.
U.S. Pat. No. 4,671,650 to Hirzel, et al. issued Jun. 9, 1987 entitled “Apparatus and method for determining aircraft position and velocity” discloses an apparatus and method for determining aircraft position and velocity. The system includes two CCD sensors which take overlapping front and hack radiant energy images of front and back overlapping areas of the earth's surface. A signal processing unit digitizes and deblurs the data that comprise each image. The overlapping first and second front images are then processed to determine the longitudinal and lateral relative image position shifts that produce the maximum degree of correlation between them. The signal processing unit then compares the first and second back overlapping images to find the longitudinal and lateral relative image position shifts necessary to maximize the degree of correlation between those two images. Various correlation techniques, including classical correlation, differencing correlation, zero-mean correction, normalization, windowing, and parallel processing are disclosed for determining the relative image position shift signals between the two overlapping images.
U.S. Pat. No. 4,739,401 to Sacks, et al. issued Apr. 19, 1988 and entitled “Target acquisition system and method” discloses a system for identifying and tracking targets in an image scene having a cluttered background. An imaging sensor and processing subsystem provides a video image of the image scene. A size identification subsystem is intended to remove background clutter from the image by filtering the image to pass objects whose sizes are within a predetermined size range. A feature analysis subsystem analyzes the features of those objects which pass through the size identification subsystem and determines if a target is present in the image scene. A gated tracking subsystem and scene correlation and tracking subsystem track the target objects and image scene, respectively, until a target is identified. Thereafter, the tracking subsystems lock onto the target identified by the system.
U.S. Pat. No. 5,147,088 to Smith, et al. issued Sep. 15, 1992 entitled “Missile tracking systems” discloses a missile tracking system that includes a target image sensor and a missile image sensor which record image data during respective target image exposure periods and missile image exposure periods. The missile is provided with an image enhancer such as a beacon or a corner reflector illuminated from the ground, which enhances the missile image only during the missile image exposure periods.
U.S. Pat. No. 5,150,426 to Banh, et al. issued Sep. 22, 1992 entitled “Moving target detection method using two-frame subtraction and a two quadrant multiplier” discloses a method and apparatus for detecting an object of interest against a cluttered background scene. The sensor tracking the scene is movable on a platform such that each frame of the video representation of the scene is aligned, i.e., appears at the same place in sensor coordinates. A current video frame of the scene is stored in a first frame storage device and a previous video frame of the scene is stored in a second frame storage device. The frames are then subtracted by means of an invertor and a frame adder to remove most of the background clutter. The subtracted image is put through a first leakage reducing filter, preferably a minimum difference processor filter. The current video frame in the first frame storage device is put through a second leakage-reducing filter, preferably minimum difference processor filter. The outputs of the two processors are applied to a two quadrant multiplier to minimize the remaining background clutter leakage and to isolate the moving object of interest.
U.S. Pat. No. 5,387,930 to Toh issued Feb. 7, 1995 entitled “Electronic image acquisition system with image optimization by intensity entropy analysis and feedback control” discloses an image acquisition system wherein parameters associated with the system, such as any of the lens aperture, the lens focus and image intensity, are adjusted. Incoming image data is processed to determine the entropy of the image and with this information the aperture can be optimized. By determining the dynamic range of the scene the black and white levels thereof can be identified and the gain and offset applied to the image adjusted to minimize truncation distortion. Specular highlights can be detected by calculating the ratio of changes in maximum and minimum intensities between different but related images.
U.S. Pat. No. 5,489,782 to Wernikoff issued Feb. 6, 1996 entitled “Method and apparatus for quantum-limited data acquisition” discloses a methodology of forming an image from a random particle flux. Particles of the flux are detected by a discrete-cell detector having a cell size finer than conventionally used. The count data are filtered through a band-limiting filter whose bandwidth lies between a bandwidth corresponding to the detector cell size and the flux bandwidth of interest. Outliers may be flattened before filtering. Neighborhoods around each cell are evaluated to differentiate stationary regions (where neighboring data are relatively similar) from edge regions (where neighboring data are relatively dissimilar). In stationary regions, a revised estimate for a cell is computed as an average over a relatively large neighborhood around the cell. In edge regions, a revised estimate is computed as an average over a relatively small neighborhood. For cells lying in an edge region but near a stationary/edge boundary, a revised estimate is computed by extrapolating from data in the nearby stationary region.
U.S. Pat. No. 5,640,468 to Hsu issued Jun. 17, 1997 entitled “Method for identifying objects and features in an image” discloses scene segmentation and object/feature extraction in the context of self-determining and self-calibration modes. The technique uses only a single image, instead of multiple images as the input to generate segmented images. First, an image is retrieved. The image is then transformed into at least two distinct bands. Each transformed image is then projected into a color domain or a multi-level resolution setting. A segmented image is then created from all of the transformed images. The segmented image is analyzed to identify objects. Object identification is achieved by matching a segmented region against an image library. A featureless library contains full shape, partial shape and real-world images in a dual library system. Also provided is a mathematical model called a Parzen window-based statistical/neural network classifier. All images are considered three-dimensional. Laser radar based 3-D images represent a special case.
U.S. Pat. No. 5,647,015 to Choate, et al. issued Jul. 8, 1997 entitled “Method of inferring sensor attitude through multi-feature tracking” discloses a method for inferring sensor attitude information in a tracking sensor system. The method begins with storing at a first time a reference image in a memory associated with tracking sensor. Next, the method includes sensing at a second time a second image. The sensed image comprises a plurality of sensed feature locations. The method further includes determining the position of the tracking sensor at the second time relative to its position at the first time and then forming a correlation between the sensed feature locations and the predetermined feature locations as a function of the relative position. The method results in an estimation of a tracking sensor pose that is calculated as a function of the correlation. Because the method is primarily computational, implementation ostensibly requires no new hardware in a tracking sensor system other than that which may be required to provide additional computational capacity.
U.S. Pat. No. 5,850,470 to Kung, et al. issued Dec. 15, 1998 entitled “Neural network for locating and recognizing a deformable object” discloses a system for detecting and recognizing the identity of a deformable object such as a human face, within an arbitrary image scene. The system comprises an object detector implemented as a probabilistic DBNN, for determining whether the object is within the arbitrary image scene and a feature localizer also implemented as a probabilistic DBNN, for determining the position of an identifying feature on the object. A feature extractor is coupled to the feature localizer and receives coordinates sent from the feature localizer which are indicative of the position of the identifying feature and also extracts from the coordinates information relating to other features of the object, which are used to create a low resolution image of the object. A probabilistic DBNN based object recognizer for determining the identity of the object receives the low resolution image of the object inputted from the feature extractor to identify the object.
U.S. Pat. No. 5,947,413 to Mahalanobis issued Sep. 7, 1999 entitled “Correlation filters for target reacquisition in trackers” discloses a system and method for target reacquisition and aimpoint selection in missile trackers. At the start of iterations through the process, distance classifier correlation filters (DCCFs) memorize the target's signature on the first frame. This stored target signature is used in a subsequent confidence match test, so the current sub-frame target registration will be compared against the stored target registration from the first frame. If the result of the match test is true, a patch of image centered on the aimpoint is used to synthesize the sub-frame filter. A sub-frame patch (containing the target) of the present frame is selected to find the target in the next frame. A next frame search provides the location and characteristics of a peak in the next image, which indicates the target position. The DCCP shape matching processing registers the sub-frame to the lock coordinates in the next frame. This process will track most frames, and operation will repeat. However, when the similarity measure criterion is not satisfied, maximum average correlation height (MACH) filters update the aim-point and re-designate the track-point. Once the MACH filters are invoked, the process re-initializes with the new lock coordinates. The MACH filters have pre-stored images which are independent of target and scene data being processed by the system.
U.S. Pat. No. 6,173,066 to Peurach, et al. issued Jan. 9, 2001 and entitled “Pose determination and tracking by matching 3D objects to a 2D sensor” discloses a method of pose determination and tracking that ostensibly does away with conventional segmentation while taking advantage of multi-degree-of-freedom numerical fitting or match filtering as opposed to a syntactic segment or feature oriented combinatorial match. The technique may be used for image database query based on object shape descriptors by allowing the user to request images from a database or video sequence which contain a key object described by a geometric description that the user designates or supplies. The approach is also applicable to target or object acquisition and tracking based on the matching of one or a set of object shape data structures.
U.S. Pat. No. 6,226,409 to Cham, et al. issued May 1, 2001 entitled “Multiple mode probability density estimation with application to sequential markovian decision processes” discloses a probability density function for fitting a model to a complex set of data that has multiple modes, each mode representing a reasonably probable state of the model when compared with the data. Particularly, an image may require a complex sequence of analyses in order for a pattern embedded in the image to be ascertained. Computation of the probability density function of the model state involves two main stages: (1) state prediction, in which the prior probability distribution is generated from information known prior to the availability of the data, and (2) state update, in which the posterior probability distribution is formed by updating the prior distribution with information obtained from observing the data. In particular this information obtained from data observations can also be expressed as a probability density function, known as the likelihood function. The likelihood function is a multimodal (multiple peaks) function when a single data frame leads to multiple distinct measurements from which the correct measurement associated with the model cannot be distinguished. The invention analyzes a multimodal likelihood function by numerically searching the likelihood function for peaks. The numerical search proceeds by randomly sampling from the prior distribution to select a number of seed points in state-space, and then numerically finding the maxima of the likelihood function starting from each seed point. Furthermore, kernel functions are fitted to these peaks to represent the likelihood function as an analytic function. The resulting posterior distribution is also multimodal and represented using a set of kernel functions. It is computed by combining the prior distribution and the likelihood function using Bayes Rule.
U.S. Pat. No. 6,553,131 to Neubauer, et al. issued Apr. 22, 2003 entitled “License plate recognition with an intelligent camera” discloses a camera system and method for recognizing license plates. The system includes a camera adapted to independently capture a license plate image and recognize the license plate image. The camera includes a processor for managing image data and executing a license plate recognition program device. The license plate recognition program device includes a program for detecting orientation, position, illumination conditions and blurring of the image and accounting for the orientations, position, illumination conditions and blurring of the image to obtain a baseline image of the license plate. A segmenting program for segmenting characters depicted in the baseline image by employing a projection along a horizontal axis of the baseline image to identify positions of the characters. A statistical classifier is adapted for classifying the characters. The classifier recognizes the characters and returns a confidence score based on the probability of properly identifying each character. A memory is included for storing the license plate recognition program and the license plate images taken by an image capture device of the camera.
U.S. Pat. No. 6,795,794 to Anastasio, et al. issued Sep. 21, 2004 entitled “Method for determination of spatial target probability using a model of multisensory processing by the brain” discloses a method of determining spatial target probability using a model of multisensory processing by the brain includes acquiring at least two inputs from a location in a desired environment where a first target is detected, and applying the inputs to a plurality of model units in a map corresponding to a plurality of locations in the environment. A posterior probability of the first target at each of the model units is approximated, and a model unit with a highest posterior probability is found. A location in the environment corresponding to the model unit with a highest posterior probability is chosen as the location of the next target.
U.S. Pat. No. 6,829,384 to Schneiderman, et al. issued Dec. 7, 2004 entitled “Object finder for photographic images” discloses an object finder program for detecting presence of a 3D object in a 2D image containing a 2D representation of the 3D object. The object finder uses the wavelet transform of the input 2D image for object detection. A pre-selected number of view-based detectors are trained on sample images prior to performing the detection on an unknown image. These detectors then operate on the given input image and compute a quantized wavelet transform for the entire input image. The object detection then proceeds with sampling of the quantized wavelet coefficients at different image window locations on the input image and efficient look-up of pre-computed log-likelihood tables to determine object presence.
U.S. Pat. No. 6,826,316 to Luo, et al. issued Nov. 30, 2004 entitled “System and method for determining image similarity” discloses a system and method for determining image similarity. The method includes the steps of automatically providing perceptually significant features of main subject or background of a first image; automatically providing perceptually significant features of main subject or background of a second image; automatically comparing the perceptually significant features of the main subject or the background of the first image to the main subject or the background of the second image; and providing an output in response thereto. In the illustrative implementation, the features are provided by a number of belief levels, where the number of belief levels are preferably greater than two. The perceptually significant features include color, texture and/or shape. In the preferred embodiment, the main subject is indicated by a continuously valued belief map. The belief values of the main subject are determined by segmenting the image into regions of homogenous color and texture, computing at least one structure feature and at least one semantic feature for each region, and computing a belief value for all the pixels in the region using a Bayes net to combine the features.
U.S. Pat. No. 6,847,895 to Nivlet, et al. issued Jan. 25, 2005 entitled “Method for facilitating recognition of objects, notably geologic objects, by means of a discriminant analysis technique” discloses a method for facilitating recognition of objects, using a discriminant analysis technique to classify the objects into predetermined categories. A learning base comprising objects that have already been recognized and classified into predetermined categories is formed with each category being defined by variables of known statistical characteristics. A classification function using a discriminant analysis technique, which allows distribution among the categories the various objects to be classified from measurements available on a number of parameters, is constructed by reference to the learning base. This function is formed by determining the probabilities of the objects belonging to the various categories by taking account of uncertainties about the parameters as intervals of variable width. Each object is then assigned, if possible, to one or more predetermined categories according to the relative value of the probability intervals.
United States Patent Publication No. 20030072482 to Brand published Apr. 17, 2003 entitled “Modeling shape, motion, and flexion of non-rigid 3D objects in a sequence of images” discloses a method of modeling a non-rigid three-dimensional object directly from a sequence of images. A shape of the object is represented as a matrix of 3D points, and a basis of possible deformations of the object is represented as a matrix of displacements of the 3D points. The matrices of 3D points and displacements forming a model of the object. Evidence for an optical flow is determined from image intensities in a local region near each 3D point. The evidence is factored into 3D rotation, translation, and deformation coefficients of the model to track the object in the video.
United States Patent Publication No. 20030132366 Gao, et al. published Jul. 17, 2003 “Cluster-weighted modeling for media classification” discloses a probabilistic input-output system is used to classify media in printer applications. The probabilistic input-output system uses at least two input parameters to generate an output that has a joint dependency on the input parameters. The input parameters are associated with image-related measurements acquired from imaging textural features that are characteristic of the different classes (types and/or groups) of possible media. The output is a best match in a correlation between stored reference information and information that is specific to an unknown medium of interest. Cluster-weighted modeling techniques are used for generating highly accurate classification results.
United States Patent Publication No. 20030183765 to Chen, et al. published Oct. 2, 2003 entitled “Method and system for target detection using an infra-red sensor” discloses a target detection and tracking system that provides dynamic changing of the integration time (IT) for the system IR sensor within a discrete set of values to maintain sensor sensitivity. The system changes the integration time to the same or a different sensor integration time within the discrete set based on the image data output from the sensor satisfying pre-determined system parameter thresholds. The system includes an IT-related saturation prediction function allowing the system to avoid unnecessary system saturation when determining whether an IT change should be made. The tracking portion of the system provides tracking feedback allowing target objects with a low sensor signature to be detected without being obscured by nearby uninterested objects that produce system saturation.
United States Patent Publication No. 20030026454 to Lewins, et al. issued Feb. 6, 2003 entitled “Probability weighted centroid tracker” discloses a system for tracking a target that includes an image sensor mounted to a gimbal for acquiring an image, wherein the image includes a plurality of pixels representing the target and a background. The system further includes a motor for rotating the gimbal and an autotracker electrically coupled to the image sensor and the motor. The autotracker includes a probability map generator for computing a probability that each of the plurality of pixels having a particular intensity is either a portion of the target or a portion of the background, a pixel processor in communicative relation with the probability map generator for calculating a centroid of the target based upon the probabilities computed by the probability map generator, and a controller in communicative relation with the pixel processor for generating commands to the motor based upon the centroid.
United States Patent Publication No. 20040021852 to DeFlumere published Feb. 5, 2004 and entitled “Reentry vehicle interceptor with IR and variable FOV laser radar” discloses a dual mode seeker for intercepting a reentry vehicle or other target. In one embodiment, the seeker is configured with an onboard 3D ladar system coordinated with an onboard IR detection system, where both systems utilize a common aperture. The IR and ladar systems cooperate with a ground based reentry vehicle detection/tracking system for defining a primary target area coordinate and focusing the IR FOV thereon. The IR system obtains IR image data in the IR FOV. The ladar system initially transmits with a smaller laser FOV to illuminate possible targets, rapidly interrogating the IR FOV. The ladar system obtains data on each possible target to perform primary discrimination assessments. Data fusion is employed to resolve the possible targets as between decoys/clutter and a reentry vehicle. The laser FOV is expandable to the IR FOV.
United States Patent Publication No. 20040022438 to Hibbard published Feb. 5, 2004 entitled “Method and apparatus for image segmentation using Jensen-Shannon divergence and Jensen-Renyi divergence” discloses a method of approximating the boundary of an object in an image, the image being represented by a data set, the data set comprising a plurality of data elements, each data element having a data value corresponding to a feature of the image. The method comprises determining which one of a plurality of contours most closely matches the object boundary at least partially according to a divergence value for each contour, the divergence value being selected from the group consisting of Jensen-Shannon divergence and Jensen-Renyi divergence.
For additional information on other prior art approaches to object (e.g., missile) tracking, see also Fitts, J. M. “Correlation Tracking via Optimal Weighting Functions,” Technical Report Number P73-240, Hughes Aircraft Co. (1973); Ulick, B. L. “Overview of Acquisition, Tracking, and Pointing System Technologies,” SPIE. Acquisition, Tracking and Pointing, 887 (1988) 40-63; and Van Rheeden, D. R. and R. A. Jones. “Effects of Noise on Centroid Tracker Aim Point Estimation,” IEEE Trans AES. 24(2) (1988), each of the foregoing incorporated herein by reference in its entirety.
Despite the foregoing plethora of different approaches to object (target) location, tracking and image processing, there is still an unsatisfied need for practical and effective apparatus and methods that account for sensor-based, medium-induced, and/or reflection related noise sources. Ideally, such improved apparatus methods would be readily implemented using extant hardware and software, adaptable to both an active and passive illumination environment, and would utilize information on an inter-frame (frame-to-prior or subsequent frame) basis in order to isolate and remove unwanted noise artifact, thereby increasing the accuracy of the image (and location of the target).