2.1 Acquisition of Multi-Dimensional Imagery Data
Multi-dimensional imagery data is an electronic picture, i.e., image, of a scene. Multi-dimensional data may be acquired in numerous ways. Laser Detection And Ranging (“LADAR”) systems are commonly employed for this purpose.
Referring to FIG. 2, generally speaking, laser beams are transmitted from a platform 18 onto a scene, e.g., a scanned field of view. Upon encountering an object 12 (or multiple objects) and surrounding environment 14, varying degrees of the transmitted laser beams, characteristic of the particular scene or portion thereof, are reflected back to and detected by a sensor on the platform 18. The object 12 may be either airborne or, as shown in FIG. 2, on the ground 16.
The platform 18 can then process the reflected signals to obtain multi-dimensional imagery data regarding the object 12 that caused the reflection. The imagery data derived from the laser reflection can be processed to derive information about the distance between the object 12 and the platform 18, commonly referred to as “range,” as well as information about a number of features of the object 18 such as its height, length, width, average height, etc. The quality and accuracy of the information about the features depends in large part on the conditions prevailing at the time the data is collected, including the orientation of the object relative to the platform (e.g., aspect and depression angles), obscurations, and pixel resolution.
LADAR data is generally acquired by scanning the field of view to generate rows and columns of discrete units of information known as “pixels.” Pixels are used to generate a two-dimensional “image” of the scanned field of view and are correlated to the third dimension, range information. Data acquisition, and particularly LADAR data acquisition is well known in the art and any suitable technique may be employed. Some suitable techniques are disclosed in, e.g., U.S. Pat. Nos. 5,200,606; 5,224,109; 5,285,461; and 5,701,326, owned by the assignee.
2.1 Processing Multi-Dimensional Imagery Data
The platform 18 typically transmits many laser signals across a general area that may contain one or more objects reflecting the laser signals. It therefore is appropriate to examine the reflected data to determine if any objects 12 are present and if so, determine which particular reflecting objects 12 might be of interest. Automatic target recognition (“ATR”) systems are used to identify objects 12 represented in multi-dimensional data to determine whether they are potential targets. ATR systems are often divided into four subsystems: object detection, object segmentation, feature extraction, and object identification.
Object identification involves taking object features such as the ones discussed above and establishing an identity for the object based on comparison(s) to features of known objects. The accuracy of the identification depends on several factors, including the correctness of the object features used in the comparison and the number of known objects constituting potential identifications.
Feature extraction involves selecting one or more features of object 12, such as its height, width, length, average length, etc., from the multi-dimensional imagery data representing object 12. Preceding feature extraction, object segmentation severs an object 12 from its surrounding environment 14., However, an object must first be detected within the multi-dimensional imagery data, meaning that each of the aforementioned subsystems depends upon the detection subsystem.
Object detection can be thought of as being the first sweep through the imagery data. It searches for the presence of one or more objects by interpreting the meaning of the image data. The imagery data includes pixel information having either (x, y) or (x, y, z) coordinates in multi-dimensional space. Pixel coordinates x, y represent vertical and horizontal position while the z coordinate represents the range, or depth, of a particular point or area in the scene relative to the platform 18. The term “pixel” is derived from the phrase “picture element.” A picture (i.e., an image) is a depiction or representation of a scene. Each pixel in the array of pixels which combine to create a picture represent a certain amount of space in the scene. The amount of space represented by each pixel directly affects the resolution of the pixel. The greater the area represented by a pixel, the lower its resolution.
Resolution is easily understood by reference to an everyday example. For a given scene, a digital camera with a zoom lens will be able to bring a subset of the scene closer than would a digital camera without a zoom lens. In the zoomed close-up digital picture, each pixel represents less space in the scene than does each pixel in the distant digital picture. Therefore, the close-up digital picture and its pixels have greater resolution of the scene than the distant digital picture and its pixels. In this way resolution is a product of the distance between the scene and the platform 18, taking into account any magnification ability of the platform 18.
Resolution is not only a function of distance it is also a function of the number of pixels available to represent a scene. The fewer available pixels, the more area that must be represented by each pixel. The number of available pixels is sometimes referred to as “pixel density.”
The relation between pixel density and resolution is easily understood by considering the difference between the same digital camera with and without a wide angle lens. A wide angle lens causes a picture to display a larger scene, i.e., more area per pixel, than does the camera without a wide angle lens. In this way, resolution is a product of pixel density, taking into account any wide angle ability of the platform 18.
Distance and pixel density have a multiplicative affect on resolution. Thus, resolution can be succinctly described as the separation, angular separation, between each pixel multiplied by the effective range from the platform 18 to the scene object 12.
Object detection is generally accomplished by identifying pixels with variances in range coordinates, relative to other pixels, exceeding predefined thresholds. Common detection methods search for object boundaries, object features, or some combination thereof. A detection operator is described in patent U.S. Pat. No. 5,424,823 (System For Identifying Flat Orthogonal Objects Using Reflected Energy Signals), owned by the assignee The operator examines a local neighborhood about a central pixel and counts the number of pixels that are within a range threshold from the central pixel. If the number of pixels exceeds a threshold, then the central pixel is turned on or identified as a detection pixel. This operation finds groups of pixels that are connected in image space and close in range space. The assumption of this operator is that the object of interest must have a large vertical section to detect. Most objects exhibit this phenomenon, but not necessarily at all orientations. Another limitation of this operation is that an unknown subset of the object may be detected. For example, the front of the object may be detected, but due to its orientation the detection may be on the left side of the target. Thus, for subsequent object segmentation to occur, a large segmentation window must be used to extract the object form the background. This not only makes the task of segmentation more difficult, it also makes the task take a longer period of time.
2.3 Problems with Prior Art Detection
A significant problem with some prior art detection methods is false detection, i.e., detection of an object in the imagery data when no object really exists. Since prior art detection methods generally search for objects by locating discontinuities in range coordinates of pixels, terrain or other naturally occurring aspects of an environment are often incorrectly determined to be objects. Complex scenes exacerbate the false detection problem.
Another problem with some prior art detection methods is failure to detect objects. Prior art detectors exhibit object aspect and pitch dependency whereby the position of an object 12 relative to platform 18 effectively hides the object from prior art detection methods.
Yet another problem with prior art detection methods is processor overusage. An ATR system must, as a practical matter, quickly establish the best possible detection with available computing resources. Some prior art systems attempting to address the aforementioned difficulties expend valuable resources, computing and otherwise.
The improved detection operator attempts to minimize the aforementioned deficiencies by determining the outer boundary of the object and then limiting the segmentation window to those pixels that lie near the boundary.