In remote sensing applications of natural environmental scenes containing man-made objects or targets, the difficulty and complexity of identifying not only the presence of targets but also their location within the scene increases rapidly as the distance increases between the sensing member and the scene. Remote sensing from satellites, providing a downward-looking view of environmental scenes, and forward looking views taken at relatively low angle of depression with respect to the plane of the environmental scene from relatively great distances are two examples of remote sensing applications of natural environmental scenes wherein the confidence level of locating a man-made object or target within the scene decreases as the cross-sectional area of the object decreases relative to the total area of the scene subtended by the remote sensing member. Accordingly, it would be desirable to provide a method and system capable of extracting during subsequent examinations of representations of such scenes the locations of man-made objects with a higher level of confidence than available heretofore.
In instances requiring the determination of the location of man-made objects or targets within a natural environmental scene in as short a period of time as possible after remote sensing of the scene, i.e., as close to real-time location determination as possible, the system used for such relatively rapid extraction of information on object location must not only provide object location determination with a high level of confidence but must be capable of reaching that determination rapidly. The importance of near real-time determination of the location of man-made objects in natural environmental scenes can be envisioned by a remote sensor's ability to rapidly zoom its optical system for an additional, enlarged view of the object or objects shortly after the initial or first view has provided the location of such object or objects within the scene.
Another aspect of remote sensing, i.e., sensing of scenes relatively distant from the sensing member, is the range of levels of brightness or levels of contrast within the scene as perceived by the sensing member within the spectral region of response of the sensing member. For example, a sensing member with a forward-looking view toward a natural environmental scene comprising a section of sky above, a section of trees in the center and a section of ground land below will perceive within its spectral region of response different levels of brightness and contrast emanating from the respective sky, tree, and ground sections at an instant in time. If the same scene were to be viewed by the same sensing member at another time, for example, in winter versus summer, or in dawn versus high noon, the differences of the different brightness and contrast levels associated with the respective sky, tree and ground sections can be smaller or larger than the respective brightness and contrast levels during the earlier scene sensing. Likewise, different spectral regions of response of the sensing member or members will render differing brightness levels or contrast levels among the sky, tree and grass sections. A scene viewed by a sensing member responsive only to near-infrared radiation in the spectral region of wavelengths from about 700 nm to about 900 nm will give a scene representation in terms of brightness levels and contrast ranges quite different from the same scene viewed by a sensing member responsive only to green-yellow light in the spectral region of wavelengths from about 520 nm to about 620 nm. Accordingly, it would be advantageous to employ a method and system capable of extracting information about the location of man-made objects or targets in natural environmental scenes viewed under differing conditions of scene brightness and contrast levels, or viewed within differing spectral ranges of response of the sensing member or of sensing members.
A number of sensing members and sensing systems have been proposed. Generally, sensing members responsive to radiation emanating only from the scene to be viewed or sensed are referred to as passive sensors. Sensing members responsive to signals emitted toward the scene by an emitter and reflected by the scene and partially collected by the sensing member are referred to as active sensors. Radar equipment and other microwave systems, and laser-based systems, can be considered to have active sensors which respond to some fraction of the signal emitted by part of the system and partially reflected toward the sensor by elements or regions of the scene. Among passive sensors or sensing systems are photographic materials, such as photographic films of black-and-white or color-rendering capacity. Scenes captured or stored on photographic materials are generally thought of as being two-dimensional photographic representations of the scene and are considered to contain scene information in analog form provided by the nature of the photographic process. Such photographic scene representations lend themselves to detailed subsequent analysis of features by visual or microscope-aided inspection. Alternatively, the analog information of the photographic material can be transformed into digital information by various known digitizing techniques such as sequential line scanning, flying spot scanning, half-tone replication, rephotographing with a digital camera, and the like. The advantage of providing and storing scene digital information resides in the application of computer-aided digital signal processing to the task of feature identification, thereby facilitating a substantially automated approach to locating certain features contained within the scene.
Another, generally passive, class of sensing elements are so-called area array sensors, currently most prominently represented by silicon-based charge-coupled devices (CCDs). These sensors comprise a relatively large number (several thousand to several million) of individually addressable, closely adjacent radiation-sensitive domains (also called picture elements or pixels), with each domain capable of generating an electrical signal in response to levels of incident radiation, covering a wide range of signal levels, for example, 256 grey levels of signal. In such area arrays, the domains are spatially arranged in a series of parallel rows and also in a series of parallel columns orthogonal to the rows. Thus, a scene viewed by such an area array sensor will be read out in the form of signal levels associated with discrete and known spatial coordinates of the device. These signals can then be stored in known means for digital data storage, for example on magnetic tapes or disks, optical disks, or electronic devices referred to as frame storage memories.
In the context of automatic detection of man-made objects or targets or features contained within natural environmental scenes, several problems are faced, among these are:
(a) Due to remote sensing of the scene, targets may comprise dimensions of only a few tens of pixels out of possibly many thousands of pixels subtending the entire scene. PA0 (b) The need for automatic detection with the computer-aided processing of digital signals and digital data to arrive at location-specific determination of a target or targets or features of interest within the scene. PA0 (c) Near real-time determination of target location which requires relatively rapid processing of suitably derived digital data. PA0 (a) Partitioning the picture elements (pixels) representing the scene into a plurality of individual identical and spatially coordinated groups of pixels with each group containing an identical number of pixels, and storing the associated digital signals; and PA0 (b) Determining simultaneously from each one of all individual groups of pixels a plurality of digital data indicative of a texture measure or texture index of each group of pixels, and storing the spatially coordinated digital data. PA0 (c) Self-calibrating the stored texture measures by the ratio of texture measure values among neighboring groups of pixels along rows and columns, and storing the self-calibrated texture measure values; PA0 (d) Deciding by the application of a group of statistical tests to the spatially coordinated self-calibrated texture measures the identity of the group or groups of pixels having high or highest texture measure values, thereby identifying a particular group or groups of pixels as an area or areas of interest (AOI), most likely to contain a target; PA0 (e) Detecting automatically the presence of a target within an area or areas of interest; PA0 (f) Displaying the spatial location coordinates of the target within the area of interest; PA0 (g) Reporting the spatial location coordinates of the target; and PA0 (h) Resetting the automatic detection system for acceptance of a subsequent representation of a sensed scene.
Various methods have been described for processing digital signals associated with, or derived from, pictorial information. For example, Doi, et al. in U.S. Pat. No. 4,839,807, issued Jun. 13, 1989, discuss the comparison of texture measures determined from digitized images of abnormal lungs with similar measures contained in a data base for normal lungs. A texture index is determined from normalized texture measures, and a threshold texture index is then chosen for initial selection of abnormal regions of interest of lung tissue having a large texture index above the threshold texture level. The selected abnormal regions are then further classified into categories of abnormality. Thus, the method and system of Doi et al for automated classification of distinction between normal lungs and abnormal lungs in digital chest radiographs requires a reference data base of texture measures for normal lungs.
Mori, et al. in U.S. Pat. No. 4,617,682, issued Oct. 14, 1986, discuss a method and apparatus for automatic quantitative measurement of textures by image analysis of a material having various optically anisotropic textures. The image is divided into a plurality of sections, and brightness of each section is classified by storage of grey levels of signals corresponding to these sections. Texture patterns are recognized on the basis of variations of grey levels, such variations being observed before and after movement of a mask. Textures of the material are then determined in accordance with predetermined criteria. Thus, Mori et al. employ an iterative process to determine texture and coarseness measures from grey level signals of areas of sample images which are prejudged to contain information suitable for image analysis.
In Ledinh, et al., U.S. Pat. No. 4,897,881, issued Jan. 30, 1990, is discussed a fast texture parameter extractor having parallel processing capability to obtain a texture parameter from any four picture pixels automatically selected in a rectangular pattern. Transformed signals from these four pixels are uncorrelated with respect to one another. While the Ledinh et al. feature extractor may provide real-time extraction of textural features due to parallel signal processing, the texture extraction is based upon the application of four two-dimensional masks, independent of the images, to four adjacent pixels of a video display image.
Ledley in U.S. Pat. No. 4,229,797, issued Oct. 21, 1980 describes a method and system for whole-picture image processing which provides for automatic texture and color analysis. The whole-picture analysis is performed relative to a predetermined whole reference picture.