This invention relates to methods and systems for geometric alignment of overlapping digital images and, specifically, to a computer-implemented method of creating, from a set of component images, a single, seamless composite image of the entire area covered by the set of component images. The method of the present invention has particular applicability to vertical-viewing aerial imagery, but can also be applied to other types of imagery.
In earth imaging operations, a set of images is typically acquired from an airplane or satellite in earth orbit using an image acquisition system. The image acquisition system may include a digital camera capable of capturing multispectral or hyperspectral single frame images. Alternatively, the images may be captured by other methods, for example, by conventional film aerial photography, later scanned to create digital images. Multispectral images comprise pixel intensity data in up to 10 spectral bands (e.g., red, green, blue, and near-IR) with relatively broad bandwidth (25 to 150 nm), while hyperspectral images comprise data for a larger number of spectral bands (typically numbering in the hundreds) with a narrow bandwidth (typically 1 to 10 nm).
Each image is defined by an image data record comprising a three-dimensional pixel data array, with X columns of pixels and Y rows of pixels for each of n spectral bands. The image data record is captured by the digital camera and stored on a computer-readable storage device, such as a disk drive or memory card. Typical digital images may be 1000 pixelsxc3x971000 pixels in 7 spectral bands, or 4000 pixelsxc3x974000 pixels in 4 spectral bands, or even 9000 pixelsxc3x979000 pixels in a single spectral band (xe2x80x9cblack and whitexe2x80x9d or panchromatic).
For example, FIG. 1 shows an aircraft 2 carrying a prior art image acquisition system such as the ADAR System 5500 sold by the assignee of the present invention, Positive Systems, Inc., Whitefish, Mont. A GPS receiver of the image acquisition system (not shown) utilizes signals 4 from GPS satellites 6 in earth orbit to accurately determine the position and altitude of the aircraft 2. The angular orientation of the aircraft 2 (and, consequently, the image acquisition system) may also be measured by a gyroscope or accelerometer subsystem, which is integrated with the GPS receiver in an inertial measurement unit (xe2x80x9cIMUxe2x80x9d). Orientation is typically indicated by three angles measured by the IMU, namely, Phi (xcfx86), Omega (xcfx89), and Kappa (K) which represent angular displacement about the respective X, Y, and Z axes, where X is parallel to the aircraft wings, Y is parallel to the aircraft body, and Z runs vertically through the aircraft. Although not typical, the position and orientation data could easily be recorded in a coordinate reference frame other than the Cartesian coordinate system. For example, position and orientation data can be recorded in a polar coordinate frame of reference. Digital image data is often acquired by time-interval photography and stored by the image acquisition system in association with contemporaneous position, orientation, and timing data. To increase accuracy, the position and orientation data is collected from the GPS receiver (or an alternative source such as a GLONASS receiver, a LORAN receiver, manual data entry, etc.) at the same moment when the image is captured. GPS-sensed position, altitude, and orientation data is not required, but can aid in automation of the mosaicking process, as described below.
FIG. 1 illustrates the sequential acquisition of a set of digital images 10 such that the images 10 overlap to ensure complete coverage of the area being imaged and to provide a basis for alignment of the images 10 relative to each other. FIG. 2A depicts four adjacent images A, B, C and D that include overlapping regions 20. FIG. 2B depicts a composite image called a mosaic 30 that depicts the surface area covered by the set of adjacent images A-D (FIG. 2A). Commercially available GPS equipment and orientation sensors are not capable of measuring position, altitude, and orientation with sufficient accuracy for the creation of the mosaic 30 so that no visible image alignment errors are present. Therefore, to provide a geometrically seamless mosaic 30, the adjacent images A, B, C and D must be manipulated to reduce misalignment. Further, a truly seamless mosaic must be adjusted radiometrically to ensure a uniform image brightness at the overlapping boundaries of the image frames.
Automated prior art methods of facilitating the alignment of overlapping images to form a mosaic are computationally intensive because they require correlation calculations to be made at a large number of locations in the subject images. Conversely, the amount of computation required can be reduced by limiting the number of locations where the correlation coefficients are calculated, but not without affecting the quality of the resulting image alignment.
For example, U.S. Pat. No. 5,649,032 of Burt et al. describes a method of automatically creating a mosaic image from multiple, overlapping still images captured by a video camera. Image alignment may be performed in batch, recursive, or hierarchical modes involving generation of the mosaic from a xe2x80x9cpyramid imagexe2x80x9d by first tiling a subset of the component images in coarse alignment, then progressively improving alignment of the tiled images by calculating affine transformations at progressively greater resolution. This method does not involve selection of possible tie point locations based on whether the image data at a location on a subject image is likely to have a matching location on a target image. Rather, the method involves searching for matching locations on the target image regardless of the quality of the image data at the corresponding location of the subject image.
U.S. Pat. No. 5,187,754 of Currin et al. describes a method of forming a composite, image mosaic from aerial photographs with the aid of an overview image obtained, e.g., by satellite. Tie points or ground control points are painstakingly identified manually by an operator using a computer mouse. Overlapping images are then automatically aligned by a tie point correlation method. The method does not involve automated selection of possible tie point locations.
U.S. Pat. Nos. 5,528,290 of Saund et al. and 5,581,637 of Cass et al. describe a system for creating a composite, mosaic image of a classroom whiteboard using a motorized camera pivotally mounted at a fixed location relative to the whiteboard. The camera pivots to capture multiple overlapping image frames, which are transmitted for reassembly at a viewing location. Landmarks projected or marked on the whiteboard in locations where the image frames overlap are selected using a gradient analysis applied at all pixel locations in the overlapping region of the images. Images are then aligned on a xe2x80x9cboard coordinate systemxe2x80x9d frame of reference by applying a weighted perspective distortion correction based upon a significance factor of the landmarks identified. This method is computationally expensive because it requires each pixel location in the overlapping area to be analyzed for the presence of a landmark. It would not be suitable for aerial imaging applications in which greater numbers of images at a much higher resolution than a video camera must be aligned accurately relative to geographic coordinates to form seamless mosaic images.
Thus, a need exists for a more efficient method and system of selecting tie point pairs in overlapping images for use in aligning the overlapping images to form a mosaic image. Methods suitable for use in aerial image processing applications are also needed.
In accordance with the present invention, a method of automatically creating mosaic images is implemented in a computer usable medium. The method involves obtaining a plurality of images including overlapping areas, identifying one or more a search site points (SSPs) in the overlapping areas, and calculating a numeric interest measure (IM) indicative of the presence of image features at the SSPs. If the IM exceeds a predetermined threshold, the system proceeds with a search in an overlapping image for a tie point (TP) correlating to the IP. The TP together with the IP comprise a tie point pair (TPP) that can be used to calculate and apply geometric transformations to align the images and thereby form a seamless mosaic.
Each point on a subject image at which IM is calculated is known as an interesting point candidate site (hereinafter xe2x80x9cIP-candidatexe2x80x9d). The system may calculate an IM of more than one IP-candidate within a predefined search window surrounding the SSP. The IMs calculated for the IP-candidates within a particular search window are compared to identify the IP-candidate at each SSP with the greatest IM, and thus, the greatest likelihood that features at the IP-candidate will yield good tie point pairs (both in the subject image and an overlapping target image).
The system may be of separable or modular design to facilitate its operation in an enhanced computing environment, such as a multithreaded processing environment, a multi-processor computer, a parallel processing computer, a computer having multiple logical partitions, or by distributed processing methods. In a preferred embodiment, the system and method also performs radiometric balancing of the images to reduce tonal mismatch.
Additional aspects and advantages of this invention will be apparent from the following detailed description of a preferred embodiment thereof, which proceeds with reference to the accompanying drawings.
As used in this application, the following terms shall be given the meanings defined below, unless the context indicates otherwise:
Bidirectional reflectance: A property of an object relating to the variation in apparent brightness of light reflected from the object as a function of the angle from which that reflected light is observed. Stated differently, given a uniform source of illumination, any object will appear to reflect different amounts of the incoming light depending on the angle between the incoming light source, the object, and an observer of the object.
Datum: An ellipsoid representing a reference surface of the earth. When creating maps, it is typical to reference all ground elevations to a reference elevation other than sea level. The reference elevation of the datum may be significantly higher than sea level. See also Ellipsoid.
DEM/DTM: A digital elevation model (DEM) or digital terrain model (DTM) is a raster representation of the elevation or terrain of an area, with the raster values representing the height of the surface of the earth either above sea level or some other ellipsoid.
Downsampling: A method for viewing or analyzing an image at a reduced resolution compared to the original image. For example, given an image 1000xc3x971000 pixels in size, a xe2x80x9cdownsampledxe2x80x9d image 100xc3x97100 pixels in size may be created by averaging every 10xc3x9710 pixel area to create a single pixel representing the average value of the 100 pixels in the 10xc3x9710 area. This downsampled image represents the original image, but with reduced resolution and scale. If the original image had a ground sample distance of 1 meter per pixel, the downsampled image would show the same image but at 10 meters per pixel. An advantage of downsampling is it reduces both the amount of memory required to store and process the image (including volatile memory and nonvolatile memory such as disk storage) and the number of computer processing cycles required to analyze the image.
Ellipsoid: An ellipsoid is the mathematical representation of the earth, which is in fact not spherical but slightly flattened at the poles. See also Datum.
Geometric properties: The spatial information contained in an image, such as the ability to make geometric measurements between points shown in an image.
Map projection: The mathematics used to project an area of the earth, which is a near-spherical surface, onto a flat surface (i.e., a paper or electronic map) and to provide the definition for the projection of that map. Typical map projections include conical surfaces or cylindrical surfaces onto which the spherical earth is projected.
Monochromatic: An image of simple brightness values within a specific spectral band (color hue), including black and white images. See also Panchromatic.
Mosaic: A compilation of multiple images into a single larger image.
Multispectral/Hyperspectral: Imagery which includes multiple views of the same area on the ground, but with each view representing a different color, each represented (or stored) as an individual monochromatic image. Multispectral typically refers to imagery containing from four to approximately ten spectral bands, whereas hyperspectral data refers to imagery which may have dozens to hundreds of spectral bands.
Optical vignetting/Vignette effect: A common effect observed within any image, whether captured by an electronic sensor or film. The vignette effect is a simple darkening of the image as a function of distance from the center (i.e., the corners of the image appear darker than the center).
Panchromatic: A term from the film industry that typically refers to black and white images recorded with either film or a digital sensor sensitive to light in the full visible spectrum (approximately 400 nm to 700 nm).
Pixel: An abbreviation for xe2x80x9cpicture element.xe2x80x9d A pixel is the smallest unit of light information within an image that is displayable on an electronic screen (such as a video screen) or transferable to physical media (such as a hard copy print of a digital image).
Radiometric: A general reference to brightness or intensity values in an image (as opposed to xe2x80x9cgeometric,xe2x80x9d which refers to the spatial information in an image).
Raster: A method of displaying image information on a two-dimensional grid in which each point of the image grid is represented by one raster value. The digital values of each point on the grid are stored as pixels so that, when displayed on a computer screen, they appear as an image to the human eye. See also DEM/DTM.
Rectify/Rectification: An image is xe2x80x9crectifiedxe2x80x9d when it is geometrically adjusted to enlarge or reduce the size of the image or a portion of the image, or to make curved lines appear straight or straight lines appear curved. Rectification performed as part of the mosaicking process involves adjusting the image geometry such that each pixel represents the same amount of area on the ground, and so that the geographic coordinates (latitude and longitude) of each pixel are known, to some limited degree of accuracy.
Remote Sensing: The use of aerial or satellite imaging to collect information about material properties (vegetation, soil, minerals, oceans, etc.) of a land area for later analysis by electronic sensors or other types of devices.
Resampling: A fundamental step in the rectification or transformation process. The mathematical transformation of an image from its current geometric state to another creates a new set of X,Y coordinates for each pixel in the original image. When any pixel in the new coordinate system (new image) lands outside a perfect X,Y location, the pixel values must be re-sampled to determine the appropriate brightness value of the new output pixel. The two common resampling algorithms are referred to as xe2x80x9cbilinear interpolationxe2x80x9d and xe2x80x9cnearest neighbor.xe2x80x9d In nearest neighbor, the new pixel value is taken to be identical to be original pixel which is nearest to that new pixel location. The advantage of nearest neighbor resampling is that no pixel values are altered. In the case of bilinear interpolation, the new pixel value is a linear average from the original pixels above, below, left, and right of the new pixel. The advantage of bilinear interpolation is a smoother appearance in the output image, but since pixel values are altered, this algorithm is less desirable if quantitative radiometric analysis will be performed on the imagery.
Scan (scanned image): When a hard copy image (photograph) is placed on an electronic instrument and turned into a digital image this is referred to as xe2x80x9cscanningxe2x80x9d the image.
Tie points: Pixels of two or more separate images which correspond to the same feature of an area shown in the images. Tie points are used to form a mosaic image.
Transformation: See Rectification.
Warp: Layman""s term for the action of geometrically rectifying or transforming an image.