1. Field of the Invention
The present invention relates to sensor models for describing a ground-to-image relationship for image sensors. More specifically, the present invention relates to a replacement sensor model that is generic to a plurality of image sensors and that reliably provides a full range of image exploitation.
2. Description of the Related Art
Image sensors are widely-known tools for obtaining images of various objects. More particularly, image sensors such as airborne or space-borne image sensors are known to be useful in obtaining images of underlying objects as the sensors pass by or over the objects. Various characteristics of these objects that are demonstrated by the resultant images can then be observed, including a temperature of the object, its position (latitude, longitude and/or elevation), its appearance, its color, its relation to another object, etc.
Since there exist these various characteristics of the imaged objects that can be observed, the term “image exploitation” is generically used to refer to a use to which an image is put. A subset of image exploitation is referred to as “geopositioning,” where, for example, an image is exploited by determining a latitude and longitude of the imaged object based on the image. Another example is determining latitude and longitude along with elevation of the image object based on multiple images. Geopositioning is considered “optimal” when it is performed in a manner such that its solution errors are minimized. This, in turn, requires statistical knowledge of the errors affecting the solution, such as the error in the knowledge of the sensor's position when the image was taken.
Image sensors such as those just discussed have various internal characteristics such as focal length, as well as various external characteristics such as location and attitude of the sensor at a given point in time. These characteristics, which vary individually by sensor, are the typical sources of error just referred to. These sources of error must be accounted for in order to obtain reliable and accurate image information, and/or to be aware of a possible extent of any resulting errors. Such characteristics and their specific values are conventionally referred to as “support data,” or “sensor support data.” The support data can be obtained by, for example, receipt of navigational data transmitted from the image sensor's platform to a ground station. For example, data may be obtained from a Global Positioning System (GPS) receiver on board a satellite.
An imaging methodology that reflects the above-discussed concepts is known as a “sensor model,” or, more specifically, a “rigorous (or physical) sensor model.” Specifically, a rigorous sensor model for an image sensor such as the one discussed above relates ground coordinates (i.e., three-dimensional data pertaining to the imaged object) to image coordinates (i.e., two-dimensional pixel data for the image of the object), utilizing the sensor support data that accompany each image and that specify sensor characteristics at a time the image was obtained, as discussed above.
As demonstrated in FIG. 1, a sensor model 110 consists of a ground-to-image transformation; i.e., a mathematical function with a three-dimensional ground coordinate set 120 as input and a two-dimensional image (pixel) location as output 130. The ground-to-image transformation of sensor model 110 is parameterized by the support data and as such is descriptive of the sensor's characteristics. The transformation of ground coordinates into image coordinates effected by sensor model 110 is thus described by Equation (1):u=F(X,S) and v=G(X,S),  (1)where u and v are the image pixel coordinates, X is a three dimensional vector describing the ground data as discussed above, and S is the sensor support data state vector (the error covariance shown in FIG. 1 is discussed in detail below).
The sensor model 110 is routinely used in image exploitation. As an example of this type of use of the sensor model 110, the geopositioning process referred to above will be discussed in more detail.
Proper geopositioning conventionally exists of two types of solutions: ground point (target) extraction and triangulation. The first of these two is obtained by the Extraction Process, where the location of ground point(s) is extracted from the two dimensional image coordinates of the object identified and measured in the image(s). For example, an operator of an image sensor within a satellite may obtain a plurality of images of relatively large areas on the earth's surface, such as a city or a portion of a city. Such an area of the entire image is typically known as the “image footprint.” Corresponding image data may then be stored by the operator, along with the support data of the image sensor applicable at the time the image was taken. In this way, a location of a particular target object such as a particular building within the city may be “extracted” from the image data.
In order to perform this process, for example using one image, an inverse transformation of the sensor model 110 and associated equation (1) is used; i.e., pixel data (u,v) and the vertical component of X, known a priori, are used as inputs to obtain the two horizontal components of X as outputs. The inverse transformation is typically a numerical, iterative inverse of the ground-to-image function. The support data state vector S is treated as a vector of fixed parameters. Examples of the Extraction Process are discussed in more detail below with respect to FIGS. 2-5.
The inverse transformation is a straight-forward solution technique applicable to one image. However, in order to also provide estimates of the one image solution accuracy, as well as utilize more than one image, a more sophisticated approach is required. This approach is characterized by the “M-image Extraction Algorithm.” This algorithm is discussed in detail below; generally speaking, however, the algorithm utilizes various image data corresponding to a plurality of images of the same target object, at least some of which offer different perspectives of the object.
Multiple perspectives are required to extract three dimensional ground positions from two dimensional images. Also, each of the images are weighted (criteria for performing the weighting is also discussed below), and the ground point locations are solved for iteratively in an optimal fashion (i.e., an estimate of X is used to begin iteration of the algorithm leading to progressively more accurate estimates of X). Solution accuracy estimates are also provided.
FIG. 2 illustrates the role of the sensor model's ground-to-image transformation in the extraction of a ground point 210 using a stereo pair of images 220 and 230. In FIG. 2, images 220 and 230 may either be obtained by two separate image sensors (perhaps at the same point in time), or by a single image sensor at different points in time. In FIG. 2, the ground-to-image transformation (or its inverse) specifies the image rays. The image rays intersect the image coordinates measured in each image.
FIG. 2 assumes perfect sensor support data. However, as discussed above, there will virtually always be errors in the sensor support data, as illustrated by incorrect sensor position/orientation 310 in FIG. 3, and any such errors will propagate to errors in extracted ground points. Thus, the support data errors should be accounted for during extraction in some manner in order to achieve optimal image exploitation (i.e., here, optimal geopositioning).
One way of accounting for support data errors is to determine a statistical “outer bound” or possible extent of the support data errors as they propagate to errors in the extracted ground points. For example, support data may indicate that a satellite (and its image sensor) was in a certain position at the time a given image was taken. An error in this data (i.e., the satellite was not actually at that exact position) may result is an extracted ground point being calculated as being, for example, up to ten meters off from its actual location. Such knowledge of this extent to which the support data (and thus the determined position of the extracted ground point) is inaccurate is referred to as “error propagation.” Error propagation can also refer to related concepts, such as, for example, a determination that a relative position between two particular imaged objects is accurate to within a particular distance.
Additionally, if a given image sensor obtains multiple images of a single object, or multiple image sensors each obtain a single image of an object(s), it would be useful to know which ones of the images are associated with more accurate support data. This knowledge is the primary criteria mentioned above that allows for the weighting of the multiple images in the M-image Extraction Algorithm, so that the more accurate images are given relatively more importance in arriving at a composite solution.
In short, it is useful to know an extent to which the support data is accurately known; i.e., an extent of errors contained in the support data. This knowledge is quantified by an “error covariance” matrix. Specifically, the quantification of the accuracy of sensor support data is represented by an a priori (initial) error covariance matrix of the support data (see FIG. 1). In its most general form, the a priori error covariance of the sensor support data errors is an (mn×mn) matrix, where m equals the number of images and n is the number of sensor support data error components per image.
The accuracy of the extracted ground point's location is quantified by its a posteriori (after solution) error covariance matrix. The a posteriori error covariance matrix is primarily a function of the number of images, their geometry, and the a priori error covariance matrix of the sensor support data. Along with the ground point solution, the a posteriori error covariance is also output from the M-image Extraction Algorithm.
Equation (2) formally defines the error covariances (all error processes are assumed unbiased, i.e., zero mean, and E{ } corresponds to expected value):CS=E{εSεST}, the sensor state vector a priori error covarianceP=E{εXεXT}, the extracted ground point a posteriori error covariance,where εS and εX are the error vectors (multi-variate random variables) associated with S and X  (2)
FIG. 4 illustrates the extraction of a ground point using multiple images from multiple image sensor positions 410a-410n, as might be performed by the M-image Extraction Algorithm. As in FIGS. 2 and 3, the ground-to-image transformation of the rigorous sensor model can be used to specify the image rays. Sensor support data errors prevent the multiple rays from intersecting at a common point, but the error covariance associated with the rigorous sensor model can quantify an extent of the errors, and thereby determine which rays correspond to more accurate support data. In this way, these rays can be given more weight in executing the algorithm. Generally, the more images utilized, the more accurate the ground point extraction (solution) will be.
FIGS. 2-4 have depicted only two of the three ground (object space) dimensions for simplicity. Images 510 and 520 in FIG. 5 further illustrate exemplary effects of support data errors, but in all three dimensions. Again, the image rays do not intersect due to the errors, but the error covariance will quantify an extent of the errors and thereby allow for error propagation and weighting of the images to obtain an optimal ground point extraction. In this particular example, both images are given equal weight; hence, the solution is at the midpoint of the minimum separation vector.
The above discussion has focused on the Extraction Process in performing optimal geopositioning. However, as mentioned above, geopositioning may also include the Triangulation Process.
In the Extraction Process, a location of a ground point(s) is extracted from image data, and an extent of support data errors, quantified by an error covariance matrix, permits error propagation and proper weighting of a plurality of images in obtaining an optimal ground point solution. The Triangulation Process also solves for the ground points, but additionally actually solves for the support data errors of all the images involved, and adjusts or reduces them so that a corresponding error(s) in the ground point solution or in subsequent extractions is actually reduced.
Solving for the support data errors according to the Triangulation Process typically requires additional information to be input into the process. For example, a location (ground point) of a control point within the image may be previously known to a great degree of accuracy, and can thus be used as a reference to actually solve for (and correct) errors in the sensor support data. Triangulation, as is known, typically involves a weighted least squares adjustment, using the a priori error covariance of the support data and the control points, and/or other previously-known information, to adjust the support data.
In summary, a complete sensor model consists of a ground-to-image transformation, the sensor support data, and the a priori error covariance of the sensor support data. The error covariance is relative to an identified set of sensor parameters. The relevant sensor support data errors are the dominant errors, such as sensor position error and attitude error. The support data a priori error covariance matrix quantifies the expected magnitude of the errors, the correlation between different error components from the same image, and, when applicable, the cross correlation between errors from different images (i.e., the extent to which errors in different images are related to one another). Error covariance can be used for extraction, including error propagation and weighting of a plurality of images. If additional information is known with respect to, for example, control points within the image footprint, error covariance can also allow solving for the sensor support data errors and adjusting them so as to obtain a more accurate determination of a target ground point(s).
Using the above techniques, rigorous sensor models are conventionally recognized as providing a ground-to-image transformation in support of optimal image exploitation, including geopositioning. However, since such rigorous sensor models are specific to particular image sensors, users of the image sensors must have access to the specific models. If a user is utilizing many different image sensors, the costs associated with obtaining, maintaining and utilizing all of the associated rigorous sensor models becomes burdensome. Moreover, many operators of image sensors do not wish to essentially share proprietary information relating to their equipment by distributing rigorous sensor models to users.
Therefore, a more general sensor model, termed the abstract sensor model, has been developed. It is also a ground-to-image function, and is conventionally expressed as a polynomial as shown in equation (3), although a ratio of two polynomials, commonly known as a rational function, is sometimes used:u=a0+a1x+a2y+a3z+a4xy+v=b0+b1x+b2y+b3z+b4xy+  (3)In equation (3), x,y,z are the components of the ground position vector X and a0,a1, . . . ,b0,b1, . . . are the polynomial coefficients. Typically, X is relative to a ground coordinate system that is centered at the middle of the image footprint and scaled such that the coordinate values x, y, and z range from −1.0 to 1.0.
The ground-to-image function corresponds to the polynomial and the “support data” are the values of the polynomial coefficients. Such a conventional abstract sensor model has various advantages. For example, the user need not know nor obtain the specific rigorous sensor models. Thus, one such model can be used for many different sensors, which also affords a common interface for the distribution and use of imagery and its support data (since a given user only needs software corresponding to the polynomial, and can then simply “plug in” the values of the coefficients and image data).
Moreover, these advantages result in lower costs for user development and maintenance. The use of an abstract sensor model polynomial defining a conventional abstract sensor model can also result in higher user throughput, due to its usually faster evaluation as compared to the original sensor model. Finally, the use of a polynomial can lead to wider availability of a specific sensor's imagery, since the polynomial coefficients (unlike the actual support data) do not have any obvious, literal physical meaning and therefore do not provide detail into proprietary sensor characteristics (as the original sensor model does).
In generating coefficients for the polynomial, the rigorous sensor model is used to generate a grid of image pixel-ground point correspondences across a representative area of the image footprint, where the ground points correspond to multiple elevation planes. That is, a number of latitude, longitude and elevation points over a representative area of the footprint are selected and inserted into the ground-to-image function to generate corresponding image points. (Alternatively, a number of image points are selected and inserted into the corresponding image-to-ground function, along with an a priori elevation, to generate corresponding ground points.) The coefficients are then fit to the grid of resulting correspondences to obtain the coefficients' values. The original sensor model can use either original sensor support data or triangulated sensor support data. Any support data errors that still exist within the sensor model when generating the polynomial coefficients will be reflected within the coefficients. However, like the coefficients themselves, such errors will not have any apparent physical meaning and so can not conventionally be accurately quantified by an associated error covariance.
A straightforward method for performing the fit to determine the coefficients is the well-known least squares fitting process. The size of the “fit error,” i.e., the difference in image coordinates between the polynomial evaluation and the original sensor model evaluation at a common ground point(s), depends on the form (degree) of the polynomial and the details of the fitting process. Such details of generating a conventional abstract sensor model polynomial as just described are conventionally well-known.
Equation (4) illustrates a typical polynomial abstract sensor model. It is order 2 in x, order 2 in y, and order 1 in z. Typical polynomial fit error is sub-pixel, with 0.01 pixel root-mean-square error not uncommon.
                    u        =                                            ∑                              i                =                0                            2                        ⁢                                                  ⁢                                          ∑                                  j                  =                  0                                2                            ⁢                                                          ⁢                                                ∑                                      k                    =                    0                                    1                                ⁢                                                                  ⁢                                                      a                    ijk                                    ⁢                                      x                    i                                    ⁢                                      y                    j                                    ⁢                                      z                    k                                    ⁢                                                                          ⁢                  and                  ⁢                                                                          ⁢                  v                                                              =                                    ∑                              i                =                0                            2                        ⁢                                                  ⁢                                          ∑                                  j                  =                  0                                2                            ⁢                                                          ⁢                                                ∑                                      k                    =                    0                                    1                                ⁢                                                                  ⁢                                                      b                    ijk                                    ⁢                                      x                    i                                    ⁢                                      y                    j                                    ⁢                                      z                    k                                                                                                          (        4        )            
Conventional abstract sensor models can match the ground-to-image relationship of an original sensor model with a great deal of accuracy for many image sensors. That is, they are generally capable of providing very similar image outputs when given the same ground point inputs (or vice-versa). Thus, such abstract sensor models can be used to perform a rudimentary extraction of a ground point target object from an image(s) as discussed above.
However, abstract sensor models have no associated error covariance that is equivalent in any meaningful way with the sensor support data error covariance of the rigorous sensor model. Without such an error covariance, conventional abstract sensor models can not perform useful error propagation, determine optimal weights to assign multiple images in determining a composite solution, or solve/correct for “support data” errors (i.e., support data errors reflected in the polynomial coefficients) in a triangulation process.
What is needed is an abstract model that provides an accurate but flexible ground-to-image relationship that is easy to use and applicable to many sensors, such as that provided by conventional abstract sensor models, together with a meaningful error covariance relative to a set of adjustable parameters that allows for optimal image exploitation, such as that conventionally provided by rigorous sensor models.