Electro-optical imaging satellites collect enormous amounts of geospatial imagery. As an example, the current DigitalGlobe constellation of satellites is capable of collecting over three million square kilometers of high-resolution geospatial imagery every day. Many other commercial providers collect similarly large amounts of geospatial imagery. The United States Government also collects geospatial imagery, although it provides the geospatial imagery to a very restricted customer base.
When geospatial imagery is provided, a satellite image is provided along with metadata that represents the ground-to-image geometry. The ground-to-image geometry allows a geospatial coordinate (e.g., latitude, longitude, and height) to be mapped to the corresponding pixel coordinate (e.g., row number and column number) of the satellite image. A Rational Polynomial Coefficient (“RPC”) camera model is one type of such metadata. The RPC camera model is an approximation of a Rigorous Projection Model (“RP”) model that describes the precise relationship between image coordinates and ground coordinates. (See “The Compendium of Controlled Extensions (CE) for the National Imagery Transmission Format (NITF),” v. 2.1, National Imagery and Mapping Agency, Nov. 16, 2001.) The RPC camera model provides 20 numerator coefficients and 20 denominator coefficients for a row equation and 20 numerator coefficients and 20 denominator coefficients for a column equation. The row equation inputs a geospatial coordinate and outputs the row number that contains the pixel of the image corresponding to that geospatial coordinate, and the column equation inputs a geospatial coordinate and outputs the column number that contains the pixel of the image corresponding to that geospatial coordinate. All the geospatial coordinates along a ray from the camera to a point on the surface of the earth map to the same pixel coordinate.
Although geospatial imagery is useful in its own right, it can be much more useful when annotations are applied to the geospatial imagery. For example, a person viewing a satellite image that includes San Francisco may notice that it contains a tower whose top looks somewhat like a firehose nozzle. If an annotation was associated with a portion of the image corresponding to the tower, the user could click on the tower to see the annotation. The annotation may provide the name of the tower (i.e., “Coit Tower”) along with an explanation that the tower was not designed to look like a firehose nozzle. If annotations were associated with satellite images, then a wealth of information could be made available to those people viewing the satellite images.
Because it is difficult to automate the adding of annotations, the process of annotating geospatial imagery is typically performed manually by an image analyst, which is itself a difficult and time-consuming task. A significant problem with automatically annotating objects (e.g., Coit Tower) in geospatial imagery is that the satellite image is often collected by satellites from off-nadir (i.e., not looking straight down) viewpoints, sometimes with large obliquity. Such off-nadir viewing angles make segmentation, the first step in traditional annotation processing, much more difficult and error-prone. Additionally, shadows are difficult to process and can be mistaken by automated systems for objects in the satellite image. For example, the shadows cast by tall structures can appear to be objects themselves or may obscure actual objects. Unfortunately, the cost of manually annotating satellite images can be prohibitive, and the results often have errors. Thus, the vast majority of satellite imagery is never annotated.